3. Software setup

The Poplar SDK is already installed on the Graphcloud server. You will need to do some initial configuration to be able to use it, as described below.

3.1. Setting up the SDK environment

To use the Graphcore tools and Poplar libraries, several environment variables (such as library and binary paths) need to be set up, as shown below:

$ cd /opt/gc/poplar_sdk-ubuntu_18_04-[ver]
$ source poplar-ubuntu_18_04-[ver]/enable.sh
$ source popart-ubuntu_18_04-[ver]/enable.sh

Where [ver] is the current software version number of each package.

You will need to source both the Poplar and the PopART enable scripts if you are using PyTorch or PopART.

If you attempt to run any Poplar software without having first enabled these scripts, you’ll get an error from the C++ compiler similar to the following (the exact message will depend on your code):

fatal error: 'poplar/Engine.hpp' file not found

Note

These scripts must be sourced for each new Bash shell. You can add these commands to your .bashrc to do this.

You can verify that Poplar has been successfully set up by running:

$ popc --version

This will display the version number of the installed software.

No further set up is required to use PopART and Poplar. PopTorch and TensorFlow for the IPU are provided as Python wheel files that can be installed using pip as described in the following sections.

3.2. Setting up PyTorch for the IPU

PopTorch is part of the Poplar SDK. It provides functions to allow PyTorch models to run on the IPU.

Before running PopTorch, you must source the enable.sh scripts for Poplar and PopART as described in Section 3.1, Setting up the SDK environment.

PopTorch is packaged as a Python wheel file that can be installed using pip.

Note

PopTorch requires pip version 18.1 or later, so it important to make sure you have the latest version before installing PopTorch.

We recommend creating a virtual environment, using virtualenv, to isolate your PopTorch environment from the system Python environment. You can create a virtual environment in a workspace directory and install PopTorch as shown below:

$ virtualenv -p python3 ~/workspace/poptorch_env
$ source poptorch_env/bin/activate
$ pip3 install -U pip
$ pip3 install poptorch-[ver].whl

Where [ver] is the SDK version version.

To confirm that PopTorch has been installed, you can use pip list, which should include the poptorch package in the output.

You can also test that the module has been installed correctly by attempting to import it in Python, for example:

$ python3 -c "import poptorch; print(poptorch.__version__)"

For more information, refer to PyTorch for the IPU: User Guide.

3.3. Setting up TensorFlow for the IPU

Before running TensorFlow, you must source the enable.sh scripts for Poplar as described in Section 3.1, Setting up the SDK environment.

To use the Graphcore port of TensorFlow, you must set up a Python virtual environment.

You can create a virtual environment in a workspace directory and install TensorFlow as shown below:

$ virtualenv -p python3.6 ~/workspace/tensorflow_env

Then activate it.

$ source tensorflow_env/bin/activate

Now all installations will be local to that virtual environment.

We support TensorFlow 1 and TensorFlow 2. There are versions of these compiled for Intel and AMD processors to provide the best performance on those hosts. As a result, there are four Python wheel files that can be installed with pip.

Warning

You must install the correct wheel file for your host CPU. You can use the command lscpu to determine the CPU type, if you are not sure.

For example, to install Graphcore’s TensorFlow distribution, compatible with v2.1.0 of TensorFlow, you would use a command similar to the following:

$ pip install tensorflow-2.1.0+[ver]+[arch].whl

Where [ver] is the TensorFlow version you have downloaded, and [arch] is the host CPU architecture (Intel or AMD).

To confirm that tensorflow has been installed, you can use pip list, which should include the tensorflow package in the output, for example:

(tensorflow_env) jsp$ pip list
Package        Version
-------------  ----------
future         0.18.2
numpy          1.19.5
pip            20.3.3
pkg-resources  0.0.0
tensorflow_env 2.1.0
setuptools     51.1.2
torch          1.6.0+cpu
wheel          0.36.2

You can also test that the module has been installed correctly by importing it in Python, for example:

$ python -c "from tensorflow.python import ipu"

For the next steps with TensorFlow, refer to the appropriate user guide:

3.4. Verifying the hardware configuration

Poplar needs a configuration file that describes how to connect to the IPU-M2000s that you have been allocated. This file is in the directory .ipuof.config.d in your home directory and is read-only.

This file contains information about:

  • The number of IPUs assigned to you as a “vPOD”

  • Networking details for how to connect to the vPOD

The file name is decided by the Graphcloud admin when setting up your account and can vary. There will only be one configuration file in this directory.

If this file is missing, then please contact your Graphcloud support team or use the resources on the Graphcore support portal https://www.graphcore.ai/support.

You can use the IPU command line tools to check what IPU hardware can be seen by the system. For example, gc-info will list information about the available IPU hardware made available through the use of this config file. For example:

$ gc-info -a
Graphcore device listing:

-+- Id:  [0], target:    [Fabric], PCI Domain: [3]
-+- Id:  [1], target:    [Fabric], PCI Domain: [2]
-+- Id:  [2], target:    [Fabric], PCI Domain: [1]
-+- Id:  [3], target:    [Fabric], PCI Domain: [0]
-+- Id:  [4], target:    [Fabric], PCI Domain: [3]
-+- Id:  [5], target:    [Fabric], PCI Domain: [2]
-+- Id:  [6], target:    [Fabric], PCI Domain: [1]
-+- Id:  [7], target:    [Fabric], PCI Domain: [0]
-+- Id:  [8], target:    [Fabric], PCI Domain: [3]
-+- Id:  [9], target:    [Fabric], PCI Domain: [2]
-+- Id: [10], target:    [Fabric], PCI Domain: [1]
-+- Id: [11], target:    [Fabric], PCI Domain: [0]
-+- Id: [12], target:    [Fabric], PCI Domain: [3]
-+- Id: [13], target:    [Fabric], PCI Domain: [2]
-+- Id: [14], target:    [Fabric], PCI Domain: [1]
-+- Id: [15], target:    [Fabric], PCI Domain: [0]
-+- Id: [16], target: [Multi IPU]
|--- PCIe Id:  [0], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [1], DNC Id: [1], PCI Domain: [2]
-+- Id: [17], target: [Multi IPU]
|--- PCIe Id:  [2], DNC Id: [0], PCI Domain: [1]
|--- PCIe Id:  [3], DNC Id: [1], PCI Domain: [0]
-+- Id: [18], target: [Multi IPU]
|--- PCIe Id:  [4], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [5], DNC Id: [1], PCI Domain: [2]
-+- Id: [19], target: [Multi IPU]
|--- PCIe Id:  [6], DNC Id: [0], PCI Domain: [1]
|--- PCIe Id:  [7], DNC Id: [1], PCI Domain: [0]
-+- Id: [20], target: [Multi IPU]
|--- PCIe Id:  [8], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [9], DNC Id: [1], PCI Domain: [2]
-+- Id: [21], target: [Multi IPU]
|--- PCIe Id: [10], DNC Id: [0], PCI Domain: [1]
|--- PCIe Id: [11], DNC Id: [1], PCI Domain: [0]
-+- Id: [22], target: [Multi IPU]
|--- PCIe Id: [12], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id: [13], DNC Id: [1], PCI Domain: [2]
-+- Id: [23], target: [Multi IPU]
|--- PCIe Id: [14], DNC Id: [0], PCI Domain: [1]
|--- PCIe Id: [15], DNC Id: [1], PCI Domain: [0]
-+- Id: [24], target: [Multi IPU]
|--- PCIe Id:  [0], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [1], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id:  [2], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id:  [3], DNC Id: [3], PCI Domain: [0]
-+- Id: [25], target: [Multi IPU]
|--- PCIe Id:  [4], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [5], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id:  [6], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id:  [7], DNC Id: [3], PCI Domain: [0]
-+- Id: [26], target: [Multi IPU]
|--- PCIe Id:  [8], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [9], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id: [10], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id: [11], DNC Id: [3], PCI Domain: [0]
-+- Id: [27], target: [Multi IPU]
|--- PCIe Id: [12], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id: [13], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id: [14], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id: [15], DNC Id: [3], PCI Domain: [0]
-+- Id: [28], target: [Multi IPU]
|--- PCIe Id:  [0], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [1], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id:  [2], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id:  [3], DNC Id: [3], PCI Domain: [0]
|--- PCIe Id:  [4], DNC Id: [4], PCI Domain: [3]
|--- PCIe Id:  [5], DNC Id: [5], PCI Domain: [2]
|--- PCIe Id:  [6], DNC Id: [6], PCI Domain: [1]
|--- PCIe Id:  [7], DNC Id: [7], PCI Domain: [0]
-+- Id: [29], target: [Multi IPU]
|--- PCIe Id:  [8], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [9], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id: [10], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id: [11], DNC Id: [3], PCI Domain: [0]
|--- PCIe Id: [12], DNC Id: [4], PCI Domain: [3]
|--- PCIe Id: [13], DNC Id: [5], PCI Domain: [2]
|--- PCIe Id: [14], DNC Id: [6], PCI Domain: [1]
|--- PCIe Id: [15], DNC Id: [7], PCI Domain: [0]
-+- Id: [30], target: [Multi IPU]
|--- PCIe Id:  [0], DNC Id: [0], PCI Domain: [3]
|--- PCIe Id:  [1], DNC Id: [1], PCI Domain: [2]
|--- PCIe Id:  [2], DNC Id: [2], PCI Domain: [1]
|--- PCIe Id:  [3], DNC Id: [3], PCI Domain: [0]
|--- PCIe Id:  [4], DNC Id: [4], PCI Domain: [3]
|--- PCIe Id:  [5], DNC Id: [5], PCI Domain: [2]
|--- PCIe Id:  [6], DNC Id: [6], PCI Domain: [1]
|--- PCIe Id:  [7], DNC Id: [7], PCI Domain: [0]
|--- PCIe Id:  [8], DNC Id: [8], PCI Domain: [3]
|--- PCIe Id:  [9], DNC Id: [9], PCI Domain: [2]
|--- PCIe Id: [10], DNC Id: [10], PCI Domain: [1]
|--- PCIe Id: [11], DNC Id: [11], PCI Domain: [0]
|--- PCIe Id: [12], DNC Id: [12], PCI Domain: [3]
|--- PCIe Id: [13], DNC Id: [13], PCI Domain: [2]
|--- PCIe Id: [14], DNC Id: [14], PCI Domain: [1]
|--- PCIe Id: [15], DNC Id: [15], PCI Domain: [0]

If this command does not bring up a list of devices or shows an error when trying to connect to the IPU resources, then you should contact your Graphcloud support team or use the resources on the Graphcore support portal https://www.graphcore.ai/support.

You can also run gc-monitor to view more IPU-related details. See the IPU Command Line Tools document for more details about other driver-level diagnostic tools available.