3. Quick start for beginners

This section provides more detail on the steps described in the Quick start for experts section.

Complete any necessary setup to use your IPU system (see Section 1.1, IPU systems) before the following steps.

3.1. Enable the Poplar SDK

Note

It is best if you use the latest version of the Poplar SDK.

On some systems you must explicitly enable the Poplar SDK before you can use PyTorch or TensorFlow for the IPU, or the Poplar Graph Programming Framework. On other systems, the SDK is enabled as part of the login process.

Table 3.1 defines whether you have to explicitly enable the SDK and where to find it.

Table 3.1 Systems that need the Poplar SDK to be enabled and the SDK location

System

Enable SDK?

SDK location

Pod system

Yes

The SDK is in the directory where you extracted the SDK tarball.

Graphcloud

Yes

/opt/gc/poplar_sdk-ubuntu_18_04-[poplar_ver]+[build]

where [poplar_ver] is the software version number of the Poplar SDK and [build] is the build information.

Gcore Cloud

No

The SDK has been enabled as part of the login process.

To enable the Poplar SDK:

For SDK versions 2.6 and later, there is a single enable script that determines whether you are using Bash or Zsh and runs the appropriate scripts to enable both Poplar and PopTorch/PopART.

Source the single script as follows:

$ source [path_to_SDK]/enable

where [path_to_SDK] is the location of the Poplar SDK on your system.

Note

You must source the Poplar enable script for each new shell. You can add this source command to your .bashrc (or .zshrc for SDK versions later than 2.6) to do this on a more permanent basis.

If you attempt to run any Poplar software without having first sourced this script, you will get an error from the C++ compiler similar to the following (the exact message will depend on your code):

fatal error: 'poplar/Engine.hpp' file not found

If you try to source the script after it has already been sourced, then you will get an error similar to:

ERROR: A Poplar SDK has already been enabled.
Path of enabled Poplar SDK: /opt/gc/poplar_sdk-ubuntu_20_04-3.2.0-7cd8ade3cd/poplar-ubuntu_20_04-3.2.0-7cd8ade3cd
If this is not wanted then please start a new shell.

You can verify that Poplar has been successfully set up by running:

$ popc --version

This will display the version of the installed software.

3.2. Create and enable a Python virtual environment

It is good practice to work in a different Python virtual environment for each framework or even for each application. This section describes how you create and activate a Python virtual environment.

Note

You must activate the Python virtual environment before you can start using it.

The virtual environment must be created for the Python version you will be using. This cannot be changed after creation. Create a new Python virtual environment with:

$ virtualenv -p python3 [venv_name]

where [venv_name] is the location of the virtual environment.

Note

Make sure that the version of Python that is installed is compatible with the version of the Poplar SDK that you are using. See Supported tools in the Poplar SDK release notes for information about the supported operating systems and versions of tools.

To start using a virtual environment, activate it with:

$ source [venv_name]/bin/activate

where [venv_name] is the location of the virtual environment.

Now all subsequent installations will be local to that virtual environment.

3.3. Install the PopTorch wheel and validate

This section describes how to install a wheel file containing PopTorch, a set of extensions to enable PyTorch models to run on the IPU.

Note

You should activate the Python virtual environment you created for PyTorch (Section 3.2, Create and enable a Python virtual environment) before performing the setup in this section.

The wheel file is included with the Poplar SDK and has a name of the form:

poptorch-[sdk_ver]+*.whl

where [sdk_ver] is the version of the Poplar SDK. An example of the wheel file is:

poptorch-3.2.0+109946_bb50ce43ab_ubuntu_20_04-cp38-cp38-linux_x86_64.whl

Install the PopTorch wheel file using the following command:

$ python -m pip install ${POPLAR_SDK_ENABLED?}/../poptorch-*.whl

where POPLAR_SDK_ENABLED is the location of the Poplar SDK defined when the SDK was enabled. The ? ensures that an error message is displayed if Poplar has not been enabled.

To confirm that PopTorch has been installed, you can use:

pip list | grep poptorch

For the example wheel file, the output will be:

poptorch       3.2.0+109946

You can also test that the module has been installed correctly by attempting to import it in Python, for example:

$ python3 -c "import poptorch; print(poptorch.__version__)"

3.4. Clone the Graphcore examples

You may need to clone the Graphcore examples repository on some systems as detailed in Table 3.2.

If you don’t need to clone the examples repository, then go straight to Section 3.5, Define environment variable.

Table 3.2 Systems that need the Graphcore tutorials and examples repositories to be cloned

System

Clone repos?

Comment

Pod system

Yes

You can clone the tutorials and examples repos in any location.

Graphcloud

Yes

You can clone the tutorials and examples repos in any location.

Gcore Cloud

No

The tutorials and examples have already been cloned in ~/graphcore/tutorials and ~/graphcore/examples respectively.

You can clone the examples repository into a location of your choice.

To clone the examples repository for the latest version of the Poplar SDK:

$ cd ~/[base_dir]
$ git clone https://github.com/graphcore/examples.git

where [base_dir] is a location of your choice. This will install the contents of the examples repository under ~/[base_dir]/examples. The tutorials are in ~/[base_dir]/examples/tutorials.

Note

If you are using a version of the Poplar SDK prior to version 3.2, then refer to Section A, Install examples and tutorials for older Poplar SDK versions for how to install examples and tutorials.

3.5. Define environment variable

In order to simplify running the tutorials, we define the environment variable POPLAR_TUTORIALS_DIR that points to the location of the cloned tutorials.

$ export POPLAR_TUTORIALS_DIR=~/[base_dir]/examples/tutorials

[base_dir] is the location where you installed the Graphcore tutorials.

3.6. Run the application

This section describes how to run a simple MNIST training application. You can find the source code and a description of the application in the tutorials repository.

  1. Install required packages

    You can now install the packages that the application requires.

$ cd $POPLAR_TUTORIALS_DIR/simple_applications/pytorch/mnist/
$ pip install -r requirements.txt
  1. Run application

You run the application with the command:

$ python3 mnist_poptorch.py

There are some command line options for the application. You can experiment with the following:

  • --batch-size: batch size

  • --device-iterations: number of iterations

  • --test-batch-size: size of the test batch

  • --epochs: number of epochs

  • --lr: learning rate

  1. If the application has run successfully, you should see output similar to that in Listing 3.1.

Listing 3.1 Example of output for PyTorch application.
$ python3 mnist_poptorch.py

  Epoch 1/10
    0%|   | 0/150 [00:00<00:00]
  Graph compilation: 100%|  | 100/100 [00:36<00:00]
  Loss:1.4709 | Accuracy:100.00%: 100%|  | 150/150 [00:52<00:00,  2.88it/s]
  Epoch 2/10
  Loss:1.4612 | Accuracy:100.00%: 100%|  | 150/150 [00:12<00:00, 12.42it/s]
  Epoch 3/10
  Loss:1.5861 | Accuracy:87.50%: 100%|  | 150/150 [00:11<00:00, 13.08it/s]
  Epoch 4/10
  Loss:1.4973 | Accuracy:100.00%: 100%|  | 150/150 [00:11<00:00, 12.94it/s]
  Epoch 5/10
  Loss:1.4612 | Accuracy:100.00%: 100%|  | 150/150 [00:11<00:00, 12.74it/s]
  Epoch 6/10
  Loss:1.4745 | Accuracy:100.00%: 100%|  | 150/150 [00:11<00:00, 12.73it/s]
  Epoch 7/10
  Loss:1.5811 | Accuracy:87.50%: 100%|| 150/150 [00:11<00:00, 13.04it/s]
  Epoch 8/10
  Loss:1.5861 | Accuracy:87.50%: 100%|| 150/150 [00:12<00:00, 12.40it/s]
  Epoch 9/10
  Loss:1.4612 | Accuracy:100.00%: 100%|| 150/150 [00:11<00:00, 12.81it/s]
  Epoch 10/10
  Loss:1.4612 | Accuracy:100.00%: 100%|| 150/150 [00:11<00:00, 12.91it/s]
  Graph compilation: 100%|  | 100/100 [00:22<00:00]
  100%|   | 125/125 [00:30<00:00,  4.03it/s]
  Accuracy on test set: 98.58%

You have now run an application that demonstrates how to use the IPU to train a neural network for classification on the MNIST dataset using PyTorch.

3.7. Exit the virtual environment

When you are done, exit the Python virtual environment.

$ deactivate

3.8. Try out other applications

The examples repo contains other tutorials and applications you can try. See Section 4, Next steps for more information.