1. Introduction
This guide explains how you can run applications in Docker on a Linux machine with one or more physical IPU devices.
Prerequisites:
A machine with IPU devices
Ubuntu 18.04 / CentOS 7.6
2. Initial setup
First check if your machine has the IPU device driver installed. You can check this is loaded and running with the following command:
$ modinfo ipu_driver
If the driver is are installed and running, you should see something similar to:
$ modinfo ipu_driver
filename: /lib/modules/4.15.0-55-generic/updates/dkms/ipu_driver.ko
version: 1.0.39
description: IPU PCI Driver
author: Graphcore Limited
license: GPL
srcversion: 49FFB7D8556EB58899AE41A
alias: pci:v00001D95d00000003sv*sd*bc*sc*i*
alias: pci:v00001D95d00000002sv*sd*bc*sc*i*
alias: pci:v00001D95d00000001sv*sd*bc*sc*i*
depends:
retpoline: Y
name: ipu_driver
vermagic: 4.15.0-55-generic SMP mod_unload
parm: memmap_start:array of ulong
parm: memmap_size:array of ulong
If so, proceed to the next section. If it returns an error along the lines of:
$ modinfo ipu_driver
modinfo: ERROR: Module ipu_driver not found.
You will need to install the driver. See the Getting Started Guide for your IPU system for more information.
3. Using gc-docker
The Graphcore Poplar SDK includes some command line tools for managing the IPU system.
The gc-docker
command is a small wrapper for the command docker run
which adds the correct flags to use a set of IPU devices inside a running
container.
If this is not on your path, you will need to go to the Poplar installation directory and enable the command-line tools:
$ cd [poplar-installation-path]
$ source enable.sh
This must be done in each shell. Alternatively, you can run the following command to automatically source it in all new Bash login shells:
$ echo 'source [full-path-to-extracted-poplar]/enable.sh' >> ~/.bash_profile
3.1. Loading docker images
First, download the Poplar image bundle from the Graphcore Downloads portal.
Then load the bundle into your local Docker daemon:
$ docker load --input=poplar-docker-images-1.3.0.tar.gz
Check the images have loaded and had tags applied. For example (output trimmed):
$ docker images
REPOSITORY TAG IMAGE ID
graphcore/tools 1.3.0 c2f5ebc91d4b
graphcore/tensorflow 1 6175c27cb631
graphcore/tensorflow 1-amd 6175c27cb631
graphcore/tensorflow 2 ae99a3fd3181
graphcore/tensorflow 2-amd ae99a3fd3181
graphcore/tensorflow 1-intel db5fef31303d
graphcore/tensorflow 2-intel cb3b3a41321e
graphcore/pytorch 1.3.0 d84478558ab0
graphcore/poplar 1.3.0 c744278a89b2
ubuntu bionic-20200903 c14bccfdea1c
graphcore/tools
: contains tools to interact with IPU devices.graphcore/poplar
: contains Poplar and PopART.graphcore/tensorflow
: is based ongraphcore/poplar
, with TensorFlow installed on top. These images are tagged with1
and2
to choose between using TensorFlow 1 or 2. AMD optimised builds are the default, but can be explicitly used with the1-amd
and2-amd
tags. Builds using Intel specific instructions can be used with1-intel
and2-intel
tags.graphcore/pytorch
: is based ongraphcore/poplar
, with PyTorch and PopTorch installed.
Note
This tarball method of container image delivery will be replaced with a
Docker registry in future, which will enable docker pull
to be used
instead.
3.2. Verifying IPU access from inside container
First check you have access to the IPU devices on the host. To do this, run
gc-info -l
and check the output contains a list of devices.
Next, do the same but inside the context of a container:
$ gc-docker -- --rm -ti graphcore/tools gc-info -l
The output should be the same.
Check you can run a TensorFlow container with gc-docker
, and make sure the
IPUs are visible to TensorFlow:
$ gc-docker -- --rm -ti graphcore/tensorflow:2 python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>> tensorflow.config.list_physical_devices("IPU")
[PhysicalDevice(name='/physical_device:IPU:0', device_type='IPU')]
>>>
The syntax for running an image with gc-docker
is similar to using
docker run
, which is:
$ docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
The main difference is that docker run
is replaced with gc-docker --
.
So, in the TensorFlow example above, we used the graphcore/tensorflow:2
image and ran python3
as the command. No arguments were passed to python3
.
The --
part of this command tells gc-docker
that the rest of the
arguments should be passed directly to docker run
. gc-docker
also has
a few options which can be used before this. For example, you can pass a subset
of IPU devices using --device-id n
:
$ gc-docker --device-id 4 -- --rm -ti graphcore/tools gc-info -l
Please note, the device IDs in the container always start from zero. So if you select a subset of devices, they will be numbered from 0. For example, if you use devices 4 to 7, they will have IDs 0 to 3 in the container.
The --echo
option is also useful. This makes gc-docker
print the Docker
command it would have run. For example:
$ gc-docker --echo --device-id 4 -- --rm -ti graphcore/tools gc-info -l
docker run --device=/dev/ipu4:/dev/ipu4 --device=/dev/ipu4_ex:/dev/ipu4_ex -ti graphcore/tools gc-info -l
Use the --help
option or refer to the IPU Command Line Tools document, for more
information.
3.3. Mounting directories from the host
You can mount volumes to share data between the host machine and the Docker container environment. This is useful for cases where you need to read data to be processed or to output results.
Volumes are mounted using the -v
option. The basic syntax is
-v <path_on_host>:<path_in_container>
. For example, to mount /home/me/cat_pics
from
your host machine as /cats
in the container, you could run the following command:
$ gc-docker -- -ti -v /home/me/cat_pics:/cats graphcore/tensorflow ls -a /cats
. .. mog.jpg
3.4. Setting environment variables
If you need some environment variables set inside the Docker environment,
add -e VAR_NAME="var value"
to your Docker options.
For example:
$ gc-docker -- -ti -e POPLAR_LOG_LEVEL=TRACE graphcore/tensorflow:2 python3
4. Running a TensorFlow application on an IPU
To demonstrate the workflow for running a TensorFlow application on IPUs in a Docker development environment, we will use one of the TensorFlow applications from the Graphcore public examples repository. First, get the code:
$ git clone https://github.com/graphcore/examples.git
$ cd examples
A common pattern when working with a Docker-based development environment is to
mount the current directory into the container (as described in
Mounting directories from the host), then set the working directory inside the
container with -w <dir name>
.
For example, -v "$(pwd):/app" -w /app
.
Applying this, you can run the LSTM example with the following command:
$ gc-docker -- -ti -v "$(pwd):/app" -w /app graphcore/tensorflow:1 python3 code_examples/tensorflow/kernel_benchmarks/lstm.py
To avoid running out of shared memory, many machine learning applications being very demanding on that part, it is recommended to
add the following docker option: --ipc=host
.
5. Extending the images
These base images can be used to create new images for more specialised purposes, or to package an application for deployment to platforms such as Kubernetes or Kubeflow.
As an example, here’s a simple Dockerfile example that creates a Jupyter notebook environment with TensorFlow and access to IPUs:
FROM graphcore/tensorflow:2
RUN pip3 install notebook
CMD ["jupyter", "notebook", "--allow-root", "--ip=0.0.0.0", "--port=8080"]
You can build and run this with the following commands:
$ docker build -t notebook .
$ gc-docker -- -p 8080:8080 notebook
6. Further reading
You can find documentation for the Graphcore software products on the Developer page of the Graphcore website.