The Poplar® SDK is the world’s first complete tool chain specifically designed for creating graph software for machine intelligence applications. Poplar seamlessly integrates with TensorFlow and Open Neural Network Exchange (ONNX) allowing developers to use their existing machine intelligence development tools and existing machine learning models.

Poplar enables you to exploit features of the Graphcore Intelligence Processing Unit (IPU), such as parallel execution and efficient floating-point operations. Models written in industry-standard machine learning (ML) frameworks are compiled by Poplar to run in parallel on one or more IPUs. You can import models from other ML frameworks using PopART™, the Poplar advanced runtime, and run them on the IPU. You an also use the Poplar graph library from C++ to create graph programs to run on IPU hardware.

_images/software-stack.jpg

Fig. 1 The Poplar SDK software stack

For more information about the IPU and its programming models, please refer to the IPU Programmer’s Guide.

1. TensorFlow

The Poplar SDK includes implementations of TensorFlow 1.15 and 2.1 for the IPU. This includes distributed TensorFlow IPU-specific estimators and optimisers.

For more information, refer to Targeting the IPU from TensorFlow.

2. PyTorch

The Poplar software stack provides support for running PyTorch training and inference models on the IPU. This requires minimal changes to your existing PyTorch code.

You can create a wrapper for your existing PyTorch model with a single function call. This will create a PopTorch model that runs in parallel on a single IPU.

You can then choose how to split the model across multiple IPUs, to create a pipelined implementation that exploits data parallelism across the IPUs.

3. PopART and ONNX

The Poplar advanced run-time (PopART) enables the efficient execution of both inference and training graphs on the IPU. Its main import format is Open Neural Network Exchange (ONNX), an open format for representing machine learning models.

You can import and execute models created using other industry-standard frameworks, including using differentiation and optimization to train their parameters. You can also create graphs in PopART directly. Models can be exported in ONNX format, for example after training.

PopART includes Python and C++ APIs.

See the PopART User Guide for more information.

4. The Poplar libraries

The Poplar libraries are a set of C++ libraries consisting of the Poplar graph library and the open-source PopLibs™ libraries.

The Poplar graph library provides direct access to the IPU by code written in C++. You can write complete programs using Poplar, or use it to write functions to be called from your application written in a higher-level framework such as TensorFlow.

Poplar enables you to construct graphs, define tensor data and control how the code and data are mapped onto the IPU for execution. The host code and IPU code are both contained in a single program. Poplar compiles code for the IPU and copies it to the device to be executed. You can also pre-compile the device code for faster startup.

The open-source PopLibs library provides a range of higher level functions commonly used in machine learning applications. This includes highly optimised and parallelised primitives for linear algebra such as matrix multiplications and convolutions. There are also several functions used in neural networks (for example, non-linearities, pooling and loss functions) and many other operations on tensor data.

The source code for PopLibs is provided so you can use the library as a starting point for implementing your own functions.

For more information, please refer to the Poplar and PopLibs User Guide

Each vertex of a Poplar graph runs a “codelet” that executes code directly on one of the many parallel cores in the IPU. These codelets can be written in C++ or, when more performance is required, assembly.

Writing assembly for the IPU is described in the Vertex Assembly Programming Guide.

5. PopVision™ Graph Analyser

The PopVision Graph Analyser is an analysis tool that helps you gain a deep understanding of how your application is performing and utilising the IPU.

The graphical user interface of the Graph Analyser rapidly gives you a deeper understanding of your code’s inner workings than is provided by other platforms. By integrating directly with the internal profiling support of the Poplar graph engine and compiler, it enables you to profile and optimise the performance and memory usage of your machine learning models, whether developed in TensorFlow, PopART or natively using Poplar and PopLibs.

_images/graph-analyser.png

Fig. 5.1 The PopVision Graph Analyser

The program provides a graphical display of information about your program, including:

  • Summary report: Essential program information.

    The summary shows detailed information about the program that has been compiled, including type of processor used, graph size, host information and program command line.

  • Memory report: Analyse program memory consumption and layout on one or multiple IPUs.

    The interactive graph displays how memory has been allocated to each of the tiles and what each memory location within a tile has been used for (for example, a convolution). Selecting a single tile from the graph will provide detailed information on the memory usage of that tile. Memory allocation is colour-coded by variable type (code, weight, activations and other tensors).

  • Liveness report: Explore temporary peaks in memory and their impact.

    Liveness describes the use of memory by variables through the lifetime of the program. The liveness report’s interactive graph shows how memory is used by variables as the program runs, and is a valuable tool for optimising memory use.

  • Execution trace report: View program execution.

    This report shows how many cycles it takes each program step to execute. The report can be viewed either as a series of compute steps or as a “flame graph” in which program steps that are part of the same operation or layer are grouped together. When a program step or layer is selected, the interactive graph displays how many cycles are in each compute or exchange phase as well as details about each program step.

The Graph Analyser can be used with any of the programming models described above. It is available for Ubuntu, Windows and macOS. Full documentation is available as context-sensitive help.

You can read a more detailed overview of the Graph Analyser features in our blog post.

6. Installing the Poplar SDK

6.1. Requirements

To use the Poplar software you will need suitable hardware, such as the DELL EMC DSS8440 IPU Server, or access to a cloud-based service that supports IPUs, such as Microsoft’s Azure.

The Poplar SDK is available for Ubuntu 18.04 or CentOS 7.6.

The tools require Python 3.6 to be installed. Other packages required by TensorFlow will be automatically installed when you install TensorFlow. We recommend running TensorFlow in a Python virtual environment.

TensorFlow requires a processor that supports the AVX-512 instruction set extensions (Skylake, or later).

The SDK also includes a software model of the IPU so it is possible to develop and test your code even when IPU hardware is not available.

6.2. Downloads

The Poplar SDK and the PopVision analysis tools can be downloaded from the Graphcore software download portal. Full documentation is included in the package and on the Graphcore developer site.

You will need a support account to access the software download portal.

Note that, on some cloud-based systems, each user sees a “clean” virtual machine and so you may need to install all the necessary software, including the IPU device drivers. See the Getting Started guide for your IPU system for information on checking what software is installed and the full installation instructions.

You can also download pre-configured Docker containers. These provide Poplar SDK images ready for deployment. The following Docker containers are available:

  • Tools: contains the necessary tools to interact with IPU devices.

  • Poplar: contains Poplar, PopART and tools to interact with IPU devices.

  • TensorFlow 1: contains everything in Poplar, with TensorFlow 1 pre-installed.

  • TensorFlow 2: contains everything in Poplar, with TensorFlow 2 pre-installed.

6.3. Contents of the SDK

The installed SDK contains the following components:

  • Device drivers and command line tools for managing the IPU hardware. See the IPU Command Line Tools document for details.

  • The Poplar and PopLibs libraries.

  • PopART with support for ONNX graphs.

  • Python wheel files for installing TensorFlow 1 and 2.

  • Documentation.

7. Support

Support is available from the Graphcore customer engineering team via the Graphcore support portal.

Graphcore maintains a GitHub repository at https://github.com/graphcore/examples, which contains:

  • TensorFlow and PopART versions of commonly used machine learning models, including CNNs such as ResNet and ResNeXt, LSTMs, and sequence modelling for both training and inference

  • Tutorials

  • Example programs

You can use the tags “ipu”, “poplar” and “popart” when asking questions or looking for answers on StackOverflow.