1. Introduction

The Poplar® SDK is the world’s first complete tool chain specifically designed for creating graph software for machine intelligence applications. Poplar seamlessly integrates with TensorFlow, PyTorch and Open Neural Network Exchange (ONNX) allowing developers to use their existing machine intelligence development tools and existing machine learning models.

Poplar enables you to exploit features of the Graphcore Intelligence Processing Unit (IPU), such as parallel execution and efficient floating-point operations. Models written in industry-standard machine learning (ML) frameworks such as TensorFlow and PyTorch are compiled by Poplar to run in parallel on one or more IPUs.

You can import models from other ML frameworks using PopART™, the Poplar advanced runtime, and run them on the IPU. You can also use the Poplar graph library from C++ to create graph programs to run on IPU hardware.


Fig. 1.1 The Poplar SDK software stack

For more information about the IPU and its programming models, refer to the IPU Programmer’s Guide.

2. TensorFlow

The Poplar SDK includes implementations of TensorFlow 1.15 and 2.6 for the IPU. This includes distributed TensorFlow IPU-specific estimators and optimisers.


There are TensorFlow wheel files built and optimised for both AMD and Intel processors. You must install the appropriate version for your system.

The SDK also includes a collection of addons created for TensorFlow for the IPU. These include IPU-specific TensorFlow layers and, for TensorFlow 2, IPU-specific Keras layers and optimizers.

For more information, refer to Targeting the IPU from TensorFlow 1 and Targeting the IPU from TensorFlow 2.

3. PyTorch

The Poplar software stack provides support for running PyTorch training and inference models on the IPU. This requires minimal changes to your existing PyTorch code.

You can create a wrapper for your existing PyTorch model with a single function call. This will create a PopTorch model that runs in parallel on a single IPU.

You can then choose how to split the model across multiple IPUs, to create a pipelined implementation that exploits data parallelism across the IPUs.

See the PyTorch for the IPU User Guide for more information.

4. PopART and ONNX

The Poplar advanced run-time (PopART) enables the efficient execution of both inference and training graphs on the IPU. Its main import format is Open Neural Network Exchange (ONNX), an open format for representing machine learning models.

You can import and execute models created using other industry-standard frameworks, including using differentiation and optimization to train their parameters. You can also create graphs in PopART directly. Models can be exported in ONNX format, for example after training.

PopART includes Python and C++ APIs.

See the PopART User Guide for more information.

5. The Poplar libraries

The Poplar libraries are a set of C++ libraries consisting of the Poplar graph library and the open-source PopLibs™ libraries.

The Poplar graph library provides direct access to the IPU by code written in C++. You can write complete programs using Poplar, or use it to write functions to be called from your application written in a higher-level framework such as TensorFlow.

Poplar enables you to construct graphs, define tensor data and control how the code and data are mapped onto the IPU for execution. The host code and IPU code are both contained in a single program. Poplar compiles code for the IPU and copies it to the device to be executed. You can also pre-compile the device code for faster startup.

The open-source PopLibs library provides a range of higher level functions commonly used in machine learning applications. This includes highly optimised and parallelised primitives for linear algebra such as matrix multiplications and convolutions. There are also several functions used in neural networks (for example, non-linearities, pooling and loss functions) and many other operations on tensor data.

The source code for PopLibs is provided so you can use the library as a starting point for implementing your own functions.

For more information, refer to the Poplar and PopLibs User Guide

Each vertex of a Poplar graph runs a “codelet” that executes code directly on one of the many parallel cores in the IPU. These codelets can be written in C++ or, when more performance is required, assembly.

6. Running programs on the IPU

6.1. Distributed applications

The PopRun tool is provided with the SDK to assist with the task of running an application across multiple IPUs. It is a command line utility to launch distributed applications on Pod systems. It creates multiple instances of the application. Each instance can either be launched on a single host server or multiple host servers within the same Pod, depending on the number of host servers available on the target Pod.

PopRun is implemented with the PopDist API. This provides functions you can use to write a distributed application.

For more information, see the PopDist and PopRun: User Guide.

6.2. Triton inference server

The Poplar SDK includes a backend for the Triton inference server to support IPU systems. This enables inference models, written using PopART and TensorFlow for the IPU, and trained on the IPU to be served for execution on Pod systems.

For more information see the Poplar Triton Backend: User Guide.

6.3. TensorFlow Serving

The Poplar SDK includes the TensorFlow Serving 1 and TensorFlow Serving 2 applications for serving inference models. The applications support models exported from TensorFlow 1 and TensorFlow 2 into the IPU-optimized SavedModel format using the TensorFlow 1 export API and TensorFlow 2 export API respectively.

For more information, refer to IPU TensorFlow Serving 1 User Guide and IPU TensorFlow Serving 2 User Guide.

7. PopVision™ analysis tools

The PopVision analysis tools enable you to get an understanding of how applications are performing and utilising the IPU.

For more information see the PopVision Graph Analyser and System Analyser User Guides.

The PopVision Graph Analyser is an analysis tool that helps you gain a deep understanding of how your application is performing and utilising the IPU. By integrating directly with the internal profiling support of the Poplar graph engine and compiler, it enables you to profile and optimise the performance and memory usage of your machine learning models, whether developed in TensorFlow, PyTorch or natively using Poplar and PopLibs.


Fig. 7.1 The PopVision Graph Analyser

The program provides a graphical display of information about your program, including:

  • Summary report: Essential program information.

  • Memory report: Analyse program memory consumption and layout on one or multiple IPUs.

  • Liveness report: Explore temporary peaks in memory and their impact.

  • Execution trace report: View program execution.

For more information see the the PopVision Graph Analyser blog post

The PopVision System Analyser allows you to identify bottlenecks on the host CPU by showing the profiling information collected by the PopVision Trace Instrumentation library for Poplar, frameworks and the user application.


Fig. 7.2 The PopVision System Analyser

It provides information about the behaviour of the host-side application code. It shows an interactive graphical view of the timeline of execution steps, helping you to identify any bottlenecks between the CPUs and IPUs.

For more information see the the PopVision Analysis Tools blog post

The PopVision analysis library (libpva) allows programmatic analysis of the IPU profiling information used by the Graph Analyser. The library provides both C++ and Python APIs that can be used to query the Poplar profiling information for you application.

The PopVision trace instrumentation library (libpvti) provides functions to manage the capturing of profiling information for the host-code of your IPU application. This data can then be explored with the PopVision System Analyser. The library provides C++ and Python APIs.

8. Contents of the SDK

The Poplar SDK can be downloaded from the Graphcore software download portal. The PopVision analysis tools can be downloaded from the PopVision tools web page. Full documentation is available on the Graphcore documentation portal. Further information and resources can be found on the Graphcore developer site.

You will need a support account to access the software download portal.

See the Getting Started Guide for your IPU system for full installation instructions.

The installed SDK contains the following components:

  • Command line tools for managing the IPU hardware. See the IPU Command Line Tools document for details.

  • The Poplar, PopLibs, PopDist and PopVision libraries.

  • PopART with support for ONNX graphs.

  • PopTorch wheel file for running PyTorch code on the IPU.

  • Python wheel files for installing versions of TensorFlow 1 and 2 that target the IPU.

  • A Horovod wheel file to support distributed training in PopART.

  • The PopRun command line tool for running distributed applications across multiple IPUs and hosts.

  • Documentation.

Note that, on some cloud-based systems, each user sees a virtual machine and so you may need to install all the necessary software. See the Getting Started Guide for information on checking what software is installed.

To use the Poplar software you will need suitable hardware, such as a Bow Pod, or access to a cloud-based service that supports IPUs, such as Graphcloud.

Supported operating systems:

  • Ubuntu 18.04

  • Ubuntu 20.04

  • CentOS 7.6

  • Debian 10.7

The tools require Python 3.6 to be installed. Other packages required by TensorFlow or PopTorch will be automatically installed when you install the wheel file. We recommend running TensorFlow and PopTorch in a Python virtual environment.

The Intel build of TensorFlow requires a processor that supports the AVX-512 instruction set extensions (Skylake, or later).

The AMD build of TensorFlow requires a Ryzen class processor.

The SDK also includes a software model of the IPU so it is possible to develop and test your code even when IPU hardware is not available. This has limited functionality but can be useful for unit testing, for example.

9. Docker containers

You can pull pre-configured Docker containers from Docker Hub. These provide Poplar SDK images ready for deployment. The following Docker containers are available:

  • Tools: contains the necessary tools to interact with IPU devices

  • Poplar: contains Poplar, PopART and tools to interact with IPU devices

  • TensorFlow 1: contains everything in Poplar, with TensorFlow 1 pre-installed

  • TensorFlow 2: contains everything in Poplar, with TensorFlow 2 pre-installed

  • PyTorch: contains everything in Poplar, with PyTorch pre-installed

There are Intel and AMD variants of the TensorFlow containers. You must use the one that matches your hardware.

See Using IPUs from Docker for more information.

10. Support

Support is available from the Graphcore customer engineering team via the Graphcore support portal.

Graphcore also has GitHub repositories with further examples:

For more information, see the Examples page.

You can use the tags “ipu”, “poplar” and “popart” when asking questions or looking for answers on StackOverflow.