1. Introduction

If you find that you need an operation (op) that is not currently implemented in the Poplar framework that you are using, you can create a custom op.

This document assumes you have already installed the Poplar SDK, and have some familiarity with the IPU programming model and at least one of the supported frameworks.

2. Creating a custom op in PyTorch

There are two steps to create a custom op in PyTorch for the IPU:

  1. Implement the op in C++ using the PopART API

  2. Make the op available in PopTorch so you can use it in your PyTorch model

2.1. Implementing the custom op

You will need to implement the new op as C++ code by creating subclasses of, at least, the Op and Opx base classes provided by the PopART API.

If you are going to use the custom op for training, then you will also need to define the classes that implement the gradient operation. For details of how to do this, see the “Custom ops” chapter of the PopART User Guide.

2.2. Make the op available in PyTorch

After you have created the C++ implementation of the custom op, you can can load the object file, and call the op from your PyTorch program, using the poptorch.custom_op class. This is described in the custom ops section of the PyTorch for the IPU User Guide.

3. Creating a custom op in TensorFlow

There are two options for creating a custom op in TensorFlow, each with advantages and disadvantages. 1. A custom op that executes on the IPU 2. A custom op that executes on the host CPU

3.1. A custom op that executes on the IPU

The process for creating a custom op that runs on the IPU in TensorFlow is similar to the process for PopART: first write an implementation in C++, using the Poplar graph programming framework, and then call this from your TensorFlow program. Your custom op can use all the features of Poplar and PopLibs, including writing your own vertex code to run on the IPU. For more information about writing Poplar graph programs and codelets, refer to the Poplar and PopLibs User Guide.

You need to write a function that builds your custom op. This function is defined by TensorFlow and will be called when you instantiate the custom op in your TensorFlow program. If your program is doing training, then you will also need to define a function that calculates gradients for the backward pass. You can also provide extra information about your custom op with an associated “metadata” function.

These functions are described in the TensorFlow user guide.

Your TensorFlow program can then instantiate the custom op and use it in your model. See Custom operation on the IPU in the TensorFlow documentation.

3.2. A custom op that executes on the host CPU

You can implement a custom op that runs on the host CPU, rather than the IPU. This still needs to be written in C++, but may be easier to implement as you can use development tools and libraries that may be more familiar to you. However, performance will be limited because it will not exploit the massive parallelism of the IPU and because of the overhead of moving data to the host and back.

However, this may be a good initial route for prototyping new operations. See Custom host CPU operations for more information.

4. Creating a custom op in PopART

There are two steps to create a custom op in PopART:

  1. Implement the op in C++ using the PopART API

  2. Make the op available in PopART

4.1. Implementing the custom op in PopART

You will need to implement the new op as C++ code by creating subclasses of, at least, the Op and Opx base classes.

If you are going to use the custom op for training, then you will also need to define the classes that implement the gradient operation. For details of how to do this, see the “Custom ops” chapter of the PopART User Guide.

4.2. Make the op available in PopART

After you have written the classes that implement the op, you will need to make the op available to PopART. This means defining an op identifier and using the op creator class to register the op with PopART. This is described in detail in the PopART User Guide.

After that, you can use the op in your PopART code.

5. Creating a custom op in PopXL

Custom ops in PopXL are essentially PopART custom ops “under the hood”, and it is recommended to share custom op implementations between PopXL and PopART than to duplicate the work.

There are three steps to create a custom op in PopXL:

  1. Setup the environment

  2. Implement the op in C++

  3. Implement the Python bindings

Full details are given in the Custom operations chapter of the PopXL User Guide and API (experimental).

5.1. Setting up the environment

You will need to set up the development environment to allow for easy compilation of C++ ops and creation of Python bindings. Refer to the Environment section in the Custom operations chapter for details.

5.2. Implementing the custom op in C++

You will need to implement the new op as C++ code by creating subclasses of, at least, the Op and Opx base classes. The detailed steps are:

  1. Creating the Parameter struct to encapsulate the parameters needed by the custom op.

  2. Creating the operation classes (Op and Opx).

  3. Creating the gradient operation classes if you are going to use the op for training (Op and Opx).

5.3. Make the op available in PopXL

After you have written the classes that implement the op, you will need to make the op available to PopXL. This means:

  1. Creating the Python bindings so the op can be used in Python.

  2. Creating the Python wrapper so the op can be added to an IR.

You can use the op in PopXL by simply importing the Python wrapper function you created.

Note

You don’t have to explicitly compile the code; this is done automatically by the cppimport module.

6. Examples

You can find examples of the use of custom ops in the Graphcore tutorials repository on GitHub:

There are also examples in the documents for TensorFlow 1 and TensorFlow 2.

Note

From Poplar SDK 3.1, TensorFlow 1 will only be supported in CentOS 7. In addition, Examples and Tutorials for TensorFlow 1 are only available up to version 3.0 of the SDK. There has been limited testing of the 3.0 versions of the TensorFlow 1 tutorials and examples with Poplar SDK 3.1.

The PopART tutorials are only supported up to Poplar SDK 3.1. They may work with later versions but have had only limited testing.