4. Model compilation

As mentioned in Section 2.3, Using the IPU Inference Toolkit, using the IPU Inference Toolkit is divided into two phases: model compilation and model runtime. This chapter uses ONNX, TensorFlow and PyTorch models as examples to describe how to convert and compile the models exported from different frameworks into binary PopEF files using the PopRT tool.

Note

The examples in this chapter require a Python virtual environment. Make sure you have installed python and virtualenv.
The commands in the examples are relatively long. When copying from a PDF document, ensure that the commands are not truncated by line breaks or page breaks.

Note

The examples in this section will use the container started in Section 3.6.1, Installation with a Docker container.

4.1. ONNX model

This section describes how to convert and compile an ONNX model to generate a PopEF model. It uses the BERT-Squad model from the ONNX model zoo as an example.

Create the directory and Python virtual environment required for this example.

$ mkdir -p onnx_bert && \
  cd onnx_bert && \
  virtualenv -p python3 venv && \
  source venv/bin/activate && \
  pip install protobuf==3.19 onnx==1.11 && \
  deactivate

4.1.1. Model exporting

Download the bertsquad-12.onnx model from the ONNX model zoo using the wget command.

$ wget https://github.com/onnx/models/raw/87d452a218093f6a60ceb62712ffe1186dce6d64/text/machine_comprehension/bert-squad/model/bertsquad-12.onnx

4.1.2. Batch size selection

The batch size needs to be determined in the compilation phase. In PopRT, the batch size is specified by the input_shape parameter. When specifying a value for the input_shape parameter, you need to know the name of the input tensor, which can be obtained using the ONNX Python API.

$ source venv/bin/activate
$ cat > list_input.py << EOF

import onnx
model = onnx.load('bertsquad-12.onnx')
graph = model.graph
print(graph.input)
EOF

$ python list_input.py

$ deactivate

which returns the following:

[name: "unique_ids_raw_output___9:0"
type {
   tensor_type {
      elem_type: 7
      shape {
         dim {
            dim_param: "unk__492"
         }
      }
   }
}
, name: "segment_ids:0"
type {
   tensor_type {
      elem_type: 7
      shape {
         dim {
            dim_param: "unk__493"
         }
         dim {
            dim_value: 256
         }
      }
   }
}
, name: "input_mask:0"
type {
   tensor_type {
      elem_type: 7
      shape {
         dim {
            dim_param: "unk__494"
         }
         dim {
            dim_value: 256
         }
      }
   }
}
, name: "input_ids:0"
type {
   tensor_type {
      elem_type: 7
      shape {
         dim {
            dim_param: "unk__495"
         }
         dim {
            dim_value: 256
         }
      }
   }
}
]

The input parameter names and dimensions can be obtained from the output. In this case, we use batch size 16, and the corresponding input_shape parameter settings are as follows:

--input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16

4.1.3. Precision selection

The bertsquad-12.onnx model in the ONNX model zoo uses a precision of float32, and the precision of float16 is recommended for the IPU. In the PopRT tool, the desired precision is set by specifying the --precision parameter. For more information about parameters, refer to the PopRT User Guide.

--precision fp16

4.1.4. Model conversion and compilation

Use the following command to convert the bertsquad-12.onnx model to an ONNX model that can be run in PopART with a precision of float16 and batch size of 16, and save it to bertsquad-12_fp16_bs_16.onnx.

If you specify the --export_popef parameter, the converted model can also be compiled into a PopEF binary file executable.popef as part of this conversion. If the --export_popef parameter is not specified, the model conversion task will be executed separately. For more information about the model conversion parameters, refer to the PopRT documentation.

$ gc-docker -- --rm \
      -v `pwd -P`:/model_conversion \
      -w /model_conversion \
      graphcorecn/poprt-staging:latest \
      --input_model bertsquad-12.onnx     \
      --output_model bertsquad-12_fp16_bs_16.onnx \
      --input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16 \
      --precision fp16 \
      --convert_version 11 \
      --export_popef \
      --ipu_version ipu21

Note

In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2

If the model conversion is successful, the following files will appear in the directory:

$ tree . -L 1
.
├── bertsquad-12.onnx
├── bertsquad-12_fp16_bs_16.onnx
├── executable.popef
├── list_input.py
└── venv

where:

bertsquad-12.onnx is the original model.
bertsquad-12_fp16_bs_16.onnx is the model converted by PopRT.
executable.popef is the PopEF file generated from compilation.

Note

This example is complete. Use cd .. to go back to the test root directory.

4.2. TensorFlow model

This section describes how to convert and compile a TensorFlow SavedModel model to generate a PopEF file . It uses the ResNet_v2_50 model from TensorFlow Hub as an example.

Create the directory and Python virtual environment required for this example.

$ mkdir -p tensorflow_resnet
$ cd tensorflow_resnet
$ virtualenv -p python3 venv
$ source venv/bin/activate

$ pip install protobuf==3.19 tensorflow==2.6 onnx==1.11 tf2onnx==1.12.1 packaging

$ deactivate

4.2.1. Model exporting

Download the ResNet_v2_50 model from TensorFlow Hub and unzip it to the specified directory:

$ wget -O resnet_v2_50.tar.gz https://tfhub.dev/google/imagenet/resnet_v2_50/classification/5?tf-hub-format=compressed
$ mkdir resnet_v2_50
$ tar -xf resnet_v2_50.tar.gz -C resnet_v2_50

As mentioned in Section 2.3, Using the IPU Inference Toolkit, the IPU Inference Toolkit uses an ONNX model as the unified input model. To convert the ResNet_v2_50 SavedModel model to ONNX, use the tf2onnx tool.

Execute the following command to convert a SavedModel model to ONNX:

$ source venv/bin/activate
$ python -m tf2onnx.convert \
   --saved-model resnet_v2_50 \
   --output resnet_v2_50.onnx \
   --inputs-as-nchw inputs \
   --outputs-as-nchw logits \
   --opset 11

$ deactivate

Since the tensor format supported by PopART is nchw, you need to use --inputs-as-nchw inputs and --outputs-as-nchw logits to convert the input/output tensor from the nhwc format to the nchw format. The parameters inputs and logits are the names of input and output tensors respectively, which can be read from the model by using the TensorFlow standard tool saved_model_cli:

$ source venv/bin/activate
$ python -m tensorflow.python.tools.saved_model_cli \
   show \
   --dir resnet_v2_50/ \
   --all

$ deactivate

The outputs include the following:

signature_def['serving_default']:
   The given SavedModel SignatureDef contains the following input(s):
      inputs['inputs'] tensor_info:
         dtype: DT_FLOAT
         shape: (-1, -1, -1, 3)
         name: serving_default_inputs:0
   The given SavedModel SignatureDef contains the following output(s):
      outputs['logits'] tensor_info:
         dtype: DT_FLOAT
         shape: (-1, 1001)
         name: StatefulPartitionedCall:0
   Method name is: tensorflow/serving/predict

where inputs['inputs'] tensor_info describes the name of the input tensor as inputs. Similarly, outputs['logits'] tensor_info describes the name of the output tensor as logits.

4.2.2. Model conversion and compilation

This section only describes using PopRT to compile a float16 model. For other optimisation parameters refer to Section 4.1, ONNX model.

Since the IPU uses static compilation, the input shape needs to be determined. For this model, we use inputs=(4,3,224,224) for the compilation. The input model is the ONNX model resnet_v2_50.onnx obtained from Section 4.2.1, Model exporting.

$ gc-docker -- --rm \
    -v `pwd -P`:/model_conversion \
    -w /model_conversion \
    graphcorecn/poprt-staging:latest \
    --input_model resnet_v2_50.onnx \
    --precision fp16 \
    --output_model resnet_v2_50_optimized.onnx \
    --input_shape inputs=4,3,224,224 \
    --convert_version 11 \
    --export_popef \
    --ipu_version ipu21

Note

In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2.

After the model conversion is successful, the optimised ONNX file resnet_v2_50_optimized.onnx from conversion and the PopEF file executable.popef from compilation are generated in the directory.

$ tree . -L 1
.
├── executable.popef
├── resnet_v2_50
├── resnet_v2_50.onnx
├── resnet_v2_50.tar.gz
├── resnet_v2_50_optimized.onnx
└── venv

Note

This example is complete. Use cd .. to go back to the test root directory.

4.3. PyTorch model

This section describes how to convert and compile a PyTorch model to generate a PopEF file . It uses the resnet50 model from torchvision as an example.

Create the directory and Python virtual environment required for this example.

$ mkdir -p torch_resnet
$ cd torch_resnet
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install torchvision==0.11.2

$ deactivate

4.3.1. Model exporting

First, load the resnet50 model from torchvision, determine the batch size and the input shape, and export it as an ONNX file.

$ source venv/bin/activate
$ cat > export.py << EOF

import torch.onnx
import torchvision.models as models
model = models.resnet50(pretrained=True)
BATCH_SIZE = 4
dummy_input = torch.randn(BATCH_SIZE, 3, 244, 244)
model.eval() # save it to onnx
torch.onnx.export(model, dummy_input, 'resnet50.onnx', opset_version=11)
print('The model saved into resnet50.onnx')
EOF

$ python export.py

$ deactivate

4.3.2. Model conversion and compilation

The output ONNX model (resnet50.onnx) is converted and compiled with the following command. For an explanation of the parameters in the command, refer to Section 4.1, ONNX model.

$ gc-docker -- --rm \
    -v `pwd -P`:/model_conversion \
    -w /model_conversion \
    graphcorecn/poprt-staging:latest \
    --input_model resnet50.onnx \
    --precision fp16 \
    --export_popef \
    --ipu_version ipu21

Note

In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2

If the model conversion is successful, the following files will appear in the directory:

$ tree . -L 1
.
├── executable.popef
├── export.py
├── resnet50.onnx
├── resnet50.onnx.optimized.onnx
└── venv

where:

resnet50.onnx is the original model.
optimized.onnx is the model converted by PopRT.
executable.popef is the PopEF file generated from compilation.

Note

This example is complete. Use cd .. to go back to the test root directory.

Search help

4. Model compilation

4.1. ONNX model

4.1.1. Model exporting

4.1.2. Batch size selection

4.1.3. Precision selection

4.1.4. Model conversion and compilation

4.2. TensorFlow model

4.2.1. Model exporting

4.2.2. Model conversion and compilation

4.3. PyTorch model

4.3.1. Model exporting

4.3.2. Model conversion and compilation