4. Model compilation
As mentioned in Section 2.3, Using the IPU Inference Toolkit, using the IPU Inference Toolkit is divided into two phases: model compilation and model runtime. This chapter uses ONNX, TensorFlow and PyTorch models as examples to describe how to convert and compile the models exported from different frameworks into binary PopEF files using the PopRT tool.
Note
The examples in this chapter require a Python virtual environment. Make sure you have installed
python
andvirtualenv
.The commands in the examples are relatively long. When copying from a PDF document, ensure that the commands are not truncated by line breaks or page breaks.
Note
The examples in this section will use the container started in Section 3.6.1, Installation with a Docker container.
4.1. ONNX model
This section describes how to convert and compile an ONNX model to generate a PopEF model. It uses the BERT-Squad model from the ONNX model zoo as an example.
Create the directory and Python virtual environment required for this example.
$ mkdir -p onnx_bert && \
cd onnx_bert && \
virtualenv -p python3 venv && \
source venv/bin/activate && \
pip install protobuf==3.19 onnx==1.11 && \
deactivate
4.1.1. Model exporting
Download the bertsquad-12.onnx
model from the ONNX model zoo using the wget
command.
$ wget https://github.com/onnx/models/raw/87d452a218093f6a60ceb62712ffe1186dce6d64/text/machine_comprehension/bert-squad/model/bertsquad-12.onnx
4.1.2. Batch size selection
The batch size needs to be determined in the compilation phase. In PopRT, the batch size is specified by the input_shape
parameter. When specifying a value for the input_shape
parameter, you need to know the name of the input tensor, which can be obtained using the ONNX Python API.
$ source venv/bin/activate
$ cat > list_input.py << EOF
import onnx
model = onnx.load('bertsquad-12.onnx')
graph = model.graph
print(graph.input)
EOF
$ python list_input.py
$ deactivate
which returns the following:
[name: "unique_ids_raw_output___9:0"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "unk__492"
}
}
}
}
, name: "segment_ids:0"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "unk__493"
}
dim {
dim_value: 256
}
}
}
}
, name: "input_mask:0"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "unk__494"
}
dim {
dim_value: 256
}
}
}
}
, name: "input_ids:0"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "unk__495"
}
dim {
dim_value: 256
}
}
}
}
]
The input parameter names and dimensions can be obtained from the output. In this case, we use batch size 16, and the corresponding input_shape
parameter settings are as follows:
--input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16
4.1.3. Precision selection
The bertsquad-12.onnx
model in the ONNX model zoo uses a precision of float32, and the precision of float16 is recommended for the IPU. In the PopRT tool, the desired precision is set by specifying the --precision
parameter. For more information about parameters, refer to the PopRT User Guide.
--precision fp16
4.1.4. Model conversion and compilation
Use the following command to convert the bertsquad-12.onnx
model to an ONNX model that can be run in PopART with a precision of float16 and batch size of 16, and save it to bertsquad-12_fp16_bs_16.onnx
.
If you specify the --export_popef
parameter, the converted model can also be compiled into a PopEF binary file executable.popef
as part of this conversion. If the --export_popef
parameter is not specified, the model conversion task will be executed separately. For more information about the model conversion parameters, refer to the PopRT documentation.
$ gc-docker -- --rm \
-v `pwd -P`:/model_conversion \
-w /model_conversion \
graphcorecn/poprt-staging:latest \
--input_model bertsquad-12.onnx \
--output_model bertsquad-12_fp16_bs_16.onnx \
--input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16 \
--precision fp16 \
--convert_version 11 \
--export_popef \
--ipu_version ipu21
Note
In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2
If the model conversion is successful, the following files will appear in the directory:
$ tree . -L 1
.
├── bertsquad-12.onnx
├── bertsquad-12_fp16_bs_16.onnx
├── executable.popef
├── list_input.py
└── venv
where:
bertsquad-12.onnx
is the original model.bertsquad-12_fp16_bs_16.onnx
is the model converted by PopRT.executable.popef
is the PopEF file generated from compilation.
Note
This example is complete. Use cd ..
to go back to the test root directory.
4.2. TensorFlow model
This section describes how to convert and compile a TensorFlow SavedModel model to generate a PopEF file . It uses the ResNet_v2_50 model from TensorFlow Hub as an example.
Create the directory and Python virtual environment required for this example.
$ mkdir -p tensorflow_resnet
$ cd tensorflow_resnet
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install protobuf==3.19 tensorflow==2.6 onnx==1.11 tf2onnx==1.12.1 packaging
$ deactivate
4.2.1. Model exporting
Download the ResNet_v2_50
model from TensorFlow Hub and unzip it to the specified directory:
$ wget -O resnet_v2_50.tar.gz https://tfhub.dev/google/imagenet/resnet_v2_50/classification/5?tf-hub-format=compressed
$ mkdir resnet_v2_50
$ tar -xf resnet_v2_50.tar.gz -C resnet_v2_50
As mentioned in Section 2.3, Using the IPU Inference Toolkit, the IPU Inference Toolkit uses an ONNX model as the unified input model. To convert the ResNet_v2_50
SavedModel model to ONNX, use the tf2onnx tool.
Execute the following command to convert a SavedModel model to ONNX:
$ source venv/bin/activate
$ python -m tf2onnx.convert \
--saved-model resnet_v2_50 \
--output resnet_v2_50.onnx \
--inputs-as-nchw inputs \
--outputs-as-nchw logits \
--opset 11
$ deactivate
Since the tensor format supported by PopART is nchw
, you need to use --inputs-as-nchw inputs
and --outputs-as-nchw logits
to convert the input/output tensor from the nhwc
format to the nchw
format. The parameters inputs
and logits
are the names of input and output tensors respectively, which can be read from the model by using the TensorFlow standard tool saved_model_cli
:
$ source venv/bin/activate
$ python -m tensorflow.python.tools.saved_model_cli \
show \
--dir resnet_v2_50/ \
--all
$ deactivate
The outputs include the following:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 3)
name: serving_default_inputs:0
The given SavedModel SignatureDef contains the following output(s):
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1001)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
where inputs['inputs'] tensor_info
describes the name of the input tensor as inputs
. Similarly, outputs['logits'] tensor_info
describes the name of the output tensor as logits
.
4.2.2. Model conversion and compilation
This section only describes using PopRT to compile a float16 model. For other optimisation parameters refer to Section 4.1, ONNX model.
Since the IPU uses static compilation, the input shape needs to be determined. For this model, we use inputs=(4,3,224,224) for the compilation. The input model is the ONNX model resnet_v2_50.onnx
obtained from Section 4.2.1, Model exporting.
$ gc-docker -- --rm \
-v `pwd -P`:/model_conversion \
-w /model_conversion \
graphcorecn/poprt-staging:latest \
--input_model resnet_v2_50.onnx \
--precision fp16 \
--output_model resnet_v2_50_optimized.onnx \
--input_shape inputs=4,3,224,224 \
--convert_version 11 \
--export_popef \
--ipu_version ipu21
Note
In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2
.
After the model conversion is successful, the optimised ONNX file resnet_v2_50_optimized.onnx
from conversion and the PopEF file executable.popef
from compilation are generated in the directory.
$ tree . -L 1
.
├── executable.popef
├── resnet_v2_50
├── resnet_v2_50.onnx
├── resnet_v2_50.tar.gz
├── resnet_v2_50_optimized.onnx
└── venv
Note
This example is complete. Use cd ..
to go back to the test root directory.
4.3. PyTorch model
This section describes how to convert and compile a PyTorch model to generate a PopEF file . It uses the resnet50 model from torchvision as an example.
Create the directory and Python virtual environment required for this example.
$ mkdir -p torch_resnet
$ cd torch_resnet
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install torchvision==0.11.2
$ deactivate
4.3.1. Model exporting
First, load the resnet50
model from torchvision
, determine the batch size and the input shape, and export it as an ONNX file.
$ source venv/bin/activate
$ cat > export.py << EOF
import torch.onnx
import torchvision.models as models
model = models.resnet50(pretrained=True)
BATCH_SIZE = 4
dummy_input = torch.randn(BATCH_SIZE, 3, 244, 244)
model.eval() # save it to onnx
torch.onnx.export(model, dummy_input, 'resnet50.onnx', opset_version=11)
print('The model saved into resnet50.onnx')
EOF
$ python export.py
$ deactivate
4.3.2. Model conversion and compilation
The output ONNX model (resnet50.onnx
) is converted and compiled with the following command. For an explanation of the parameters in the command, refer to Section 4.1, ONNX model.
$ gc-docker -- --rm \
-v `pwd -P`:/model_conversion \
-w /model_conversion \
graphcorecn/poprt-staging:latest \
--input_model resnet50.onnx \
--precision fp16 \
--export_popef \
--ipu_version ipu21
Note
In the case of testing on an IPU-M2000 or Bow-2000, use --ipu_version ipu2
If the model conversion is successful, the following files will appear in the directory:
$ tree . -L 1
.
├── executable.popef
├── export.py
├── resnet50.onnx
├── resnet50.onnx.optimized.onnx
└── venv
where:
resnet50.onnx
is the original model.optimized.onnx
is the model converted by PopRT.executable.popef
is the PopEF file generated from compilation.
Note
This example is complete. Use cd ..
to go back to the test root directory.