2. Model compilation
As a first step you may need to download a model, if you do not have your own. Once that is done then you can compile your model.
2.1. Model download
In this example, we will use the BERT-Squad model from the ONNX model zoo as an example. We use the wget
command to download the model:
wget https://github.com/onnx/models/raw/87d452a218093f6a60ceb62712ffe1186dce6d64/text/machine_comprehension/bert-squad/model/bertsquad-12.onnx
2.2. Model conversion and compilation
The following command converts the bertsquad-12.onnx
model to an FP16 precision model with a batch size of 16 and compiles it into a binary file executable.popef
in PopEF format. For more details of how to use PopRT see the PopRT User Guide.
gc-docker -- --rm \
-v `pwd -P`:/model_conversion \
-w /model_conversion \
graphcorecn/poprt-staging:latest \
--input_model bertsquad-12.onnx \
--input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16 \
--precision fp16 \
--convert_version 11 \
--export_popef \
--ipu_version ipu21
Note
If you are testing in an IPU-M2000 or Bow-2000 environment you will need to use --ipu_version ipu2
instead of --ipu_version ipu21
.
Once the model has been converted and compiled the model directory will contain the following files:
$ tree . -L 1
├── bertsquad-12.onnx
├── bertsquad-12.onnx.optimized.onnx
├── executable.popef
bertsquad-12.onnx
: The original modelbertsquad-12.onnx.optimized.onnx
: The model generated after the conversion with PopRTexecutable.popef
: The PopEF file generated during compilation