2. Model compilation

As a first step you may need to download a model, if you do not have your own. Once that is done then you can compile your model.

2.1. Model download

In this example, we will use the BERT-Squad model from the ONNX model zoo as an example. We use the wget command to download the model:

wget https://github.com/onnx/models/raw/87d452a218093f6a60ceb62712ffe1186dce6d64/text/machine_comprehension/bert-squad/model/bertsquad-12.onnx

2.2. Model conversion and compilation

The following command converts the bertsquad-12.onnx model to an FP16 precision model with a batch size of 16 and compiles it into a binary file executable.popef in PopEF format. For more details of how to use PopRT see the PopRT User Guide.

gc-docker -- --rm \
    -v `pwd -P`:/model_conversion \
    -w /model_conversion \
    graphcorecn/poprt-staging:latest \
    --input_model bertsquad-12.onnx \
    --input_shape input_ids:0=16,256 input_mask:0=16,256 segment_ids:0=16,256 unique_ids_raw_output___9:0=16 \
    --precision fp16 \
    --convert_version 11 \
    --export_popef \
    --ipu_version ipu21

Note

If you are testing in an IPU-M2000 or Bow-2000 environment you will need to use --ipu_version ipu2 instead of --ipu_version ipu21.

Once the model has been converted and compiled the model directory will contain the following files:

$ tree . -L 1

  ├── bertsquad-12.onnx
  ├── bertsquad-12.onnx.optimized.onnx
  ├── executable.popef
  • bertsquad-12.onnx: The original model

  • bertsquad-12.onnx.optimized.onnx: The model generated after the conversion with PopRT

  • executable.popef: The PopEF file generated during compilation