7. Tools

You can use the following tools to test the Model Runtime API with PopEF models to verify the processing throughput and validity of the data computed by the IPU.

7.1. Callbacks benchmark

The model_runner_benchmark_callbacks tool allows you to calculate the throughput provided by the IPU using the Model Runtime API. The data used for calculations by the IPU are random. Communication with the IPU is carried out through the callback mechanism described in the under the hood chapter. The tool measures the time between sending the query to the IPU and receiving its result.

Note

To define the number of iterations performed by the IPU, define the NUM environment variable. The default is 10 if not specified.

Listing 7.1 Callbacks benchmark usage.
$ NUM=1000 model_runner_benchmark_callbacks bert.popef

[140000941486208] Parsed bert.popef
[140000941486208] Model created
[140000941486208] Device acquired
[140000941486208] Preparing device
NUM=1000
[140000941486208] Creating callback for input 'input/1' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for input 'input' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for output 'Reshape:0/1' Shape: [1, 110] Data type: F32
[140000941486208] Creating callback for output 'Reshape:0' Shape: [1, 110] Data type: F32

The model has multiple input anchors (2):
1. input/1
2. input
Please select the input for which throughput will be measured from the list above: 2
[140000941486208] Running load programs
[140000941486208] Total elapsed time in seconds: 0.744733 sec
[140000941486208] Processed batch(1) * iterations(1000) = 1000
[140000941486208] 1342.76 items / sec

7.2. Queues benchmark

The model_runner_benchmark_queues tool provides the same measurements as model_runner_benchmark_callbacks but provides the data to the IPUs using queues described also in the under the hood chapter.

Listing 7.2 Callbacks benchmark usage.
$ NUM=1000 model_runner_benchmark_queues bert.popef

[140619973836928] Parsed bert.popef
[140619973836928] Model created
NUM=1000
[140619973836928] constructor, num_iterations = 1000
[140619973836928] Device acquired
[140619973836928] Preparing device
[140619973836928] Running load program

The model has multiple input anchors (2):
1. input
2. input/1
Please select the input for which throughput will be measured from the list above: 1
[140619973836928] Starting threads
[140618722395904] Program run starting now
[140619973836928] Waiting for main thread
[140618722395904] Last output callback: exiting
[140618722395904] Exiting main program
[140619973836928] Main thread returned
[140619973836928] Total elapsed time in seconds: 0.768687 sec
[140619973836928] Processed batch(1) * iterations(1000) = 1000
[140619973836928] 1300.92 items / sec

7.3. Real data

The model_runner_real_data tool will automatically execute the model and create a file in the PopEF format with the values calculated by the IPU. Values will be saved as feed data blobs.

Below is an example of using the tool based on the saved mnist model using the PopTorch API. It is saved in the executable.popef file. The input data that will be provided by the tool to the IPU is in the feed.popef file. The output of the tool will be the out_feed.popef file, which will contain the processing results for the first four inputs from the feed.popef file.

Listing 7.3 Input data for the mnist model.
$ popef_dump feed.popef

PopEF file: feed.popef
Executables:
Feeds:
  Name: "input":
    Version:
    Available read size: 349408
    Number of tensors: 101
    TensorInfo: { dtype: S32, sizeInBytes: 3136, shape [1, 1, 28, 28] }
Listing 7.4 The output and usage real_data tool.
$ model_runner_real_data -o out_feed.popef -n 4 executable.popef feed.popef

NUM=10
[140253022322816] Parsed executable.popef
[140253022322816] Parsed feed.popef
[140253022322816] Model created
[140253022322816] Output filename supplied: using NUM=4
[140253022322816] constructor, num_iterations = 4
[140253022322816] Device acquired
[140253022322816] Preparing device
[140253022322816] Running load program
[140253022322816] Starting threads
[140252157048576] Program run starting now
[140253022322816] Waiting for main thread
[140252157048576] Last output callback: exiting
[140252157048576] Exiting main program
[140253022322816] Main thread returned
Listing 7.5 Content of the out_feed.popef file.
$ popef_dump out_feed.popef

PopEF file: out_feed.popef
Executables:
Feeds:
  Name: "softmax/Softmax:0":
    Version:
    Available read size: 320
    Number of tensors: 4
    TensorInfo: { dtype: F32, sizeInBytes: 40, shape [1, 10] }
Listing 7.6 Example script which allows reading content of the feed data blob.
import popef

reader = popef.Reader()
reader.parseFile("out_feed.popef")

for out_tensor in reader.feeds()[0].data():
  print(out_tensor)
Listing 7.7 The output of the above-mentioned script called read_feed.py
$ python3 read_feed.py

[[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]
[[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]