8. Tools

You can use the following tools to test the Model Runtime API with PopEF models in order to verify the processing throughput and validity of the data computed by the IPU.

8.1. Callbacks benchmark

The model_runner_benchmark_callbacks tool allows you to calculate the throughput provided by the IPU using the Model Runtime API. The data used for calculations by the IPU is random. Communication with the IPU is carried out through the callback mechanism described in Section 2.5, Managing data sources and targets. The tool measures the time between sending the query to the IPU and receiving a result.

Note

To specify the number of iterations performed by the IPU, define the NUM environment variable. The default is 10 if not specified.

Listing 8.1 Callbacks benchmark usage.

$ NUM=1000 model_runner_benchmark_callbacks bert.popef

[140000941486208] Parsed bert.popef
[140000941486208] Model created
[140000941486208] Device acquired
[140000941486208] Preparing device
NUM=1000
[140000941486208] Creating callback for input 'input/1' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for input 'input' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for output 'Reshape:0/1' Shape: [1, 110] Data type: F32
[140000941486208] Creating callback for output 'Reshape:0' Shape: [1, 110] Data type: F32

The model has multiple input anchors (2):
1. input/1
2. input
Please select the input for which throughput will be measured from the list above: 2
[140000941486208] Running load programs
[140000941486208] Total elapsed time in seconds: 0.744733 sec
[140000941486208] Processed batch(1) * iterations(1000) = 1000
[140000941486208] 1342.76 items / sec

8.2. Queues benchmark

The model_runner_benchmark_queues tool provides the same measurements as model_runner_benchmark_callbacks but also provides the data to the IPUs using queues as described in Section 2.6, Queues of data.

Listing 8.2 Queues benchmark usage.

$ NUM=1000 model_runner_benchmark_queues bert.popef

[140619973836928] Parsed bert.popef
[140619973836928] Model created
NUM=1000
[140619973836928] constructor, num_iterations = 1000
[140619973836928] Device acquired
[140619973836928] Preparing device
[140619973836928] Running load program

The model has multiple input anchors (2):
1. input
2. input/1
Please select the input for which throughput will be measured from the list above: 1
[140619973836928] Starting threads
[140618722395904] Program run starting now
[140619973836928] Waiting for main thread
[140618722395904] Last output callback: exiting
[140618722395904] Exiting main program
[140619973836928] Main thread returned
[140619973836928] Total elapsed time in seconds: 0.768687 sec
[140619973836928] Processed batch(1) * iterations(1000) = 1000
[140619973836928] 1300.92 items / sec

8.3. Real data

The model_runner_real_data tool will automatically execute the model and create a PopEF file with the values calculated by the IPU. Values will be saved as feed data blobs.

The following is an example of using the tool, based on the mnist model saved using the PopTorch API. It is saved in the executable.popef file. The input data that will be provided by the tool to the IPU is in the feed.popef file. The output of the tool will be the out_feed.popef file, which will contain the processing results for the first four inputs from the feed.popef file.

Listing 8.3 Input data for the mnist model.

$ popef_dump feed.popef

PopEF file: feed.popef
Executables:
Feeds:
  Name: "input":
    Version:
    Available read size: 349408
    Number of tensors: 101
    TensorInfo: { dtype: S32, sizeInBytes: 3136, shape [1, 1, 28, 28] }

Listing 8.4 Input and output of the model_runtime_real_data tool

$ model_runner_real_data -o out_feed.popef -n 4 executable.popef feed.popef

NUM=10
[140253022322816] Parsed executable.popef
[140253022322816] Parsed feed.popef
[140253022322816] Model created
[140253022322816] Output filename supplied: using NUM=4
[140253022322816] constructor, num_iterations = 4
[140253022322816] Device acquired
[140253022322816] Preparing device
[140253022322816] Running load program
[140253022322816] Starting threads
[140252157048576] Program run starting now
[140253022322816] Waiting for main thread
[140252157048576] Last output callback: exiting
[140252157048576] Exiting main program
[140253022322816] Main thread returned

Listing 8.5 Contents of the out_feed.popef file.

$ popef_dump out_feed.popef

PopEF file: out_feed.popef
Executables:
Feeds:
  Name: "softmax/Softmax:0":
    Version:
    Available read size: 320
    Number of tensors: 4
    TensorInfo: { dtype: F32, sizeInBytes: 40, shape [1, 10] }

Listing 8.6 Example script to read the contents of the feed data blob from the out_feed.popef file.

import popef

reader = popef.Reader()
reader.parseFile("out_feed.popef")

for out_tensor in reader.feeds()[0].data():
  print(out_tensor)

Listing 8.7 The contents of the feed data blob as printed by the Python script in Listing 8.6

[[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]
[[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

Search help

8. Tools

8.1. Callbacks benchmark

8.2. Queues benchmark

8.3. Real data