8. Tools
You can use the following tools to test the Model Runtime API with PopEF models in order to verify the processing throughput and validity of the data computed by the IPU.
8.1. Callbacks benchmark
The model_runner_benchmark_callbacks
tool allows you to calculate the
throughput provided by the IPU using the Model Runtime API. The data used
for calculations by the IPU is random. Communication with the
IPU is carried out through the callback mechanism described
in Section 2.5, Managing data sources and targets. The tool measures
the time between sending the query to the IPU and receiving a result.
Note
To specify the number of iterations performed by the IPU,
define the NUM
environment variable. The default is 10 if
not specified.
$ NUM=1000 model_runner_benchmark_callbacks bert.popef
[140000941486208] Parsed bert.popef
[140000941486208] Model created
[140000941486208] Device acquired
[140000941486208] Preparing device
NUM=1000
[140000941486208] Creating callback for input 'input/1' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for input 'input' Shape: [1, 110] Data type: S32
[140000941486208] Creating callback for output 'Reshape:0/1' Shape: [1, 110] Data type: F32
[140000941486208] Creating callback for output 'Reshape:0' Shape: [1, 110] Data type: F32
The model has multiple input anchors (2):
1. input/1
2. input
Please select the input for which throughput will be measured from the list above: 2
[140000941486208] Running load programs
[140000941486208] Total elapsed time in seconds: 0.744733 sec
[140000941486208] Processed batch(1) * iterations(1000) = 1000
[140000941486208] 1342.76 items / sec
8.2. Queues benchmark
The model_runner_benchmark_queues
tool provides the same measurements
as model_runner_benchmark_callbacks
but also provides the data to the IPUs
using queues as described in Section 2.6, Queues of data.
$ NUM=1000 model_runner_benchmark_queues bert.popef
[140619973836928] Parsed bert.popef
[140619973836928] Model created
NUM=1000
[140619973836928] constructor, num_iterations = 1000
[140619973836928] Device acquired
[140619973836928] Preparing device
[140619973836928] Running load program
The model has multiple input anchors (2):
1. input
2. input/1
Please select the input for which throughput will be measured from the list above: 1
[140619973836928] Starting threads
[140618722395904] Program run starting now
[140619973836928] Waiting for main thread
[140618722395904] Last output callback: exiting
[140618722395904] Exiting main program
[140619973836928] Main thread returned
[140619973836928] Total elapsed time in seconds: 0.768687 sec
[140619973836928] Processed batch(1) * iterations(1000) = 1000
[140619973836928] 1300.92 items / sec
8.3. Real data
The model_runner_real_data
tool will automatically execute the model and
create a PopEF file with the values calculated by the IPU.
Values will be saved as feed data blobs.
The following is an example of using the tool, based on the mnist
model saved using
the PopTorch API. It is saved in the executable.popef
file. The input data
that will be provided by the tool to the IPU is in the feed.popef
file.
The output of the tool will be the out_feed.popef
file, which will contain
the processing results for the first four inputs from the feed.popef
file.
$ popef_dump feed.popef
PopEF file: feed.popef
Executables:
Feeds:
Name: "input":
Version:
Available read size: 349408
Number of tensors: 101
TensorInfo: { dtype: S32, sizeInBytes: 3136, shape [1, 1, 28, 28] }
$ model_runner_real_data -o out_feed.popef -n 4 executable.popef feed.popef
NUM=10
[140253022322816] Parsed executable.popef
[140253022322816] Parsed feed.popef
[140253022322816] Model created
[140253022322816] Output filename supplied: using NUM=4
[140253022322816] constructor, num_iterations = 4
[140253022322816] Device acquired
[140253022322816] Preparing device
[140253022322816] Running load program
[140253022322816] Starting threads
[140252157048576] Program run starting now
[140253022322816] Waiting for main thread
[140252157048576] Last output callback: exiting
[140252157048576] Exiting main program
[140253022322816] Main thread returned
$ popef_dump out_feed.popef
PopEF file: out_feed.popef
Executables:
Feeds:
Name: "softmax/Softmax:0":
Version:
Available read size: 320
Number of tensors: 4
TensorInfo: { dtype: F32, sizeInBytes: 40, shape [1, 10] }
import popef
reader = popef.Reader()
reader.parseFile("out_feed.popef")
for out_tensor in reader.feeds()[0].data():
print(out_tensor)
[[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]
[[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]