9. gc-hosttraffictest
This tool tests the data transfer between the host machine and the IPUs (in both directions).
On IPU-M2000 systems, host transfers to and from the IPU are remotely buffered. The simultaneous access to that buffer is measured.
On PCIe card systems, the IPU has direct access to the buffer in the host memory. In this case, the performance of memory accesses to the local buffer is not measured.
To use it, run:
gc-hosttraffictest -d {device_id} -j
where {device_id}
is the id number returned by the gc-inventory
tool.
If JSON output is selected, the output on an IPU-M2000 system will look something like this:
{
"configuration": {
"number_of_ipus": "1",
"tile_transfers_enabled": "true",
"tiles_per_ipu": "32",
"transfer_size_64byte_blocks": "4",
"iterations": "100000",
"host_transfers_enabled": "true"
},
"repeat_0": {
"host_from_buffer_from_tile": {
"description": "host <- buffer <- tile",
"duration_seconds": "3.4895363750000001",
"host_data_transferred_bytes": "10234101760",
"tile_data_transferred_bytes": "13107200000",
"host_transfer_speed_gbps": "23.462375880807375",
"tile_transfer_speed_gbps": "30.049149437509445"
},
"host_to_buffer_to_tile": {
"description": "host -> buffer -> tile",
"duration_seconds": "3.970404303",
"host_data_transferred_bytes": "28034727936",
"tile_data_transferred_bytes": "13107200000",
"host_transfer_speed_gbps": "56.487401879586365",
"tile_transfer_speed_gbps": "26.409804140291353"
},
"host_to_buffer_from_tile": {
"description": "host -> buffer <- tile",
"duration_seconds": "4.2574377639999996",
"host_data_transferred_bytes": "11576279040",
"tile_data_transferred_bytes": "13107200000",
"host_transfer_speed_gbps": "21.752574542156008",
"tile_transfer_speed_gbps": "24.629273711680266"
},
"host_from_buffer_to_tile": {
"description": "host <- buffer -> tile",
"duration_seconds": "3.8982566639999998",
"host_data_transferred_bytes": "19612565504",
"tile_data_transferred_bytes": "13107200000",
"host_transfer_speed_gbps": "40.248895225642841",
"tile_transfer_speed_gbps": "26.89858801970357"
}
}
}
The output will be plain text if the -j
option is not specified.
For example:
$ gc-hosttraffictest --device 0
Running test sequence 1/1
host <- buffer <- tile: 23.46 Gbps RDMA host read, 30.00 Gbps tile write, for 3.50 seconds.
host -> buffer -> tile: 55.78 Gbps RDMA host write, 26.40 Gbps tile read, for 3.97 seconds.
host -> buffer <- tile: 21.75 Gbps RDMA host write, 24.63 Gbps tile write, for 4.26 seconds.
host <- buffer -> tile: 40.38 Gbps RDMA host read, 26.97 Gbps tile read, for 3.89 seconds.
There are options to measure just IPU or host transfer performance independently.
For example:
$ gc-hosttraffictest --device 0 --tile
Running test sequence 1/1
buffer <- tile: 37.38 Gbps tile write, for 2.80 seconds.
buffer -> tile: 51.92 Gbps tile read, for 2.02 seconds.
$ gc-hosttraffictest --device 0 --host
Running test sequence 1/1
host -> buffer: 90.72 Gbps RDMA host write, for 5.01 seconds.
host <- buffer: 74.85 Gbps RDMA host read, for 5.00 seconds.
More detailed information and a progress bar is shown when using the -v
option.
If an error occurs during the test, then gc-hosttraffictest
will return a
non-zero exit code, and output an error message to the terminal. When JSON
output is enabled, an error field will be added to the failing test.
9.1. Usage
9.1.1. Allowed options
|
Emit JSON output |
|
Device id |
|
Number of tiles: 1 to 32 (default: 32) |
|
Use remote buffers (HEXOATT, without use HEXOPT) |
|
Number of 64 byte blocks per transfer: 2|4 (default: 4) |
|
Number of 4KB transfers per tile (affects test duration) (default: 100000) |
|
Test duration in seconds for tests without IPU tile access (default: 5) |
|
Measure IPU tile read performance |
|
Measure IPU tile write performance |
|
Alternative for –tile-read –tile-write |
|
Measure host RDMA read performance |
|
Measure host RDMA write performance |
|
Alternative for –host-read –host-write |
|
Minimum tile bandwidth (Gbps) expected - fails if not reached (default: 0) |
|
Minimum bandwidth (Gbps) expected - fails if not reached (default: 0) |
|
Suppress test timeout |
|
Suppress safety reset on failure (for debug) |
|
Verbose output |
|
Use C600 secondary PCIe interface |
|
Dump tile overview upon failure |
|
Produce help message |
|
Version number |
|
Number of times to repeat test (default: 1) |