3. Capturing execution information

To capture execution data from your program to a file, use the following options when executing your program. These are specified at the same time as the POPLAR_ENGINE_OPTIONS:


Additional options include:

  • directory: To specify where the .pvti file will be saved,

  • channels: To specify a list of channels to capture from. See Using the libpvti API for details.

.pvti files are streamed to disk as the program executes,so it should not have a significant effect on your host system’s memory (although it may affect the speed of execution).

The libpvti User Guide gives more details.

3.1. Capturing function entry and exit

For most basic cases, there are POPLAR_TRACEPOINT macros that can be used inside your program to capture the timing of function entry and exit. For example:

void Engine::prepareForStreamAccess(StreamAndIndex stream) const {
  const DataStream &streamInfo = getStreamInfo(stream.id);
  logging::engine::debug("prepareForStreamAccess {} \\"{}\\" (id {}, index {}))",
                          isHostToDevice(streamInfo.type) ? "host write of",
                                                          : "host read of",
                          streamInfo.handle, stream.id, stream.index);

This will capture the name of the method, and, using object construction and destruction, record the entry and exit time.

Similar macros are available for PopART ( POPART_TRACEPOINT) and TensorFlow ( TENSORFLOW_TRACEPOINT).

3.2. Using the libpvti API

The API allows you to mark the begin and end of a section of code. This is not limited to functions - markers can be placed anywhere in your code. More information on the instrumentation API (available in Python and C++) can be found in the libpvti User Guide.

To use the API to indicate the beginning and end of trace events, you must first create a channel.

For example, in Python you would:

import libpvti as pvti
channel = pvti.createTraceChannel("MyChannel")

# Functional implementation
def foo ():
  pvti.Tracepoint.begin(channel, "foo")
  pvti.Tracepoint.end(channel, "foo")

# context manager implementation
def bar():
  with pvti.Tracepoint(channel, "bar"):

# wrapped object
class Bob:
  def somemethod(self):

bob = Bob()
pvti.instrument(bob ["somemethod"], channel)

# decorator (later)
def cat():

and in C++ you would:

// Create channel
pvti::TraceChannel channel = {"MyChannel"};

// Functional implementation
void foo() {
  pvti::TracePoint::begin(channel, "foo");
  pvti::TracePoint::end(channel, "foo");

// Scoped tracepoint object
void bar() {
  pvti::Tracepoint tp(channel, "bar");

3.3. Capturing scalar values

The API also allows you to capture scalar values over time. This can be used to log any scalar value, for example host memory usage or CPU load. More information on the instrumentation API can be found in the the libpvti User Guide.

To use the API to capture scalar values, you must first create a graph (with units) and then a series against that graph.

For example, in Python you would:

import libpvti as pvti
# Create graph
graph = pvti.Graph("MyGraph", "%")

# Create series
series1 = graph.addSeries("series1")
series2 = graph.addSeries("series2")

# Capture values

and in C++ you would:

// Create graph
pvti::Graph graph("MyGraph", "%");

// Create series
auto series1 = graph.addSeries("series1");
auto series2 = graph.addSeries("series2");

// Capture values

You can capture default driver monitoring information by setting the environment variable GCDA_MONITOR=1. See the online documentation for the gc-monitor command for more details of the data provided by the drivers.

3.4. Capturing tensor data for heatmaps

Underflow or overflow of model activations, weights and gradients when training a model can lead to instabilities and failures, or hinder convergence. With libpvti and the System Analyser, you can capture the distributions of these values and visualise changes in the distributions over time, which may help to identify underflow or overflow issues. You may also be able to identify opportunities to use a lower-precision floating point representation for some values. See Heatmaps of tensor data for more details.

Here’s how you can use libpvti to capture tensor data.

In Python:

import libpvti as pvti

# Create heatmap
heatmapX = pvti.HeatmapDouble(
    "Heatmap X", np.linspace(-16, 16, 33).tolist(), "2^x"

# Within model iteration, capture tensor values

And in C++:

auto heatmapX = pvti::Heatmap<double>("Heatmap X", {
  -16.0, -15.0, -14.0, -13.0, -12.0, -11.0, -10.0, -9.0,
  -8.0, -7.0, -6.0, -5.0, -4.0, -3.0, -2.0, -1.0, 0.0,
  1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0,
  10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0
}, "2^x");

// Within model iteration, capture tensor values

You can add multiple heatmaps to a single trace file, and each will appear as a separate heatmap chart in the System Analyser.

When using an exponential scale, it can be helpful to remove zeroes and take absolute values from your list before binning.

When you’re ready to run the script, enable PVTI logging by setting the PVTI_OPTIONS environment variable as follows:

PVTI_OPTIONS='{"enable":"true"}' python3 my_script.py