2. PopVision Analysis library C++ API

#include <pva.hpp>

2.1. Compilation reports

class PoplarVersion

This class contains the version of poplar that was used when generating report.

Public Functions

PoplarVersion(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

std::string string() const

Poplar version as a string.

std::string packageHash() const

Poplar GIT hash.

uint32_t majorVer() const

major version.

uint32_t minorVer() const

minor version.

uint32_t pointVer() const

point version.

Friends

friend std::ostream &operator<<(std::ostream &out, const PoplarVersion &obj)
class ProfileReportVersion

This class contains the version of report format that has been used.

Public Functions

ProfileReportVersion(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

uint32_t majorVer() const

major version.

uint32_t minorVer() const

minor version.

uint32_t pointVer() const

point version.

bool isUnstableFormat() const

indicates this report has development changes.

Friends

friend std::ostream &operator<<(std::ostream &out, const ProfileReportVersion &obj)
class InstrumentationSettings

This class contains information about the instrumentation settings used when compiling the graph.

Public Types

enum class ComputeInstrumentationLevel

The type of compute instrumentation.

Values:

enumerator Off
enumerator Vertex
enumerator Tile
enumerator Ipu
enumerator Device
enumerator Unknown
enum class ExternalExchangeInstrumentationLevel

The type of external exchange instrumentation.

Values:

enumerator Off
enumerator Tile
enumerator Unknown

Public Functions

InstrumentationSettings(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

ComputeInstrumentationLevel compute() const

Compute instrumentation level.

ExternalExchangeInstrumentationLevel externalExchange() const

External exchange instrumentation level.

Friends

friend std::ostream &operator<<(std::ostream &out, const InstrumentationSettings &obj)
class Graph

This class contains basic details about the graph.

Public Functions

Graph(const FileReaderPtr filereader)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • stepId – The ID of the step

uint64_t numComputeSets() const

The number of compute sets.

uint64_t numEdges() const

The number of edges.

uint64_t numVars() const

The number of variables.

uint64_t numVertices() const

The number of vertices.

std::unordered_map<std::string, double> optimisationMetrics() const

Internal metrics of the optimisations performed on the graph.

The keys may change but this will always be a map from strings to doubles

Friends

friend std::ostream &operator<<(std::ostream &out, const Graph &obj)
class Target

This class contains information about the target hardware.

Public Types

enum class Type

The type of hardware.

Values:

enumerator Ipu
enumerator IpuModel
enumerator Cpu
enumerator Unknown

Public Functions

Target(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

std::uint64_t numIPUs() const

The number of IPU chips in the system.

std::uint64_t tilesPerIpu() const

The number of tiles on each IPU chip.

std::uint64_t numTiles() const

The total number of tiles.

This is the product of numIPUs and tilesPerIPU. It is stored redundantly for convenience.

Bytes bytesPerTile() const

The number of bytes of memory on a tile.

Bytes bytesPerIpu() const

The number of bytes of memory on an IPU.

std::uint64_t totalMemory() const

The total memory.

This is the product of bytesPerTile and numTiles (or bytesPerIPU and numIPUs). It is stored redundantly for convenience.

double clockFrequency() const

The tile clock frequency in Hz.

std::uint64_t numReplicas() const

The number of replica.s.

std::uint64_t ipusPerReplica() const

The number of IPUS in a replica.

std::uint64_t tilesPerReplica() const

The number of tiles in a replica.

Bytes memoryPerReplica() const

The total memory in a replia.

Type type() const

The target type.

IPU::Architecture architecture() const

The IPU architecture.

uint64_t minSyncDelay() const

The minimum sync delay for any tile.

std::uint64_t supervisorInstrFetchDelay() const

The delay of a supervisor instruction fetch.

std::uint64_t interleavedMemoryStart() const

The start of interleaved memory.

std::vector<std::uint64_t> memoryElementOffsets() const

Offsets for each memory element (a memory element is an address range that does not allow simultaneous access).

std::vector<std::uint64_t> memoryRegionOffsets() const

Offsets for each memory region (a memory region is a set of contiguous memory elements with the same access characteristics).

std::uint64_t tileMemoryBaseAddress() const

The physical address where tile memory starts.

std::uint64_t tilesPerSuperTile() const

The number of tiles per supertile.

Will return 0 if this data is not present in the report

std::uint64_t numWorkerContexts() const

The number of worker contexts per tile.

Will return 0 if this data is not present in the report

Bytes exchangeBytesPerCycle() const

The bandwidth of internal IPU exchange in bytes per cycle.

Will return 0 if this data is not present in the report

Bytes memcpyBytesPerCycle() const

The number of bytes per cycle that can be copied from one location to another using a memcpy.

Will return 0 if this data is not present in the report

Bytes instructionBytes() const

The size of an instruction in bytes.

Will return 0 if this data is not present in the report

bool supportsSuperTileSendReceive() const

Whether a tile in a supertile can use all the exchange bandwidth of the supertile to send or receive, when the other tile is idle or receiving the same data.

Will return false if this data is not present in the report

Cycles globalSyncCycles() const

The number of clock cycles required to synchronize all IPUs.

Will return 0 if this data is not present in the report

Bytes globalExchangePacketBytes() const

Size of the packet used to transfer data between tiles in bytes.

Will return 0 if this data is not present in the report

Cycles tileLocalSyncSyncDelay() const

Number of cycles from issuing a sync instruction to the earliest time that instructions can resume.

Will return 0 if this data is not present in the report

Cycles tileLocalSyncExitDelay() const

Number of cycles after a worker has issued its exit instruction that the supervisor can resume.

Will return 0 if this data is not present in the report

std::uint64_t numStrideBits() const

Number of stride bits.

Will return 0 if this data is not present in the report

std::uint64_t dataPathWidth() const

The width of the load/store data path within the tile.

Will return 0 if this data is not present in the report

std::uint64_t fp8ConvUnitMaxPipelineDepth() const

The maximum pipeline depth of the convolution units within the tile for fp8.

Will return 0 if this data is not present in the report

std::uint64_t fp16ConvUnitMaxPipelineDepth() const

The maximum pipeline depth of the convolution units within the tile for fp16.

Will return 0 if this data is not present in the report

std::uint64_t fp32ConvUnitMaxPipelineDepth() const

The maximum pipeline depth of the convolution units within the tile for fp32.

Will return 0 if this data is not present in the report

std::uint64_t fp8ConvUnitInputLoadElemsPerCycle() const

The input elements loaded per cycle for f8 conv.

Will return 0 if this data is not present in the report

std::uint64_t fp16ConvUnitInputLoadElemsPerCycle() const

The input elements loaded per cycle for f16 conv.

Will return 0 if this data is not present in the report

std::uint64_t fp32ConvUnitInputLoadElemsPerCycle() const

The input elements loaded per cycle for f32 conv.

Will return 0 if this data is not present in the report

std::uint64_t fp16InFp16OutConvUnitsPerTile() const

The number of convolution units in the tile that can be used when partial results are outputs as 16-bits and inputs are 16 bits.

Will return 0 if this data is not present in the report

std::uint64_t fp16InFp32OutConvUnitsPerTile() const

The number of convolution units in the tile that can be used when partial results are outputs as 32-bits and inputs are 16 bits.

Will return 0 if this data is not present in the report

std::uint64_t fp32InFp32OutConvUnitsPerTile() const

The number of convolution units in the tile that can be used when accumulating to 32 bit values.

Will return 0 if this data is not present in the report

std::uint64_t fp8InFp16OutConvUnitsPerTile() const

The number of convolution units in the tile that can be used when partial results are outputs as 16-bits and inputs are 8 bits.

Will return 0 if this data is not present in the report

std::uint64_t convUnitCoeffLoadBytesPerCycle() const

The number of convolutional weights that can be loaded in a cycle.

Will return 0 if this data is not present in the report

Bytes workerInstrFetchDelay() const

Number of bytes worker context may be loading instructions from memory ahead of current PC.

Will return 0 if this data is not present in the report

std::uint64_t maxImmediateOffsetInRunInstr() const

Max range of immediate operand in run instruction zimm16 operand multiplied implicitly by 4 when added to register operand.

Will return 0 if this data is not present in the report

std::uint64_t rptCountMax() const

Max number of repeat counter.

Will return 0 if this data is not present in the report

std::uint64_t atomicStoreGranularity() const

The atomic store granularity.

Will return 0 if this data is not present in the report

Friends

friend std::ostream &operator<<(std::ostream &out, const Target &obj)
class VertexInstances

Subclassed by pva::execution::VertexInstances

Public Functions

VertexInstances(const FileReaderPtr filereader, const VertexTypeId vertexTypeId, const uint32_t count, std::vector<uint64_t> countByTile, const Cycles estimatedCycles)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • vertexId – The vertex type id

  • count – The number of instances of this vertex type

  • countByTile – The number of instances of this vertex type by tile (can be empty)

  • estimatedCycles – Estimated cycles for this vertex type

VertexType type() const

The type of vertex.

uint32_t count() const

The number of instances of a vertex type.

std::vector<uint64_t> countByTile() const

The number of instances of a vertex type by tile.

Cycles estimatedCycles() const

The estimated cycles for this vertex type.

Returns 0 if cycle estimates are not available.

Friends

friend std::ostream &operator<<(std::ostream &out, const VertexInstances &obj)
class VertexType

This class represents a vertex type.

Public Types

enum class Source

Enum of how the vertex has been implemented.

Values:

enumerator Asm

Implemented in assembler.

enumerator CPlusPlus

Implemented in C++.

enumerator Unknown

Implementation unknown.

Public Functions

VertexType(const FileReaderPtr filereader, const VertexTypeId vertexId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • vertexId – The vertex type id

std::string name() const

The name of the vertex type.

Bytes size() const

The size of the vertex type in bytes.

Contains the size of the vertex state (the class members) of each vertex type. For example Doubler might have 4 bytes of state.

Source source() const

How the vertex type has been implemented.

bool operator==(const VertexType &other) const
bool operator!=(const VertexType &other) const

Friends

friend std::ostream &operator<<(std::ostream &out, const VertexType &obj)
class LoweredVariable

This class contains information about a lowered variable.

Public Functions

LoweredVariable(std::unique_ptr<Impl> impl)
LoweredVariable(const LoweredVariable &other)
LoweredVariable(LoweredVariable &&other)
~LoweredVariable()
LoweredVariableId _id() const

The id of this variable.

bool allocated() const

True if this variable has been allocated to a tile.

If not, the rest of the data in this class should be ignored.

TileId tileId() const

The tile this variables is on.

std::string name() const

The name of this lowered variable.

std::uint64_t bytes() const

The size in bytes of the variable.

std::uint64_t alignment() const

The alignment in bytes.

Normally at least 4 bytes.

std::uint64_t offset() const

The offset within the tile’s variable storage area.

VariableCategory category() const

Category for the data stored in the variable.

bool constant() const

True if this variable stores constant data that is never written to.

bool alwaysLive() const

If true, this variable can never overlap with others.

Other variables may end up always live as a result of their usage, but those flagged as such we know should be always live and will be discounted for liveness analysis and reporting.

bool executable() const

If true, this variable needs to be allocated in an executable region of memory.

bool vertexCode() const

If true, this variable contains vertex code.

bool lowest256KBytes() const

If true, this variable contains vertex state that should be in the first 256KB of memory so that it can be addressed via the run assembly instruction.

bool startSection() const

If true, this variable contains an entrypoint that may need special allocation.

bool inInterleavedMem() const

If true, this variable must be stored in interleaved memory.

bool accessedInVertex() const

Special flag for variables that may be accessed in a vertex.

EquivalenceClass equivalenceClass() const

The equivalence class of this variable.

Variables with the same equivalence class have the same liveness profile.

List<LoweredVariable> elementConstraints() const

The lowered variables that cannot be stored in the same memory element as this variable.

List<LoweredVariable> regionConstraints() const

The lowered variables that cannot be stored in the same memory region as this variable.

bool hasUnloweredVar() const

True if this variable comes from an unlowered var whose id can be retrieved.

Variable unloweredVar() const

The unlowered variable that originated this variable.

Note: throws std::runtime_error if hasUnloweredVar() == false.

bool operator==(const LoweredVariable &other) const
bool operator!=(const LoweredVariable &other) const

Friends

friend std::ostream &operator<<(std::ostream &out, const LoweredVariable &obj)
class LoweredVariables

This class allows access to lowered variables by tile and by unlowered variable.

Public Functions

inline LoweredVariables(const FileReaderPtr filereader)
List<LoweredVariable> forTile(TileId tileId) const
List<LoweredVariable> forVar(Variable var) const
std::vector<LoweredVariableBrief> allBriefVars() const

Brief information for each of the lowered variables.

Note this vector may be very long.

Friends

friend std::ostream &operator<<(std::ostream &out, const LoweredVariables &obj)
class VertexMemory

Public Functions

VertexMemory(const FileReaderPtr filereader, const VertexTypeId vertexTypeId, const TileId tileId)

Constructor.

Parameters
  • filereader – Pointer to the

  • vertexId – The vertex type id

  • tileId – The tile id

VertexType type() const

The type of vertex.

Bytes codeBytes() const

The amount of code.

Bytes copyPtrBytes() const

The amount of copy pointers.

Bytes descriptorBytes() const

The amount of descriptors.

Bytes edgePtrBytes() const

The amount of edge pointers.

Bytes paddingBytes() const

The amount of padding pointers.

Bytes vertexDataBytes() const

The amount of vertex data.

Friends

friend std::ostream &operator<<(std::ostream &out, const VertexMemory &obj)
class NotAlwaysLiveMemory

Not-always-live memory info for a given step.

Public Functions

inline NotAlwaysLiveMemory(const Bytes bytes, const List<VariableSize> variables)
inline Bytes bytes() const

The total not always live bytes in this step.

inline List<VariableSize> variables() const

List of not always live variables in this step.

Friends

friend std::ostream &operator<<(std::ostream &out, const NotAlwaysLiveMemory &obj)
class MemoryOverlap

This class represents how much memory with in a category / region is overlapped or not overlapped.

The memory used by some variables can be overlapped with others, because they are not live at the same time. Hence, the usage is split into overlappedand nonOverlapped components.

Public Functions

inline MemoryOverlap(const Bytes nonOverlapped, const Bytes overlapped)

Constructor.

Parameters
  • nonOverlapped – nonOverlapped memory

  • overlapped – overlapped memory

inline Bytes nonOverlapped() const

The memory not overlapped.

inline Bytes overlapped() const

The memory overlapped.

inline Bytes total() const

The sum of overlapped and not overlapped memory.

Friends

friend std::ostream &operator<<(std::ostream &out, const MemoryOverlap &obj)
class CategoryMemory

This class represents the breakdown of memory by region for a category.

Category is a breakdown of memory usage across the whole system by the type of data, and the region it is in.

There are two memory regions on each tile, interleaved and non-interleaved, the use of each of these is reported separately. If the memory requirement is greater than the available memory, then this is reported as overflowed.

Public Functions

inline CategoryMemory(const MemoryOverlap interleaved, const MemoryOverlap nonInterleaved, MemoryOverlap overflowed)

Constructor.

Parameters
  • interleaved – interleaved memory region

  • nonInterleaved – non interleaved memory region

  • overflowed – overflowed memory region

inline MemoryOverlap interleaved() const

The interleaved memory region.

inline MemoryOverlap nonInterleaved() const

The non interleaved memory region.

inline MemoryOverlap overflowed() const

The overflowed memory region.

inline Bytes total() const

The sum of interleaved, non interleaved, and overflowed memory.

Friends

friend std::ostream &operator<<(std::ostream &out, const CategoryMemory &obj)
class Categories

Public Functions

Categories(const FileReaderPtr filereader, const TileId tileid)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

inline CategoryMemory constant() const

Constants memory category.

Constants added by the user. Variables added by the compiler that happen to be constant will be categorised as “variable”.

inline CategoryMemory controlCode() const

Control code memory category.

Code for Program objects and running compute sets.

inline CategoryMemory controlId() const

Control id memory category.

Variables that are used in switch programs or variables that store a sync ID for tracking host/device synchronisation points.

inline CategoryMemory controlTable() const

Control table memory category.

A table that lists the vertices to run in each compute set. Only used if the table scheduler is enabled.

inline CategoryMemory copyDescriptor() const

Copy descriptor memory category.

Copy descriptors are special variable-sized fields used by copy vertices.

inline CategoryMemory globalExchangeCode() const

Global exchange code memory category.

Code for performing exchange operations between IPUs.

inline CategoryMemory globalExchangePacketHeader() const

Global exchange packet header category.

Packet headers for exchange operations between IPUs.

inline CategoryMemory globalMessage() const

Global message memory category.

Message variables holding data being sent between IPUs.

inline CategoryMemory hostExchangeCode() const

Host exchange code memory category.

Code for performing exchange operations to and from the host.

inline CategoryMemory hostExchangePacketHeader() const

Host exchange packet header memory category.

Data used as packet headers for host exchange.

inline CategoryMemory hostMessage() const

Host message memory category.

Message variables holding data being sent or received from the host.

inline CategoryMemory instrumentationResults() const

Instrumentation results memory category.

Storage for profiling information.

inline CategoryMemory internalExchangeCode() const

Internal exchange code memory category.

Code for performing internal exchanges.

inline CategoryMemory message() const

Message memory category.

Message data for internal exchanges.

inline CategoryMemory multiple() const

Multiple memory category.

Space shared by variables from multiple different categories.

inline CategoryMemory outputEdge() const

Output edge memory category.

Storage for output edge data before an exchange takes place.

inline CategoryMemory rearrangement() const

Rearrangement memory category.

Variables holding rearranged versions of tensor data. A rearranged variable will never be always live as it is only required in the context of a specific compute set.

inline CategoryMemory sharedCodeStorage() const

Shared code storage memory category.

Code shared by vertices.

inline CategoryMemory sharedDataStorage() const

Shared data storage memory category.

Data shared by vertices.

inline CategoryMemory stack() const

Stack memory category.

The worker and supervisor stacks allocated on the specified tile. For more information about worker stack allocation see the Vertex Assembly Programming Guide.

inline CategoryMemory variable() const

Variable memory category.

Variables created in the program (for example, created by the Poplar graph.addVariable() function).

inline CategoryMemory vectorListDescriptor() const

Vector list descriptor memory category.

The data for VectorList<Input<…>, DeltaN> fields.

inline CategoryMemory vertexCode() const

Vertex code memory category.

Code for vertex functions (codelets).

inline CategoryMemory vertexFieldData() const

Vertex field data memory category.

Variable-sized fields, e.g. the data for Vector<float>, Vector<Input<…>> and InputSet<…> fields.

inline CategoryMemory vertexInstanceState() const

Vertex instance state memory category.

An instance of a Vertex class object. This will be sizeof(VertexName) for each vertex.

inline CategoryMemory dwarf() const

DWARF memory category.

DWARF debugging information.

Friends

friend std::ostream &operator<<(std::ostream &out, const Categories &obj)
class MemoryWithAndWithoutGaps

Public Functions

inline MemoryWithAndWithoutGaps(const Bytes excludingGaps, const Bytes includingGaps)

Constructor.

Parameters
  • excludingGaps – Memory excluding gaps

  • includingGaps – Memory including gaps

inline Bytes excludingGaps() const

The memory excluding gaps.

inline Bytes includingGaps() const

The memory include gaps.

Friends

friend std::ostream &operator<<(std::ostream &out, const MemoryWithAndWithoutGaps &obj)
class Memory

This class represent the memory layout of a tile in various different ways.

The memory object contains a lot of information about memory use. All memory is statically allocated so you don’t need to run the program to gather this data.

Public Functions

Memory(const FileReaderPtr filereader, const TileId &tileId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

Bytes alwaysLiveBytes() const
Bytes notAlwaysLiveBytes() const
MemoryWithAndWithoutGaps nonInterleaved() const

Details of memory with the non interleaved region for this tile.

MemoryWithAndWithoutGaps interleaved() const

Details of memory with the interleaved region for this tile.

MemoryWithAndWithoutGaps overflowed() const

Details of memory with the overflowed region for this tile.

MemoryWithAndWithoutGaps total() const

Details of the total memory usage for this tile.

Categories category() const

Details memory by category.

virtual List<VertexMemory> vertices() const

List of vertex memory usage on this tile.

Friends

friend std::ostream &operator<<(std::ostream &out, const Memory &obj)
class Categories

Public Functions

Categories(const FileReaderPtr filereader, const TileId tileid)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

inline CategoryMemory constant() const

Constants memory category.

Constants added by the user. Variables added by the compiler that happen to be constant will be categorised as “variable”.

inline CategoryMemory controlCode() const

Control code memory category.

Code for Program objects and running compute sets.

inline CategoryMemory controlId() const

Control id memory category.

Variables that are used in switch programs or variables that store a sync ID for tracking host/device synchronisation points.

inline CategoryMemory controlTable() const

Control table memory category.

A table that lists the vertices to run in each compute set. Only used if the table scheduler is enabled.

inline CategoryMemory copyDescriptor() const

Copy descriptor memory category.

Copy descriptors are special variable-sized fields used by copy vertices.

inline CategoryMemory globalExchangeCode() const

Global exchange code memory category.

Code for performing exchange operations between IPUs.

inline CategoryMemory globalExchangePacketHeader() const

Global exchange packet header category.

Packet headers for exchange operations between IPUs.

inline CategoryMemory globalMessage() const

Global message memory category.

Message variables holding data being sent between IPUs.

inline CategoryMemory hostExchangeCode() const

Host exchange code memory category.

Code for performing exchange operations to and from the host.

inline CategoryMemory hostExchangePacketHeader() const

Host exchange packet header memory category.

Data used as packet headers for host exchange.

inline CategoryMemory hostMessage() const

Host message memory category.

Message variables holding data being sent or received from the host.

inline CategoryMemory instrumentationResults() const

Instrumentation results memory category.

Storage for profiling information.

inline CategoryMemory internalExchangeCode() const

Internal exchange code memory category.

Code for performing internal exchanges.

inline CategoryMemory message() const

Message memory category.

Message data for internal exchanges.

inline CategoryMemory multiple() const

Multiple memory category.

Space shared by variables from multiple different categories.

inline CategoryMemory outputEdge() const

Output edge memory category.

Storage for output edge data before an exchange takes place.

inline CategoryMemory rearrangement() const

Rearrangement memory category.

Variables holding rearranged versions of tensor data. A rearranged variable will never be always live as it is only required in the context of a specific compute set.

inline CategoryMemory sharedCodeStorage() const

Shared code storage memory category.

Code shared by vertices.

inline CategoryMemory sharedDataStorage() const

Shared data storage memory category.

Data shared by vertices.

inline CategoryMemory stack() const

Stack memory category.

The worker and supervisor stacks allocated on the specified tile. For more information about worker stack allocation see the Vertex Assembly Programming Guide.

inline CategoryMemory variable() const

Variable memory category.

Variables created in the program (for example, created by the Poplar graph.addVariable() function).

inline CategoryMemory vectorListDescriptor() const

Vector list descriptor memory category.

The data for VectorList<Input<…>, DeltaN> fields.

inline CategoryMemory vertexCode() const

Vertex code memory category.

Code for vertex functions (codelets).

inline CategoryMemory vertexFieldData() const

Vertex field data memory category.

Variable-sized fields, e.g. the data for Vector<float>, Vector<Input<…>> and InputSet<…> fields.

inline CategoryMemory vertexInstanceState() const

Vertex instance state memory category.

An instance of a Vertex class object. This will be sizeof(VertexName) for each vertex.

inline CategoryMemory dwarf() const

DWARF memory category.

DWARF debugging information.

Friends

friend std::ostream &operator<<(std::ostream &out, const Categories &obj)
class Tile

Public Functions

Tile(const FileReaderPtr filereader, const TileId tileId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

inline TileId tileId() const

The sofware tile ID.

Memory memory() const

Details of memory usage by this tile.

uint64_t relativeSyncDelay() const

The sync delay for this tile (relative to the minimum value).

double clockFrequency() const

The tile clock frequency in Hz.

Friends

friend std::ostream &operator<<(std::ostream &out, const Tile &obj)
class CategoryMemory

This class represents the breakdown of memory by region for a category.

Category is a breakdown of memory usage across the whole system by the type of data, and the region it is in.

There are two memory regions on each tile, interleaved and non-interleaved, the use of each of these is reported separately. If the memory requirement is greater than the available memory, then this is reported as overflowed.

Public Functions

inline CategoryMemory(const MemoryOverlap interleaved, const MemoryOverlap nonInterleaved, MemoryOverlap overflowed)

Constructor.

Parameters
  • interleaved – interleaved memory region

  • nonInterleaved – non interleaved memory region

  • overflowed – overflowed memory region

inline MemoryOverlap interleaved() const

The interleaved memory region.

inline MemoryOverlap nonInterleaved() const

The non interleaved memory region.

inline MemoryOverlap overflowed() const

The overflowed memory region.

inline Bytes total() const

The sum of interleaved, non interleaved, and overflowed memory.

Friends

friend std::ostream &operator<<(std::ostream &out, const CategoryMemory &obj)
class Memory

This class represent the memory layout of a tile in various different ways.

The memory object contains a lot of information about memory use. All memory is statically allocated so you don’t need to run the program to gather this data.

Public Functions

Memory(const FileReaderPtr filereader, const TileId &tileId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

Bytes alwaysLiveBytes() const
Bytes notAlwaysLiveBytes() const
MemoryWithAndWithoutGaps nonInterleaved() const

Details of memory with the non interleaved region for this tile.

MemoryWithAndWithoutGaps interleaved() const

Details of memory with the interleaved region for this tile.

MemoryWithAndWithoutGaps overflowed() const

Details of memory with the overflowed region for this tile.

MemoryWithAndWithoutGaps total() const

Details of the total memory usage for this tile.

Categories category() const

Details memory by category.

virtual List<VertexMemory> vertices() const

List of vertex memory usage on this tile.

Friends

friend std::ostream &operator<<(std::ostream &out, const Memory &obj)
class Categories

Public Functions

Categories(const FileReaderPtr filereader, const TileId tileid)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • tileId – The ID of the tile

inline CategoryMemory constant() const

Constants memory category.

Constants added by the user. Variables added by the compiler that happen to be constant will be categorised as “variable”.

inline CategoryMemory controlCode() const

Control code memory category.

Code for Program objects and running compute sets.

inline CategoryMemory controlId() const

Control id memory category.

Variables that are used in switch programs or variables that store a sync ID for tracking host/device synchronisation points.

inline CategoryMemory controlTable() const

Control table memory category.

A table that lists the vertices to run in each compute set. Only used if the table scheduler is enabled.

inline CategoryMemory copyDescriptor() const

Copy descriptor memory category.

Copy descriptors are special variable-sized fields used by copy vertices.

inline CategoryMemory globalExchangeCode() const

Global exchange code memory category.

Code for performing exchange operations between IPUs.

inline CategoryMemory globalExchangePacketHeader() const

Global exchange packet header category.

Packet headers for exchange operations between IPUs.

inline CategoryMemory globalMessage() const

Global message memory category.

Message variables holding data being sent between IPUs.

inline CategoryMemory hostExchangeCode() const

Host exchange code memory category.

Code for performing exchange operations to and from the host.

inline CategoryMemory hostExchangePacketHeader() const

Host exchange packet header memory category.

Data used as packet headers for host exchange.

inline CategoryMemory hostMessage() const

Host message memory category.

Message variables holding data being sent or received from the host.

inline CategoryMemory instrumentationResults() const

Instrumentation results memory category.

Storage for profiling information.

inline CategoryMemory internalExchangeCode() const

Internal exchange code memory category.

Code for performing internal exchanges.

inline CategoryMemory message() const

Message memory category.

Message data for internal exchanges.

inline CategoryMemory multiple() const

Multiple memory category.

Space shared by variables from multiple different categories.

inline CategoryMemory outputEdge() const

Output edge memory category.

Storage for output edge data before an exchange takes place.

inline CategoryMemory rearrangement() const

Rearrangement memory category.

Variables holding rearranged versions of tensor data. A rearranged variable will never be always live as it is only required in the context of a specific compute set.

inline CategoryMemory sharedCodeStorage() const

Shared code storage memory category.

Code shared by vertices.

inline CategoryMemory sharedDataStorage() const

Shared data storage memory category.

Data shared by vertices.

inline CategoryMemory stack() const

Stack memory category.

The worker and supervisor stacks allocated on the specified tile. For more information about worker stack allocation see the Vertex Assembly Programming Guide.

inline CategoryMemory variable() const

Variable memory category.

Variables created in the program (for example, created by the Poplar graph.addVariable() function).

inline CategoryMemory vectorListDescriptor() const

Vector list descriptor memory category.

The data for VectorList<Input<…>, DeltaN> fields.

inline CategoryMemory vertexCode() const

Vertex code memory category.

Code for vertex functions (codelets).

inline CategoryMemory vertexFieldData() const

Vertex field data memory category.

Variable-sized fields, e.g. the data for Vector<float>, Vector<Input<…>> and InputSet<…> fields.

inline CategoryMemory vertexInstanceState() const

Vertex instance state memory category.

An instance of a Vertex class object. This will be sizeof(VertexName) for each vertex.

inline CategoryMemory dwarf() const

DWARF memory category.

DWARF debugging information.

Friends

friend std::ostream &operator<<(std::ostream &out, const Categories &obj)
class MemoryOverlap

This class represents how much memory with in a category / region is overlapped or not overlapped.

The memory used by some variables can be overlapped with others, because they are not live at the same time. Hence, the usage is split into overlappedand nonOverlapped components.

Public Functions

inline MemoryOverlap(const Bytes nonOverlapped, const Bytes overlapped)

Constructor.

Parameters
  • nonOverlapped – nonOverlapped memory

  • overlapped – overlapped memory

inline Bytes nonOverlapped() const

The memory not overlapped.

inline Bytes overlapped() const

The memory overlapped.

inline Bytes total() const

The sum of overlapped and not overlapped memory.

Friends

friend std::ostream &operator<<(std::ostream &out, const MemoryOverlap &obj)
class MemoryWithAndWithoutGaps

Public Functions

inline MemoryWithAndWithoutGaps(const Bytes excludingGaps, const Bytes includingGaps)

Constructor.

Parameters
  • excludingGaps – Memory excluding gaps

  • includingGaps – Memory including gaps

inline Bytes excludingGaps() const

The memory excluding gaps.

inline Bytes includingGaps() const

The memory include gaps.

Friends

friend std::ostream &operator<<(std::ostream &out, const MemoryWithAndWithoutGaps &obj)
class IPU

This class represents details of a single IPU.

Public Functions

List<Tile> tiles() const

List of tiles on this IPU.

Architecture architecture() const

The architecture type of this IPU.

enum class pva::IPU::Architecture

Enum of the different IPU architectures.

Values:

enumerator Ipu1

IPU Mk1

enumerator Ipu2

IPU Mk2

enumerator Ipu21

IPU Mk21

enumerator Ipu30
enumerator Unknown

Unknown

class Replica

This class represents details of a single replica.

Public Functions

Replica(const FileReaderPtr filereader, const ReplicaId replicaId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • replicaId – The ID of the replica

inline ReplicaId replicaId() const

The replica id.

List<IPU> ipus() const

List of tiles on this replica.

Friends

friend std::ostream &operator<<(std::ostream &out, const Replica &obj)
class ComputeSet

This class represents details of a single compute set.

Subclassed by pva::execution::ComputeSet

Public Functions

ComputeSet(const FileReaderPtr filereader, const ComputeSetId computeSetId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • computeSetId – The ID of the compute set

std::string name() const

The name of the compute set.

List<VertexInstances> vertices() const

The vertices used by this compute set.

std::vector<Cycles> estimatedCyclesByTile() const

Estimated cycles by tile for this compute set.

Returns an empty vector if cycle estimates are unavailable.

std::vector<LoweredVariable> vars() const

The lowered variables used by this compute set.

std::vector<LoweredVariableId> varIds() const

The lowered variable ids used by this compute set.

std::vector<DebugContext> debugContexts() const

List of debug contexts of this compute set (may be empty).

Friends

friend std::ostream &operator<<(std::ostream &out, const ComputeSet &obj)
class Program

This class is a base class that represents program type.

Subclassed by pva::BlockProgram, pva::CallProgram, pva::CodeCopyProgram, pva::DoExchangeProgram, pva::GetGlobalConsensusProgram, pva::GlobalExchangeProgram, pva::IfElseProgram, pva::OnEveryTileSwitchProgram, pva::OnTileExecuteProgram, pva::OnTileSwitchProgram, pva::RepeatProgram, pva::RepeatWhileProgram, pva::SansProgram, pva::SequenceProgram, pva::SetLocalConsensusFromVarProgram, pva::SetLocalConsensusProgram, pva::StreamCopyBeginProgram, pva::StreamCopyEndProgram, pva::StreamCopyMidProgram, pva::SyncAnsProgram, pva::SyncProgram, pva::UnknownProgram, pva::WriteUndefProgram

Public Types

enum class Type

Type of program.

Values:

enumerator Unknown
enumerator Sequence
enumerator OnTileExecute
enumerator Repeat
enumerator RepeatWhile
enumerator OnTileSwitch
enumerator OnEveryTileSwitch
enumerator IfElse
enumerator DoExchange
enumerator GlobalExchange
enumerator StreamCopyBegin
enumerator StreamCopyMid
enumerator StreamCopyEnd
enumerator WriteUndef
enumerator Sync
enumerator SetLocalConsensus
enumerator SetLocalConsensusFromVar
enumerator GetGlobalConsensus
enumerator Sans
enumerator Call
enumerator SyncAns
enumerator CodeCopy
enumerator Block
enumerator Execute
enumerator UnloweredWhile
enumerator UnloweredExecuteDst
enumerator UnloweredSwitch
enumerator OnTileCopy
enumerator Copy
enumerator CrossReplicaCopy
enumerator StreamCopy
enumerator InstrumentationPlaceholder
enum class SyncType

Type of synchronisation.

Internal means all tiles on the current IPU. GS1, GS2, GS3 and GS4 are group syncs that can span multiple IPUs. Poplar determines which IPUs are in each group, and this is subject to change. Unknown means the sync type was not recorded.

Values:

enumerator Internal
enumerator GS1
enumerator GS2
enumerator GS3
enumerator GS4
enumerator Unknown

Public Functions

Program(const FileReaderPtr filereader, const ProgramId progamId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual ~Program() = default

Destructor.

virtual std::string name() const

The name of the program.

May be empty.

std::vector<DebugContext> debugContexts() const

List of debug contexts of the program.

May be empty.

virtual List<std::shared_ptr<Program>> children() const

List of child programs.

The list of children programs may be empty.

virtual Type type() const = 0

The type of the program.

inline virtual std::vector<LoweredVariable> vars() const

The lowered variables used by this program.

std::vector<std::uint64_t> controlCodeByTile() const

The size in bytes of the control code of this program and its children (if any) for each tile.

May be empty for old reports.

virtual void accept(ProgramVisitor &visitor) const = 0

The visitor pattern accept method.

The visitor pattern allows you to visit a program dependent on its type.

inline bool operator==(const Program &other) const
inline bool operator!=(const Program &other) const
inline ProgramId _id() const

The id of the program.

Friends

friend std::ostream &operator<<(std::ostream &out, const Program &obj)
friend std::ostream &operator<<(std::ostream &out, const Program::Type &type)
class UnknownProgram : public pva::Program

UnknownProgram details.

Public Functions

UnknownProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class SequenceProgram : public pva::Program

SequenceProgram details.

Public Functions

SequenceProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class OnTileExecuteProgram : public pva::Program

OnTileExecuteProgram details.

Public Functions

OnTileExecuteProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual std::string name() const override

The name of the program.

The name of this program is based on the name of the compute set.

virtual Type type() const override

The type of the program.

ComputeSet computeset() const

The compute set this program uses.

virtual std::vector<LoweredVariable> vars() const override

The lowered variables used by this on tile execute.

class RepeatProgram : public pva::Program

RepeatProgram details.

Public Functions

RepeatProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

bool hasRepeatCount() const

Return whether this Repeat program has a hard-coded repeat count.

This method will return false for older reports which do not contain information about repeat counts.

uint32_t repeatCount() const

Return how many times this Repeat program repeats.

This method will return 0 for older reports which do not contain information about repeat counts or if this Repeat program does not have a hard-coded repeat count.

class RepeatWhileProgram : public pva::Program

RepeatWhileProgram details.

Public Functions

RepeatWhileProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class OnTileSwitchProgram : public pva::Program

OnTileSwitchProgram details.

Public Functions

OnTileSwitchProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class OnEveryTileSwitchProgram : public pva::Program

OnEveryTileSwitchProgram details.

Public Functions

OnEveryTileSwitchProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class IfElseProgram : public pva::Program

IfElseProgram details.

Public Functions

IfElseProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class DoExchangeProgram : public pva::Program

DoExchangeProgram details.

Public Functions

DoExchangeProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

std::vector<Bytes> bytesReceivedByTile() const

The number of bytes received on each tile.

std::vector<Bytes> bytesSentByTile() const

The number of bytes received on each tile.

std::vector<uint64_t> estimatedCyclesByTile() const

\ The estimated number of cycles used to execute this program.

std::vector<Bytes> codeBytesByTile() const

The size of the code required for this program on each tile.

virtual std::vector<LoweredVariable> vars() const override

The lowered variables used by this exchange.

class GlobalExchangeProgram : public pva::Program

GlobalExchangeProgram details.

Public Functions

GlobalExchangeProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

std::vector<uint64_t> bytesReceivedByTile() const

The number of bytes received on each tile.

std::vector<uint64_t> bytesSentByTile() const

The number of bytes sent on each tile.

std::vector<uint64_t> estimatedCyclesByTile() const

The estimated number of cycles used to execute this program.

std::vector<uint64_t> exchangeCyclesByTile() const

The number of exchange cycles for this program.

std::vector<uint64_t> syncCyclesByTile() const

The number of sync cycles for this program.

SyncType syncType() const

The type of synchronisation used by this program.

virtual std::vector<LoweredVariable> vars() const override

The lowered variables used by this exchange.

class StreamCopyBeginProgram : public pva::Program

StreamCopyBeginProgram details.

Public Functions

StreamCopyBeginProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class StreamCopyMidProgram : public pva::Program

StreamCopyMidProgram details.

Public Types

enum class StreamCopyType

The source or destination of the stream copy.

Values:

enumerator Host
enumerator RemoteBuffer
enumerator Mixed
enumerator Unknown

Public Functions

StreamCopyMidProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

std::vector<uint64_t> bytesReceivedByTile() const

The number of bytes received on each tile.

std::vector<uint64_t> bytesSentByTile() const

The number of bytes sent on each tile.

std::vector<uint64_t> estimatedCyclesByTile() const

The estimated number of cycles used to execute this program.

virtual std::vector<LoweredVariable> vars() const override

The lowered variables used by this exchange.

StreamCopyType streamCopyType() const
class StreamCopyEndProgram : public pva::Program

StreamCopyEndProgram details.

Public Functions

StreamCopyEndProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class WriteUndefProgram : public pva::Program

WriteUndefProgram details.

Public Functions

WriteUndefProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class SyncProgram : public pva::Program

SyncProgram details.

Subclassed by pva::ImplicitSyncProgram

Public Functions

SyncProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual std::string name() const override

The name of this program.

virtual Type type() const override

The type of the program.

SyncType syncType() const

The type of synchronisation for this program.

class ImplicitSyncProgram : public pva::SyncProgram

ImplicitSyncProgram details.

This program represents the implicit syncs that are added to the execution trace. It will always be an Internal Sync.

Public Functions

ImplicitSyncProgram(const FileReaderPtr filereader, const std::string &name)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • name – The name of this implicit sync

virtual std::string name() const override

The name of this program.

virtual List<std::shared_ptr<Program>> children() const override

Will always be an empty list.

class SetLocalConsensusProgram : public pva::Program

SetLocalConsensusProgram details.

Public Functions

SetLocalConsensusProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class SetLocalConsensusFromVarProgram : public pva::Program

SetLocalConsensusFromVarProgram details.

Public Functions

SetLocalConsensusFromVarProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class GetGlobalConsensusProgram : public pva::Program

GetGlobalConsensusProgram details.

Public Functions

GetGlobalConsensusProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

class SansProgram : public pva::Program

SansProgram details.

Public Functions

SansProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

uint64_t numTiles() const

Number of tiles involved in this Sans program.

class CallProgram : public pva::Program

CallProgram details.

Public Functions

CallProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

std::shared_ptr<Program> target() const

The program that this call program invokes.

class SyncAnsProgram : public pva::Program

SyncAnsProgram details.

The non particaptory sync.

Public Functions

SyncAnsProgram(const FileReaderPtr filereader, const ProgramId programId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • programId – The ID of the program

virtual void accept(ProgramVisitor &visitor) const override

The visitor pattern accept method.

virtual Type type() const override

The type of the program.

uint64_t numTiles() const

Number of tiles involved in this SanAns program.

class ProgramVisitor

ProgramVisitor interface.

It is expected that users would create a subclass of this class and then visit the programs.

Public Functions

virtual void visitUnknown(const UnknownProgram &unknown)
virtual void visitSequence(const SequenceProgram &sequence)
virtual void visitOnTileExecute(const OnTileExecuteProgram &onTileExecute)
virtual void visitRepeat(const RepeatProgram &repeat)
virtual void visitRepeatWhile(const RepeatWhileProgram &repeatWhile)
virtual void visitOnTileSwitch(const OnTileSwitchProgram &onTileSwitch)
virtual void visitOnEveryTileSwitch(const OnEveryTileSwitchProgram &onEveryTileSwitch)
virtual void visitIfElse(const IfElseProgram &ifElse)
virtual void visitDoExchange(const DoExchangeProgram &doExchange)
virtual void visitGlobalExchange(const GlobalExchangeProgram &globalExchange)
virtual void visitStreamCopyBegin(const StreamCopyBeginProgram &streamCopyBegin)
virtual void visitStreamCopyMid(const StreamCopyMidProgram &streamCopyMid)
virtual void visitStreamCopyEnd(const StreamCopyEndProgram &streamCopyEnd)
virtual void visitWriteUndef(const WriteUndefProgram &writeUndef)
virtual void visitSync(const SyncProgram &sync)
virtual void visitSetLocalConsensus(const SetLocalConsensusProgram &setLocalConsensus)
virtual void visitSetLocalConsensusFromVar(const SetLocalConsensusFromVarProgram &setLocalConsensusFromVar)
virtual void visitGetGlobalConsensus(const GetGlobalConsensusProgram &getGlobalConsensus)
virtual void visitSans(const SansProgram &sans)
virtual void visitCall(const CallProgram &call)
virtual void visitSyncAns(const SyncAnsProgram &syncAns)
virtual void visitCodeCopy(const CodeCopyProgram &codeCopy)
virtual void visitBlock(const BlockProgram &block)
virtual void visitProgram(const Program &program)
class Variable

This class represents details of a variable.

Subclassed by pva::VariableSize

Public Functions

inline Variable(const FileReaderPtr filereader, const VariableId id, const std::string name)
inline VariableId _id() const

The id of this variable.

inline bool operator==(const Variable &other) const
inline bool operator!=(const Variable &other) const
inline std::string name() const

The name of the variable.

std::vector<DebugContext> debugContexts() const

List of debug contexts of the variable.

May be empty.

Friends

friend std::ostream &operator<<(std::ostream &out, const Variable &obj)
class VariableCategory

This class describes the category of the data stored in a variable (Message, Constant, Stack, etc.).

Public Functions

inline VariableCategory(const FileReaderPtr filereader, std::uint64_t id)
inline std::uint64_t id() const

The id of this category.

std::string name() const

The name of this category.

std::string description() const

The description of this category.

Friends

friend std::ostream &operator<<(std::ostream &out, const VariableCategory &obj)
class LivenessProgramStep

This class represents details liveness for step in the program.

Public Functions

LivenessProgramStep(const FileReaderPtr filereader, const StepId stepId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • stepId – The ID of the step

std::shared_ptr<Program> program() const

The program this step is for.

NotAlwaysLiveMemory notAlwaysLiveMemory() const
NotAlwaysLiveMemory notAlwaysLiveMemoryForTile(const TileId tileId) const
NotAlwaysLiveMemory notAlwaysLiveMemoryForIpu(const IpuId ipuId) const
Bytes notAlwaysLiveBytesForVariable(const Variable &var) const

Friends

friend std::ostream &operator<<(std::ostream &out, const LivenessProgramStep &obj)
class CompilationReport

This class contain information known at the end of graph compilation.

Public Functions

CompilationReport(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

Target target() const

Detail of the target hardware.

Graph graph() const

Details of the graph.

List<VariableSize> alwaysLiveVariables() const

List of always live variables.

List<VariableSize> alwaysLiveVariablesForTile(const TileId tileId) const

List of always live variables by tile.

List<LivenessProgramStep> livenessProgramSteps() const

List of program steps with liveness information.

Based on a depth first order.

List<Tile> tilesWithLivenessInfo() const

Tiles whose liveness info has been recorded and can be retrieved with LivenessProgramStep::notAlwaysLiveForTile.

List<Variable> variables() const

List of unlowered variables.

LoweredVariables loweredVariables() const
std::vector<std::vector<LoweredVariableId>> allocationOrderByTile() const

For each tile, a list of variable ids in the same order they were allocated in memory.

const List<Tile> &tiles() const

List of tiles.

Provided as convience rather than iterate IPUs or Replicas.

List<IPU> ipus() const

List of IPUs.

List<Replica> replicas() const

List of replicas.

List<std::shared_ptr<Program>> programs() const

List of all programs.

List<std::shared_ptr<Program>> controlPrograms() const

List of all control programs.

List<std::shared_ptr<Program>> functions() const

List of all functions.

List<CompilationStep> compilationSteps() const
List<ComputeSet> computeSets() const

A list of all compute sets.

std::string timestamp() const

A time stamp of date & time when the application when compiled.

bool isDebugInfoPresent() const

True if debug info is present.

List<DebugContext> debugContexts(const DebugContext::Filter &filter = {}) const

List of debug contexts filtered by Filter.

Throws if isDebugInfoPresent() is false.

DebugContext debugContext(const DebugContextId id) const

Get DebugContext by ID.

Throws if isDebugInfoPresent() is false.

std::vector<EngineOption> compilationParameters() const

A list of parameters used during compilation of the graph.

std::vector<EngineOption> targetParameters() const

A list of parameters describing the target upon which the graph will execute.

Friends

friend std::ostream &operator<<(std::ostream &out, const CompilationReport &obj)
class DebugContext

This class contains information about each debug context

Public Functions

DebugContext(const FileReaderPtr filereader, DebugContextId id)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • id – The debug context id

std::string layer() const

The layer of a DebugContext (e.g.

xla_op, poplibs, poplar).

std::string name() const

The name of a DebugContext.

DebugContextId id() const

Unique identifier for DebugContext.

std::string fullPath() const

The full path name of a DebugContext.

This method will walk the parents to build a full name for this debug context, using the / deliminator

std::string json() const

The information for the debug context as a json string.

DebugContextLocation location() const

The location in code where the debug context was created.

List<DebugContext> parents() const

The immediate parent debug contexts.

List<DebugContext> children() const

The immediate children debug contexts.

List<std::shared_ptr<Program>> programs(bool inlineCalls = true) const

The programs that correspond to this debug context or any of its descendants.

Parameters

inlineCalls – If true, programs that have been outlined into functions will appear in the returned list directly after the corresponding Call program (recursively).

List<Variable> variables() const

The variables that correspond to this debug context or any of its descendants.

Public Static Attributes

static const DebugContextId INVALID_ID = {0}

Friends

friend std::ostream &operator<<(std::ostream &out, const DebugContext &obj)
class Filter

This class describes a filter to be used when fetching Debug Contexts (e.g.

via CompilationReport::debugContexts).

Public Functions

inline Filter()

No filter.

inline Filter(std::string layer)

Retrieve Debug Contexts of a given layer.

inline Filter(char const *layer)
inline Filter(bool withoutParent)

Retrieve Debug Contexts that have no parent.

inline std::string layer() const
inline bool withoutParent() const
class DebugContextLocation

This class contains information about the location in the source code (file name, line, etc.) where a debug context was created.

Public Functions

DebugContextLocation(std::string filename, unsigned linenumber, std::string function)
inline DebugContextLocation()

Constructs an empty (invalid) location.

std::string fileName()

The path of the file where the DebugContext was created.

unsigned lineNumber()

The line number where the DebugContext was created.

std::string functionName()

The name of the function where the DebugContext was created.

inline bool isValid()

Returns false if this is an invalid location (the rest of fields should be disregarded).

2.2. Execution reports

class Run

This class contains information about each Poplar Engine::run.

Public Functions

Run(const FileReaderPtr filereader, const RunId runId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • runId – The id of a run

std::string name() const

Name given to this run.

List<ExecutionStep> steps() const

A list of program steps in this run.

Period<Microseconds> microseconds() const

The start and end of this run.

std::vector<Period<Cycles>> cyclesByIpu() const

The cycles by Ipu at the start and end of this run.

Note: vector may be empty if the run was too short to record cycles

std::vector<EngineOption> executionParameters() const

If the report contains information about per-run execution parameters then this vector will contain the options set for this run.

Otherwise it will be empty.

Friends

friend std::ostream &operator<<(std::ostream &out, const Run &obj)
class Ipu

The class contains the measured information for an IPU.

Public Functions

Ipu(const FileReaderPtr filereader, const IpuId ipuId, const StepId stepid)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • ipuId – The id of an IPU

  • stepId – The id of a step in the execution

CyclesInfo allCycles() const

The range of cycles for all tiles on this IPU.

Note : Will return 0’s if this ipu has not be profiled.

CyclesInfo activeCycles() const

The range of cycles for all tiles involved on this IPU.

Note : Will return 0’s if this ipu has not be profiled.

uint64_t cycles() const

Total cycles executed on this IPU.

This is the sum of cycles on all tiles.

Note : Will return 0 if this ipu has not be profiled.

uint64_t activeTiles() const

The number of tiles involved.

The number of tiles that are computing (or exchanging for exchanges).

Note : Will return 0 if this ipu has not be profiled.

float threadBalance() const

Indication of hardware thread utilisation.

Measures how well-utilised the hardware threads are. If you always run 6 threads or 0 threads this will be 1.0 even if the total computation on each tile takes a different amount of time.

Note : Will return 0 if this ipu has not be profiled.

float tileBalance() const

Indication of hardware tile utilisation.

Measures how well-utilised the hardware tiles are. Larger values (up to a maximum of 1.0) indicate that work has been shared amongst tiles on this IPU more evenly.

Note : Only valid for OnTileExecute, GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

float activeTileBalance() const

Indication of active hardware tile utilisation.

Measures how well-utilised the active hardware tiles are. Larger values (up to a maximum of 1.0) indicate that work has been shared amongst the active tiles on this IPU more evenly.

Note : Only valid for OnTileExecute, GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

Bytes dataIn() const

Total data received by this IPU.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

Bytes dataOut() const

Total data sent by this IPU.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

float dataBalance() const

Indication of how well-balanced the data transfer was.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange Note : Will return 0 if this ipu has not be profiled.

bool profiled() const

Indication if this IPU was profiled.

With the poplar engine option ‘replicaToProfile’, only a subset of the ipu’s may have execution profile information.

IpuId id() const

The ID of this IPU.

Friends

friend std::ostream &operator<<(std::ostream &out, const Ipu &obj)
class ExecutionStep

The class contains information about a step of execution.

Each step will represent the execution of poplar program

Public Functions

ExecutionStep(const FileReaderPtr filereader, const StepId stepId)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • stepId – The id of a step in the execution

std::shared_ptr<Program> program() const

The program being executed.

std::vector<Cycles> cyclesByTile() const

The measured cycles by tile for this execution step *.

When using the ‘replicaToProfile’ option tiles, on the replica profiled will have measured values, the other tiles will return 0

List<execution::ComputeSet> computeSets() const

A list of all compute sets in this step.

List<Ipu> ipus() const

Details of a program step execution per IPU.

Friends

friend std::ostream &operator<<(std::ostream &out, const ExecutionStep &obj)
class CycleRange

The class contains the range of cycles executed for this step.

Each tile executes a step with different number of cycles

Public Functions

inline CycleRange(Cycles min, Cycles max, Cycles average)

Constructor.

Parameters
  • min – Minimum number of cycles

  • max – Maximum number of cycles

  • average – Average number of cycles

inline Cycles min() const

The minimum number of cycles.

inline Cycles max() const

The maximum number of cycles.

inline Cycles average() const

The average number of cycles.

Friends

friend std::ostream &operator<<(std::ostream &out, const CycleRange &obj)
class CyclesInfo

The class contains the start and end cycles.

Public Functions

inline CyclesInfo(CycleRange from, CycleRange to)

Constructor.

Parameters
  • from – Minimum number of cycles

  • to – Maximum number of cycles

inline CycleRange from() const

The start of the cycles.

inline CycleRange to() const

The end of the cycles.

Friends

friend std::ostream &operator<<(std::ostream &out, const CyclesInfo &obj)
class Ipu

The class contains the measured information for an IPU.

Public Functions

Ipu(const FileReaderPtr filereader, const IpuId ipuId, const StepId stepid)

Constructor.

Parameters
  • filereader – Pointer to the file reader

  • ipuId – The id of an IPU

  • stepId – The id of a step in the execution

CyclesInfo allCycles() const

The range of cycles for all tiles on this IPU.

Note : Will return 0’s if this ipu has not be profiled.

CyclesInfo activeCycles() const

The range of cycles for all tiles involved on this IPU.

Note : Will return 0’s if this ipu has not be profiled.

uint64_t cycles() const

Total cycles executed on this IPU.

This is the sum of cycles on all tiles.

Note : Will return 0 if this ipu has not be profiled.

uint64_t activeTiles() const

The number of tiles involved.

The number of tiles that are computing (or exchanging for exchanges).

Note : Will return 0 if this ipu has not be profiled.

float threadBalance() const

Indication of hardware thread utilisation.

Measures how well-utilised the hardware threads are. If you always run 6 threads or 0 threads this will be 1.0 even if the total computation on each tile takes a different amount of time.

Note : Will return 0 if this ipu has not be profiled.

float tileBalance() const

Indication of hardware tile utilisation.

Measures how well-utilised the hardware tiles are. Larger values (up to a maximum of 1.0) indicate that work has been shared amongst tiles on this IPU more evenly.

Note : Only valid for OnTileExecute, GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

float activeTileBalance() const

Indication of active hardware tile utilisation.

Measures how well-utilised the active hardware tiles are. Larger values (up to a maximum of 1.0) indicate that work has been shared amongst the active tiles on this IPU more evenly.

Note : Only valid for OnTileExecute, GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

Bytes dataIn() const

Total data received by this IPU.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

Bytes dataOut() const

Total data sent by this IPU.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange. Note : Will return 0 if this ipu has not be profiled.

float dataBalance() const

Indication of how well-balanced the data transfer was.

Note : Only valid for GlobalExchange, StreamCopy, DoExchange Note : Will return 0 if this ipu has not be profiled.

bool profiled() const

Indication if this IPU was profiled.

With the poplar engine option ‘replicaToProfile’, only a subset of the ipu’s may have execution profile information.

IpuId id() const

The ID of this IPU.

Friends

friend std::ostream &operator<<(std::ostream &out, const Ipu &obj)
class TileCycleTotals

This class contains total cycles for various activities, such as exchange and compute.

Public Functions

TileCycleTotals(const Cycles activeCompute, const Cycles compute, const Cycles interIpuExchange, const Cycles streamCopyBegin, const Cycles streamCopy, const Cycles streamCopyEnd, const Cycles internalExchange, const Cycles sync, const Cycles total)

Constructor.

Parameters
  • activeCompute – Active compute cycles

  • compute – Cycles when at least one thread is still executing

  • interIpuExchange – Cycles during inter-IPU exchange

  • streamCopyBegin – Cycles during beginning part of stream copy

  • streamCopy – Cycles during main part of stream copy

  • streamCopyEnd – Cycles during end part of stream copy

  • internalExchange – Cycles during internal exchange

  • sync – Cycles during sync

  • total – Total cycles

Cycles activeCompute() const

Active compute cycles.

Cycles compute() const

Cycles when at least one thread is still executing.

Cycles interIpuExchange() const

Cycles during inter-IPU exchange.

Cycles streamCopyBegin() const

Cycles during beginning part of stream copy.

Cycles streamCopy() const

Cycles during main part of stream copy.

Cycles streamCopyEnd() const

Cycles during end part of stream copy.

Cycles internalExchange() const

Cycles during internal exchange.

Cycles sync() const

Cycles during sync.

Cycles total() const

Total cycles.

Friends

friend std::ostream &operator<<(std::ostream &out, const TileCycleTotals &obj)
template<class T>
class Period

This class represents a period of time measured in T (seconds, cycles, etc.)

Public Functions

inline Period(const T start, const T end)
inline T start() const
inline T end() const

Friends

template<class U>
friend std::ostream &operator<<(std::ostream &out, const Period<U> &obj)
class ExecutionReport

This class contains information collected from the execution of a model.

Public Functions

ExecutionReport(const FileReaderPtr filereader)

Constructor.

Parameters

filereader – Pointer to the file reader

List<Run> runs() const

A list of poplar engine runs.

List<ExecutionStep> steps() const

A list of all program steps for the entire execution.

TileCycleTotals totalCycles() const

Total cycles executed on all IPUs.

For all IPUs and replicas.

List<Block> blocks(unsigned tile) const

A list of block measurements.

Friends

friend std::ostream &operator<<(std::ostream &out, const ExecutionReport &obj)

2.3. Trace reports

TraceReport pva::openTrace(const std::string &tracePath)

The main entry-point into a trace file; loads the specified trace and returns a TraceReport object providing access to information about the processes within the trace.

Parameters

tracePath – The path to the trace file you want to read from.

Throws

std::exception – If an error occurs attempting to open the specified file.

class TraceReport

Provides access to information within a trace file.

Most importantly, you can retrieve a list of processes from this object.

Public Functions

TraceReport() = delete
~TraceReport() = default
const std::vector<Process> &processes() const

Returns the list of processes recorded in this trace.

const Process &process(PID pid) const

Retrieves and returns a specific process by its PID.

Parameters

pid – The PID of the process you want to retrieve.

Throws

std::out_of_range – If this trace contains no process with the provided PID.

class Process

Contains all information about a process.

Most notably, this contains info about threads, from which you can get information about events.

Public Functions

Process() = default
~Process() = default
PID pid() const
const std::vector<Thread> &threads() const

Returns the list of threads that this process owns.

const Thread &thread(TID tid) const

Retrieves and returns a specific thread by its TID.

Parameters

tid – The TID of the thread you want to retrieve.

Throws

std::out_of_range – If this process has no thread with the provided TID.

class Thread

Each thread contains a sequential list of events that were caught by the trace.

These are just the root events, whose children must be accessed separately via the respective event object itself.

Public Functions

Thread() = default
~Thread() = default
PID pid() const

Returns the PID of the thread’s owning process.

TID tid() const

Returns this thread’s TID.

const std::vector<Event> &events() const

Returns the root level events that occurred on this thread.

For their sub-events, call the children() function on a specific Event object.

class Event

Contains all information about a specific event.

A vector of its child events can be obtained with the children() function.

Public Functions

Event() = default
~Event() = default
PID pid() const

Returns the PID of the process that owns the thread on which this event occurred.

TID tid() const

Returns the TID of the thread on which this event occurred.

const std::string &label() const

Returns this event’s label.

const std::string &channel() const

Returns the name of this event’s channel.

const uint64_t timestamp() const

Returns the timestamp at which this event occurred in microseconds.

Timestamps and durations are recorded using std::steady_clock::now(), so timestamps are not relative to a reliable epoch between different trace files. See https://en.cppreference.com/w/cpp/chrono/steady_clock

const uint64_t duration() const

Returns the event’s duration in microseconds.

Timestamps and durations are recorded using std::steady_clock::now(), so timestamps are not relative to a reliable epoch between different trace files. Durations are always relative to the initial timestamp though. See https://en.cppreference.com/w/cpp/chrono/steady_clock

const std::vector<Event> &children() const

Retrieves and returns a list of events that occurred on the same thread during this event’s timespan.

bool isComplete() const

Returns whether or not the event is ‘complete’.

This can be false if the trace file is unfinished, such as in the case of a program crash.