Graph

#include <poplar/Graph.hpp>

namespace poplar

Poplar classes and functions.

Functions

StringRef versionString()

StringRef packageHash()

class Graph

#include <Graph.hpp>

This class represents a graph program to be executed on the IPU.

Public Types

using TileToTensorMapping = std::vector<std::vector<Interval>>

using TraceFn = std::function<void()>

Record some compilation time as part of the graph construction phase.

Deprecated:: Tracing via Poplar is deprecated and will be removed.

Param name: The name of the phase. This can be composed of multiple parts.
Param fn: The construction code to be timed.

Public Functions

Graph()

Graph(const Target &target, replication_factor r = replication_factor(1))

Construct a graph object.

This constructor creates a Graph object using the given graph programming environment.

Parameters

target – The target the graph is being constructed to work with.
r – Number of times graph is to be replicated (default is no replication)

Graph(const Device &device, replication_factor r = replication_factor(1))

Construct a graph object.

This constructor creates a Graph object using the given graph programming environment.

Parameters

device – The device the graph is being constructed to work with.
r – Number of times graph is to be replicated (default is no replication).

Graph(Graph&&) noexcept

Graph &operator=(Graph&&) noexcept

~Graph()

const Target &getTarget() const: Retrieve the target that this graph is targeting.

bool addCodelets(StringRef src, CodeletFileType type = CodeletFileType::Auto, StringRef compileFlags = "")

Add a codelet to the graph.

A codelet is either a C, C++, or assembly source file, or a .gp object file. If a source file is given it is compiled for the graph’s target and then loaded into the graph. If it is an object file then it is loaded into the graph.

Symbols that codelets use are not resolved until the engine is built, so codelets can use symbols from each other by calling addCodelets() for each source or object file (or passing a list of files as a vector).

Parameters

src – The path to a source or object file containing codelets.
type – Specify the type of the codelet (source or precompiled). If the value is CodeletFileType::Auto is used, the type is determined from the filename extension.
compileFlags – Additional flags to pass to the compiler if using source code. For example, -g to generate debug info.

Returns

True if the codelet is added to the graph successfully, or false if the codelet already existed in the graph.

bool addCodelets(StringRef src, CodeletFileType type, StringRef compileFlags, std::ostream &compileOutput)

Add a codelet to the graph and write error messages from the compilation process to the given output stream.

By default they are printed to cerr.

inline bool addCodelets(ArrayRef<std::string> xs, StringRef compileFlags = "")

Add a set of codelets to the graph.

These codelets can depend on each other. For example, symbols defined in one can be used by any other. The order is not important.

Returns: True if all the codelets are added successfully, or false if any of the codelets are not added because they already exist in the graph.

void addCodelets(std::stringstream &stream, StringRef compileFlags = "", CodeletFileType type = CodeletFileType::CppSource)

Take a codelet contained within the stream and store it in a temporary file which we then use to compile the codelet.

The language type of the codelet in the stream can be specified, defaulting to C++.

Note that this is not idempotent, in other words, this function will throw an exception if called twice with the same stream, unlike the overload that takes a file path instead.

void addCodelets(std::stringstream &stream, StringRef compileFlags, std::ostream &compileOutput, CodeletFileType type = CodeletFileType::CppSource)

VertexRef addVertex(ComputeSet cs, StringRef vertexType)

Add a vertex to the graph.

Parameters

cs – The compute set to add the vertex to.
vertexType – The name of the type of the vertex. This must be a declared vertex type in the graph programming environment used to create the graph builder.

inline VertexRef addVertex(ComputeSet cs, StringRef vertexType, ArrayRef<ConnectionDesc> connections)

Add a vertex to the graph and connect graph elements to some of its fields.

This variant of add vertex allows you to pass in a list of connection descriptions to connect graph elements to fields of the newly created vertex. The connection descriptions can be initialized with:

{ string, Tensor } - connect a tensor to a field.
{ string, FieldRef, bool } - connect a vertex field to a field.
{ string, T v } - connect a constant value to an input field.

For example, the following:

addVertex(cs, "MyVertex", {{"x", tensor[4]}, {"y", v["z"], false}});

Will create a vertex and connect a tensor to its x field and the vertex field v[“z”] to its y field.

Parameters

cs – The compute set to add the vertex to.
vertexType – The name of the type of the vertex. This must be a declared vertex type in the graph programming environment used to create the graph builder.
connections – A list of connection descriptions.

VertexRef addExternalExchangeVertex(ComputeSet cs, StringRef vertexType, unsigned incomingDownCount, bool usesEastEdge, bool sendsXReq)

Add an external exchange vertex to the graph.

A compute set can contain at most one external exchange vertex per tile. External exchange vertices cannot be mixed with non external exchange vertices in the same compute set. Before an external vertex is called we set the INCOMING_DCOUNT and INCOMING_MUX mux registers and synchronize all tiles containing external exchange vertices.

Parameters

cs – The compute set to add the vertex to.
vertexType – The name of the type of the vertex. This must be a declared vertex type in the graph programming environment used to create the graph builder.
incomingDownCount – The value to set the INCOMING_DCOUNT register to.
usesEastEdge – Whether the vertex uses an east edge exchange block. The INCOMING_MUX register is set to point to either the east edge or west edge depending on this argument.
sendsXReq – Whether this vertex is responsible for sending the XREQ packet. There must be at most one tile per exchange block context that sends the XREQ and the tile must be the same in every compute set containing external exchange vertices.

Tensor addVariable(const Type &type, ArrayRef<std::size_t> shape, const DebugContext &debugContext = {})

Add a variable to the graph.

If using this function with a target with multiple tiles then the variable will initially have no tile mapping. It is expected that the tile mapping will be set later with Graph::setTileMapping(). If the target of the graph has only one tile then the tensor will be automatically mapped to that tile.

Parameters

type – The type of the elements of the variable.
shape – The shape of the variable.
name – An optional name to identify the variable for debugging/profiling purposes.
returns – A tensor referring to the variable in the graph.

Tensor addVariable(const Type &type, ArrayRef<std::size_t> shape, VariableMappingMethod mappingMethod, const DebugContext &debugContext = {})

Add a variable to the graph.

Parameters

type – The type of the elements of the variable.
shape – The shape of the variable.
mappingMethod – The method to use to initially map the variable to tiles.
name – An optional name to identify the variable for debugging/profiling purposes.

Returns

A tensor referring to the variable in the graph.

template<typename T> inline Tensor addConstant(const Type &type, ArrayRef<std::size_t> shape, ArrayRef<T> values, const DebugContext &debugContext = {"<const>"})

Add a constant to the graph.

A constant tensor is a tensor with every element initialized.

Parameters

type – The type of the elements of the constant.
shape – The shape of the constant.
values – Vector of values to initialize tensor elements to.
name – An optional name to identify the variable for debugging/profiling purposes.

template<typename T> inline Tensor addConstant(const Type &type, ArrayRef<std::size_t> shape, T val, const DebugContext &debugContext = {"<const>"}, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Add a constant to the graph.

A constant tensor is a tensor with every element initialized to the same value. It cannot be connected to a vertex output.

Parameters

type – The type of the elements of the constant.
shape – The shape of the constant.
val – The value to initialize tensor elements to.
name – An optional name to identify the variable for debugging/profiling purposes.

template<typename T> inline Tensor addConstant(const Type &type, ArrayRef<std::size_t> shape, const T *val, const DebugContext &debugContext = {"<const>"}, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Add a constant to the graph with multiple cell values.

A constant tensor is a tensor with every element initialized to the same value. It cannot be connected to a vertex output.

Parameters

type – The type of the elements of the constant.
shape – The shape of the constant.
val – The value to initialize tensor elements to.
name – An optional name to identify the variable for debugging/profiling purposes.

Tensor addConstant(const Type &type, ArrayRef<std::size_t> shape, const void *val, const TypeTraits &traits, bool broadcast, const DebugContext &debugContext = {"<const>"})

inline Tensor addConstantHalf(const Type &type, ArrayRef<std::size_t> shape, uint16_t val, const DebugContext &debugContext = {"<const>"})

Add a constant to the graph, where the host data is type IEEE half.

A constant tensor is a tensor with every element initialized to the same value. It cannot be connected to a vertex output.

Parameters

type – The type of the elements of the constant.
shape – The shape of the constant.
val – The value to initialize tensor elements to.

inline Tensor addConstantHalf(const Type &type, ArrayRef<std::size_t> shape, const uint16_t *val, const DebugContext &debugContext = {"<const>"})

Add a constant to the graph with multiple cell values, where the host data is type IEEE half.

A constant tensor is a tensor with every element initialized to the same value. It cannot be connected to a vertex output.

Parameters

type – The type of the elements of the constant.
shape – The shape of the constant.
val – The value to initialize tensor elements to.

Tensor clone(const Type &type, const Tensor &t, const DebugContext &debugContext = {}, TensorCloneMethod method = TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)

Add a tensor to the graph that has the same size and tile mapping as Tensor t.

Parameters

type – The element type of the new tensor.
t – The tensor to be cloned.
name – A debug name to give to any new tensors allocated in the graph during the clone. If this is empty then the debug names will be derived from existing tensor debug names.
method – The method to use for the cloning (decides whether to preserve ordering/aliasing in the new tensor).

Tensor cloneN(const Type &type, const Tensor &t, std::size_t N, const DebugContext &debugContext = {}, TensorCloneMethod method = TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES, TensorCloneDuplicationMethod duplicationMethod = TensorCloneDuplicationMethod::DUPLICATE_BY_OUTER_DIMENSION)

Clone a tensor N times.

Given a tensor of shape [D1, D2, … Dn], this function will create a new tensor of shape [N, D1, D2, …, Dn] where each of the N sub-tensors is a clone of the original tensor (meaning it has the same layout and tile mapping).

See also

TensorCloneDuplicationMethod

Parameters

type – The element type of the new tensor.
t – The tensor to clone.
N – The replication factor to clone with.
name – The name for the new variables created.
method – The tensor cloning method (see Graph::clone()).
duplicationMethod – The behaviour used when a tensor is cloned.

inline Tensor clone(const Tensor &t, const DebugContext &debugContext = {}, TensorCloneMethod method = TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)

Add a tensor to the graph that has the same size and tile mapping as Tensor t.

Parameters

t – The tensor to be cloned.
name – A debug name to give to any new tensors allocated in the graph during the clone. If this is empty then the debug names will be derived from existing tensor debug names.
method – The method to use for the cloning (decides whether to preserve ordering/aliasing in the new tensor).

inline Tensor cloneN(const Tensor &t, std::size_t N, const DebugContext &debugContext = {}, TensorCloneMethod method = TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES, TensorCloneDuplicationMethod duplicationMethod = TensorCloneDuplicationMethod::DUPLICATE_BY_OUTER_DIMENSION)

Clone a tensor N times.

Given a tensor of shape [D1, D2, … Dn], this function will create a new tensor of shape [N, D1, D2, …, Dn] where each of the N sub-tensors is a clone of the original tensor (meaning it has the same layout and tile mapping).

See also

TensorCloneDuplicationMethod

Parameters

t – The tensor to clone.
N – The replication factor to clone with.
name – The name for the new variables created.
method – The tensor cloning method (see Graph::clone()).
duplicationMethod – The behaviour used when a tensor is cloned.

void connect(FieldRef field, const Tensor &tensor)

Connect a tensor to a vertex field.

This function connects an a tensor with a vertex field. If the vertex field is an scalar input/output then a simple edge is added (and the tensor must be of zero dimension; in other words, a scalar).

If the vertex field is an input/output of a vector then a vector edge is added (and the tensor must be of dimension 1).

If the vertex field is a vector of inputs or outputs then the size of the field is set to the correct size and edges are added for every element of the tensor tensor (and the tensor must be of dimension 1).

If the vertex field is a vector of input or output vectors then the tensor must be 2-dimensional. In this case, the size of the vector field is set to the size of the first dimension and vector edges are added for every sub-vector of the two dimensional tensor.

Parameters

tensor – The tensor.
field – Reference to the vertex field to connect.

template<typename T> inline void connect(FieldRef field, T v, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Connect a constant value to an input field.

This method creates a single-element tensor containing a specified value and connects that tensor element to an input field.

Parameters

v – The value to connect.
field – The field to connect to.

inline void connect(FieldRef field, ArrayRef<Tensor> tensors)

Connect a vector of tensors to a vertex field.

This function connects an vector a tensors with a vertex field. The field must be a vector of inputs or outputs. The field will be sized to the provided vector and each element will be connect to the corresponding element of the field.

Parameters

tensors – The vector of tensors.
field – Reference to the vertex field to connect.

void setPerfEstimate(const VertexRef &v, std::uint64_t cycles, std::uint64_t flops = 0)

Set the performance estimate for a vertex.

Parameters

v – The vertex to set the estimate for.
cycles – The number of cycles that this vertex will use when run.
flops – The number of flops that this vertex will use when run.

void setPerfEstimate(const VertexRef &v, const VertexPerfEstimate &estimate)

Set the performance estimate for a vertex.

Parameters

v – The vertex to set the estimate for.
estimate – The performance estimates for this vertex when run.

VertexPerfEstimate getPerfEstimate(const VertexRef &v) const

Get the performance estimate for the specified vertex.

Parameters: v – The vertex to get the estimate for.
Throws: missing_perf_estimate – if the performance estimate is not available (for example, because the graph hasn’t been executed yet).
Returns: The performance estimates used when this vertex is run.

void registerPerfEstimator(StringRef vertexTypeName, PerfEstimateFunc f)

Parameters

vertexTypeName – Type of vertex to register the estimator for.
f – Callback function that will compute a performance estimate for all vertices of this type.

unsigned getNumVertices(void) const

Get the number of vertices currently in the graph.

Returns: The numbers of vertices currently in the graph.

ComputeSet addComputeSet(const DebugContext &debugContext = {})

Create a compute set within the graph.

Parameters: name – An optional identifier for the compute set that may be used during profiling/debugging.
Returns: The reference to the compute set.

void setFieldSize(FieldRef field, std::size_t size)

Set the size of a vector field.

Parameters

field – The reference to the field.
size – The size of the field.

std::size_t getFieldSize(FieldRef field) const

Get the size of a vector field.

Parameters: field – The reference to the field.
Returns: The size of the field.

std::size_t getMaxFieldDim(StringRef vertexName, StringRef fieldName, unsigned dimIndex) const

Find the maximum size for a dimension of a field.

Parameters

vertexType – The type of vertex
field – The field
dimIndex – The index of the dimension

Throws

index_error – If there is no such dimension
poplar_error – If the field is not indexable

double getMaxVertexFieldValue(StringRef vertexName, StringRef fieldName) const

Find the maximum value that can be represented by an element of a field.

Parameters

vertexType – The type of vertex
field – The field

template<typename T> inline void setInitialValue(FieldRef field, T val, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Set the initial value of a field.

Parameters

field – The reference to the field.
val – The value to set the field to when the graph engine is created.

template<typename T> inline void setInitCallback(FieldRef field, LateInitCallback<T> callback, typename std::enable_if<std::is_arithmetic<T>::value>::type* = nullptr)

Set the init callback for a field; the callback function will be called after graph construction and must return the init value of the field.

This can be called instead of calling setInitialValue(), or both can be called for the field, to ensure that the field has a (at least partially) valid starting value, for instance it if needs to be retrieved in an early stage of graph compilation, before storage allocation (for instance during cycle estimation)

Note that you must explicitly provide the template parameter T in the specialisation when using this function. For example:

 setInitCallback<uint16_t>(vertex["size"], sizeCallback)

This is because the compiler will not be able to detect the correct type from the callback parameter.

Parameters

field – The reference to the field.
callback – The callback that will return the value for the field.
<unnamed> – This exists only to allow the insertion of the is_arithmetic<T> check for the type T.

template<typename T> inline void setInitialValue(FieldRef field, const std::vector<T> &v)

template<typename T> inline void setInitialValue(FieldRef field, const std::initializer_list<T> &l)

inline void setInitialValueHalf(FieldRef field, uint16_t val)

Set the initial value of a field of type IEEE half.

Parameters

field – The reference to the field.
val – The value to set the field to when the graph engine is created.

template<typename T> inline void setInitialValue(FieldRef field, ArrayRef<T> val)

Set initial values of a vector field.

Parameters

field – The reference to the vector field.
val – A vector value to set the field to when the graph engine is created.

inline void setInitialValueHalf(FieldRef field, ArrayRef<uint16_t> val)

Set initial values of a vector field of type IEEE half.

Parameters

field – The reference to the vector field.
val – A vector value to set the field to when the graph engine is created.

template<typename T> inline void setInitialValue(const Tensor &t, T val, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Set the initial value of a tensor element.

Parameters

t – The tensor representing the value to set.
val – The value to set the field to when the graph engine is created. A buffer of values can be provided to set the elements of a non-scalar tensor.

template<typename T> inline void setInitialValue(const Tensor &t, ArrayRef<T> values)

inline void setInitialValueHalf(const Tensor &t, uint16_t val)

Set the initial value of a tensor element of type IEEE half.

Parameters

t – The tensor representing the value to set.
val – The value to set the field to when the graph engine is created. A buffer of values can be provided to set the elements of a non-scalar tensor.

inline void setInitialValueHalf(const Tensor &t, ArrayRef<uint16_t> values)

void createHostWrite(StringRef handle, const Tensor &t, bool rearrangeOnHost = false)

Mark a Tensor as being available as the destination of host to device copies.

This is a convenience function that creates a host-to-device FIFO, and a Copy program that copies data from the FIFO to the tensor. When you call Engine::writeTensor() it copies the input data to the FIFO and then executes the Copy program on the device.

See also

Engine::writeTensor()

Parameters

handle – A name to be associated with this host copy.
t – The tensor to be marked as an input.
rearrangeOnHost – Save IPU memory at the cost of exchange speed by rearranging the data on the host before sending it to the IPU, rather than doing an internal exchange. Note that due to alignment and size requirements of host exchange packets this may still require part of the transfer to be received to a temporary variable and copied to its destination.

void createHostRead(StringRef handle, const Tensor &t, bool rearrangeOnHost = false)

Mark a Tensor as being available as the source of device to host copies.

This is a convenience function that creates a device-to-host FIFO, and a Copy program that copies data to the FIFO from the tensor. When you call Engine::writeTensor() it executes the Copy program on the device and then outputs the data from the FIFO.

See also

Engine::readTensor()

Parameters

handle – A name to be associated with this host copy.
t – The tensor to be marked as an output.
rearrangeOnHost – Save IPU memory at the cost of exchange speed by sending data in any order and rearranging it on the host, rather than doing an internal exchange before sending it.

DataStream addHostToDeviceFIFO(StringRef handle, const Type &elementType, std::size_t numElements, ReplicatedStreamMode replicatedMode = ReplicatedStreamMode::REPLICATE, const OptionFlags &options = {})

Add a data stream to the graph for copying data from the host to the device.

Supported options:

splitLimit Integer [=50 * 1024 * 1024]

The maximum size of the FIFO before it is split into multiple FIFOs. This is a useful option to avoid exceeding the stream buffer size limit. If the original FIFO is larger than the specified split limit, then it is replaced by a number of FIFOs which represent chunks of the original FIFO, and are read from sequentially. Setting splitLimit to 0 or UINT_MAX disables this option.
bufferingDepth Integer [=1]

The depth of the FIFO which can be prefetched before being read by the device. By default the FIFO size is 1, so it prefetches a single entry, after it has been read, to refill the FIFO. Increasing the size of the FIFO allows for prefetching of multiple entries, increasing the probability there will be a valid entry in the FIFO for the device to read before falling back to synchronously fetching the next entry.
addressSpace (pageTable, addressTranslationTable, serviceTable) [=pageTable]

The type of address mapping used by the hardware to translate an exchange address to a host physical address.
- pageTable: The stream uses a lookup table which maps one memory page per entry.
- addressTranslationTable: This uses a translation table. This table contains very few entries but each of them can map large regions. This type of address mapping is only supported for replicated streams. It is also necessary to set the Target option gatewayMode to false.
- serviceTable: This translation mode supports a large address space. Requires Target option gatewayMode to be enabled.

Parameters

handle – A name to be associated with this stream.
elementType – The type of data in the stream.
numElements – The number of elements to be transferred from the stream by a Copy program.
replicatedMode – How the stream is replicated if this is a replicated graph.
options – List of options.

DataStream addDeviceToHostFIFO(StringRef handle, const Type &elementType, std::size_t numElements, const OptionFlags &options = {})

Add a data stream to the graph for copying data from the device to the host.

Supported options:

splitLimit Integer [=50 * 1024 * 1024]

The maximum size of the FIFO before it is split into multiple FIFOs. This is a useful option to avoid exceeding the stream buffer size limit. If the original FIFO is larger than the specified split limit, then it is replaced by a number of FIFOs which represent chunks of the original FIFO, and are read from sequentially. Setting splitLimit to 0 or UINT_MAX disables this option.

Parameters

handle – A name to be associated with this stream.
elementType – The type of data in the stream.
numElements – The number of elements to be transferred to the stream by a Copy program.
options – List of options.

RemoteBuffer addRemoteBuffer(StringRef handle, const Type &elementType, std::size_t numElements, std::size_t repeats = 1, bool rearrangeOnHost = false, bool optimiseMemory = false)

Add a remote buffer to the graph.

A remote buffer is memory outside the IPU which can be read and written by the IPU. A read returns the last written value. The remote buffer is (repeats * numElements * sizeof(elementType) + padding) bytes in size. Padding is added to meet any alignment constraints of the hardware.

Parameters

handle – A name to be associated with this remote buffer.
elementType – The type of data in the remote buffer.
numElements – The number of elements to be transferred to the remote buffer by a Copy program.
repeats – The buffer can store multiple blocks of data to be transferred. The total number of data elements in the buffer is numElements * repeats.
rearrangeOnHost – Perform any necessary data rearrangement on the on the host instead of on the IPU.
optimiseMemory – Optimise for memory use rather than speed.

void outputVertexGraph(std::ostream &outputStream, ArrayRef<program::Program> progs = {}) const

Output the vertex graph to a stream in dot file format.

Parameters: outputStream – The C++ stream to output the dot file onto.

void outputComputeGraph(std::ostream &outputStream, ArrayRef<program::Program> progs = {}) const

Output the compute graph to a stream in dot file format.

The graph will contain the following:

Green boxes represent variables (tensors), with the number of elements in square brackets.
Blue boxes represent compute sets.
Orange boxes within the blue boxes represent the number of vertices in the compute set.
Yellow boxes that are not linked to anything else in the graph, represent individual variables. The code attempts to simplify the layout by merging them when they have the same name, size and connectivity.
Black edges are the inputs and outputs of a compute set.
Red edges represent data copies.

Parameters

outputStream – The C++ stream to output the dot file onto.
progs – The list of programs to generate a graph of.

void setTileMapping(VertexRef v, unsigned tileNum)

Map a vertex to a specific tile on the device.

Parameters

v – Reference to the vertex to map.
tileNum – The tile number to map the vertex to.

void setTileMapping(const Tensor &t, unsigned tileNum)

Map a tensor slice to a specific tile on the device.

Parameters

t – The tensor or tensor slice to map.
tileNum – The tile number to map to.

TileToTensorMapping getTileMapping(const Tensor &t, bool requireComplete = true, bool allowExternal = false) const

Inspect the tile mapping of a tensor.

Parameters

t – The tensor to inspect.
requireComplete – If t is not fully mapped and requireComplete is true then an invalid_tile_mapping exception will be thrown.
allowExternal – Allow some of the tensor to be mapped to tiles not belonging to this Graph. These elements will not be included in the result.

Returns

The mapping from tiles to a vector of intervals mapped to the tile (implemented as vector indexed by the tile number). The lower and upper bound of each interval are elements number in the flattened tensor.

TileToTensorMapping getTileMapping(const Tensor &t, bool *isComplete, bool allowExternal = false) const

Inspect the tile mapping of a tensor.

Parameters

t – The tensor to inspect
isComplete – If non-null, updated to indicate whether the mapping is complete.
allowExternal – Allow some of the tensor to be mapped to tiles not belonging to this Graph. These elements will not be included in the result.

Returns

The mapping from tiles to a vector of intervals mapped to the tile (implemented as vector indexed by the tile number). The lower and upper bound of each interval are elements number in the flattened tensor.

TileToTensorMapping getVariableTileMapping(const Tensor &t) const

Inspect the tile mapping of a tensor.

This excludes any constant regions.

Parameters: t – The tensor to inspect
Returns: The mapping from tiles to a vector of intervals mapped to the tile (implemented as vector indexed by the tile number). The lower and upper bound of each interval are elements number in the flattened tensor.

void setTileMapping(const Tensor &t, const TileToTensorMapping &mapping)

Set the tile mapping of a tensor based on an explicit map from tiles to tensor intervals.

Parameters

t – The tensor to map
mapping – The mapping from tiles to a vector of intervals to be placed on that tile (implemented as vector indexed by the tile number). The lower and upper bound of each interval are elements number in the flattened tensor.

Tensor getVariable(VariableRef v) const

Get a tensor representing an entire variable.

Parameters: v – The variable to retrieve.
Returns: A Tensor object representing that variable.

bool isConstant(VariableRef v) const

Check whether a variable reference refers to a constant.

When Graph::addConstant() is called, a variable is created to represent that constant. This call checks whether a variable was created by that method or by Graph::addVariable().

Parameters: v – The variable to examine.
Returns: True if and only if the variable refers to a constant.

std::vector<std::vector<Interval>> getSortedContiguousRegions(const Tensor &t, ArrayRef<Interval> regions, bool removeAliasedIntervals = false, std::vector<std::size_t> *aliases = nullptr) const

Get a list of sequences of intervals over a tensor such that each sequence represents a contiguous region of memory.

Parameters

t – The tensor to get intervals over.
regions – A list of intervals representing the elements to sort in to contiguous sequences in memory.
removeAliasedIntervals – If true, remove intervals which alias others in the given regions from the result.
aliases – Optional list of indices for each region in the returned intervals where an index is always the same for a region representing the same underlying elements in memory. If this is nullptr, then no aliases will be returned.

Returns

A list of sequences of intervals. The intervals will cover the same elements as the intput tensor.

void reorderToSimplify(Tensor *t, ArrayRef<Tensor*> ts, bool requireSimplestOrder = true) const

Reorder a set of tensors in order to simplify the view on data.

This function will update t to be a (simpler) reordered view on the same data. The same reordering will be applied to all elements of ts. The reordering will be the same for all tensors, so order-invariant or elementwise operations on t and ts can still be performed.

The main purpose of this function is to provide a way to implement more efficient graph construction of elementwise or order-invariant operations.

If requireSimplestOrder is set to true then, after execution, t will consist of the minimum number of possible contiguous regions. If not, then no guarantee is give on the order of t.

All the tensors provided to this function must be of rank 1 (flattened tensors) and have the same number of elements.

TensorRearranger getSimplifyingRearranger(const Tensor &t) const

Get a rearranger object for simplifying the underlying representation of a tensor.

This rearranger will rearrange the tensor to simplify the underlying representation to reduce the processing time for functions such as getContiguousRegions(), getTileMapping().

The actual reordering is unspecified and depends on the underlying representation with the Poplar library (however it can always be undone using the TensorRearranger object).

Parameters: t – The tensor to simplify.
Returns: A TensorRearranger object that can perform the rearrangement.

Tensor findUnbroadcastTensor(const Tensor &t) const

Attempt to determine the shape of a Tensor prior to it having been broadcast.

Under some circumstances this may not be possible, failure is indicated by the returned tensor having the same shape as the input tensor

Parameters: t – The input tensor
Returns: A tensor which will be set to the unbroadcast (sliced from t) tensor if it is possible to do so. Each dimension of the returned tensor will be a factor of the same dimension of the input tensor. The returned tensor will have the same rank as the input tensor. If it is not possible to determine the shape of the unbroadcast tensor the input tensor will be returned.

void serializeTensors(std::ostream &out, ArrayRef<Tensor> tensors, SerializationFormat format) const

Serialize a set of tensors to JSON or CapnProto.

The tensors must all be from this graph or an exception is thrown. The information saved is:

The type, shape and expression of the tensors.
The type and number of elements of any variables used.

This is intended to be used for debugging, testing and visualisation.

Parameters

out – Stream to write to.
tensors – A set of tensors to serialize.
format – Serialize in JSON or CapnProto format. JSON is pretty printed.

Throws

poplar_error – if any tensor is not from this graph. CapnProto may also throw an exception if serialization fails.

std::vector<Tensor> deserializeTensors(std::istream &in, SerializationFormat format)

Deserialize a set of tensors from a CapnProto message.

JSON deserialization is not currently supported and an exception will be thrown if format is SerializationFormat::JSON.

This will recreate the tensors in this graph. It throws an exception on failure (for example, if the tensor type does not match the variable types). Whenever a variable is used by a tensor a new variable is added to the graph.

The layout of the tensors and variables should be the same as when they were serialized.

This function is primarily intended for testing and benchmarks. You should not use it as a general method of creating tensors.

Parameters

in – A stream from which serialised tensor data can be read.
format – Must be SerializationFormat::Binary.

Returns

The deserialized set of tensors.

Graph createVirtualGraph(unsigned numTilesPerIPU)

Create a “virtual” graph using a subset of the target’s tile.

This method returns a graph object that references the same state as this graph but has a virtual target than only uses a subset of the target’s tiles.

If the getTarget() method is called on the new graph it will return a target with the new number of tiles.

Parameters: numTilesPerIPU – The number of tiles per IPU for the new graph to use.
Returns: The virtual graph object.

Graph createVirtualGraph(unsigned lowerTile, unsigned upperTile)

Create a “virtual” graph that uses a subset of the target’s tiles.

This method returns a graph object that references the same state as this graph but has a virtual target than only uses a subset of the target’s tiles.

This variant of the method takes a tile range for the new virtual graph to use. The range is [lowerTile, upperTile). This tile range must be contained within a single IPU.

If the getTarget() method is called on the new graph it will return a target with the new number of tiles.

Parameters

lowerTile – The starting tile of the tile range for the virtual graph to use.
upperTile – The upper bound of the tile range for the virtual graph to use. This is a non-inclusive upper bound.

Returns

The virtual graph object.

Graph createVirtualGraph(const std::vector<unsigned> &perIpuTiles)

Create a “virtual” graph that uses a subset of the target’s tiles.

This method returns a graph object that references the same state as this graph but has a virtual target than only uses a subset of the target’s tiles.

This variant of the method takes the set of tiles in each IPU that should be included in the new graph.

If the getTarget() method is called on the new graph it will return a target with the new number of tiles.

Parameters: perIpuTiles – The tiles to include in the graph. Tiles are specified by their index in the IPU. Each tile index must be unique and less than the number of tiles per IPU.
Returns: The virtual graph object.

Graph getTopLevelGraph()

Return the top level graph.

The createVirtualGraph() and createReplicatedGraph() methods can be used to create graph objects that are views on an underlying graph. If this is a virtual or replicated graph then this function returns the top level underlying graph, otherwise it returns the current graph.

unsigned getReplicationFactor() const: Return the replication factor of the graph.

Tensor addReplicationIndexConstant(const DebugContext &debugContext = {}): Add a constant that is initialized with the replication index.

void serialize(std::ostream &out, SerializationFormat format) const

Serialize a graph to JSON or binary (CapnProto) format.

This is equivalent to serialize(out, {}, format).

Note that this does not currently serialize every bit of graph data, so it cannot be used to save and reload a graph.

Parameters

out – Stream to write to.
format – Serialize in JSON or CapnProto format. JSON is pretty printed.

void serialize(std::ostream &out, ArrayRef<program::Program> progs, SerializationFormat format) const

Serialize a graph to JSON or binary (CapnProto) format.

Programs can be passed so that information about Copy programs can be serialized (the Graph class itself does not know about them).

Note that this does not currently serialize every bit of graph data, so it cannot be used to save and reload a graph.

Parameters

out – Stream to write to.
progs – A set of programs that are searched for Copy programs. Information about the variables copied is serialised.
format – Serialize in JSON or CapnProto format. JSON is pretty printed.

Function addFunction(const program::Program &program)

Add a function to the graph.

A function is a partial control program that can be reused. By registering a repeated program as a function and calling it, less control code is generated than repeating the sequence.

Parameters: program – The control program to register as a callable function.
Returns: The Function object that can be used by a Call program.

unsigned convertVirtualTileToPhysicalTile(unsigned virtualTileId) const

Convert a virtual tile ID to a physical tile ID.

This provides the conversion required by the Graphcore communication library (GCL) to know which exchange-block context a tile is associated with.

Parameters: virtualTileId – A virtual tile ID.
Returns: The corresponding physical tile ID.

unsigned convertPhysicalTileToVirtualTile(unsigned physicalTileId) const

Convert a physical tile ID to a virtual tile ID.

This provides the conversion required by the Graphcore communication library (GCL) to know which exchange-block context a tile is associated with.

Parameters: physicalTileId – A physical tile ID.
Returns: The corresponding virtual tile ID.

unsigned convertPhysicalTileToVirtualTile(unsigned ipuId, unsigned physicalTileId) const

Convert a physical tile ID to a virtual tile ID.

This returns the virtual tile ID based on a parameters pair of IPU and and physical tile ID. This is required by the Graphcore communication library (GCL) to know what exchange-block context a tile is associated with.

Parameters

ipuId – The IPU ID.
physicalTileId – The physical tile ID.

Returns

The corresponding virtual tile ID.

bool hasCodelet(StringRef codeletName) const

Check if a graph contains a codelet with this name.

Parameters: codeletName – The name of the codelet to check for.
Returns: True if the codelet is in the graph.

void trace(ArrayRef<StringRef> name, const TraceFn &fn)

Graph(std::unique_ptr<core::GraphBuilder>, Target target)

inline core::GraphBuilder &getImpl() const

Private Functions

void setInitialValue(FieldRef field, const void *val, const TypeTraits&)

template<typename T> void setInitCallback(FieldRef field, LateInitCallback<T> callback, const TypeTraits&)

void setInitialValue(const Tensor &t, const void *val, const TypeTraits&)

void connect(FieldRef field, void *val, const TypeTraits&)

void checkFieldSubgraph(const FieldRef &f) const

void checkVertexSubgraph(const VertexRef &v) const

Private Members

std::unique_ptr<core::GraphBuilder> impl

Target target

class ConnectionDesc

Public Functions

inline ConnectionDesc(StringRef field, Tensor t)

inline ConnectionDesc(StringRef field, ArrayRef<Tensor> tsArr)

template<typename T> inline ConnectionDesc(StringRef field, T v, typename std::enable_if<TypeTraits::isSimpleType<T>()>::type* = nullptr)

Private Types

enum Kind

Values:

enumerator TensorEdge

enumerator ValueEdge

enumerator VectorTensorEdge

Private Functions

inline void connect(Graph &g, const VertexRef &v) const

Private Members

Kind kind

std::string field

std::vector<Tensor> ts

std::unique_ptr<char[]> val

TypeTraits traits

Friends

friend class Graph

namespace core

namespace program