Program

#include <poplar/Program.hpp>

namespace poplar

Poplar classes and functions.

namespace core

namespace program

Functions

bool operator==(const Program &lhs, const Program &rhs)

void dumpProgram(const Graph &graph, const Program &program, std::ostream &out): Print the resulting lowered program from input ‘program’ to ostream ‘out’.

class Abort : public poplar::program::Program 

Public Functions

Abort(const DebugContext &debugContext = {})

Throws an exception.

Parameters: debugContext – Optional DebugId and program name.

Abort(const std::string &message, const DebugContext &debugContext = {})

Reports message to the host and then throws an exception.

Parameters

message – Optional string to be reported.
debugContext – Optional DebugId and program name.

class AbortOnCondition : public poplar::program::Program 

Public Functions

AbortOnCondition(Tensor predicate, const DebugContext &debugContext = {})

Throws an exception if the predicate tensor tests to true.

Parameters

predicate – Scalar tensor to test.
debugContext – Optional DebugId and program name.

AbortOnCondition(Tensor predicate, const std::string &message, const DebugContext &debugContext = {})

If the predicate tensor tests to true, reports message to the host and then throws an exception.

Parameters

predicate – Scalar tensor to test.
message – Optional string to be reported
debugContext – Optional DebugId and program name.

class AssumeEqualAcrossReplicas : public poplar::program::Program 

#include <Program.hpp>

A program to mark a tensor as equal across replicas.

This can be used to tell Poplar that the value of a tensor is the same in all replicas (for example, if it is the result of a cross-replica all-gather operation). Poplar will use this information while checking for divergence in the control flow. This allows it to accept programs that it would otherwise reject because of lack of knowledge of tensor values.

Public Functions

AssumeEqualAcrossReplicas(Tensor t, const DebugContext &debugContext = {})

class Block : public poplar::program::Program 

#include <Program.hpp>

A program to scope another program.

This can be used together with Engine option profiler.blocks.filter to profile the cycle length of each execution of the scoped program.

Public Functions

Block(const Program &p, const DebugContext &debugContext = {})

class Call : public poplar::program::Program 

#include <Program.hpp>

A program to perform a function call to a previously stored program.

Public Functions

Call(Function f, const DebugContext &debugContext = {})

Call the function.

Parameters

f – A program that has been added to the graph using Graph::addFunction.
debugContext – Optional DebugId and program name.

Call(HostFunction f, ArrayRef<Tensor> inputs, ArrayRef<Tensor> outputs, const DebugContext &debugContext = {})

Call a function in the host.

Parameters

f – A program that has been added to the graph using Graph::addHostFunction
inputs – Array of tensors containing the host function’s input
outputs – Array of tensors to transfer the result of each host function’s output arguments.
debugContext – Optional DebugId and program name.

class Copy : public poplar::program::Program 

#include <Program.hpp>

A program that copies data.

Public Functions

Copy(Tensor src, Tensor dst, bool dontOutline = false, const DebugContext &debugContext = {})

Construct a program to copy data from one tensor to another.

This constructor creates a program that will copy data from the src tensor to the dst tensor.

Parameters

src – The tensor to copy from.
dst – The tensor to copy to.
dontOutline – Do not outline this copy as a function call. Default is false (the copy will be outlined).
debugContext – Optional DebugId and program name.

Copy(const DataStream &stream, Tensor dst, bool optimiseMemory = false, const DebugContext &debugContext = {})

Construct a program to copy from a data stream to a tensor.

See also

See the Poplar User Guide for more information about prefetching.

Parameters

stream – The stream to copy from.
dst – The tensor to copy to.
optimiseMemory – If set to true, Poplar will sacrifice speed to reduce memory use. For example, it may rearrange data on the host and outline writes. Setting this will disable prefetch.
debugContext – Optional DebugId and program name.

Copy(Tensor src, const DataStream &stream, bool optimiseMemory = false, const DebugContext &debugContext = {})

Construct a program to copy a tensor to a data stream.

See also

See the Poplar User Guide for more information about prefetching.

Parameters

src – The tensor to copy from.
stream – The stream to copy to.
optimiseMemory – If set to true, Poplar will sacrifice speed to reduce memory use. For example, it may rearrange data on the host and outline writes. Setting this will disable prefetch.
debugContext – Optional DebugId and program name.

Copy(const RemoteBuffer &buffer, Tensor dst, const DebugContext &debugContext = {})

Construct a program to copy a remote buffer to a tensor.

Parameters

buffer – The remote buffer to copy from.
dst – The tensor to copy to.
debugContext – Optional DebugId and program name.

Copy(const RemoteBuffer &buffer, Tensor dst, Tensor offset, const DebugContext &debugContext = {})

Construct a program to copy a remote buffer to a tensor.

The data to be transferred is controlled by the definition of the buffer and the offset parameter.

The buffer has repeat data-transfer “rows” each containing numElements data items (these are not necessarily the same as rows in the destination tensor.) The size of offset defines the number of rows to copy. The rows to be copied are defined by offset: each element of offset is the index of a row to be copied.

The size of dst must be equal to the data transfer size: sizeof(offset) * numElements.

If the offset tensor has more than one element then the dst must be a rank 2 tensor with dimensions [offset.numElements(), remoteBuffer.numElements()].

Multiple values in the offset tensor with the same value will result in undefined behaviour because the order of writes to the buffer is not guaranteed.

See also

Graph::addRemoteBuffer()

Parameters

buffer – The remote buffer to copy from.
dst – The tensor to copy to.
offset – The “rows”” in the remote buffer to copy from.
debugContext – Optional DebugId and program name.

Copy(Tensor src, const RemoteBuffer &buffer, const DebugContext &debugContext = {})

Construct a program to copy a tensor to a remote buffer.

Parameters

src – The tensor to copy from.
buffer – The remote buffer buffer to copy to.
debugContext – Optional DebugId and program name.

Copy(Tensor src, const RemoteBuffer &buffer, Tensor offset, const DebugContext &debugContext = {})

Construct a program to copy a tensor to a remote buffer.

The data that is transferred is controlled by the definition of the buffer and the offset parameter.

The buffer has repeat data transfer “rows” each containing numElements data items. (These are not necessarily the same as rows in the source tensor) The rows to be copied are defined by offset. The size of offset defines the number of rows to copy. Each element of offset is the index of a row to be copied.

The size of src must be equal to the data transfer size: sizeof(offset) * numElements.

If the offset tensor has more than one element then the src must be a rank 2 tensor with dimensions [offset.numElements(), remoteBuffer.numElements()].

Multiple values in the offset tensor with the same value will result in undefined behaviour.

See also

Graph::addRemoteBuffer()

Parameters

src – The tensor to copy from.
buffer – The remote buffer buffer to copy to.
offset – The “rows” in the remote buffer to copy to.
debugContext – Optional DebugId and program name.

Copy(const FunctionBuffer &buffer, const Function &function, const DebugContext &debugContext = {})

Construct a program to copy the contents of a FunctionBuffer to a Function.

Note that there is no Copy program for the inverse direction i.e. Function to FunctionBuffer because no mutable state is stored in a FunctionBuffer.

See also

Graph::addFunctionBuffer()

Parameters

buffer – The FunctionBuffer to copy from.
function – The Function to copy to.
debugContext – Option DebugId and program name.

Private Functions

Copy(const DataStream &stream, Tensor dst, bool rearrangeOnHost, Tensor offset, size_t repeats, bool optimiseMemory, const OptionFlags &options = {}, const DebugContext &debugContext = {})

Copy(Tensor src, const DataStream &stream, bool rearrangeOnHost, Tensor offset, size_t repeats, bool optimiseMemory, const OptionFlags &options = {}, const DebugContext &debugContext = {})

class CrossReplicaCopy : public poplar::program::Program 

#include <Program.hpp>

A program that copies tensors between replicated sub-graphs.

Public Functions

CrossReplicaCopy(Tensor src, Tensor dst, std::map<unsigned, unsigned> replicaMap, const DebugContext &debugContext = {})

Constructor to create a program to copy a tensor to the equivalent tensor in a different replica sub-graph.

When the replicated graphs are created, this will create a Copy program in each replica. Each replica sends to exactly one other replica and receives from exactly one other replica. A replica may not copy to itself.

Parameters

src – Replicated tensor to copy from.
dst – Replicated tensor to copy to.
replicaMap – Each key in this map specifies a source replica from which to copy. The corresponding value specifies the replica to copy to. The map describes how to copy the replicated tensor src to dst
for each replica.

For example, given replicationFactor of 4, a clockwise ring may be constructed with the following replica map: {{0, 2}, {2, 3}, {3, 1}, {1, 0}}
Each replica must be represented once as a key (source) and once as a value (destination).
debugContext – Optional DebugId and program name.

class ErrorProgram : public poplar::program::Program 

Public Functions

ErrorProgram(StringRef message, Tensor debugTensor, const DebugContext &debugContext = {})

Throw an error.

Prints out a message and then throws an error.

Parameters

message – String to print.
debugTensor – tensor that will be printed after the message to aid debugging.
debugContext – Optional DebugId and program name.

class Execute : public poplar::program::Program 

#include <Program.hpp>

Program that executes a compute set in the graph.

Public Functions

explicit Execute(ComputeSet cs, const DebugContext &debugContext = {})

Construct a graph execution program.

Parameters

cs – The compute set to execute.
debugContext – Optional DebugId and program name.

Execute(ComputeSet cs, Tensor t, const DebugContext &debugContext = {})

Construct a graph execution program and write the exit status to a scalar tensor.

The exit status is the logical and of the values returned by the compute() method of all the vertices in the compute set.

Deprecated:: Use vertices to explicitly write to a result Tensor instead.

See also

Vertex and MultiVertex

Parameters

cs – The compute set to execute.
t – The tensor to write the exit status to.
debugContext – Optional DebugId and program name.

class If : public poplar::program::Program 

#include <Program.hpp>

A program that runs one of two programs depending on the value of a scalar tensor.

Public Functions

If(Tensor predicate, const Program &trueBody, const Program &falseBody, const DebugContext &debugContext = {})

A program that executes trueBody or falseBody depending on the value of predicate.

You can pass an empty Sequence to either trueBody or falseBody if you don’t want that branch to do anything. Any non-zero value of the predicate is treated as true.

Parameters

predicate – The scalar tensor (of type BOOL, UNSIGNED_INT, INT, SHORT or UNSIGNED_SHORT) that determines which branch to execute.
trueBody – This program is run if the predicate is true.
falseBody – This program is run if the predicate is false.
debugContext – Optional DebugId and program name.

class Loop : public poplar::program::Program 

#include <Program.hpp>

A program that executes for an indefinite number of iterations.

Public Functions

Loop(const Program &prog, const DebugContext &debugContext = {})

Construct a program which repeats indefinitely.

Parameters

prog – The program to repeatedly execute.
debugContext – Optional DebugId and program name.

class PrintTensor : public poplar::program::Program 

Public Functions

PrintTensor(Tensor t, const PrintTensorFmt &fmt, const DebugContext &debugContext = {})

Print the contents of a tensor.

You can send the output to a different stream by using the Engine::setPrintTensorStream function.

Parameters

t – The tensor to print.
fmt – The print format.
debugContext – Optional DebugId and program name.

PrintTensor(StringRef title, Tensor t, const PrintTensorFmt &fmt, const DebugContext &debugContext = {})

Print the name and contents of a tensor.

Parameters

title – The name of the tensor
t – The tensor to print.
fmt – The print format.
debugContext – Optional DebugId and program name.

PrintTensor(Tensor t, const DebugContext &debugContext = {})

Print the name and contents of a tensor.

Parameters

t – The tensor to print.
debugContext – Optional DebugId and program name.

PrintTensor(StringRef title, Tensor t, const DebugContext &debugContext = {})

Print the name and contents of a tensor.

Parameters

title – The name of the tensor
t – The tensor to print.
debugContext – Optional DebugId and program name.

class Program

#include <Program.hpp>

This class represents a control program that executes operations on the graph.

The class should not be explicitly constructed but one of its sub-classes should be constructed instead.

Subclassed by poplar::program::Abort, poplar::program::AbortOnCondition, poplar::program::AssumeEqualAcrossReplicas, poplar::program::Block, poplar::program::Call, poplar::program::Copy, poplar::program::CrossReplicaCopy, poplar::program::ErrorProgram, poplar::program::Execute, poplar::program::If, poplar::program::Loop, poplar::program::PrintTensor, poplar::program::Repeat, poplar::program::RepeatWhileFalse, poplar::program::RepeatWhileTrue, poplar::program::Sequence, poplar::program::Switch, poplar::program::Sync, poplar::program::WriteUndef

Public Functions

Program()

Program(const Program &p)

Program(Program &&p) noexcept

Program &operator=(const Program &p)

Program &operator=(Program &&p) noexcept

virtual ~Program()

inline bool isEmpty() const

inline core::ProgramImpl &getImpl()

inline const core::ProgramImpl &getImpl() const

Protected Attributes

std::unique_ptr<core::ProgramImpl> impl

Friends

friend bool operator==(const Program &lhs, const Program &rhs)

class Repeat : public poplar::program::Program 

#include <Program.hpp>

A program that repeatedly executes for a fixed number of iterations.

For more flexible loop operations see the PopLibs functions popops::countedLoop() and popops::countedForLoop().

Public Functions

Repeat(unsigned count, const Program &prog, const DebugContext &debugContext = {})

Construct a repeat program.

Parameters

count – The number of iterations to repeat for.
prog – The program to repeatedly execute.
debugContext – Optional DebugId and program name.

class RepeatWhileFalse : public poplar::program::Program 

#include <Program.hpp>

A program that executes a program repeatedly while a condition is false.

The program starts by executing the condition program, cond, which should set the value of predicate. If predicate is true, then the loop exits. If predicate is false then the body program is executed and then it loops to execute cond program again.

This is like a C while statement with an inverted condition.

Public Functions

RepeatWhileFalse(const Program &cond, Tensor predicate, const Program &body, const DebugContext &debugContext = {})

Construct a repeat-while-false program.

Parameters

cond – The program executed before predicate is evaluated. The normal use case is that this will set the value of predicate.
predicate – The scalar tensor (of type BOOL, UNSIGNED_INT, INT, SHORT or UNSIGNED_SHORT) that determines whether to execute body. Any non-zero value of the predicate is treated as true.
body – The body to execute when predicate is false.
debugContext – Optional DebugId and program name.

class RepeatWhileTrue : public poplar::program::Program 

#include <Program.hpp>

A program that executes a program repeatedly while a condition is true.

The program starts by executing the condition program, cond, which should set the value of predicate. If predicate is false, then the loop exits. If predicate is true then the body program is executed, and then it loops to execute cond program again.

This is like a C while statement.

Public Functions

RepeatWhileTrue(const Program &cond, Tensor predicate, const Program &body, const DebugContext &debugContext = {})

Construct a repeat-while-true program.

Parameters

cond – The program executed before predicate is evaluated. The normal use case is that this will set the value of predicate.
predicate – The scalar tensor (of type BOOL, UNSIGNED_INT, INT, SHORT or UNSIGNED_SHORT) that determines whether to execute body. Any non-zero value of the predicate is treated as true.
body – The body to execute when predicate is true.
debugContext – Optional DebugId and program name.

class Sequence : public poplar::program::Program 

#include <Program.hpp>

Program that executes a sequence of programs.

Public Functions

inline Sequence(const DebugContext &debugContext = {}): Construct an empty execution sequence (with optional debug context).

inline Sequence(std::initializer_list<Program> programs, const DebugContext &debugContext = {})

Construct an execution sequence from a list of programs.

This constructor is used to create a sequence of programs where the programs are provided as arguments to the constructor.

Sequence{prog1, prog2, prog3}
Sequence({prog1, prog2, prog3}, {debugId})
Sequence({prog1, prog2, prog3}, {debugId, "debugName"})

Parameters

programs – List of programs in the sequence.
debugContext – Optional DebugId and program name.

void add(const Program &p)

Add a program to the end of the sequence.

Parameters: p – The program to add.

Private Functions

template<class ...T> inline void add_many(const Program &first, T&&... rest)

inline void add_many()

void init()

void init(const DebugContext &debugContext)

class Switch : public poplar::program::Program 

#include <Program.hpp>

A program that runs one of many programs depending on the value of a tensor.

The controlling tensor must be a scalar of type INT or UNSIGNED_INT.

A switch contains of a number of switch cases, each with a case value and a case body and a default case. The case values must be unique. If the value of the controlling tensor matches the case value of a case the corresponding case body is run, otherwise the default case is run.

Public Functions

Switch(Tensor control, const std::vector<std::pair<std::int32_t, Program>> &cases, const DebugContext &debugContext = {})

Construct a switch with the specified set of cases and an empty default case.

Parameters

control – The controlling tensor.
cases – The cases of the switch: value and program to run.
debugContext – Optional DebugId and program name.

Switch(Tensor control, const std::vector<std::pair<std::int32_t, Program>> &cases, const Program &defaultCaseBody, const DebugContext &debugContext = {})

Construct a switch with the specified set of cases and default case.

Parameters

control – The controlling tensor.
cases – The cases of the switch: value and program to run.
defaultCaseBody – The body of the default case.
debugContext – Optional DebugId and program name.

Switch(Tensor control, const DebugContext &debugContext = {})

Construct a switch with no cases and an empty default case.

The add() method can be used to add cases after the switch is constructed.

Parameters

control – The controlling tensor.
debugContext – Optional DebugId and program name.

Switch(Tensor control, const Program &defaultCaseBody, const DebugContext &debugContext = {})

Construct a switch with no cases and the specified default case.

The add() method can be used to add cases after the switch is constructed.

Parameters

control – The controlling tensor.
defaultCaseBody – The body of the default case.
debugContext – Optional DebugId and program name.

Switch &add(std::int32_t value, const Program &body)

Add a case with the specified case value and body.

Parameters

value – The case value.
body – The case body.

Returns

A reference to the switch program.

Public Static Functions

static Switch switchWithBoundsChecking(Tensor control, const std::vector<std::pair<std::int32_t, Program>> &cases, const DebugContext &debugContext = {}): A helper function that causes the default case to throw an error.

static Switch switchWithUnreachableDefault(Tensor control, const DebugContext &debugContext = {})

This function lets the compiler assume the default case is unreachable.

If the control value is something other than one of the cases, it results in undefined behaviour (although there is some very minimal error checking at runtime).

Private Functions

Switch(Tensor control, const Program &defaultCaseBody, const bool unreachableDefault, const DebugContext &debugContext = {})

class Sync : public poplar::program::Program 

#include <Program.hpp>

A program to synchronise at a certain granularity dictated by the SyncType.

Public Functions

Sync(SyncType type, const DebugContext &debugContext = {})

Parameters

type – The type of sync to perform.
debugContext – Optional DebugId and program name.

class WriteUndef : public poplar::program::Program 

#include <Program.hpp>

A program to mark a tensor as containing an undefined value.

This can be used to improve the liveness analysis of tensors and save memory in some situations.

Poplar does liveness analysis using the standard algorithm, except that Poplar’s variables are not scalar values; they are arrays (tensors). In the standard analysis, a variable is “killed” when it is written to with a new value. This means that it is dead immediately before that point because its value there can never be read.

int a = 1;
// a is dead here because its current value (1) can never be read.
a = 2; // a is killed here, which makes it dead on the line above.

In Poplar, a variable is killed when all of its elements are written in the same compute set. Consider the pseudo-code:

var = graph.addVariable(FLOAT, {2}, ...);

seq.add(Execute( var[0] = 1, var[1] = 2 ));
// var is dead here (it is killed on the line below) because none of its
// element values (1, 2) can ever be read.
seq.add(Execute( var[0] = 3, var[1] = 4 ));

If only some of the elements are written then the entire variable is still live before the write because we may still need the values of the elements that were not written to.

seq.add(Execute( var[0] = 1, var[1] = 2 ));
// var is alive here because the value 2 might be read later.
seq.add(Execute( var[0] = 3 ));

var is still alive because no compute set writes to every element. If the entire variable is overwritten but in separate compute sets, then it will still be considered to be live because Poplar does not track the liveness of each variable element, only the entire variable.

seq.add(Execute( var[0] = 1, var[1] = 2 ));
// var is alive here even though 1 and 2 can never be read.
seq.add(Execute( var[0] = 3 ));
seq.add(Execute( var[1] = 4 ));

This means var is alive more than necessary which may lead to increased memory use. One solution is for Poplar to track the liveness of every variable element separately, but that would be prohibitively expensive.

Instead, this program provides a way to manually mark a tensor as being dead by writing an undefined value to it. Changing the above code to the following results in the correct liveness.

seq.add(Execute( var[0] = 1, var[1] = 2 ));
// Manually kill var because we know - even if Poplar does not - that
// it is about to be completely overwritten.
seq.add(WriteUndef(var));
seq.add(Execute( var[0] = 3 ));
seq.add(Execute( var[1] = 4 ));

For more information about liveness analysis see https://en.wikipedia.org/wiki/Live_variable_analysis and https://www.cl.cam.ac.uk/teaching/2006/OptComp/slides/lecture03.pdf

Param t: The tensor to mark as undefined.
Param debugContext: Optional DebugId and program name.

Public Functions

WriteUndef(Tensor t, const DebugContext &debugContext = {})

#include <poplar/PrintTensor.hpp>

namespace poplar

Poplar classes and functions.

Functions

bool operator==(const PrintTensorFmt &lhs, const PrintTensorFmt &rhs)

class PrintTensorFmt

#include <PrintTensor.hpp>

PrintTensorFmt specifies how the print output of PrintTensor should be formatted.

Public Types

enum class FloatFormat

Floating point format to use when printing a tensor.

Values:

enumerator Auto = 0

enumerator Fixed = 1

enumerator Scientific = 2

enumerator None = 3

Public Functions

PrintTensorFmt(unsigned summariseThreshold = 1000, unsigned edgeItems = 3, unsigned maxLineWidth = 75, unsigned digits = 8, FloatFormat floatFormat = FloatFormat::Auto, char separator = ' ', char openBracket = '[', char closeBracket = ']')

PrintTensorFmt specifies how the print output of PrintTensor should be formatted.

The default output format will split large lines, print all elements in the same format, pad elements so that they align and summarise large tensors.

You can use the disableFormatting constructor to disable all types of formatting.

Parameters

summariseThreshold – (default 1000) If the number of elements of the tensor exceeds this threshold the output will be summarised. Only the edge elements will be displayed with an ellipsis indicating skipped elements. A value of 0 will disable summarisation.
edgeItems – (default 3) number of edge elements to include at the beginning and end when summarisation is enabled
maxLineWidth – (default 75) lines longer than this limit will be split across multiple lines. A value of 0 will disable line splitting.
digits – (default 8) number of digits to display. For integers this limit can be exceeded if any number is large enough. For floating points this does not include the exponent. The number of digits is used in conjunction analysis of the tensor to determine the width of each element to align all elements when printed. A value of 0 disables this analysis and each elements will be printed in an unaligned format.
floatFormat – (default Auto) determines the floating point format to use. Automatic mode determines the appropriate format based on the data. If digits==0 this option is disregarded and the floatFormat is set to None.
separator – (default space) character used to delininate values.
openBracket – (default square bracket) character used to open a tensor.
closeBracket – (default square bracket) character used to close a tensor.

~PrintTensorFmt()

PrintTensorFmt(const PrintTensorFmt &other) noexcept

PrintTensorFmt(PrintTensorFmt &&other) noexcept

PrintTensorFmt &operator=(PrintTensorFmt &&other) noexcept

explicit PrintTensorFmt(std::unique_ptr<core::PrintTensorFmt> impl)

inline const core::PrintTensorFmt &getImpl() const

Public Static Functions

static PrintTensorFmt disableFormatting()

PrintTensorFmt specifies how the output of PrintTensor should be formatted.

This constructor disables all types of formatting.

Private Members

std::unique_ptr<core::PrintTensorFmt> impl

namespace core

Search help

Program