4. PopLibs API reference¶
The PopLibs libraries provide application-level functions that can be used in Poplar programs for the IPU.
Library |
Depends on |
Description |
---|---|---|
|
|
Linear algebra functions (matrix multiplications, convolutions) |
|
|
Functions used in neural networks (for example, non-linearities, pooling and loss functions) |
|
|
Operations on tensors in control programs (elementwise functions and reductions) |
|
|
Functions for populating tensors with random numbers |
|
Functions for operating on sparse tensors |
|
|
General utility functions for building graphs |
4.1. Utility functions (poputil)¶
General utility functions for building graphs.
4.1.1. poputil/Broadcast.hpp¶
-
namespace
poputil
Functions
-
void
expandToMatchRanks
(poplar::Tensor &a, poplar::Tensor &b) Insert singleton dimensions into either of two tensors such that their ranks match following numpy style expansion rules.
The tensor with the lower rank has singleton dimensions inserted as outer-most dimensions.
- Parameters
a
: First tensor to match.b
: Second tensor to match.
-
void
broadcastToMatch
(poplar::Tensor &a, const std::vector<std::size_t> &shape) Match dimensions of a tensor to a shape by broadcasting using numpy style broadcast rules:
1) If the rank of the tensor is expand to the dimensions to the left with dimensions of size 1 to match the rank of the required shape.
2) For each dimension, the size of the dimension in the tensors must be the same as the required shape or must have size 1. In the case where it is of size one the tensor is broadcast in that dimension to match the shape. If neither of these conditions hold then an exception is thrown.
- Parameters
a
: The tensor to broadcast to match the shape. This will be updated in place with broadcast dimensions.shape
: The shape to match.
-
void
broadcastToMatch
(poplar::Tensor &a, poplar::Tensor &b) Match dimensions of two tensors by broadcasting using numpy style broadcast rules:
1) If the rank of one tensor is less than the other then extend the dimensions to the left with dimensions of size 1.
2) For each dimension, the size of the dimension in both tensors must be the same or one of them must have size 1. In the case where one is of size one the tensor is broadcast in that dimension to match the other. If neither of these conditions hold then an exception is thrown.
- Parameters
a
: First tensor to match. This will be updated in place with broadcast dimensions.b
: Second tensor to match. This will be updated in place with broadcast dimensions.
-
void
broadcastToMatch
(poplar::Tensor &a, poplar::Tensor &b, poplar::Tensor &c) Match dimensions of three tensors by broadcasting using numpy style broadcast rules:
1) If the rank of one tensor is less than the other then extend the dimensions to the left with dimensions of size 1.
2) For each dimension, the size of the dimension in both tensors must be the same or one of them must have size 1. In the case where one is of size one the tensor is broadcast in that dimension to match the other. If neither of these conditions hold then an exception is thrown.
- Parameters
a
: First tensor to match. This will be updated in place with broadcast dimensions.b
: Second tensor to match. This will be updated in place with broadcast dimensions.c
: Third tensor to match. This will be updated in place with broadcast dimensions.
-
bool
canBroadcastToMatch
(const poplar::Tensor &a, const poplar::Tensor &b) Test if the given tensors can be broadcast to match one another using the rules for broadcastToMatch.
- Return
True if the two tensors may be broadcast to match one another and false if they do not match following the broadcastToMatch broadcast rules.
- Parameters
a
: First tensor to match.b
: Second tensor to match.
-
void
4.1.2. poputil/GraphFunction.hpp¶
-
namespace
poputil
-
namespace
graphfn
¶ -
Enums
Functions
-
struct
ArgSig
¶
-
class
ProgramFunction
¶ Public Functions
Private Members
-
VoidFunction
voidFunc
¶
-
VoidFunction
-
struct
-
namespace
4.1.3. poputil/Loop.hpp¶
-
namespace
poputil
Typedefs
4.1.4. poputil/TileMapping.hpp¶
-
namespace
poputil
Functions
-
std::vector<std::vector<poplar::Interval>>
calcLinearTileMapping
(const poplar::Graph &graph, std::vector<std::size_t> shape, unsigned minElementsPerTile, unsigned grainSize)¶ Calculate a tile mapping that spreads the tensor evenly over the tiles in a linear manner (i.e.
with the indices of the flattened tensor mapped across from low -> high tile numbers).
-
std::vector<std::vector<poplar::Interval>>
calcLinearTileMapping
(const poplar::Graph &graph, const poplar::Tensor &t)¶ Calculate a tile mapping that spreads the tensor evenly over the tiles in a linear manner (i.e.
with the indices of the flattened tensor mapped across from low -> high tile numbers).
In this case the elements are split so as not to split vectors of elements for the devices natural vector widths and to try and keep at least 128 bytes of data on each tile to avoid high exchange costs.
-
void
mapTensorLinearly
(poplar::Graph &graph, const poplar::Tensor &t, unsigned minElementsPerTile, unsigned grainSize)¶
-
unsigned
getTileImbalance
(const poplar::Graph::TileToTensorMapping &mapping, unsigned minElementsPerTile = 0, unsigned grainSize = 1)¶ Determine how unbalanced a tensor is mapped over tiles.
- Return
The maximum number of elements over expected on any tile.
- Parameters
mapping
: The tile mapping of the tensorminElementsPerTile
: The expected minimum number of elements per tile.grainSize
: The expected “grain size” i.e. atomic grains used to split of elements over tiles
-
unsigned
getTileImbalance
(const poplar::Graph &graph, const poplar::Tensor &t, unsigned minElementsPerTile = 0, unsigned grainSize = 1)¶ Determine how unbalanced a tensor is mapped over tiles.
- Return
The maximum number of elements over expected on any tile.
- Parameters
graph
: The graph.t
: The tensor to be inspected.minElementsPerTile
: The expected minimum number of elements per tile.grainSize
: The expected “grain size” i.e. atomic grains used to split of elements over tiles
-
void
mapOutputForElementWiseOp
(poplar::Graph &graph, const std::vector<poplar::Tensor> &inputs, const poplar::Tensor &output, unsigned grainSize = 1, unsigned minGrainsPerTile = 0)¶ Update a tensor’s tile mapping such that when it is used as the output of an element-wise operation (operation with no dependency between more than one element of the output and any given element of any input tensor).
Use the resulting tensor to map element-wise operations to tiles to produce an operation that is computationally balanced across tiles and which minimises exchange.
- Parameters
graph
: A graph which the given inputs/output belong to.inputs
: List of input tensors for the operation.output
: Output tensor for the operation.grainSize
: Grain-size for elements mapped to each tile.minGrainsPerTile
: Minimum no. of grains mapped to a tile.
-
poplar::Tensor
cloneToIpu
(poplar::Graph &graph, const poplar::Tensor &t, unsigned dstIPU, poplar::StringRef name = "", poplar::TensorCloneMethod method = poplar::TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)¶ Create a clone of the specified tensor.
Elements of the cloned tensor are mapped to the specified IPU such the index of the tile an element is mapped to within an IPU is preserved.
- Return
The cloned tensor.
- Parameters
graph
: The graph representing the entire multi-IPU device.t
: The tensor to clone.dstIPU
: The index of the IPU to clone the tensor onto.name
: A debug name to give to any new tensors allocated in the graph during the clone. If this is empty then the debug names will be derived from existing tensor debug names.method
: The method to use for the cloning.
-
poplar::Tensor
copyToIpu
(poplar::Graph &masterGraph, const poplar::Tensor &t, poplar::program::Sequence &prog, unsigned dstIPU, poplar::StringRef name = "", poplar::TensorCloneMethod method = poplar::TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)¶ Move a tensor from one IPU to another by duplicating it, mapping the clone onto another IPU, and copying the original to the new one.
- Return
The new tensor on the specified IPU.
- Parameters
masterGraph
: The graph representing the entire multi-IPU device.t
: The tensor to move from one IPU to another.prog
: A program sequence to which the Copy will be added.dstIPU
: The index of the IPU onto which the Tensor will be moved.name
: A debug name to give to the tensor created on dstIPU. If this is empty then the debug names will be derived from existing tensor debug names.method
: The method to use for cloning of the tensor on the destination IPU.
-
poplar::Tensor
createIpuCopy
(poplar::Graph &graph, const poplar::Tensor &t, unsigned dstIpu, poplar::Tensor ©Src, poplar::Tensor ©Dst, poplar::StringRef name = "", poplar::TensorCloneMethod method = poplar::TensorCloneMethod::PRESERVE_ORDER_AND_ALIASES)¶ Move a tensor from one IPU to another by duplicating it, mapping the clone onto another IPU, and provide the src/dsts tensors of an inter-IPU copy (but to not add that copy to a program at this point).
- Return
The new tensor on the specified IPU.
- Parameters
masterGraph
: The graph representing the entire multi-IPU device.t
: The tensor to move from one IPU to another.dstIPU
: The index of the IPU onto which the Tensor will be moved.copySrc
: A tensor that can be used as the source to do the copycopyDst
: A tensor that can be used as the dest to do the copyname
: A debug name to give to the tensor created on dstIPU. If this is empty then the debug names will be derived from existing tensor debug names.method
: The method to use for cloning of the tensor on the destination IPU.
-
bool
dimIsSplitOverTiles
(const poplar::Graph &graph, const poplar::Tensor &t, unsigned dimension)¶ Check if the tile mapping of the given tensor is or isn’t such that the given dimension is split over more than 1 Tile.
- Return
true if any slice of the given dimension is spread over more than one Tile.
- Parameters
graph
: The graph to introspect.t
: The tensor to introspect.dimension
: The dimension to check.
-
bool
dimIsSplitOverIPUs
(const poplar::Graph &graph, const poplar::Tensor &t, unsigned dimension)¶ Check if the tile mapping of the given tensor is or isn’t such that the given dimension is split over more than 1 IPU.
- Return
true if any slice of the given dimension is spread over more than one IPU.
- Parameters
graph
: The graph to introspect.t
: The tensor to introspect.dimension
: The dimension to check.
-
class
TensorUseTracker
¶ - #include <TileMapping.hpp>
Class that tracks the usage of data on different tiles.
If data is broadcast to many tiles, it is sometimes efficient to map the data to be spread evenly amongst the tiles that use it.
This class can collect uses of data and then calculate such a tile mapping.
Public Types
Public Functions
-
TensorUseTracker
(unsigned numTiles)¶
-
TensorUseTracker
(const TensorUseTracker &other)¶
-
TensorUseTracker
(TensorUseTracker &&other)¶
-
TensorUseTracker &
operator=
(const TensorUseTracker &other)¶
-
TensorUseTracker &
operator=
(TensorUseTracker &&other)¶
-
~TensorUseTracker
()¶
-
void
add
(const poplar::Graph &graph, unsigned tile, const poplar::Tensor &t)¶ Add a data use case.
- Parameters
graph
: The Poplar graphtile
: The tile that the use occurs on.t
: The tensor representing the data being used.
-
void
add
(TensorUseTracker other)¶ Add data use cases from another tracker.
- Parameters
other
: The TensorUseTracker from which to merge data uses.
-
void
resolve
(const poplar::Graph &graph, unsigned grainSize, unsigned minElementsPerTile, bool extendPartialUsage = false, TensorUseTracker::MappingMethod mappingMethod = TensorUseTracker::MappingMethod::None)¶ Resolve data uses for mapping.
Data used on multiple tiles will have their uses spread across those tiles.
- Parameters
grainSize
: The number of elements that cannot be split amongst tiles.minElementsPerTile
: The minimum number of elements that must be mapped to a tile.extendPartialUsage
: When set, partial uses of tensors will be extended to cover the entire tensor, based on the usage of neighbouring regions.mappingMethod
: Method used for mapping elements.
-
void
mapTensorsByUse
(poplar::Graph &graph, unsigned grainSize, unsigned minElementsPerTile, bool extendPartialUsage = false, TensorUseTracker::MappingMethod mappingMethod = TensorUseTracker::MappingMethod::None)¶ Map data according to use.
This function will set the tile mapping of variable regions based on tracked data uses. Variable regions with uses on multiple tiles will have their elements spread across those tiles.
- Parameters
graph
: The Poplar graphgrainSize
: The number of elements that cannot be split amongst tiles.minElementsPerTile
: The minimum number of elements that must be mapped to a tile.extendPartialUsage
: When set, partial uses of tensors will be extended to cover the entire tensor, based on the usage of neighbouring regions before mapping.mappingMethod
: Method used for mapping eements.
-
bool
empty
() const¶ Have any use cases been registered.
- Return
True if no data use cases, false otherwise
-
-
std::vector<std::vector<poplar::Interval>>
4.1.5. poputil/Util.hpp¶
-
namespace
poputil
Functions
-
std::vector<std::vector<poplar::Interval>>
splitRegions
(const std::vector<poplar::Interval> ®ions, unsigned grainSize, unsigned maxPartitions, unsigned minElementsPerPartition = 0, unsigned maxElementsPerPartition = UINT_MAX, unsigned maxElementsPerRegion = UINT_MAX)¶ Given a set of contiguous regions, partition these regions trying to balance the number of elements in each partition, respecting the specified grain.
At most maxPartitions partitions are created. Regions may be split to achieve a better balance.
-
std::vector<std::vector<poplar::Interval>>
splitRegionsBetweenWorkers
(const poplar::Target &target, const std::vector<poplar::Interval> ®ions, unsigned grainSize, unsigned minElementsPerPartition = 0, unsigned maxElementsPerPartition = UINT_MAX, unsigned maxElementsPerRegion = UINT_MAX)¶ Given a set of contiguous regions per tile, partition these regions between workers on that tile, respecting the specified grain size.
Regions may be split to balance the work across workers.
-
std::vector<std::vector<std::vector<poplar::Interval>>>
splitRegions
(const std::vector<std::vector<poplar::Interval>> ®ions, unsigned grainSize, unsigned maxPartitions, unsigned minElementsPerPartition = 0, unsigned maxElementsPerPartition = UINT_MAX, unsigned maxElementsPerRegion = UINT_MAX)¶
-
std::vector<std::vector<std::vector<poplar::Interval>>>
splitRegionsBetweenWorkers
(const poplar::Target &target, const std::vector<std::vector<poplar::Interval>> ®ions, unsigned grainSize, unsigned minElementsPerPartition = 0, unsigned maxElementsPerPartition = UINT_MAX, unsigned maxElementsPerRegion = UINT_MAX)¶ Given a set of sequences of regions per tile, partition these sequences between workers on that tile, respecting the specified grain size.
Regions may be split to balance the work across workers.
-
template<class
T
>
std::vector<T>unflattenIndex
(const std::vector<T> &shape, std::size_t index)¶ Given an index into a flattened tensor returns the indices into the dimensions of the original tensor.
-
template<class
T
>
std::size_tflattenIndex
(const std::vector<T> &shape, const std::vector<T> &indices)¶ Given an list of indices into a tensor return the corresponding index in a flattened version of the tensor.
-
std::size_t
intervalSequenceNumElements
(const std::vector<std::vector<poplar::Interval>> &seq)¶ Total number of elements in the interval sequence.
-
poplar::Tensor
duplicate
(poplar::Graph &graph, const poplar::Tensor &in, poplar::program::Sequence &p, const std::string &name = "", poplar::TensorCloneMethod method = poplar::TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)¶ Copy a tensor’s data to a new tensor.
The duplicated tensor has the same tile mapping as the original tensor.
-
poplar::Tensor
cloneN
(poplar::Graph &graph, const poplar::Tensor &t, unsigned N, poplar::StringRef name = "", poplar::TensorCloneMethod method = poplar::TensorCloneMethod::PRESERVE_ORDER_UNLESS_ALIASES)¶ Clone a tensor N times.
Given a tensor of shape [D1, D2, … Dn], this function will create a new tensor of shape [N, D1, D2, …, Dn] where each of the N sub-tensors is a clone of the original tensor (i.e. has the same layout).
- Parameters
graph
: The Poplar grapht
: The tensor to cloneN
: The replication factor to clone withname
: The name for the new variables createdmethod
: The tensor cloning method (see Graph::clone)
-
std::vector<std::vector<poplar::Interval>>
4.1.6. poputil/VarStructure.hpp¶
-
namespace
poputil
Typedefs
Functions
-
unsigned
detectInnermostGrouping
(const poplar::Graph &graph, const poplar::Tensor &t)¶ If there is one, detect if the given tensor has a grouping in its innermost dimension.
-
poplar::Tensor
createPartitionableTensor
(poplar::Graph &graph, const poplar::Type &type, const std::vector<std::size_t> &shape, const std::vector<std::size_t> &nPartitions, const std::string &debugName = "")¶ Create a tensor with the given shape such that when it is partitioned into slices according to the given number of partitions in each dimension, each slice is a single contiguous region.
This partitions such that the maximum number of elements in each partition of a dimension is minimised as well as the number of partitions. i.e. if a dimension has
n
elements, and the number of partitions in that dimension isd
then:a * ceil(n/d) + 1 * (nd) = n
There will be
a
partitions with ceil(n/d) elements followed byb
partitions with floor(n/d) elements and possibly some number of partitions with 0 elements.The returned tensor has no tile mapping set.
- Return
A tensor with the given shape where each partition is contiguous.
- Parameters
graph
: The graph to add the variable to.type
: The type of the elements in the returned tensor.shape
: The shape of the returned tensor.nPartitions
: How many partitions the given shape will be partitioned into in each dimension.debugName
: The debug name associated with the returned tensor.
-
void
iterateTensorPartitions
(const poplar::Tensor &t, const std::vector<std::size_t> &nPartitions, const std::function<void(const std::vector<std::size_t> &i, const poplar::Tensor &s)> &f)¶ Iterate the partitions of a tensor.
Partitioning follows the same definition as described above in
addVariableWithSplits
.- Parameters
t
: The tensor to iterate.nPartitions
: How many partitions the given tensor is partitioned into in each dimension.f
: A function taking the indices of the partition in the range [0,splits[d]) in each dimension of the tensor as well as the slice of the tensor corresponding to that partition.
-
unsigned
4.1.7. poputil/VertexTemplates.hpp¶
-
namespace
poputil
Functions
-
template<typename ...
Args
>
std::stringtemplateVertexParams
(bool first, const std::string &val, Args&&... args)¶
-
template<typename ...
Args
>
std::stringtemplateVertexParams
(bool first, const char *val, Args&&... args)¶
-
template<typename ...
Args
>
std::stringtemplateVertexParams
(bool first, const poplar::Type &type, Args&&... args)¶
-
template<typename
T
>
structVertexTemplateToString
¶
-
template<> StringRef >
-
template<typename ...
4.2. Tensor operations (popops)¶
Functions for building operations on tensors in control programs (such as element-wise functions and reductions).
4.2.1. popops/AllTrue.hpp¶
-
namespace
popops
¶ Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
allTrue
(poplar::Graph &graph, poplar::Tensor A, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Given a boolean tensor, compute the logical AND of all its elements.
A new variable is created to store the result.
- Return
A variable that holds the result of the operation
- Parameters
graph
: The Poplar graphA
: The boolean tensorprog
: The program sequence to add this operation todebugPrefix
: A debug name for the operation
-
poplar::Tensor
4.2.2. popops/Cast.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
cast
(poplar::Graph &graph, const poplar::Tensor &src, const poplar::Type &dstType, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Cast elements of the specified
src
tensor todstType
, returning the result as a new tensor.Note: If
dstType
==src.elementType()
, then the operation is a copy.- Return
The resultant cast tensor.
- Parameters
graph
: The graph that the operation will be added to.src
: Source tensor to cast.dstType
: Type of the destination tensor.prog
: Program to add the cast operation to.debugPrefix
: Name of the operation, for debugging.
-
poplar::program::Program
cast
(poplar::Graph &graph, poplar::Tensor src, poplar::Tensor dst, const std::string &debugPrefix = "")¶ Create a program to copy tensor casting between types (for example, half->float).
Precondition:
src.shape()
==dst.shape()
Note: If
dst.elementType()
==src.elementType()
, then the operation is just a copy.- Return
The program to perform this operation.
- Parameters
graph
: The graph that the operation will be added to.src
: Source tensor.dst
: Destination tensor.debugPrefix
: Name of the operation, for debugging.
-
void
cast
(poplar::Graph &graph, poplar::Tensor src, poplar::Tensor dst, poplar::ComputeSet cs)¶ Create vertices to copy element wise from the
src
tensor to thedst
tensor casting between types (for example, half->float).The vertices are added to the specified compute set.
Precondition:
src.shape()
==dst.shape()
- Parameters
graph
: The graph that the operation will be added to.src
: Source tensor.dst
: Destination tensor.cs
: Compute set to add the vertices to.
-
poplar::Tensor
cast
(poplar::Graph &graph, poplar::Tensor src, const poplar::Type &dstType, poplar::ComputeSet cs, const std::string &debugPrefix = "")¶ Create vertices to cast elements of the specified
src
tensor todstType
, returning the result as a new tensor.The vertices are added to the specified compute set.
- Return
Resultant destination tensor.
- Parameters
graph
: The graph that the operation will be added to.src
: Source tensor.dstType
: Destination type.cs
: Compute set to add the vertices to.debugPrefix
: Name of the operation, for debugging.
-
poplar::Tensor
checkAccuracyWhenCast
(poplar::Graph &graph, const poplar::Tensor &input, poplar::Type outputType, double tolerance, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Helper function which checks the relative error in the tensor
input
when casting it to typeoutputType
.The result is a single element bool tensor which is set to true if the error is <
tolerance
.Preconditions:
input.elementType() == FLOAT
outputType == HALF
input.numElements() == 1
- Return
Boolean tensor indicating error <
tolerance
- Parameters
graph
: The graph that the operation will be added to.input
: Input tensor.outputType
: Output type after the cast operation.tolerance
: Allowed tolerance in error from cast operation.prog
: Program to add the check onto.debugPrefix
: Name of the operation, for debugging.
-
poplar::Tensor
4.2.3. popops/CircBuf.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
-
class
CircBuf
¶ Public Functions
-
CircBuf
(poplar::Graph &graph, const poplar::Type &dataType, unsigned size, const std::vector<std::size_t> &shape, const std::string &debugPrefix = "")¶ CircBuf represents a circular buffer of tensors which can be indexed using prev().
Each call to add() will add the given tensor to the circular buffer with the potential to overwrite a previous element if the buffer is full.
- Parameters
graph
: Graph to add the circular buffer to.dataType
: Datatype of the tensor elements in buffer.size
: Size of the circular buffer.shape
: Shape of the tensor elements in buffer.debugPrefix
: Prefix of the circular buffer tensor, for debugging.
-
poplar::Tensor
prev
(unsigned i, poplar::program::Sequence &seq, const std::string &debugPrefix = "")¶ Return elements
i
entries old.i
must be < size- Return
Tensor returned from the circular buffer.
- Parameters
i
: Index into the circular buffer.seq
: Program to add the operation to.debugPrefix
: Name of the operation, for debugging.
-
void
add
(poplar::Tensor t, poplar::program::Sequence &seq, const std::string &debugPrefix = "")¶ Append an element to the end of the circular buffer.
- Parameters
t
: Tensor to append to the circular bufferseq
: Program to add the operation to.debugPrefix
: Name of the operation, for debugging.
-
unsigned
size
() const¶ Size of the circular buffer.
-
poplar::Graph::TileToTensorMapping
getTileMapping
()¶ Return tensor mapping of the tensor returned by indexing into a circular buffer.
-
-
class
4.2.4. popops/Collectives.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
Chunks
reduceScatter
(poplar::Graph &graph, const poplar::Tensor &toReduce, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Given a tensor of rank 2 reduce across the outermost dimension using the specified reduction operator.
This function assumes index
i
in the outermost dimension is mapped to IPUi
. The result is distributed over IPUs such that each IPU has a slice of the final result. The return value is a vector of chunks where chunki
resides on IPUi
. The chunks may have different number of elements (for example, when the number of IPUs does not exactly divide the number of elements).- Parameters
graph
: The graph.toReduce
: The tensor to reduce. Each partial should be mapped identically to the others across the IPUs with in the rank.op
: The reduction operator (for example,Operation::ADD
).prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options (not currently used).
-
poplar::Tensor
allGather
(poplar::Graph &graph, const Chunks &toGather, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Broadcast data distributed over IPUs to all IPUs.
This function assumes chunk
i
is mapped to IPUi
. The result is a 2D tensor that contains a copy of the data for each IPU. Indexi
in the outermost dimension of the result is mapped to IPUi
.- Parameters
graph
: The graph.toGather
: The chunks to gather.prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options. See reduceScatter().
-
poplar::Tensor
allReduce
(poplar::Graph &graph, const poplar::Tensor &toReduce, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Perform an all-reduce operation on the specified tensor.
This operation reduces across the outermost dimension of input and produces a tensor with the same shape where the innermost dimension is the result of the reduction and the outermost dimension is a number of copies of the result. This function assumes index
i
in the outermost dimension of the input is mapped to IPUi
. Indexi
in the outermost dimension of the result is mapped to IPUi
.- Parameters
graph
: The graph.toReduce
: The tensor to reduce. Each partial should be mapped identically to the others across the ipus with in the rank.op
: The reduction operator (for example,Operation::ADD
).prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options. See reduceScatter().
-
poplar::Tensor
replicatedAllReduce
(poplar::Graph &graph, const poplar::Tensor &data, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Perform an all-reduce operation on the specified replicated tensor.
This operation reduces across the tensors the replicated tensor is a handle for. The result returned as a replicated tensor.
- Parameters
graph
: The replicated graph the input tensor belongs to.data
: The replicated tensor to reduce.op
: The reduction operator (for example,Operation::ADD
)prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options. See reduceScatter().
-
void
replicatedAllReduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &data, poplar::Tensor &output, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Same as replicatedAllReduce but writes the result to the output tensor instead of creating a new one.
-
void
replicatedAllReduceInPlace
(poplar::Graph &graph, poplar::Tensor &data, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Perform an all-reduce operation on the specified replicated tensor.
This operation reduces across the tensors the replicated tensor is a handle for. The result is written to back to the input data tensor.
- Parameters
graph
: The replicated graph the input tensor belongs to.data
: The replicated tensor to reduce and written to.op
: The reduction operator (for example,Operation::ADD
)prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options. See reduceScatter().
-
poplar::Tensor
replicatedAllReduce
(poplar::Graph &graph, poplar::Graph &parentGraph, const poplar::Tensor &data, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Perform an all-reduce operation on the specified replicated tensor.
This variant of replicatedAllReduce() is deprecated and may be removed in future.
-
poplar::Tensor
replicatedReduceScatter
(poplar::Graph &graph, const poplar::Tensor &toReduce, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Reduce the replicated rank-1 tensor “toReduce” with the result scattered across the replicas.
For an input of shape [numElements] mapped to a single IPU per replica, the output will have shape [ceil(numElements / replicationFactor)]. If replicationFactor does not evenly divide numElements, the result is zero-padded. For instance:
Before:
Replica0: toReduce[x0, y0, z0]
Replica1: toReduce[x1, y1, z1]
After:
Replica0: result[op(x0, x1), op(y0, y1)]
Replica1: result[op(z0, z1), 0]
For an input of shape [numElementsIPU0 + numElementsIPU1 + …] mapped to multiple IPUs per replica, the output will have shape: [ceil(numElementsIPU0 / replicationFactor) + ceil(numElementsIPU1 / replicationFactor) + …] with the result grouped per IPU. If replicationFactor does not evenly divide the number of elements on an IPU, the result is zero-padded per IPU. For instance:
Before:
Replica0: toReduce[x0, y0, z0, w0]
Replica1: toReduce[x1, y1, z1, w1]
Replica2: toReduce[x2, y2, z2, w2]
Replica3: toReduce[x3, y3, z3, w3]
Mapping: toReduce[IPU0, IPU0, IPU0, IPU1]
After:
Replica0: result[op(x0, x1, x2, x3), op(w0, w1, w2, w3)]
Replica1: result[op(y0, y1, y2, y3), 0]
Replica2: result[op(z0, z1, z2, z3), 0]
Replica3: result[0, 0]
Mapping: result[IPU0, IPU1]
-
poplar::Tensor
replicatedAllGather
(poplar::Graph &graph, const poplar::Tensor &toGather, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Gather the replicated tensor
toGather
and return the result so each replica will have a copy of all other replicas’toGather
tensors.For instance:
Before:
Replica0: toGather[x,y]
Replica1: toGather[z,w]
Replica2: toGather[x1, y1]
After allGather:
Replica0: result[x,y,z,w,x1,y1]
Replica1: result[x,y,z,w,x1,y1]
Replica2: result[x,y,z,w,x1,y1]
For an input of shape [incomingShape] the output will be [replicationFactor][incomingShape].
-
poplar::Tensor
allToAllPersonalizedExchange
(poplar::Graph &graph, const poplar::Tensor &input, poplar::program::Sequence &sequence, const std::string &debugPrefix = "")¶ Perform an all-to-all exchange of the elements of the input tensor based on replica ID.
The shape of the input must have the number of replicas in the graph as its first or only dimension. That dimension will be used to split up the tensor being sent, with each replica sending all splits except for the split index which matches its replica ID. That is, replica 2 will not send input[2] and so on.
The replica receiving the slice will copy that incoming slice into the output at the index which matches the replica ID of the replica which sent it. For instance:
Input tensor:
Replica0: Tensor T[x0,x1,x2]
Replica1: Tensor T[y0,y1,y2]
Replica2: Tensor T[z0,z1,z2]
Output tensor:
Replica0: Tensor T[x0,y0,z0]
Replica1: Tensor T[x1,y1,z1]
Replica2: Tensor T[x2,y2,z2]
-
struct
Chunk
¶
-
struct
Chunks
¶
-
Chunks
4.2.5. popops/CollectivesInterface.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
-
class
ReplicatedCollectivesInterface
: public popops::VersionedInterface¶ Public Functions
-
~ReplicatedCollectivesInterface
()¶
-
poplar::Tensor
replicatedAllReduce
(poplar::Graph &graph, const poplar::Tensor &data, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}) = 0¶ Perform an all-reduce operation on the specified replicated tensor.
This operation reduces across the tensors the replicated tensor is a handle for. The result returned as a replicated tensor.
- Parameters
graph
: The replicated graph the input tensor belongs to.data
: The replicated tensor to reduce.op
: The reduction operator (for example, Operation::ADD)prog
: The program sequence to add operations to.debugPrefix
: String used as a prefix for compute sets.options
: Collective options
-
void
replicatedAllReduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &data, poplar::Tensor &output, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}) = 0¶ Same as above but writes the result to the output tensor instead of creating a new one.
-
poplar::Tensor
replicatedAllReduce
(poplar::Graph &graph, poplar::Graph &parentGraph, const poplar::Tensor &data, popops::Operation op, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}) = 0¶ Perform an all-reduce operation on the specified replicated tensor.
This variant of replicatedAllReduce() is deprecated and may be removed in future.
Public Static Attributes
-
std::shared_ptr<ReplicatedCollectivesInterface>
defaultImpl
¶
-
-
class
VersionedInterface
¶ Subclassed by popops::ReplicatedCollectivesInterface
-
class
4.2.6. popops/DynamicSlice.hpp¶
-
namespace
poplar
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
createSliceableTensor
(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, std::size_t minGrainSize = 0, const std::string &debugPrefix = "")¶ Create and map a tensor to be sliced/updated efficiently.
The returned tensor will be laid out according to the plan.
- Return
A tensor shape
shape
that is suitably mapped- Parameters
graph
: The Poplar graph.type
: The type of the elements.shape
: The shape of the tensor to be slice/updated.dims
: The dimensions of the tensor that will be slice/updated.sizes
: The size of the slice in each of the dimensions.minGrainSize
: The minimum elements per slice mapped to each tiledebugPrefix
: A string prepended to debugging info.
-
poplar::Tensor
createSliceableTensor
(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Create and map a tensor to be sliced/updated efficiently.
The returned tensor will be spread over as many tiles as possible while respecting this minimum no. of elements per-tile and still being in a form to be sliced/updated efficiently.
- Return
A tensor shape
shape
that is suitably mapped.- Parameters
graph
: The Poplar graph.type
: The type of the elements.shape
: The shape of the tensor to be slice/updated.dims
: The dimensions of the tensor that will be slice/updated.sizes
: The size of the slice in each of the dimensions.plan
: Plan describing how the slicing/updating operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: A string prepended to debugging info.
-
poplar::Tensor
createSliceTensor
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, std::size_t numIndices, const std::string &debugPrefix = "")¶ Create and map a tensor to be sliced into/updated from efficiently.
Introspection on the tensor to update is used to lay out the returned tensor such that it can be used to update that tensor efficiently.
- Return
A tensor with shape [numIndices, shape…] mapped appropriately to be sliced into/updated from.
- Parameters
graph
: The Poplar graph.t
: The tensor to be updated.dims
: The dimensions of the tensor that will be sliced/updated.sizes
: The number of elements of each dimension indims
that will be sliced/updated.numIndices
: The number of slices this tensor should contain.plan
: Plan describing how the slicing/updating operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: A string prepended to debugging info.
-
poplar::Tensor
createSliceTensor
(poplar::Graph &graph, const poplar::Type &type, const std::vector<std::size_t> &shape, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, std::size_t numIndices, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Create and map a tensor to be sliced into/updated from efficiently.
The returned tensor is laid out according to the plan for the slice/update operation.
- Return
A tensor with shape [numIndices, shape…] mapped appropriately to be sliced into/updated from.
- Parameters
graph
: The Poplar graph.type
: The type of the elements.shape
: The shape of the tensor to be slice/updated.dims
: The dimensions of the tensor that will be sliced/updated.sizes
: The number of elements of each dimension indims
that will be sliced/updated.numIndices
: The number of slices this tensor should contain.plan
: Plan describing how the slicing/updating operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: A string prepended to debugging info.
-
poplar::Tensor
createIndicesTensor
(poplar::Graph &graph, const std::vector<std::size_t> &dims, std::size_t numIndices, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Create and map a tensor to contain indices for slicing/updating a tensor efficiently.
- Return
A tensor of shape [numIndices, dims.size()] mapped appropriately to be used as the indices for a slice/update operation. Element type is always UNSIGNED_INT.
- Parameters
graph
: The Poplar graph.dims
: The dimensions of a tensor to be sliced/updated that will be sliced/updated using these indices.numIndices
: The number of indices this tensor should containplan
: Plan describing how the slicing/updating operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: The prefix prepended to debugging info.
-
poplar::Tensor
createSliceableTensorFromSlice
(poplar::Graph &graph, const poplar::Tensor &s, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &numSlices, const std::string &debugPrefix = "")¶
-
poplar::Tensor
dynamicSlice
(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &offset, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Slice a tensor based on offsets specified by a tensor.
dims
gives the dimensions to slice,sizes
defines the size of the slice in those dimensions andoffset
gives the base offsets on each execution.offset
[0],dims
andsizes
must have the same size.offset
may have a second dimension with an element per tile, which can eliminate exchange.- Return
The specified subtensor
- Parameters
graph
: The Poplar graph.t
: The source tensor.offset
: A tensor of offsets at which the output is extracted.dims
: The dimensions oft
to slice.sizes
: The size of the slice in each of the dimensions indims
.prog
: The program to be extendeddebugPrefix
: The prefix prepended to debugging info
-
poplar::Graph::TileToTensorMapping
getSliceMapping
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes)¶ Get the tile mapping for a slice of a tensor.
dims
gives the dimensions to slice,sizes
defines the size of the slice in those dimensions.- Parameters
graph
: The Poplar graph.t
: The source tensor.dims
: The dimensions oft
to slice.sizes
: The size of the slice in each of the dimensions indims
.
-
void
dynamicUpdate
(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offset, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Update a subtensor at offsets read from a tensor.
dims
gives the dimensions that are partially updated, bysizes
elements, at offsetsoffset
. Unspecified dimensions are copied in full with zero offset.offset
[0],dims
andsizes
must have the same size.offset
may have a second dimension with an element per tile, which can eliminate exchange.- Parameters
graph
: The Poplar graph.t
: The tensor to update.s
: The updates.offset
: The offset withint
to be updated.dims
: The dimensions to be dynamically updated.sizes
: The size of the update in each of the dimensions indims
.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
poplar::Tensor
multiSlice
(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &offsets, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Take multiple slices from a base tensor.
The returned tensor will have a rank one greater than
t
. Its outer dimension will beoffsets.dim(0)
. Note thatdims
refers to the dimensions oft
.t
can be created usingcreateSliceableTensor()
to ensure efficient mapping.- Parameters
graph
: The Poplar graph.t
: The tensor being sliced.offsets
: The offsets withint
to be sliced.dims
: The dimensions oft
to be sliced.sizes
: The size of the update in each of the dimensions indims
.prog
: The program to be extended.plan
: Plan describing how the operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: The prefix prepended to debugging info.
-
void
multiUpdate
(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offsets, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Update multiple slices in a tensor.
- Parameters
graph
: The Poplar graph.t
: The tensor being updated.s
: The slices to insert.offsets
: The offsets withint
to be updated.dims
: The dimensions oft
to be updated.sizes
: The size of the update in each of the dimensions indims
.prog
: The program to be extended.plan
: Plan describing how the operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: The prefix prepended to debugging info.
-
void
multiUpdateAdd
(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offsets, const poplar::Tensor &scale, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const std::string &debugPrefix = "")¶ Accumulate multiple slices in a tensor for i offsets: t[offsets[i]] += scale * s[i]
t
,s
andscale
must have the same element type.- Parameters
graph
: The Poplar graph.t
: The tensor being updated (must be rank 2).s
: The slices to accumulate.offsets
: The offsets withint
to be accumulated.scale
: The scaling to apply to the update.dims
: The dimensions oft
to be accumulated (must be rank 1).sizes
: The size of the accumulate in each of the dimensions indims
.prog
: The program to be extended.plan
: Plan describing how the operation will be implemented.options
: Flags controlling how the operation will be implemented.debugPrefix
: The prefix prepended to debugging info.
-
class
SlicePlan
¶
-
namespace
embedding
¶ Functions
-
SlicePlan
plan
(const poplar::Graph &graph, const poplar::Type &dataType, const std::size_t numEntries, const std::size_t outputSize, const std::vector<std::size_t> &numLookups, const poplar::OptionFlags &options)¶ Create a plan for implementing a set of operations on an embedding matrix.
- Return
A plan which describes how the embedding matrix lookup/update operations should be implemented.
- Parameters
graph
: The graph the operation will be added to.dataType
: The data type of the entries in the embedding matrix and the resulting lookups from the matrix.numEntries
: Input size of embedding matrix.outputSize
: Output size of embedding matrix lookup.numLookups
: Vector of numbers of indices which will be looked up in the embedding matrix.options
: Set of option flags controlling how the operation will be implemented.
-
SlicePlan
-
poplar::Tensor
4.2.7. popops/ElementWise.hpp¶
These functions perform the same operation on each element of one or more tensors.
Every function has an in-place overload that writes the result of the function to the first tensor argument of the function.
The functions that perform operations on two tensors also have overloads for one of the tensors being a constant scalar. These functions perform the same operation on each element in the remaining tensor using the scalar as the other side of the operation for all elements.
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
varianceToInvStdDev
(poplar::Graph &graph, const poplar::Tensor &src, const poplar::Tensor &epsilon, poplar::program::Sequence &prog, const poplar::Type dstType = poplar::HALF, const std::string &debugPrefix = "")¶ Variance conversion operations can be created using the map functions below, but that requires the input and output to be of the same type.
It can be an advantage to maintain variance in full precision and inverse standard deviation in half precision. These supplementary functions make that possible.
- Return
A tensor containing the elements resulting from the variance to/from standard deviation conversion.
- Parameters
graph
: The graph to update.src
: The source Tensorepsilon
: A tensor initialised with the epsilon parameter used in conversion. Must have a single element and have the same type as the input type. Alternatively a float value can be used and the appropriate tensor will be created.prog
: The sequence to extend with the execution of conversion.dstType
: The type of the tensor to be output. Must be FLOAT when outputting variance, HALF when outputting standard deviation, or equal to the input type.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.
-
poplar::Tensor
invStdDevToVariance
(poplar::Graph &graph, const poplar::Tensor &src, const poplar::Tensor &epsilon, poplar::program::Sequence &prog, const poplar::Type dstType = poplar::FLOAT, const std::string &debugPrefix = "")¶
-
poplar::Tensor
varianceToInvStdDev
(poplar::Graph &graph, const poplar::Tensor &src, const float epsilon, poplar::program::Sequence &prog, const poplar::Type dstType = poplar::HALF, const std::string &debugPrefix = "")¶
-
poplar::Tensor
invStdDevToVariance
(poplar::Graph &graph, const poplar::Tensor &src, const float epsilon, poplar::program::Sequence &prog, const poplar::Type dstType = poplar::FLOAT, const std::string &debugPrefix = "")¶
-
poplar::Tensor
map
(poplar::Graph &graph, const expr::Expr &expr, const std::vector<poplar::Tensor> &ts, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Map an expression across tensors.
Element Wise Options
enableGenerateCodelet
(true, false) [=true]If true (and all of the inputs are the same size and do not alias), a codelet is generated to execute this map operation. A codelet will not be generated if there is only a single operation unless
forceGenerateCodelet
is true.- Return
A tensor containing the elements resulting from the application of the expression across the tensors.
- Parameters
graph
: The graph to update.expr
: The expression to map across the tensors. The placeholders in the expressions will be substituted with corresponding elements from the tensors ints
.ts
: The list of tensors to map the expression across. If elements from these tensors are used in binary/ternary operations in the expression the numpy-style broadcast rules are used to match the shapes of the tensors (see poputil::broadcastToMatch()).prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this functionoptions
: A list of flags to pass to the expression evaluator.
-
poplar::Tensor
map
(poplar::Graph &graph, expr::UnaryOpType op, const poplar::Tensor &t, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
map
(poplar::Graph &graph, expr::BinaryOpType op, const poplar::Tensor &a, const poplar::Tensor &b, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
map
(poplar::Graph &graph, expr::TernaryOpType op, const poplar::Tensor &a, const poplar::Tensor &b, const poplar::Tensor &c, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
mapInPlace
(poplar::Graph &graph, const expr::Expr &expr, const std::vector<poplar::Tensor> &ts, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Map an expression across tensors and assign it back to the first tensor given.
- Return
A tensor containing the elements resulting from the application of the expression across the tensors.
- Parameters
graph
: The graph to update.expr
: The expression to map across the tensors. The placeholders in the expressions will be substituted with corresponding elements from the tensors ints
. The result of the expression is then written to the elements of the first tensor ints
.ts
: The list of tensors to map the expression across. If elements from these tensors are used in binary/ternary operations in the expression the numpy-style broadcast rules are used to match the shapes of the tensors (see poputil::broadcastToMatch()).prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this functionoptions
: Element wise options. See map().
-
void
mapInPlace
(poplar::Graph &graph, expr::UnaryOpType op, const poplar::Tensor &t, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
mapInPlace
(poplar::Graph &graph, expr::BinaryOpType op, const poplar::Tensor &a, const poplar::Tensor &b, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
mapInPlace
(poplar::Graph &graph, expr::TernaryOpType op, const poplar::Tensor &a, const poplar::Tensor &b, const poplar::Tensor &c, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
abs
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the absolute value of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::abs(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
absInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
asin
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the arc-sine of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::asin(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
asinInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
bitwiseNot
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the bitwise NOT operation for each element in
A
.- Return
A tensor where each element is equivalent to the result of
~a
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
bitwiseNotInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
ceil
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the ceiling of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::ceil(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
ceilInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
countLeadingZeros
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the number of binary leading zeros of each element in
A
.- Note
If the element is zero then it is treated as 32 leading zeros.
- Return
A tensor where each element is equivalent to the result of
a ? __builtin_clz(a) : 32
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
countLeadingZerosInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
cos
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the cosine of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::cos(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
cosInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
exp
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the exponential of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::exp(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
expInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
expm1
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the exponential of each element in
A
minus one.- Return
A tensor where each element is equivalent to the result of
std::expm1(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
expm1InPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
floor
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the floor of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::floor(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
floorInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
inv
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the inverse of each element in
A
.- Return
A tensor where each element is equivalent to the result of
1 / a
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
invInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
log
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the log base-e of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::log(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
logInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
log1p
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the log base-e of each element in
A
plus one.- Return
A tensor where each element is equivalent to the result of
std::log1p(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
log1pInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
logicalNot
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the logical NOT of each element in
A
.- Return
A tensor where each element is equivalent to the result of
!a
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
logicalNotInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
neg
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the negation of each element in
A
.- Return
A tensor where each element is equivalent to the result of
-1 * a
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
negInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
popcount
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the popcount of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::popcount(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
popcountInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
signum
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the signum of each element in
A
.- Return
A tensor where each element is one of -1, 0 or +1 if the corresponding element in
A
was less than, equal to or greater than 0 respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
signumInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
sin
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the sine of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::sin(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
sinInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
tan
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the tangent of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::tan(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
tanInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
tanh
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the hyperbolic tangent of each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::tanh(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
tanhInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
round
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Round each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::round(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
roundInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
sqrt
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the square-root for each element in
A
.- Return
A tensor where each element is equivalent to the result of
std::sqrt(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
sqrtInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
square
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the square for each element in
A
.- Return
A tensor where each element is equivalent to the result of
x * x
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
squareInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
sigmoid
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the sigmoid for each element in
A
.- Return
A tensor where each element is equivalent to the result of
1 / (1 + exp(-x))
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
sigmoidInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
rsqrt
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the reciprocal square root for each element in
A
.- Return
A tensor where each element is equivalent to the result of
1 / sqrt(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
rsqrtInPlace
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
isFinite
(poplar::Graph &graph, const poplar::Tensor &A, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is finite.- Return
A tensor where each element is equivalent to the result of
std::isfinite(a)
where a is an element ofA
.- Parameters
graph
: The graph to update.A
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
voidcheckTypes
(poplar::Type elementType, constType constant)¶ Check that the host compile-time type
constType
is compatible with the run-time IPU typeelementType
.- Parameters
elementType
: The run-time IPU type.constant
: Unused.
- Template Parameters
constType
: The host compile-time type.
- Exceptions
std::runtime_error
: if the types are not compatible.
-
poplar::Tensor
add
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Add each element in
A
to the corresponding element inB
.- Return
A tensor where each element is the result of
a + b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensoradd
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensoradd
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
addInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidaddInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
atan2
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the two argument arctangent of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
atan2(a, b)
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensoratan2
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensoratan2
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
atan2InPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidatan2InPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
bitwiseAnd
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the bitwise AND of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
a & b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorbitwiseAnd
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorbitwiseAnd
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
bitwiseAndInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidbitwiseAndInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
bitwiseOr
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the bitwise OR of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
a | b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorbitwiseOr
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorbitwiseOr
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
bitwiseOrInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidbitwiseOrInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
bitwiseXor
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the bitwise XOR of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
a ^ b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorbitwiseXor
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorbitwiseXor
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
bitwiseXorInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidbitwiseXorInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
bitwiseXnor
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the bitwise XNOR of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
!(a ^ b)
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorbitwiseXnor
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorbitwiseXnor
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
bitwiseXnorInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidbitwiseXnorInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
div
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Divide each element in
A
by the corresponding element inB
.- Return
A tensor where each element is the result of
a / b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of dividends.B
: The tensor of divisors.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensordiv
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensordiv
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
divInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voiddivInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
eq
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is equal to the corresponding element inB
.- Return
A tensor where each element is the result of
a == b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensoreq
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensoreq
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
eqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voideqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
gteq
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is greater than or equal to the corresponding element inB
.- Return
A tensor where each element is the result of
a >= b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorgteq
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorgteq
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
gteqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidgteqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
gt
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is greater than the corresponding element inB
.- Return
A tensor where each element is the result of
a > b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorgt
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorgt
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
gtInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidgtInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
invStdDevToVariance
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ See invStdDevToVariance().
-
template<typename
constType
>
poplar::TensorinvStdDevToVariance
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorinvStdDevToVariance
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
invStdDevToVarianceInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidinvStdDevToVarianceInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
lteq
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is less than or equal to the corresponding element inB
.- Return
A tensor where each element is the result of
a <= b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorlteq
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorlteq
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
lteqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidlteqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
logicalAnd
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the logical AND (
&&
) of each element inA
with the corresponding element inB
.- Return
A tensor where each element is the result of
a && b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorlogicalAnd
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorlogicalAnd
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
logicalAndInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidlogicalAndInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
logicalOr
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the logical OR (
||
) of each element inA
with the corresponding element inB
.- Return
A tensor where each element is the result of
a || b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorlogicalOr
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorlogicalOr
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
logicalOrInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidlogicalOrInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
lt
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is less than the corresponding element inB
.- Return
A tensor where each element is the result of
a < b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorlt
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorlt
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
ltInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidltInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
max
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the maximum of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
max(a, b)
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensormax
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensormax
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
maxInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidmaxInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
min
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the minimum of each element in
A
with the corresponding element inB
.- Return
A tensor where each element is the result of
min(a, b)
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensormin
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensormin
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
minInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidminInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
mul
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Multiply each element in
A
by the corresponding element inB
.- Return
A tensor where each element is the result of
a * b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensormul
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensormul
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
mulInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidmulInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
neq
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Check if each element in
A
is not equal to the corresponding element inB
.- Return
A tensor where each element is the result of
a != b
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: A tensor of elements.B
: A tensor of elements.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorneq
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorneq
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
neqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidneqInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
pow
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute each element in
A
to the power of the corresponding element inB
.- Return
A tensor where each element is equal to
pow(a, b)
, where a and b are the corresponding elements ofA
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of bases.B
: The tensor of exponents.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorpow
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorpow
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
powInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidpowInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
rem
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute the remainder of each element in
A
divided by the corresponding element inB
.- Return
A tensor where each element is equal to a % b, where a and b are the corresponding elements of
A
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of dividends.B
: The tensor of divisors.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorrem
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorrem
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
remInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidremInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
shiftLeft
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Shift the elements of
A
left by the corresponding elements ofB
.- Return
A tensor where each element is equal to a << b, where a and b are the corresponding elements of
A
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of elements which to left-shift.B
: The tensor of elements that describe the amount to left-shift by.A
.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorshiftLeft
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorshiftLeft
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
shiftLeftInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidshiftLeftInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
shiftRight
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Shift the elements of
A
right by the corresponding elements ofB
.- Return
A tensor where each element is equal to a >> b (without sign extension), where a and b are the corresponding elements of
A
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of elements which to right-shift.B
: The tensor of elements that describe the amount to right-shift by.A
.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorshiftRight
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ See shiftRight().
-
template<typename
constType
>
poplar::TensorshiftRight
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ See shiftRight().
-
void
shiftRightInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ See shiftRight().
-
template<typename
constType
>
voidshiftRightInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ See shiftRight().
-
poplar::Tensor
shiftRightSignExtend
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Shift the elements of
A
right with sign extension by the corresponding elements ofB
.- Return
A tensor where each element is equal to a >> b with sign extension, where a and b are the corresponding elements of
A
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of elements which to right-shift.B
: The tensor of elements that describe the amount to right-shift by.A
.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::TensorshiftRightSignExtend
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorshiftRightSignExtend
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
shiftRightSignExtendInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidshiftRightSignExtendInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
sub
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Subtract the elements of
B
fromA
and return the result in a new tensor.- Return
A tensor where each element is equal to a - b, where a and b are the corresponding elements of
A
andB
tensors respectively.- Parameters
graph
: The graph to update.A
: The tensor of elements which will be subtracted from.B
: The tensor of elements to subtract fromA
.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
template<typename
constType
>
poplar::Tensorsub
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::Tensorsub
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
subInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidsubInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
varianceToInvStdDev
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorvarianceToInvStdDev
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
poplar::TensorvarianceToInvStdDev
(poplar::Graph &graph, const constType A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
varianceToInvStdDevInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
template<typename
constType
>
voidvarianceToInvStdDevInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const constType B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
select
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, const poplar::Tensor &C, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Populate the returned tensor with elements from
A
orB
depending on the corresponding element ofC
.That is, for each element in the output compute
c ? a : b
, where a, b, c are the corresponding elements in the tensorsA
,B
,C
respectively.- Return
A tensor containing the elements from
A
where the corresponding elements inC
were not equal to zero and containing the elements fromB
where the corresponding elements inC
were zero.- Parameters
graph
: The graph to update.A
: One of the tensors containing the elements to select from.B
: One of the tensors containing the elements to select from.C
: The tensor containing the elements to use as predicates.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
void
selectInPlace
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, const poplar::Tensor &C, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
clamp
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, const poplar::Tensor &C, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Populate the returned tensor with elements from
A
but clamp them such that each element is greater than or equal to the corresponding element inB
and less than or equal to the corresponding element inC
.That is, for each element in the returned tensor compute:
min(max(a, b), c)
where a, b, c are the corresponding elements in the tensorsA
,B
,C
respectively.- Return
A tensor containing the elements resulting from the application of the expression across the tensors.
- Parameters
graph
: The graph to update.A
: The tensor containing the elements to clamp.B
: The tensor containing the elements to use as minimums.C
: The tensor containing the elements to use as maximums.prog
: The sequence to extend with the execution of the expression evaluation.debugPrefix
: A debug prefix to be added to debug strings in compute sets and variables created by this function.options
: Element-wise options. See map().
-
poplar::Tensor
4.2.8. popops/elementwiseCodelets.hpp¶
Defines
-
INSTANTIATE_OP_1
(v, op, t)¶
-
INSTANTIATE_OP_2
(v, op, t, ...)¶
-
INSTANTIATE_OP_3
(v, op, t, ...)¶
-
INSTANTIATE_OP_4
(v, op, t, ...)¶
-
INSTANTIATE_OP_5
(v, op, t, ...)¶
-
SELECT_VARGS
(_1, _2, _3, _4, _5, NAME, ...)¶
-
INSTANTIATE_OP
(v, op, ...)¶
Functions
-
__attribute__((always_inline)) static unsigned
getWsr
(void)¶
-
__attribute__ ((noinline)) unsigned divideWork(const unsigned size
Variables
-
const unsigned
vectorWidthShifts
¶
-
const unsigned const unsigned worker
-
constexpr auto
ONE_PTR
= poplar::VectorLayout::ONE_PTR¶
-
constexpr auto
SPAN
= poplar::VectorLayout::SPAN¶
-
constexpr auto
SCALED_PTR64
= poplar::VectorLayout::SCALED_PTR64¶
-
constexpr auto
SCALED_PTR32
= poplar::VectorLayout::SCALED_PTR32¶
4.2.9. popops/ElementWiseUtil.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
createOutputForElementWiseOp
(poplar::Graph &graph, const std::vector<poplar::Tensor> &inputs, const poplar::Type &outputType, const std::string &debugName = "")¶ Create a tensor for use as the output of an element-wise operation (operation with no dependency between more than one element of the output and any given element of any input tensor).
Use the mapping of this tensor to map element-wise operations to tiles to produce an operation that is computationally balanced across tiles and which minimises exchange.
All input tensors must have the same shape.
- Return
A tensor with the same shape as the given inputs, with a complete tile mapping.
- Parameters
graph
: A graph to add the tensor to and which the inputs belong to.inputs
: List of input tensors for the element-wise operation.outputType
: The element type of the tensor.debugName
: Debug name given to the tensor.
-
poplar::Tensor
4.2.10. popops/Encoding.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
void
encodeOneHot
(poplar::Graph &graph, const poplar::Tensor &indices, const poplar::Tensor &encoded, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Encode a given set of indices as a set of one-hot vectors per-index with a hot element at that index.
i.e. given a 1-dimensional
indices
tensor with length N and a 2-dimensionalencoded
tensor with shape N * xencoded
is a tensor with a single element equal to 1, and all others equal 0. The single hot element in each row is given by the indices inindices
.- Parameters
graph
: The graph to add the tensor and any vertices needed for the encoding to.encoded
: Tensor to encode output to.indices
: 1-dimensional tensor containing indices to encode as one-hot vectors. A codepoint MASKED_LABEL_CODE is reserved to indicate that the encoding is not done for that index.prog
: Sequence which the programs that perform the encoding are added to.debugPrefix
: Optional debug prefix for programs/variables used to perform the encoding.
-
void
encodeOneHot
(poplar::Graph &graph, const poplar::Tensor &indices, const poplar::Tensor &encoded, poplar::program::Sequence &prog, const poplar::Tensor &on, const poplar::Tensor &off, const std::string &debugPrefix = "")¶ Encode a given set of indices as a set of one-hot vectors per-index with a hot element at that index.
i.e. given a 1-dimensional
indices
tensor with length N and a 2-dimensionalencoded
tensor with shape N * xencoded
is a tensor with a single element equal toon
, and all others equal tooff
as given by the user. The single hot element in each row is given by the indices inindices
.- Parameters
graph
: The graph to add the tensor and any vertices needed for the encoding to.encoded
: Tensor to encode output to.indices
: 1-dimensional tensor containing indices to encode as one-hot vectors.prog
: Sequence which the programs that perform the encoding are added to.debugPrefix
: Optional debug prefix for programs/variables used to perform the encoding.on
: Value which represents the “On” state in the one hot encoded output.off
: Value which represents the “Off” state.
-
void
iota
(poplar::Graph &graph, const poplar::Tensor &t, unsigned startInteger, poplar::program::Sequence &prog, const std::string &debugPrefix)¶ Returns a right-open range of integers [startInteger, startInteger + length) where, length is the number of elements of mapped 1-D output tensor
t
.Output tensor can be of type INT or UNSIGNED_INT.
- Parameters
graph
: The graph to add the tensor and any vertices needed for the operation.t
: 1-D tensor to write the encoded output to. Tensor must be mapped.startInteger
: The start integer in the output range.prog
: Sequence which the programs that perform the encoding are added to.debugPrefix
: Optional debug prefix for programs/variables used to perform the encoding.
-
void
iota
(poplar::Graph &graph, const poplar::Tensor &t, int startInteger, poplar::program::Sequence &prog, const std::string &debugPrefix)¶ Returns a right-open range of integers [startInteger, startInteger + length) where, length is the number of elements of mapped 1-D output tensor
t
.Output tensor can be of type INT or UNSIGNED_INT.
- Parameters
graph
: The graph to add the tensor and any vertices needed for the operation.t
: 1-D tensor to write the encoded output to. Tensor must be mapped.startInteger
: The start integer in the output range.prog
: Sequence which the programs that perform the encoding are added to.debugPrefix
: Optional debug prefix for programs/variables used to perform the encoding.
-
void
4.2.12. popops/Expr.hpp¶
Defines
-
POPLIBS_DEFINE_EXPR_UNARY_OP
(Name, Op)¶
-
POPLIBS_DEFINE_EXPR_UNARY_OP_AND_SYMBOL
(Name, Op, Sym)¶
-
POPLIBS_DEFINE_EXPR_BINARY_OP
(Name, Op)¶
-
POPLIBS_DEFINE_EXPR_BINARY_OP_AND_SYMBOL
(Name, Op, Sym)¶
-
POPLIBS_DEFINE_EXPR_TERNARY_OP
(Name, Op)¶
-
namespace
popops
Common functions, such as elementwise and reductions.
-
namespace
expr
¶ Functions
-
const PlaceHolder _1 (1)
-
const PlaceHolder _2 (2)
-
const PlaceHolder _3 (3)
-
const PlaceHolder _4 (4)
-
const PlaceHolder _5 (5)
-
const PlaceHolder _6 (6)
-
const PlaceHolder _7 (7)
-
const PlaceHolder _8 (8)
-
const PlaceHolder _9 (9)
-
const PlaceHolder _10 (10)
-
const PlaceHolder _11 (11)
-
const PlaceHolder _12 (12)
-
const PlaceHolder _13 (13)
-
const PlaceHolder _14 (14)
-
const PlaceHolder _15 (15)
-
const PlaceHolder _16 (16)
-
const PlaceHolder _17 (17)
-
const PlaceHolder _18 (18)
-
const PlaceHolder _19 (19)
-
const PlaceHolder _20 (20)
-
BitwiseNot
operator~
(const Expr &a)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Add>::typeoperator+
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Add>::typeoperator+
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseAnd>::typeoperator&
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseAnd>::typeoperator&
(const Expr &a, const T &b)¶
-
BitwiseAnd
operator&
(const Expr &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseOr>::typeoperator|
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseOr>::typeoperator|
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseXor>::typeoperator^
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, BitwiseXor>::typeoperator^
(const Expr &a, const T &b)¶
-
BitwiseXor
operator^
(const Expr &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Divide>::typeoperator/
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Divide>::typeoperator/
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Equal>::typeoperator==
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Equal>::typeoperator==
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Gte>::typeoperator>=
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Gte>::typeoperator>=
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Gt>::typeoperator>
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Gt>::typeoperator>
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Lte>::typeoperator<=
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Lte>::typeoperator<=
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, And>::typeoperator&&
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, And>::typeoperator&&
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Or>::typeoperator||
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Or>::typeoperator||
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Lt>::typeoperator<
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Lt>::typeoperator<
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Mul>::typeoperator*
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Mul>::typeoperator*
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, NotEqual>::typeoperator!=
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, NotEqual>::typeoperator!=
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Rem>::typeoperator%
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Rem>::typeoperator%
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Shl>::typeoperator<<
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Shl>::typeoperator<<
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Shr>::typeoperator>>
(const T &a, const Expr &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Shr>::typeoperator>>
(const Expr &a, const T &b)¶
-
template<typename
T
>
std::enable_if<!std::is_base_of<Expr, T>::value, Sub>::typeoperator-
(const T &a, const Expr &b)¶
-
class
Any
¶
-
class
BinaryOp
: public popops::expr::ExprType<BinaryOp>¶ Subclassed by popops::expr::Add, popops::expr::And, popops::expr::Atan2, popops::expr::BitwiseAnd, popops::expr::BitwiseOr, popops::expr::BitwiseXnor, popops::expr::BitwiseXor, popops::expr::Divide, popops::expr::Equal, popops::expr::Gt, popops::expr::Gte, popops::expr::InvStdDevToVariance, popops::expr::Lt, popops::expr::Lte, popops::expr::Max, popops::expr::Min, popops::expr::Mul, popops::expr::NotEqual, popops::expr::Or, popops::expr::Pow, popops::expr::Rem, popops::expr::Shl, popops::expr::Shr, popops::expr::ShrSE, popops::expr::Sub, popops::expr::VarianceToInvStdDev
Public Functions
-
BinaryOp
(BinaryOpType type, const Expr &a, const Expr &b)¶
-
BinaryOpType
getOpType
() const¶
-
-
class
Const
: public popops::expr::ExprType<Const>¶ Subclassed by popops::expr::ConstHalf
Public Functions
-
Const
(poplar::TypeTraits typeTraits_, poplar::Type type_, const char *data_)¶
-
char *
getData
() const¶
-
const poplar::TypeTraits &
getTypeTraits
() const¶
-
-
class
Expr
¶ - #include <Expr.hpp>
Type to represent element expressions.
This class represents an expression that can be applied to elements of Tensors.
The type is an abstract type which can be instantiated by its sub-classes to build up expressions, for example:
Tanh(Add(Square(_1), Const(3))))
.Expressions can be applied to tensors with the popops::map() and popops::mapInPlace() functions.
Subclassed by popops::expr::ExprType< BinaryOp >, popops::expr::ExprType< Cast >, popops::expr::ExprType< Const >, popops::expr::ExprType< PlaceHolder >, popops::expr::ExprType< TernaryOp >, popops::expr::ExprType< UnaryOp >, popops::expr::ExprType< T >
Protected Types
-
using
ExprClassID
= void (*)(void)¶
Protected Functions
-
Expr
(ExprClassID classId)¶
Protected Attributes
-
ExprClassID
classId
¶
-
using
-
template<class
T
>
classExprType
: public popops::expr::Expr¶ Subclassed by popops::expr::BinaryOp, popops::expr::Cast, popops::expr::Const, popops::expr::PlaceHolder, popops::expr::TernaryOp, popops::expr::UnaryOp
Public Functions
-
ExprType
()¶
Friends
- friend class Expr
-
-
class
TernaryOp
: public popops::expr::ExprType<TernaryOp>¶ Subclassed by popops::expr::Clamp, popops::expr::Select
Public Functions
-
TernaryOp
(TernaryOpType type, const Expr &a, const Expr &b, const Expr &c)¶
-
TernaryOpType
getOpType
() const¶
Private Members
-
TernaryOpType
type
¶
-
-
class
UnaryOp
: public popops::expr::ExprType<UnaryOp>¶ Subclassed by popops::expr::Abs, popops::expr::Asin, popops::expr::BitwiseNot, popops::expr::Ceil, popops::expr::Cos, popops::expr::Exp, popops::expr::Expm1, popops::expr::Floor, popops::expr::Inv, popops::expr::IsFinite, popops::expr::IsInf, popops::expr::IsNaN, popops::expr::Log, popops::expr::Log1p, popops::expr::Neg, popops::expr::Not, popops::expr::Round, popops::expr::Rsqrt, popops::expr::Sigmoid, popops::expr::Signum, popops::expr::Sin, popops::expr::Sqrt, popops::expr::Square, popops::expr::Tan, popops::expr::Tanh
-
-
namespace
4.2.13. popops/ExprOp.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
-
namespace
expr
Enums
-
enum
BinaryOpType
¶ Values:
-
enumerator
ADD
¶
-
enumerator
ATAN2
¶
-
enumerator
BITWISE_AND
¶
-
enumerator
BITWISE_OR
¶
-
enumerator
BITWISE_XOR
¶
-
enumerator
BITWISE_XNOR
¶
-
enumerator
DIVIDE
¶
-
enumerator
EQUAL
¶
-
enumerator
GREATER_THAN_EQUAL
¶
-
enumerator
GREATER_THAN
¶
-
enumerator
INV_STD_DEV_TO_VARIANCE
¶
-
enumerator
LESS_THAN_EQUAL
¶
-
enumerator
LOGICAL_AND
¶
-
enumerator
LOGICAL_OR
¶
-
enumerator
LESS_THAN
¶
-
enumerator
MAXIMUM
¶
-
enumerator
MINIMUM
¶
-
enumerator
MULTIPLY
¶
-
enumerator
NOT_EQUAL
¶
-
enumerator
POWER
¶
-
enumerator
REMAINDER
¶
-
enumerator
SHIFT_LEFT
¶
-
enumerator
SHIFT_RIGHT
¶
-
enumerator
SHIFT_RIGHT_SIGN_EXTEND
¶
-
enumerator
SUBTRACT
¶
-
enumerator
VARIANCE_TO_INV_STD_DEV
¶
-
enumerator
-
enum
UnaryOpType
¶ Values:
-
enumerator
ABSOLUTE
¶
-
enumerator
ASIN
¶
-
enumerator
BITWISE_NOT
¶
-
enumerator
CEIL
¶
-
enumerator
COS
¶
-
enumerator
COUNT_LEADING_ZEROS
¶
-
enumerator
EXPONENT
¶
-
enumerator
EXPONENT_MINUS_ONE
¶
-
enumerator
FLOOR
¶
-
enumerator
INVERSE
¶
-
enumerator
IS_FINITE
¶
-
enumerator
IS_INF
¶
-
enumerator
IS_NAN
¶
-
enumerator
LOGARITHM
¶
-
enumerator
LOGARITHM_ONE_PLUS
¶
-
enumerator
LOGICAL_NOT
¶
-
enumerator
NEGATE
¶
-
enumerator
POPCOUNT
¶
-
enumerator
SIGNUM
¶
-
enumerator
SIN
¶
-
enumerator
TAN
¶
-
enumerator
TANH
¶
-
enumerator
ROUND
¶
-
enumerator
SQRT
¶
-
enumerator
SQUARE
¶
-
enumerator
SIGMOID
¶
-
enumerator
RSQRT
¶
-
enumerator
-
enum
-
namespace
4.2.14. popops/Fill.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
void
fill
(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &prog, const void *fillValue, const poplar::TypeTraits &traits, const std::string &debugPrefix = "")¶
-
template<typename
FillValueType
>
voidfill
(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &prog, FillValueType fillValue, const std::string &debugPrefix = "")¶ Appends programs to
prog
which fills all elements of the Tensort
with a value offillValue
.- Note
The type of
fillValue
must be compatible with the element type oft
.- Parameters
graph
: The graph that the operation will be added to.t
: The tensor whose elements are to be filled.prog
: Poplar program sequence to append the operation onto.fillValue
: The value to fillt
with.debugPrefix
: Name of the operation, for debugging.
-
void
4.2.15. popops/Gather.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
createGatherInput
(poplar::Graph &graph, const poplar::Type &type, const std::vector<std::size_t> &operandShape, unsigned axis, GatherParams params = {}, const std::string &name = "")¶ Create the input of the gather with only a single gather axis.
This is designed to spread the gather, and each dynamic slice within the gather, across the tiles evenly.
- Return
A tensor with the desired shape.
- Parameters
graph
: The Poplar graph.type
: The data type of the required tensor.operandShape
: The desired shape of the input.axis
: The axis that will be gathered on.params
: The same parameters as used by the gather().name
: The name of the tensor.
-
poplar::Tensor
gather
(poplar::Graph &graph, const poplar::Tensor &input, const poplar::Tensor &indices, unsigned axis, poplar::program::Sequence &prog, GatherParams params, const std::string &debugPrefix = "")¶ The gather operation stitches together several slices (each slice at a potentially different runtime offset) of an input tensor.
To achieve the best performance, the input tensor should be created with createGatherInput.
- Note
The indices are treated as offsets along the chosen axis. At this offset a slice of depth 1 in the axis dimension is taken.
- Return
The gathered slices from the input with rank y + (x - 1).
- Parameters
graph
: The Poplar graph.input
: The tensor we are gathering from of rank x.indices
: Tensor containing the indices of the slices we gather of rank y.axis
: The axis to gather on, axis must be less than x.prog
: The program sequence to add this operation to.params
: Parameters for the form of the gather.debugPrefix
: A debug name for the operation.
-
poplar::Tensor
createGatherInput
(poplar::Graph &graph, const poplar::Type &type, const std::vector<std::size_t> &inputShape, const std::vector<std::size_t> &sliceSizes, std::vector<unsigned> startIndexMap, const std::string &name = "")¶ Create the input of the gather given a start index map.
This is designed to spread the gather, and each dynamic slice within the gather, across the tiles evenly.
- Return
A tensor with the desired shape.
- Parameters
graph
: The Poplar graph.type
: The data type of the required tensor.inputShape
: The desired shape of the input.sliceSizes
:slice_sizes[i]
is the bounds for the slice on dimensioni
.startIndexMap
: A map that describes how to map indices instartIndices
to legal indices into input.name
: The name of the tensor.
-
poplar::Tensor
gather
(poplar::Graph &graph, const poplar::Tensor &input, const poplar::Tensor &indices, std::size_t indexVectorDim, const std::vector<std::size_t> &offsetDims, const std::vector<std::size_t> &sliceSizes, const std::vector<std::size_t> &collapsedSliceDims, const std::vector<unsigned> &startIndexMap, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ The gather operation stitches together several slices (each slice at a potentially different runtime offset) of an input tensor.
To achieve the best performance, the input tensor should be created with createGatherInput.
Example usage where we want to take 2 elements from a given tensor:
// The runtime defined input tensor input = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}; // shape = {3, 3} // The runtime defined indices tensor containing the coords we want to // extract indices = {{1, 1}, {2, 1}}; // shape = {2, 2} // We want to extract elems at [1, 1] and [2, 1] from the input // To achieve this we need to define the other parameters correctly // We want to treat the rows of indices as coords into the input tensor indexVectorDim = 1; // None of the output dims will correspond to any of the input dims offsetDims = {}; // We will be taking 1x1 slices to pick single elements sliceSizes = {1, 1}; // We will collapse both dims of the input slices collapsedSliceDims = {0, 1}; // An identity mapping between the indices coords and the input dims startIndexMap = {0, 1}; // Perform the desired gather result = gather(input, indices, indexVectorDim, offsetDims, sliceSizes collapsedSliceDims, startIndexMap) = {5, 8}; // shape = {2}
- Note
When indexVectorDim == indices.rank(), the indices are interpreted as scalar values.
- Note
This is a near direct port of https://www.tensorflow.org/xla/operation_semantics#gather from tensorflow/compiler/xla/service/gather_expander.cc
- Return
The gathered slices from the input.
- Parameters
graph
: The Poplar graph.input
: The tensor we are gathering from.indices
: Tensor containing the starting indices of the slices we gather.indexVectorDim
: The dimension inindices
that “contains” the starting indices.offsetDims
: The set of dimensions in the output shape that offset into a tensor sliced from input.sliceSizes
:slice_sizes[i]
is the bounds for the slice on dimensioni
.collapsedSliceDims
: The set of dimensions in each slice that are collapsed away. These dimensions must have size 1.startIndexMap
: A map that describes how to map indices instartIndices
to legal indices into input.prog
: The program sequence to add this operation todebugPrefix
: A debug name for the operation.
-
struct
GatherParams
¶
-
poplar::Tensor
4.2.16. popops/HostSliceTensor.hpp¶
-
namespace
poplar
4.2.17. popops/NaN.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
hasNaN
(poplar::Graph &graph, const poplar::Tensor &src, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Takes a tensor of any shape and type float or half and returns a new scalar bool tensor whose only element is true if any of the elements of the src tensor contained a NaN.
- Parameters
graph
: The graph to add the tensor and any vertices to.src
: The input tensor, the type must be floating point.prog
: Sequence to add programs to to perform the check.debugPrefix
: Optional debug prefix for programs/variables.
-
poplar::Tensor
4.2.18. popops/Operation.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Enums
4.2.19. popops/Pad.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::ptrdiff_t> &paddingLower, const std::vector<std::ptrdiff_t> &paddingUpper, float val = 0.0f, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶ Return a tensor with constant padding added.
- Return
The tensor with padding added.
- Parameters
graph
: The graph containing the tensor.t
: The tensor to pad.paddingLower
: A vector specifying the amount of padding to add at the start of each dimension. Negative padding truncates.paddingUpper
: A vector specifying the amount of padding to add at the end of each dimension. Negative padding truncates.val
: The input tensor will be padded with this value.mappingMethod
: The method that should be used to map added padding elements.
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::ptrdiff_t> &paddingLower, const std::vector<std::ptrdiff_t> &paddingUpper, int val, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::ptrdiff_t> &paddingLower, const std::vector<std::ptrdiff_t> &paddingUpper, const poplar::Tensor &val, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, std::ptrdiff_t paddingLower, std::ptrdiff_t paddingUpper, unsigned dim, float val = 0.0f, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶ Return a tensor with constant padding added to one dimension.
- Return
The tensor with padding added.
- Parameters
t
: The tensor to pad.paddingLower
: The amount of padding to add at the start of the dimension. Negative padding truncates.paddingUpper
: The amount of padding to add at the end of the dimension. Negative padding truncates.dim
: The dimension to pad.val
: The input tensor will be padded with this value.mappingMethod
: The method that should be used to map added padding elements.
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, std::ptrdiff_t paddingLower, std::ptrdiff_t paddingUpper, unsigned dim, int val, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶
-
poplar::Tensor
pad
(poplar::Graph &graph, const poplar::Tensor &t, std::ptrdiff_t paddingLower, std::ptrdiff_t paddingUpper, unsigned dim, const poplar::Tensor &val, padding::MappingMethod mappingMethod = padding::MappingMethod::ZERO)¶
-
poplar::Tensor
pad
(const poplar::Tensor &t, const std::vector<std::ptrdiff_t> &paddingLower, const std::vector<std::ptrdiff_t> &paddingUpper, padding::Type type)¶ Return a tensor with numpy-style padding added.
- Return
The tensor with padding added.
- Parameters
t
: The tensor to pad.paddingLower
: A vector specifying the amount of padding to add at the start of each dimension. Negative padding truncates.paddingUpper
: A vector specifying the amount of padding to add at the end of each dimension. Negative padding truncates.type
: The type of padding.
-
poplar::Tensor
pad
(const poplar::Tensor &t, std::ptrdiff_t paddingLower, std::ptrdiff_t paddingUpper, unsigned dim, padding::Type type)¶ Return a tensor with numpy-style padding added to one dimension.
- Return
The tensor with padding added.
- Parameters
t
: The tensor to pad.paddingLower
: The amount of padding to add at the start of the dimension. Negative padding truncates.paddingUpper
: The amount of padding to add at the end of the dimension. Negative padding truncates.dim
: The dimension to pad.
-
namespace
padding
¶ Enums
-
enum
Type
¶ Padding types as per numpy.pad.
Values:
-
enumerator
EDGE
¶ Also known as nearest-neighbour padding, each new pad element has its value set to that of the pre-padded element nearest to it.
Any such nearest neighbour lies on the edge of the pre-padded tensor, hence the name.
-
enumerator
REFLECT
¶ The tensor is reflected outwards.
Specifically, a new pad element has its value set to that of the element which is an equal distance to the pad element’s nearest neighbour as the pad element, but in the opposite direction.
-
enumerator
-
enum
MappingMethod
¶ Methods to map added padding elements to tiles.
Values:
-
enumerator
NONE
¶ Padding won’t be mapped.
-
enumerator
ZERO
¶ Set tile mapping of padding element to tile 0 for the graph.
-
enumerator
EDGE
¶ Set tile mapping of padding elements to match the nearest-neighbour element which lies on the edge of the tensor prior to padding.
Requires a non-empty tensor to be padded with a complete tile mapping.
-
enumerator
-
enum
-
poplar::Tensor
4.2.20. popops/Rearrange.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
-
namespace
rearrange
¶ Functions
-
bool
canUseFastTranspose
(const poplar::Target &target, const poplar::Type &type, unsigned numRows, unsigned numColumns, unsigned numTranspositions)¶ Determine if a fast transposition codelet may be used based on the given target/data type/no.
of rows/no. of columns.
- Return
A boolean indicating whether or not the fast transposition codelets can be targeted based on the given parameters.
- Parameters
target
: The target the operation will be targeted at.type
: The data type of the tensor to transpose.numRows
: The no. of rows in each transposition to perform.numColumns
: The no. of columns in each transposition to perform.
-
void
addTransposeVertices
(poplar::Graph &graph, const poplar::ComputeSet &cs, const poplar::Type &dType, unsigned rows, unsigned cols, const poplar::Graph::TileToTensorMapping &mapping, std::function<std::pair<const poplar::Tensor, const poplar::Tensor>(size_t)> getInOut)¶ Transposes of a set of matrices stored on multiple tiles.
This adds all the needed vertices on the graph.
- Parameters
graphcs
: The graph and compute set to add the vertices to.dTyperowscols
: The type and dimensions of the matrices to be transposed, the same for all of them.mapping
: A vector with ‘number of tiles’ elements, where each element is a vector of intervals indicating which matrices to be transposed are mapped (possibly partially) on each tile.getInOut
: A function:pair<Tensor, Tensor> getInOut(size_t index)
, which, given as input an index inside the intervals specified in ‘mapping’, returns a std::pair of Tensors (in, out) which are the input and output matrix for the ‘index’ transposition. The ‘in’ and ‘out’ return values are 2D matrices, but they must be flattened to a single dimension.
-
poplar::Tensor
partialTranspose
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::ComputeSet &cs, const std::string &debugPrefix = "")¶ Transpose the innermost pair of dimensions of the specified tensor, writing the results to a new tensor.
This function assumes order of the underlying storage matches the order of the elements in the tensor. This function is optimized for group sizes that are typical of the underlying memory layout of convolution activations / weights - it may be inefficient for other group sizes.
-
unsigned
getMinimumRegroupGrainSize
(const poplar::Type &type)¶ Get the smallest grouping we can transpose between for the given type using fast transposition codelets.
- Return
The smallest size of grouping that can be efficiently transposed for the given type.
- Parameters
type
: The data type to be transposed.
-
poplar::Tensor
regroupTensor
(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &copies, const poplar::ComputeSet &transposeCS, const poputil::GroupingInfo &from, const poputil::GroupingInfo &to, const std::string &debugPrefix)¶ Insert copies or other operations into the given programs/compute sets to transform the grouping found on the given tensor from
from
toto
.This is a no-op for a one-dimensional tensor.
- Return
A tensor with the contents of
t
but laid out such that it has the grouping specified into
.- Parameters
graph
: The graph to add the operation to.t
: The tensor to regroup.copies
: A poplar sequence to add pre-arranging copies to.transposeCS
: A compute set that may or may not have vertices added to it to perform the regrouping operation.from
: A grouping that is applied to the given tensort
to rearrange from.to
: A grouping wanted on the returned tensor.debugPrefix
: An optional string to be prepended to any debug info.
-
poplar::Tensor
regroupTensor
(poplar::Graph &graph, const poplar::Tensor &t, std::vector<poplar::program::Copy> &copies, const poplar::ComputeSet &transposeCS, const poputil::GroupingInfo &from, const poputil::GroupingInfo &to, const std::string &debugPrefix)¶ Insert copies or other operations into the given programs/compute sets to transform the grouping found on the given tensor from
from
toto
.This is a no-op for a one-dimensional tensor.
Overload that takes a vector of Copy programs instead of a Sequence.
- Return
A tensor with the contents of
t
but laid out such that it has the grouping specified into
.- Parameters
graph
: The graph to add the operation to.t
: The tensor to regroup.copies
: A vector to add pre-arranging copies to.transposeCS
: A compute set that may or may not have vertices added to it to perform the regrouping operation.from
: A grouping that is applied to the given tensort
to rearrange from.to
: A grouping wanted on the returned tensor.debugPrefix
: An optional string to be prepended to any debug info.
-
poplar::Tensor
regroupIfBeneficial
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &ref, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the grouping of the resulting tensor matches that of the reference tensor, or a factor of that grouping if it balances memory usage across the target better.
- Return
A tensor with the contents of the given tensor
in
rearranged in memory to have a grouping matchingref
.- Parameters
graph
: The graph to add the operation to.in
: The tensor to maybe regroup.ref
: A reference tensor which will be introspected to find a grouping to apply to the returned tensor.prog
: A poplar sequence to add the regrouping operation to.debugPrefix
: An optional string to be prepended to any debug info.
-
poplar::Tensor
regroupIfBeneficial
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &ref, std::vector<poplar::program::Copy> &copies, poplar::ComputeSet transposeCS, const std::string &debugPrefix = "")¶ If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the grouping of the resulting tensor matches that of the reference tensor, or a factor of that grouping if it balances memory usage across the target better.
Overload that takes a vector of Copy programs instead of a Sequence.
- Return
A tensor with the contents of the given tensor
in
rearranged in memory to have a grouping matchingref
.- Parameters
graph
: The graph to add the operation to.in
: The tensor to maybe regroup.ref
: A reference tensor which will be introspected to find a grouping to apply to the returned tensor.copies
: A vector to add pre-arranging copies to.debugPrefix
: An optional string to be prepended to any debug info.
-
poplar::Tensor
regroupIfBeneficial
(poplar::Graph &graph, const poplar::Tensor &in, std::size_t preferredGrouping, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the resulting tensor has a grouping in the innermost dimension equivalent to, or a factor of the given preferred grouping if it balances memory usage across the target better.
- Return
A tensor with the contents of the given tensor
in
rearranged in memory to have a grouping matchingref
.- Parameters
graph
: The graph to add the operation to.in
: The tensor to maybe regroup.preferredGrouping
: A size of grouping of the innermost dimension of the given tensor to regroup to.prog
: A poplar sequence to add the regrouping operation to.debugPrefix
: An optional string to be prepended to any debug info.
-
bool
-
namespace
4.2.21. popops/Reduce.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Type &outType, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Apply a reduction operation to a tensor.
scale and update are currently only valid with the
ADD
orSQUARE_ADD
operations.Internally, this creates a new variable for the output then calls
reduceWithOutput()
. The type of the output will beoutType
.The options parameter accepts the following:
accumType.interTile
(float, half)The type to use for intermediate values between tiles.
accumType.inVertex
(float, half)The type to use for intermediate values within a vertex.
If either of the above options are not set then the intermediate type will default to either the input tensor element type or float if the input is of type half and the reduction operation benefits from higher precision (for example, add).
The input and output types that are supported depend on the operation:
ADD
,SQUARE_ADD
,MUL
: float->float, half->half, int->int, float->half, half->floatMAX
,MIN
: float->float, half->half, int->intLOGICAL_AND
,LOGICAL_OR
: bool->bool
- Parameters
graph
: The graph to add the operation toin
: The tensor to be reducedoutType
: The output type of the reduce operationdims
: The dimensions to reduce inprog
: The program sequence to add the operation todebugPrefix
: Identifying prefix for debugging information
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
reduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &out, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ This is similar to reduce() but allows you to specify the output.
If the tile mapping of
out
is not complete it will be set. Otherwise it won’t be changed.
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Type &outType, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ These are alternate forms that add their vertices to a vector of compute sets instead of a Sequence.
The caller is expected to add each compute set to a Sequence (in an Execute) themselves, like this:
Sequence seq; std::vector<ComputeSet> css; auto A = reduce(..., css); auto B = reduce(..., css); for (const auto &cs : css) { seq.add(Execute(cs));
This allows you to do multiple reductions in parallel. Note that the reductions are not aware of each other, so it may be more efficient to concatenate tensors and do a single reduction instead if they have the same shape, operation, and input and output types.
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
reduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &out, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Type &outType, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
void
reduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &out, const std::vector<std::size_t> &dims, ReduceParams params, poplar::program::Sequence &prog, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Type &outType, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
poplar::Tensor
reduce
(poplar::Graph &graph, const poplar::Tensor &in, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
void
reduceWithOutput
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &out, const std::vector<std::size_t> &dims, ReduceParams params, std::vector<poplar::ComputeSet> &css, const std::string &debugPrefix, const poplar::OptionFlags &options, ReductionDebug *debug)¶ DEPRECATED - Use overloaded function without ReductionDebug parameter instead.
-
struct
ReduceParams
¶ - #include <Reduce.hpp>
A reduce operation can optionally scale the output, and can also be an “update”, i.e.
out += reduce(in) rather than out = reduce(in).
ReduceParams
stores that information, as well as the basic operation being performed (add, mul, etc).Public Functions
-
ReduceParams
() = default¶
-
-
poplar::Tensor
4.2.22. popops/ScaledAdd.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor B, float scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Add the elements of one tensor multiplied by a scalar to another tensor.
Performs the calculations
A
+=scaleB
*B
The operation is performed after casting
B
to the type ofA
.Scaled add options
optimizeForSpeed
(true, false) [=false]The scaledAdd vertices default to being optimized to aid memory allocation. To optimise them for speed instead, set this option to true.
scaleFloatToHalfTolerance
(double) [=1e-6]Where the tensors
A
,B
are of type half and ascaleB
is provided as a float or a tensor of type float, it is possible to to implement the scaledAddTo in half precision ifscaleB
can be cast to half precision with acceptable accuracy. Otherwise full precision arithmetic can be used internally, but at the cost of speed. Floating point arithmetic will be selected if the relative error in casting is greater than the relative tolerance.Only applies to ScaledAddTo with scaleB.
- Parameters
graph
: The Poplar graph.A
: The destination tensor.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The scalar to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations.
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor B, poplar::Tensor scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Add the elements of one tensor each multiplied by a (scalar) tensor to another tensor.
Performs the calculations
A
+=scaleB
*B
The operation is performed after casting
scaleB
andB
to the type ofA
.- Parameters
graph
: The Poplar graph.A
: The destination tensor.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The scalar tensor to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledSubtractFrom
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor B, float scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Subtract the elements of one tensor multiplied by a scalar from another tensor.
Performs the calculations
A
-=scaleB
*B
The operation is performed after casting
B
to typeA
.- Parameters
graph
: The Poplar graph.A
: The destination tensor.B
: The second tensor providing the elements to subtract (must be of the same shape asA
).scaleB
: The scalar to multiply elements ofB
with before subtraction.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledSubtractFrom
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor B, poplar::Tensor scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Subtract the elements of one tensor each multiplied by a (scalar) tensor from another tensor.
Performs the calculations
A
-=scaleB
*B
The operation is performed after casting
scaleB
, andB
to the type ofA
.- Parameters
graph
: The Poplar graph.A
: The destination tensor.B
: The second tensor providing the elements to subtract (must be of the same shape asA
).scaleB
: The scalar tensor to multiply elements ofB
with before subtraction.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor scaleA, poplar::Tensor B, poplar::Tensor scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and add the scaled elements of another tensor to it.
The two scaling factors are (scalar) tensors.
Performs the calculations
A
=scaleA
*A
+scaleB
*B
The operation is performed after casting
scaleA
,scaleB
andB
to the type ofA
.- Parameters
graph
: The Poplar graph.A
: The destination tensor.scaleA
: The scalar tensor to multiply elements ofA
with before addition.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The scalar tensor to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor scaleA, poplar::Tensor B, poplar::Tensor scaleB, poplar::program::Sequence &prog, const ScaledAddSpecialisation speciality, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and add the scaled elements of another tensor to it.
The two scaling factors are (scalar) tensors.
Performs the calculations
A
=scaleA'
*A
+scaleB
*B
where scaleA’ is a function of scaleA specified by the “speciality” option.The operation is performed after casting
scaleA
,scaleB
andB
to the type ofA
.- Parameters
graph
: The Poplar graph.A
: The destination tensor.scaleA
: The scalar tensor to multiply elements ofA
with before addition.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The scalar tensor to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.speciality
: Choice of ScaledAdd expression formulationdebugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, float scaleA, poplar::Tensor B, float scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and add the scaled elements of another tensor to it.
The two scaling factors are constants.
Performs the calculations
A
=scaleA
*A
+scaleB
*B
If
A
andB
are of different types,B
is first cast to the type ofA
and the operation performed.- Parameters
graph
: The Poplar graph.A
: The destination tensor.scaleA
: The constant to multiply elements ofA
with before addition.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The constant to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledAddTo
(poplar::Graph &graph, poplar::Tensor A, float scaleA, poplar::Tensor B, float scaleB, poplar::program::Sequence &prog, const ScaledAddSpecialisation speciality, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and add the scaled elements of another tensor to it.
The two scaling factors are constants.
Performs the calculations
A
=scaleA'
*A
+scaleB
*B
where scaleA’ is a function of scaleA specified by the “speciality” option.If
A
andB
are of different types,B
is first cast to the type ofA
and the operation performed.- Parameters
graph
: The Poplar graph.A
: The destination tensor.scaleA
: The constant to multiply elements ofA
with before addition.B
: The second tensor to add elements from (must be of the same shape asA
).scaleB
: The constant to multiply elements ofB
with before addition.prog
: A sequence program to which the code performing the add will be appended.speciality
: Choice of ScaledAdd expression formulationdebugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledSubtractFrom
(poplar::Graph &graph, poplar::Tensor A, poplar::Tensor scaleA, poplar::Tensor B, poplar::Tensor scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and subtract the scaled elements of another tensor to it.
The two scaling factors are (scalar) tensors.
Performs the calculations
A
= scaleA
*A
-scaleB
*B
The operation is performed after casting
scaleA
,scaleB
andB
to the type ofA
.- Parameters
graph
: The poplar graph.A
: The destination tensor.scaleA
: The scalar tensor to multiply elements ofA
with before subtraction.B
: The second tensor to subtract elements from (must be of the same shape asA
).scaleB
: The scalar tensor to multiply elements ofB
with before subtraction.prog
: A sequence program to which the code performing the subtract will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
scaledSubtractFrom
(poplar::Graph &graph, poplar::Tensor A, float scaleA, poplar::Tensor B, float scaleB, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Scale the elements of one tensor and subtract the scaled elements of another tensor to it.
The two scaling factors are constants.
Performs the calculations
A
=scaleA
*A
-scaleB
*B
If
A
andB
are of different types,B
is first cast to the type ofA
and the operation performed.- Parameters
graph
: The poplar graph.A
: The destination tensor.scaleA
: The constant to multiply elements ofA
with before subtraction.B
: The second tensor to subtract elements from (must be of the same shape asA
).scaleB
: The constant to multiply elements ofB
with before subtraction.prog
: A sequence program to which the code performing the subtract will be appended.debugPrefix
: A debug prefix to add to any tensors/compute set names.options
: A list of flags to control optimizations. See scaledAddTo().
-
void
4.2.23. popops/Scatter.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Typedefs
Functions
-
void
scatter
(poplar::Graph &graph, const poplar::Tensor &operand, const poplar::Tensor &indices, const poplar::Tensor &updates, std::size_t indexVectorDim, std::vector<unsigned> updateWindowDims, std::vector<std::size_t> insertWindowDims, std::vector<unsigned> scatterDimsToOperandDims, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ The scatter operation generates a result which is the value of the input array
operand
, with several slices (at indices specified byscatter_indices
) updated with the values inupdates
.- Note
This is a near direct port of https://www.tensorflow.org/xla/operation_semantics#scatter from tensorflow/compiler/xla/service/scatter_expander.cc
- Parameters
graph
: The Poplar graph.operand
: Array to be scattered into.indices
: Array containing the starting indices of the slices that must be scattered to.updates
: Array containing the values that must be used for scattering.indexVectorDim
: The dimension in scatter_indices that contains the starting indices.updateWindowDims
: The set of dimensions in updates shape that are window dimensions.insertWindowDims
: The set of window dimensions that must be inserted into updates shape.scatterDimsToOperandDims
: A dimensions map from the scatter indices to the operand index space. This array is interpreted as mapping i to scatterDimsToOperandDims[i] . It has to be one-to-one and total.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
void
scatter
(poplar::Graph &graph, const poplar::Tensor &operand, const poplar::Tensor &indices, const poplar::Tensor &updates, std::size_t indexVectorDim, std::vector<unsigned> updateWindowDims, std::vector<std::size_t> insertWindowDims, std::vector<unsigned> scatterDimsToOperandDims, UpdateComputationFunc &updateComputation, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Similar to the above scatter, but allows for a user defined update computation.
This computation is used to combine the existing values in the input tensor and the updates during the scatter.
- Note
The first tensor parameter that is passed into the updateComputation will always be the current value from the operand tensor and the second parameter will always be the value from the updates tensor. This is important specifically for cases when the updateComputation is not commutative.
- Parameters
updateComputation
: Computation to be used for combining the existing values in the input tensor and the updates during scatter.
-
void
4.2.24. popops/SelectScalarFromRows.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
selectScalarFromRows
(poplar::Graph &graph, const poplar::Tensor ¶ms, const poplar::Tensor &indices, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ For each row in the 2D tensor params, select a single scalar value.
Aggregate the resulting scalars into a 1D tensor.
The size of the
indices
tensor must be equal to the size of dimension-0 ofparams
. The ith element ofindices
represents an index in the ith row of the params tensor.- Parameters
params
: A 2D tensor, element-type must be either float or halfindices
: A 1D tensor, element-type must be unsigned integer.
- Return
A 1D tensor containing in the ith position the scalar
params[indices[i]]
.If ith element of the
indices
tensor is less than 0 or greater than the width ofparams
then a NaN is stored into the ith element of the output. If the ith element of theindices
tensor is equal toMASKED_LABEL_CODE
then zero is stored into the ith element of the output.
-
poplar::Tensor
4.2.25. popops/Sort.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
poplar::Tensor
sort
(poplar::Graph &graph, const poplar::Tensor &t, unsigned dim, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Sort a given tensor along the given dimension.
This will return a tensor that is a permutation of the input tensor with the property that all 1D slices in the chosen dimensions are in ascending order.
This aims to match TensorFlow’s XLA sort https://www.tensorflow.org/xla/operation_semantics#sort
- Return
A tensor which is a permutation of
t
such that all elements in the given dimension are in order.- Parameters
graph
: The Poplar graph.t
: The source tensor.dim
: The dimension to sort on.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
void
sortInPlace
(poplar::Graph &graph, const poplar::Tensor &t, unsigned dim, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ In-place sort a given tensor along the given dimension.
This will permute the input tensor so that all 1D slices in the chosen dimensions are in ascending order.
- Parameters
graph
: The Poplar graph.t
: The source tensor to be sorted.dim
: The dimension to sort on.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
poplar::Tensor
sortKeyValue
(poplar::Graph &graph, const poplar::Tensor &k, const poplar::Tensor &v, unsigned dim, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Sort a given tensor by a key tensor along the given dimension.
This will return a tensor that is a permutation of the input value tensor with the property that all 1D slices in the chosen dimensions are in ascending order with respect to the key tensor.
This aims to match TensorFlow’s XLA sort https://www.tensorflow.org/xla/operation_semantics#sort
- Return
A tensor which is a permutation of
v
such that it is in order with respect to the tensork
in the given dimension.- Note
If
k
andv
alias, the result is undefined.- Parameters
graph
: The Poplar graph.k
: The key tensor to sort on.v
: The value tensor to be sorted.dim
: The dimension to sort on.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
void
sortKeyValueInPlace
(poplar::Graph &graph, const poplar::Tensor &k, const poplar::Tensor &v, unsigned dim, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ In-place sort a given tensor by a key tensor along the given dimension.
This will permute the key and value tensors so that all 1D slices in the chosen dimensions are in ascending order with respect to the key tensor.
- Note
the ‘k’ tensor is also sorted by this in-place operation.
- Note
If the
k
tensor and thev
tensor alias, the result is undefined.- Parameters
graph
: The Poplar graph.k
: The key tensor to sort on.v
: The value tensor to be sorted.dim
: The dimension to sort on.prog
: The program to be extended.debugPrefix
: The prefix prepended to debugging info.
-
poplar::Tensor
4.2.26. popops/UpdateScalarInRows.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
void
updateScalarInRows
(poplar::Graph &graph, const poplar::Tensor ¶ms, const poplar::Tensor &indices, poplar::program::Sequence &program, const std::string &debugPrefix = "")¶ Update in-place one scalar per row of the tensor
params
.For each row, the index of the value to update is specified by the tensor
indices
. If the index fromindices
is equal to MASKED_LABEL_CODE then no update is carried out.Pseudo-code for each row r if indices[r] != MASKED_LABEL_CODE params[r][indices[r]] = params[r][indices[r]] - 1.f
If the ith index is less than 0 or greater than width then the whole row of the param tensor is set to NAN. This is to match the interface of the backward phase of tf.nn.sparse_softmax_cross_entropy_with_logits (see the link above).
- Parameters
params
: The 2D tensor to be updated, element-type must be either float or half.indices
: 1D tensor, element-type must be unsigned integer.
-
void
4.2.27. popops/Zero.hpp¶
-
namespace
popops
Common functions, such as elementwise and reductions.
Functions
-
void
zero
(poplar::Graph &graph, poplar::Tensor t, const std::vector<poplar::Interval> &tileRegions, unsigned tile, poplar::ComputeSet zeroCS)¶ Appends vertices to
zeroCS
which zeroes elements intileRegions
oft
which reside on tiletile
.- Parameters
graph
: The graph that the operation will be added to.t
: The tensor whose elements are to be set to zero.tileRegions
: Region mapping of the tensor ontile
.tile
: Tile which the regions relate to.zeroCS
: Compute set to add the operation into.
-
void
zero
(poplar::Graph &graph, const poplar::Tensor &t, unsigned tile, poplar::ComputeSet zeroCS)¶ Appends vertices to
zeroCS
which zeroes all elements oft
which reside on tiletile
.- Parameters
graph
: The graph that the operation will be added to.t
: The tensor whose elements are to be set to zero.tile
: Tile on which the tensor is mapped to.zeroCS
: Compute set to add the operation into.
-
void
zero
(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::vector<poplar::Interval>> &mapping, poplar::ComputeSet zeroCS)¶ Appends vertices to
zeroCS
which zeroes elements inmapping
oft
which reside on tiles represented withmapping
.- Parameters
graph
: The graph that the operation will be added to.t
: The tensor whose elements are to be set to zero.mapping
: The tensor’s region mapping per tile. Each element describes a region mapping of a tile (ordered). i.e. mapping[0] -> tile 0’s region mapping fort
.zeroCS
: Compute set to add the operation into.
-
void
zero
(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Appends programs to
prog
which zeroes all elements of the Tensort
.- Parameters
graph
: The graph that the operation will be added to.t
: The tensor whose elements are to be set to zero.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
void
4.3. Linear algebra functions (poplin)¶
Linear algebra functions (matrix multiplications, convolutions).
4.3.1. poplin/ConvParams.hpp¶
-
template<>
structstd
::
hash
<poplin::ConvParams::InputTransform>¶ Public Functions
-
std::size_t
operator()
(const poplin::ConvParams::InputTransform &it) const¶
-
std::size_t
-
template<>
structstd
::
hash
<poplin::ConvParams::OutputTransform>¶ Public Functions
-
std::size_t
operator()
(const poplin::ConvParams::OutputTransform &ot) const¶
-
std::size_t
-
template<>
structstd
::
hash
<poplin::ConvParams>¶ Public Functions
-
std::size_t
operator()
(const poplin::ConvParams ¶ms) const¶
-
std::size_t
-
namespace
poplin
¶ Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Functions
-
std::ostream &
operator<<
(std::ostream &os, const ConvParams &p)¶
-
std::size_t
hash_value
(const ConvParams::InputTransform &it)¶
-
std::size_t
hash_value
(const ConvParams::OutputTransform &ot)¶
-
struct
ConvParams
¶ Public Functions
-
ConvParams
() = default¶
-
ConvParams
(poplar::Type dataType, std::size_t batchSize, std::vector<std::size_t> inputFieldShape, std::vector<std::size_t> kernelShape, std::size_t inputChannels, std::size_t outputChannels, std::size_t numConvGroups)¶
-
ConvParams
(poplar::Type inputType, poplar::Type outputType, std::size_t batchSize, std::vector<std::size_t> inputFieldShape, std::vector<std::size_t> kernelShape, std::size_t inputChannels, std::size_t outputChannels, std::size_t numConvGroups)¶
-
ConvParams
(poplar::Type inputType, poplar::Type outputType, std::size_t batchSize, std::vector<std::size_t> inputFieldShape, std::vector<std::size_t> kernelShape, std::size_t inputChannels, std::size_t outputChannels, std::size_t numConvGroups, InputTransform inputTransform, InputTransform kernelTransform, OutputTransform outputTransform)¶
-
std::size_t
getUntransformedOutputSize
(unsigned dim) const¶ Return the size of the output of the convolution operation, before output transformations are applied.
-
unsigned
getTruncatedInputSize
(unsigned dim) const¶ Return the size of input in the specified dimension after truncation.
-
unsigned
getTruncatedKernelSize
(unsigned dim) const¶ Return the size of kernel in the specified dimension after truncation.
-
unsigned
getTransformedInputSize
(unsigned dim) const¶ Return the size of input in the specified dimension after applying the input transforms.
-
unsigned
getTransformedKernelSize
(unsigned dim) const¶ Return the size of kernel in the specified dimension after applying the kernel transforms.
-
void
validate
() const¶
-
ConvParams
canonicalize
() const¶
Public Members
-
std::size_t
numConvGroups
¶ number of groups in a grouped convolution (G).
The input and output channels are divided by G such that G kernels are applied to an input tensors of size {B, {dims}, Ci/G} to produce output tensors of size {B, O{dims}, Co/G}. O{dims} is the output field dimensions
-
InputTransform
inputTransform
¶ The transform applied to the input.
-
InputTransform
kernelTransform
¶ The transform applied to the kernel.
-
OutputTransform
outputTransform
¶ The transform applied to the output.
Friends
-
friend bool
operator<
(const ConvParams &a, const ConvParams &b)¶
-
friend bool
operator==
(const ConvParams &a, const ConvParams &b)¶
-
friend bool
operator!=
(const ConvParams &a, const ConvParams &b)¶
-
struct
InputTransform
¶ Public Functions
-
InputTransform
() = default¶
Public Members
-
std::vector<unsigned>
dilation
¶ Dilation applied to each spatial dimensions after truncation and before padding.
Dilation is performed by placing zeroed elements between the elements of the field.
Friends
-
friend bool
operator<
(const InputTransform &a, const InputTransform &b)¶
-
friend bool
operator==
(const InputTransform &a, const InputTransform &b)¶
-
friend bool
operator!=
(const InputTransform &a, const InputTransform &b)¶
-
-
struct
OutputTransform
¶ Public Functions
-
OutputTransform
() = default¶
Public Members
Friends
-
friend bool
operator<
(const OutputTransform &a, const OutputTransform &b)¶
-
friend bool
operator==
(const OutputTransform &a, const OutputTransform &b)¶
-
friend bool
operator!=
(const OutputTransform &a, const OutputTransform &b)¶
-
-
-
std::ostream &
-
namespace
std
-
template<> ConvParams >
Public Functions
-
std::size_t
operator()
(const poplin::ConvParams ¶ms) const¶
-
std::size_t
-
template<> InputTransform >
Public Functions
-
std::size_t
operator()
(const poplin::ConvParams::InputTransform &it) const¶
-
std::size_t
-
template<> OutputTransform >
Public Functions
-
std::size_t
operator()
(const poplin::ConvParams::OutputTransform &ot) const¶
-
std::size_t
-
4.3.2. poplin/ConvUtil.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Functions
-
unsigned
getDilatedSize
(unsigned size, unsigned dilation)¶ Return the output size when the specified dilation is applied to an input of the specified size.
-
unsigned
getInputIndex
(unsigned dim, unsigned outputIndex, unsigned kernelIndex, const ConvParams ¶ms)¶ Return the index of the input element that is multiplied by the specified kernel index to produce the specified output.
Return ~0U if there is no such input element.
-
unsigned
getKernelIndex
(unsigned dim, unsigned outputIndex, unsigned inputIndex, const ConvParams ¶ms)¶ Return the index of the kernel element that is multiplied by the specified input index to produce the specified output.
Return ~0U if there is no such kernel element.
-
std::pair<unsigned, unsigned>
getOutputRangeForKernelIndex
(unsigned dim, std::pair<unsigned, unsigned> outputRange, unsigned kernelIndex, const ConvParams ¶ms)¶ Given an output range, return the subset whose calculation involves the specified kernel index.
-
std::pair<unsigned, unsigned>
getOutputRangeForInputIndex
(unsigned dim, std::pair<unsigned, unsigned> outputRange, unsigned inputIndex, const ConvParams ¶ms)¶ Given an output range, return the subset whose calculation involves the specified input.
-
std::pair<unsigned, unsigned>
getOutputRangeForKernelRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, std::pair<unsigned, unsigned> kernelIndexRange, const ConvParams ¶ms)¶ Given an output range, return the subset whose calculation involves the specified range of kernel indicies.
-
std::pair<unsigned, unsigned>
getOutputRangeForInputRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, std::pair<unsigned, unsigned> inputRange, const ConvParams ¶ms)¶ Given an output range, return the subset whose calculation involves the specified range of input indicies.
-
std::pair<unsigned, unsigned>
getInputRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, unsigned kernelIndex, const ConvParams ¶ms)¶ Return the input range that is associated with the specified kernel index when calculating the specified output range.
-
std::pair<unsigned, unsigned>
getKernelRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, unsigned inputIndex, const ConvParams ¶ms)¶ Return the kernel range that is associated with the specified input index when calculating the specified output range.
-
std::pair<unsigned, unsigned>
getInputRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, std::pair<unsigned, unsigned> kernelIndexRange, const ConvParams ¶ms)¶ Return the input range that is associated with the specified kernel index range when calculating the specified output range.
-
std::pair<unsigned, unsigned>
getKernelRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, std::pair<unsigned, unsigned> inputRange, const ConvParams ¶ms)¶ Return the kernel range that is associated with the specified input index range when calculating the specified output range.
-
std::pair<unsigned, unsigned>
getInputRange
(unsigned dim, unsigned outputIndex, std::pair<unsigned, unsigned> kernelIndexRange, const ConvParams ¶ms)¶
-
std::pair<unsigned, unsigned>
getInputRange
(unsigned dim, unsigned outputIndex, const ConvParams ¶ms)¶
-
std::pair<unsigned, unsigned>
getInputRange
(unsigned dim, std::pair<unsigned, unsigned> outputRange, const ConvParams ¶ms)¶
-
ConvParams
getGradientParams
(const ConvParams ¶ms)¶ Given a set of parameters, return the set of params that represent the convolution to be applied to the output gradients to get the input gradients (provided the weights have been transposed in the channel axes and flipped in the spatial axes).
-
ConvParams
getWeightUpdateParams
(const ConvParams &fwdParams)¶ Given a set of convolution parameters, return the set of params that represent the convolution to be applied to the output gradients to get the weight update gradients.
-
unsigned
4.3.3. poplin/Convolution.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Typedefs
-
using
ConvPlanParams
= std::tuple<const poplar::Target*, const ConvParams, const poplar::OptionFlags*>¶
Functions
-
uint64_t
getFwdFlops
(const ConvParams ¶ms)¶ Calculate minimum number of floating point operations required to perform the forward pass convolution given a set of
params
.
-
uint64_t
getBwdFlops
(const ConvParams ¶ms)¶ Calculate minimum number of floating point operations required to perform the backward pass convolution given a set of
params
.
-
uint64_t
getWuFlops
(const ConvParams ¶ms)¶ Calculate minimum number of floating point operations required to perform the weight update pass convolution given a set of
params
.
-
double
getFwdPerfectCycleCount
(const poplar::Graph &graph, const ConvParams ¶ms)¶ Calculate the number of cycles to perform the forward pass assuming maximal utilisation of target hardware performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency. cycleCount = getFwdFlops() / maximumHardwareVectorization
- Parameters
graph
: Provides target the convolution will run on.params
: Description of convolution.
-
double
getBwdPerfectCycleCount
(const poplar::Graph &graph, const ConvParams ¶ms)¶ Calculate the number of cycles to perform the backward pass assuming maximal utilisation of target hardware performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency. cycleCount = getBwdFlops() / maximumHardwareVectorization
- Parameters
graph
: Provides target the convolution will run on.params
: Description of convolution.
-
double
getWuPerfectCycleCount
(const poplar::Graph &graph, const ConvParams ¶ms)¶ Calculate the number of cycles to perform the weight update pass assuming maximal utilisation of target hardware performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency. cycleCount = getWuFlops() / maximumHardwareVectorization
- Parameters
graph
: Provides target the convolution will run on.params
: Description of convolution.
-
poplar::Tensor
createWeights
(poplar::Graph &graph, const ConvParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Create a weight tensor suitable for use with convolution()
The shape of the tensor will be [convGroups x outChansPerConvGroup x inChansPerConvGroup x H x W]
Convolution options
availableMemoryProportion
Decimal between 0 and 1 (inclusive) [=0.6]The proportion of tile memory to be made available as temporary memory for this convolution. This constraint will be ignored (with a warning) if a conforming plan cannot be found and then the planner will replan for the smallest memory usage possible. Less temporary memory will generally result in a convolution that takes more cycles to complete. However, because always live memory (like code and vertex state) is not tracked by the planner, a convolution using less temporary memory may use more memory overall due to an increase of always live memory.
Note: We recommend using a value greater than 0.05. Below this value the volume of always live memory quickly increases and can result in OOM errors.
partialsType
(half, float) [=float]Data type used for intermediate calculations.
pass
(NONE, INFERENCE_FWD, TRAINING_FWD, TRAINING_BWD, TRAINING_WU, FC_INFERENCE_FWD, FC_TRAINING_FWD, FC_TRAINING_BWD, FC_TRAINING_WU) [=NONE]use128BitConvUnitLoad
(true, false) [=false]If true, convolution weights are loaded 128-bits at a time. Otherwise, they are loaded 64-bits at a time. Not all codelets support 128-bit loads. This option affects memory usage and cycle count.
enableMultiStageReduce
(true, false) [=true]If true, perform the reduction following the convolution in multiple stages if it would significantly reduce code size. This comes at the cost of increasing the number of cycles.
enableFastReduce
(true, false) [=false]If true, use a faster reduction vertex if the data types and widths allow it. This comes at the cost of further constraints on memory allocation
remapOutputTensor
(true, false) [=true]If true, the output of the convolution is remapped if the output is detected to have a poor layout.
enableConvDithering
(true, false) [=true]If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models.
- Return
The weights tensor suitable for use with convolution().
- Parameters
graph
: The graph that the tensor will be added to.params
: The same parameters as used by the convolution().name
: Debugging name for the tensor.options
: Options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
createBiases
(poplar::Graph &graph, const poplar::Tensor &activations, const std::string &name = "biases")¶ Create a bias tensor suitable for input to addBias() function.
The tensor will have the shape [outChans]
- Return
The tensor of biases.
- Parameters
graph
: The graph that the tensor will be added to.activations
: The activation tensor which is output from the convolution.name
: Debugging name for the tensor.
-
poplar::Tensor
createInput
(poplar::Graph &graph, const ConvParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Create an input tensor for a convolution.
Use this when required to create an input data tensor for a convolution. The same set of parameters which will be passed to the convolution() should also be passed to createInput().
The returned tensor has the shape [B x inChans x H x W].
- Return
The allocated input tensor.
- Parameters
graph
: The tensor will be added to this graph.params
: Parameters as passed to the target convolution.name
: Debugging name for the tensor.options
: Options controlling the implementation. See createWeights().cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
convolution
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &weights, const ConvParams ¶ms, bool transposeAndFlipWeights, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Convolve an input with a set of weights.
This is for a 2D convolution.
The input tensor is in the form [B x inChans x H x W], and can be allocated using createInput(). The weights tensor is in the form [convGroups x outChansPerConvGroup x inChansPerConvGroup x H x W], and can be allocated using createWeights().
The returned tensor has the shape [B x outChans x H x W]
Padding and striding are specified in the ConvParams structure.
- Return
The convolved output tensor.
- Parameters
graph
: The graph that the operation will be added to.in
: Input data tensor.weights
: Weights tensor.params
: Parameters for the form of the convolution.transposeAndFlipWeights
: For the weight update pass.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.options
: Options that control the implementation. See createWeights().cache
: Optional pointer to planning cache to use.
-
void
preplanConvolutions
(const std::set<ConvPlanParams> &convs, PlanningCache &cache)¶ Plan the specified convolutions.
- Parameters
convs
: A set of tuples ofconv-specific target for tile / IPU sizing
convolution parameters
implementation options. See createWeights(). All entries must have matching machine parameters.
cache
: The planning cache to update.
-
void
weightsTransposeChansFlipXY
(poplar::Graph &graph, const poplar::Tensor &weightsIn, const poplar::Tensor &weightsOut, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Copy the weights in
weightsIn
intoweightsOut
such that each element of the kernel is transposed with respect to the input and output channels and flip each spatial dimension of the kernel.See
transposeAndFlipWeights
parameter in convolution().- Parameters
graph
: The graph that the operation will be added to.weightsIn
: The input weights tensor.weightsOut
: The output weights tensor.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
void
weightsTransposeChansFlipXY
(poplar::Graph &graph, const poplar::Tensor &weightsInUnGrouped, const poplar::Tensor &weightsOutUnGrouped, std::vector<poplar::program::Copy> &preTranspose, poplar::ComputeSet transposeCS, std::vector<poplar::program::Copy> &postTranspose, const std::string &debugPrefix = "")¶ Copy the weights in
weightsIn
intoweightsOut
such that each element of the kernel is transposed with respect to the input and output channels and flip each spatial dimension of the kernel.Overload that takes vectors of Copy programs and a ComputeSet instead of a Sequence.
See
transposeAndFlipWeights
parameter in convolution().- Parameters
graph
: The graph that the operation will be added to.weightsIn
: The input weights tensor.weightsOut
: The output weights tensor.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
poplar::Tensor
calculateWeightDeltas
(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &activations, const ConvParams ¶ms, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Append an operation to generate the tensor of weight deltas onto
prog
.- Return
The weight deltas are the gradients with respect to the weights of the convolution. These are populated when the operation runs.
- Parameters
graph
: The tensor will be added to this graph.zDeltas
: Tensor containing the gradients with respect to the output of the convolution.activation
: Tensor containing the inputs to the convolution in the forward pass.params
: Parameters of the convolution.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
void
convolutionWeightUpdate
(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &weights, const poplar::Tensor &activations, ConvParams params, const poplar::Tensor &scale, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Append operations to to generate and apply the weight update.
- Parameters
graph
: The graph that the operation will be added to.zDeltas
: Tensor containing the gradients with respect to the output of the convolution.weights
: Weights tensor.activations
: Tensor containing the inputs to the convolution in the forward pass.params
: Parameters of the convolution.scale
: Scale to apply to the zDeltas.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
void
convolutionWeightUpdate
(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &weights, const poplar::Tensor &activations, ConvParams params, float scale, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Append operations to to generate and apply the weight update.
- Parameters
graph
: The graph that the operation will be added to.zDeltas
: Tensor containing the gradients with respect to the output of the convolution.weights
: Weights tensor.activations
: Tensor containing the inputs to the convolution in the forward pass.params
: Parameters of the convolution.scale
: Scale to apply to the zDeltas.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
void
convolutionBiasUpdate
(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &biases, const poplar::Tensor &scale, const poplar::OptionFlags &options, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Add a program to update
biases
tensor with the gradients derived from thezDeltas
tensor.- Parameters
graph
: The graph that the operation will be added to.zDeltas
: Tensor containing the gradients with respect to the output of the convolution.biases
: Biases tensor to update.scale
: Scale to apply to to zDeltas tensor.options
: Options controlling the implementation.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
void
convolutionBiasUpdate
(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &biases, float scale, const poplar::OptionFlags &options, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Add a program to update
biases
tensor with the gradients derived from thezDeltas
tensor.- Parameters
graph
: The graph that the operation will be added to.zDeltas
: Tensor containing the gradients with respect to the output of the convolution.biases
: Biases tensor to update.scale
: Scale to apply to to zDeltas tensor.options
: Options controlling the implementation.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
void
addBias
(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &biases, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Adds a program to
prog
which addsbiases
toactivations
tensor.- Parameters
graph
: The graph that the operation will be added to.input
: Tensor containing values which to add the biases.biases
: Biases to add to theinput
tensor.prog
: Poplar program sequence to append the operation onto.debugPrefix
: Name of the operation, for debugging.
-
void
reportPlanInfo
(std::ostream &out, const poplar::Graph &graph, const ConvParams ¶ms, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Report the convolution plan corresponding the
params
andoptions
provided.- Parameters
out
: ostream to report the plan to.graph
: The graph that the convolution is planned with.params
: The same parameters as used by the convolution().options
: Options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
void
reportWeightUpdatePlanInfo
(std::ostream &out, const poplar::Graph &graph, const ConvParams &fwdParams, const poplar::OptionFlags &fwdOptions = {}, PlanningCache *cache = nullptr)¶ Report the convolution plan corresponding to the weight update pass given the fwd pass
params
andoptions
.- Parameters
out
: ostream to report the plan to.graph
: The graph that the convolution is planned with.fwdParams
: Fwd pass parameters as used by the convolution().fwdOptions
: Fwd pass options controlling the implementation.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
fullyConnectedWeightTranspose
(poplar::Graph &graph, poplar::Tensor activations, const ConvParams ¶ms, poplar::program::Sequence &prog, const std::string &debugPrefix, const poplar::OptionFlags &options, PlanningCache *cache = nullptr)¶
-
class
PlanningCache
¶
-
using
4.3.4. poplin/FullyConnected.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
-
namespace
fc
¶ Functions
-
std::vector<std::pair<MatMulParams, poplar::OptionFlags>>
getMatMulPrePlanParameters
(FullyConnectedParams parameters, poplar::OptionFlags matmulOptions, poplar::Type type, bool inferenceOnly)¶ Predict what matrix multiplications will be needed for the given parameters and return list of corresponding matmul parameters and options.
- Return
Vector of pairs of {MatMulParams, OptionFlags} representing the complete set of matmul parameters for planning
- Parameters
parameters
: Parameters for the fully connected layer. See above for definitions.matmulOptions
: Option flags are the same as those from matmul. They are passed through to the underlying matmul, updating thefullyConnectedPass
option onlytype
: Input and output datatypeinferenceOnly
: Whether the FullyConnected layer is for inference only. If true, we can ignore backwards and weight update passes
-
struct
FullyConnectedParams
¶
-
std::vector<std::pair<MatMulParams, poplar::OptionFlags>>
-
namespace
4.3.5. poplin/MatMul.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Typedefs
-
using
MatMulPlanParams
= std::tuple<const poplar::Target*, const MatMulParams, const poplar::OptionFlags*>¶ A tuple containing the required parameters to preplan a matmul:
matmul-specific target for tile / IPU sizing
matmul parameters
implementation options (see matMul() above) All entries must have matching machine parameters.
Functions
-
poplar::Tensor
matMul
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Multiply two matrices.
Calculates C = A * B where A and B are matrices.
Matmul options
availableMemoryProportion
Decimal between 0 and 1 (inclusive) [=0.6]See createWeights().
fullyConnectedPass
(NONE, INFERENCE_FWD, TRAINING_FWD, TRAINING_BWD, TRAINING_WU) [=NONE]Optimize the plan for the specified type of pass. Note the abbreviations: FWD (forward), BWD (backward), WU (weight-update).
inputRHSIsPreArranged
(true, false) [=false]Indicates to matMul functions whether the input data has already been re-arranged (using preArrangeMatMulInputRHS()). This allows data to be re-arranged once then used many times.
use128BitConvUnitLoad
(true, false) [=false]If true, weights are loaded into the convolution unit 128-bits at a time. Otherwise, they are loaded 64-bits at a time. Not all codelets support 128-bit loads. This option affects memory usage and cycle count.
enableMultiStageReduce
(true, false) [=true]If true, perform the reduction following the matrix multiplication in multiple stages if it would significantly reduce code size. This comes at the cost of increasing the number of cycles.
enableFastReduce
(true, false) [=false]If true, use a faster reduction vertex if the data types and widths allow it. This comes at the cost of further constraints on memory allocation
remapOutputTensor
(true, false) [=true]If true, the output of the convolution is remapped if the output is detected to have a poor layout.
partialsType
(half, float) [=float]See createWeights().
- Return
The tensor holding the result of the multiplication. This tensor will be created, added to the graph and mapped to tiles.
- Parameters
graph
: The Poplar graph.A
: The left argument to the multiplication. This 2D tensor must be already mapped to tiles.B
: The right argument to the multiplication. This 2D tensor must be already mapped to tiles.prog
: A reference to a program sequence which will be appended with the code to perform the multiplication.outputType
: Optional via overloaded function. Element type of returned tensor. The default isA.elementType()
if omitted.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the multiplication should be implemented.cache
: Optional pointer to a planning cache to use.
-
poplar::Tensor
matMul
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
void
matMulReportPlan
(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
poplar::Tensor
matMulGrouped
(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Multiply two grouped matrices.
Calculates C[g] = A[g] * B[g] where A[g] and B[g] are matrices for each element in the group. g is element of the set {0, 1, …, G-1}
The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G.
- Return
The tensor holding the result of the grouped multiplication. This tensor will be created, added to the graph and mapped to tiles.
- Parameters
graph
: The Poplar graph.A
: The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.B
: The right argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.prog
: A reference to a program sequence which will be appended with the code to perform the multiplication.outputType
: Data type to be used for the returned tensor.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the grouped multiplication should be implemented. See matMul().cache
: Optional pointer to a planning cache to use.
-
void
matMulGroupedReportPlan
(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
void
matMulAcc
(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Multiply two matrices and add to a third (with a scaling factor).
Calculates C += k * A * B where A, B are matrices and k is a constant scalar.
- Parameters
graph
: The Poplar graph.C
: The matrix to add to. This 2D tensor must be already mapped to tiles.k
: The constant or a single element tensor to multiply the result of the multiplication. Ifk
is a tensor, it must be of the same type asA
A
: The left argument to the multiplication. This 2D tensor must be already mapped to tiles.B
: The right argument to the multiplication. This 2D tensor must be already mapped to tiles.prog
: A reference to a program sequence which will be appended with the code to perform the multiplication and add.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the multiplication should be implemented. See matMul().cache
: Optional pointer to a planning cache to use.
-
void
matMulAcc
(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
void
matMulGroupedAcc
(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Multiply two grouped matrices and add to a third (with a scaling factor).
Calculates C[g] += k * A[g] * B[g] where A[g], B[g] are matrices and k is a constant scalar. g is element of the set g = {0, 1, …, G-1}
The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G
- Parameters
graph
: The Poplar graph.C
: The matrix to add to. This 3D tensor must be already mapped to tiles.k
: The constant or a single element tensor to multiply the result of the multiplication. Ifk
is a tensor, it must be of the same type asA
A
: The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.B
: The right argument to the multiplication. This 3D tensor must be already mapped to tiles.prog
: A reference to a program sequence which will be appended with the code to perform the grouped multiplication and add.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the multiplication should be implemented. See matMul().cache
: Optional pointer to planning cache to use.
-
void
matMulGroupedAcc
(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
poplar::Tensor
createMatMulInputLHS
(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Create an tensor that is used as the left operand of matrix multiplication.
This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the left argument efficient.
- Return
A matrix of type
type
and shapeaShape
. The tensor will have been mapped to tiles.- Parameters
graph
: The Poplar graph.inputType
: The input data type.outputType
: The data type of the returned tensor.aShape
: The shape of the required matrix.bShape
: The shape of the matrix that the required matrix will be multiplied by.name
: The debug name of the required matrix.options
: The implementation options of the multiplication. See matMul().cache
: Optional pointer to a planning cache to use.
-
poplar::Tensor
createMatMulInputLHS
(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Overloaded function for when inputType == outputType (represented by the dataType parameter).
-
poplar::Tensor
createMatMulGroupedInputLHS
(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Create an tensor that is used as the left operand of a grouped matrix multiplication.
This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the left argument efficient.
The first dimension of the required matrix and the matrix it multiplies by must the number of groups.
- Return
A matrix of type
type
and grouped shapeaShape
. The tensor will have been mapped to tiles.- Parameters
graph
: The Poplar graph.type
: The data type of the required matrix.aShape
: The grouped shape {g, r, c} of the required matrix.bShape
: The grouped shape {g, r, c} of the matrix that the required matrix will be multiplied by.name
: The debug name of the required matrix.options
: The implementation options of the multiplication. See matMul().cache
: Optional pointer to a planning cache to use.
-
poplar::Tensor
createMatMulInputRHS
(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Create an tensor that is used as the right operand of matrix multiplication.
This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the right argument efficient.
- Return
A matrix of type
type
and shapebShape
. The tensor will have been mapped to tiles.- Parameters
graph
: The Poplar graph.inputType
: The input data type.outputType
: The data type of the returned tensor.aShape
: The shape of the matrix that the required matrix will be multiplied by.bShape
: The shape of the required matrix.name
: The debug name of the required matrix.options
: The implementation options of the multiplication. See matMul().cache
: Optional pointer to a planning cache to use.
-
poplar::Tensor
createMatMulInputRHS
(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Overloaded function for when inputType == outputType (represented by the dataType parameter).
-
poplar::Tensor
createMatMulGroupedInputRHS
(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const std::string &name, const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Create an tensor that is used as the right operand of grouped matrix multiplication.
This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the right argument efficient.
The first dimension of the required matrix and the matrix it multiplies by must the number of groups.
- Return
A matrix of type
type
and grouped shapebShape
. The tensor will have been mapped to tiles.- Parameters
graph
: The Poplar graph.type
: The data type of the required matrix.aShape
: The grouped shape {g, r, c} of the matrix that the required matrix will be multiplied by.bShape
: The grouped shape {g, r, c} of the required matrix.name
: The debug name of the required matrix.options
: The implementation options of the multiplication. See matMul().cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
preArrangeMatMulInputRHS
(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶ Re-arrange memory for RHS operand to an upcoming matmul operation.
This allows the rearrangement of the memory of a tensor that would otherwise be rearranged as part of the matmul operation for efficiency.
Use this function and the matMul* functions with the
inputRHSIsPreArranged
option flag to do any re-arrangement necessary once and then re-use that input multiple times.Only valid for fully connected layers.
- Return
New tensor holding the rearranged input. This tensor has the same shape as the given tensor.
- Parameters
graph
: The Poplar graph.aShape
: The shape of the left argument to the multiplication.B
: The right argument to the multiplication. This 2D tensor must be already mapped to tiles.prog
: A reference to a program sequence which will be appended with the code to perform the arrangement.outputType
: Optional via overloaded function. Element type of returned tensor. The default isB.elementType()
if omitted.debugPrefix
: A debug prefix added to compute set and tensor names.options
: Flags describing options for how the multiplication should be implemented. See matMul().cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
preArrangeMatMulInputRHS
(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
poplar::Tensor
preArrangeMatMulGroupedInputRHS
(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, matmul::PlanningCache *cache = nullptr)¶
-
poplar::Tensor
transposeGroupedMatrix
(const poplar::Tensor &A)¶ Transposes a grouped matrix tensor.
- Return
Transposed tensor
- Parameters
A
: Tensor to transpose
-
void
preplanMatMuls
(const std::set<MatMulPlanParams> &matmuls, matmul::PlanningCache &cache)¶ Plan the specified matrix multiplications.
- Parameters
matmuls
: A set of parameters to preplan matmulscache
: The planning cache to update
-
struct
MatMulParams
¶ - #include <MatMul.hpp>
Parameters to define a Matrix multiplication C = A * B.
Public Members
Friends
-
friend bool
operator<
(const MatMulParams &a, const MatMulParams &b)¶
-
friend bool
-
using
4.3.6. poplin/MeshGrid.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Functions
-
poplar::Tensor
linspace
(poplar::Graph &graph, const poplar::Type &type, float left, float right, size_t count, const std::string &debugPrefix = "")¶ Create a constant variable that contains values equally spaced in the specified closed range [left, right].
- Return
Constant Tensor of rank 1 (vector) containing the linspace values.
- Parameters
graph
: Graph to which the variable is added.left
: The first value in the range.right
: The last value in the range.type
: Data type of variable to create. Must be FLOAT or HALF.
-
std::vector<poplar::Tensor>
meshgrid2d
(poplar::Graph &graph, poplar::Tensor x, poplar::Tensor y)¶ Create a coordinate grid for each axis by broadcasting the input tensors.
This 2D specialisation only supports two inputs that must be of rank 1 (vectors) and hence the output coordinate grids are always two matrices (so two outputs of rank two).
- Return
A list of (two) tensors that form co-ordinate grids for each input axis. These output tensors will be views of the inputs (reshaped and broadcast)
- Parameters
graph
: Graph to which the variables are added.x
: Co-ordinates for the x-axisy
: Co-ordinates for the y-axis
-
poplar::Tensor
4.3.7. poplin/MultiConvolution.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
-
namespace
multiconv
¶ Functions
-
poplar::Tensor
createWeights
(poplar::Graph &graph, const std::vector<CreateTensorArgs> &args, unsigned weightsIndex, const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Create a specific weights tensor for the multiconvolution.
- Return
A weights tensor suitable for use with convolution().
- Parameters
graph
: The graph that the tensors will be added to.args
: The same set of parameters as used by convolution().weightsIndex
: Index into args describing the convolution which to create the weights for.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
poplar::Tensor
createInput
(poplar::Graph &graph, const std::vector<CreateTensorArgs> &args, unsigned inputIndex, const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Create a specific input tensor for the multiconvolution.
- Return
A input tensor suitable for use with convolution().
- Parameters
graph
: The graph that the tensors will be added to.args
: The same set of parameters as used by convolution().inputIndex
: Index into args describing the convolution which to create the input for.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
std::vector<poplar::Tensor>
convolution
(poplar::Graph &graph, const std::vector<ConvolutionArgs> &args, bool transposeAndFlipWeights, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Convolve a set of inputs with a set of weights.
See Convolution.hpp for more information.
- Return
Set of convolved output tensors.
- Parameters
graph
: The graph that the operations will be added to.args
: Collection of inputs, weights, and convolution parameters specifying each convolution in the multiconvolution.transposeAndFlipWeights
: Prepare the weights for the backwards pass.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
std::vector<poplar::Tensor>
calculateWeightDeltas
(poplar::Graph &graph, const std::vector<CalculateWeightDeltasArgs> &args, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Append an operation to generate the set of weight delta tensors.
See Convolution.hpp for more information.
- Return
Set of weight deltas.
- Parameters
graph
: The graph that the operations will be added to.args
: Collection of zDeltas, activations, and convolution parameters specifying each convolution in the multiconvolution.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
void
convolutionWeightUpdate
(poplar::Graph &graph, const std::vector<ConvWeightUpdateArgs> &args, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Append operations to
prog
to generate and apply the weight update.See Convolution.hpp for more information.
- Parameters
graph
: The graph that the operations will be added to.args
: Collection of zDeltas, activations, scale, and convolution parameters for the weight updates in the multiconvolution.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
void
convolutionWeightUpdate
(poplar::Graph &graph, const std::vector<ConvWeightUpdateArgsScalar> &args, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::PlanningCache *cache = nullptr)¶ Append operations to
prog
to generate and apply the weight update.See Convolution.hpp for more information.
- Parameters
graph
: The graph that the operations will be added to.args
: Collection of zDeltas, activations, scale, and convolution parameters for the weight updates in the multiconvolution.prog
: Poplar program sequence to append the operations onto.debugPrefix
: Name of the operation, for debugging.options
: Options controlling the implementation.cache
: Optional pointer to a planning cache to use.
-
struct
CalculateWeightDeltasArgs
¶ - #include <MultiConvolution.hpp>
- Parameters
zDeltas
: Tensor containing gradients with respect to the output of the convolution.activations
: Tensor containing the inputs of the convolution in the forward pass.params
: Parameters specifying the convolution.options
: Options controlling the implementation.
-
struct
ConvolutionArgs
¶ - #include <MultiConvolution.hpp>
- Parameters
in
: Input tensor.weights
: Weights tensor.params
: Parameters specifying the convolution.options
: Options controlling the implementation.
-
struct
ConvWeightUpdateArgs
¶ - #include <MultiConvolution.hpp>
- Parameters
zDeltas
: Tensor containing gradients with respect to the output of the convolution.weights
: Weights tensor.activations
: Tensor containing the inputs of the convolution in the forward pass.scale
: Scale to apply to the zDeltas.params
: Parameters specifying the convolution.options
: Options controlling the implementation.
-
struct
ConvWeightUpdateArgsScalar
¶ - #include <MultiConvolution.hpp>
- Parameters
zDeltas
: Tensor containing gradients with respect to the output of the convolution.weights
: Weights tensor.activations
: Tensor containing the inputs of the convolution in the forward pass.scale
: Scale to apply to the zDeltas.params
: Parameters specifying the convolution.options
: Options controlling the implementation.
-
struct
CreateTensorArgs
¶ - #include <MultiConvolution.hpp>
Multi-convolutions allow for a set of convolutions to be executed in parallel.
The benefit of executing convolutions in parallel is an increase in data throughput. Specifically, executing N independent convolutions in parallel will be faster than sequentially executing them because less time is spent on the ~constant vertex overhead per tile.
Note that the allocation of associated tensors for convolutions should be done through the same api such that they are mapped across tiles appropriately for the operation.
See Convolution.hpp for information about convolutions and each individual operation.
Multi-Convolution options
planType
(serial, parallel) [=parallel]Which multi-conv implementation to use. Serial is the same as using the normal API for each convolution.
perConvReservedTiles
Integer [=50]The amount of tiles to reserve for each convolution when planning.
cycleBackOff
Double [=0.1]A percentage, represented as a proportion between 0 and 1 of how much off the fastest plan when attempting to plan the largest convolution using the least amount of tiles.
This number is scaled up according to how many convolutions are being run in parallel.
- Parameters
params
: Parameters specifying the convolution.options
: Options controlling the implementation.name
: Debugging name for the tensor.
-
poplar::Tensor
-
namespace
4.3.8. poplin/Norms.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
Functions
-
poplar::Tensor
createNormGamma
(poplar::Graph &graph, const poplar::Tensor &acts)¶ Create and map the per channel multiplicative gamma parameter tensor used for normalisation in convolution layers.
- Return
Gamma vector of dimension
C
.- Parameters
graph
: The graph with the activations and gamma tensor.acts
: The activations tensor has shape[N][C][..F..]
whereN
is the batch sizeC
is the number of channels..F..
is dimensions of a N-dimensional field.
-
poplar::Tensor
createNormBeta
(poplar::Graph &graph, const poplar::Tensor &acts)¶ Create and map the per channel additive beta parameter tensor used for normalisation in convolution layers.
- Return
Beta vector of dimension
C
.- Parameters
graph
: The graph with the activations and beta tensor.acts
: The activations tensor has shape[N][C][..F..]
whereN
is the batch sizeC
is the number of channels..F..
is dimensions of a N-dimensional field
-
std::pair<poplar::Tensor, poplar::Tensor>
createNormParams
(poplar::Graph &graph, const poplar::Tensor &acts)¶ Creates a tensor pair of normalisation parameters (gamma, beta).
- Return
A pair of vectors of dimension
C
.- Parameters
graph
: The graph with the activations and beta/gamma tensors.acts
: The activations tensor has shape[N][C][..F..]
whereN
is the batch sizeC
is the number of channels..F..
is dimensions of a N-dimensional field
-
std::pair<poplar::Tensor, poplar::Tensor>
normStatistics
(poplar::Graph &graph, const poplar::Tensor &actsUngrouped, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo = false, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "")¶ Compute the normalisation statistics from the activations tensor.
The activations tensor is of shape
[N][C][..F..]
. The mean and inverse standard deviation is computed over dimensions{[N] [..F..]|
and vectors of lengthC
are returned as estimates.The input activations tensor must be rearranged such that statistics are computed for
C
channels.- Return
A vector pair with mean and inverse standard deviation.
- Parameters
graph
: The graph in which the computation is performed.actsUngrouped
: The activation with shape[N][C][..F..]
whereN
is the batch sizeC
is the number of channels..F..
is dimensions of a N-dimensional field.
eps
: The epsilon added to the variance to avoid divide by zero.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.unbiasedVarEstimate
: Compute unbiased variance estimate.stableAlgo
: If true, computes the mean first and subtracts the activations by it before computing the variance. The implementation with this flag set to true ispartialsType
: Poplar type used for partials.debugPrefix
: A debug prefix added to compute set and tensor names.
-
poplar::Tensor
normWhiten
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix)¶ Compute the whitened activations using the supplied mean and inverse standard deviation.
The input activations undergo a prior rearrangement such that
C
is the size of the statistics mean and iStdDev.- Return
Whitened activations.
- Parameters
graph
: The graph which the computation is in.acts
: The activations tensor of shape [N][C][..F..].mean
: Mean of the activations with dimension C.iStdDev
: Inverse standard deviation with dimension C.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.debugPrefix
: A debug prefix added to compute set and tensor names.
-
poplar::Tensor
normalise
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gamma, const poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Computes the normalised output given whitened activations.
- Parameters
graph
: The graph to which the normalisaton operation is added.actsWhitened
: Whitened activations.gamma
: Per channel multiplicative normalisation parameter.beta
: Per channel additive normalisation parameter.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.debugPrefix
: A debug prefix added to compute set and tensor names.
-
std::pair<poplar::Tensor, poplar::Tensor>
normParamGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "")¶ Compute gradients with respect to parameters required for parameter update.
- Parameters
graph
: The graph to which the normalisaton operation is added.actsWhitened
: Whitened activations.gradsIn
: Input gradients to the normalisation layer.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.partialsType
: The intermediate type kept in the computation.debugPrefix
: A debug prefix added to compute set and tensor names.
-
poplar::Tensor
normGradients
(poplar::Graph &graph, const poplar::Tensor &gradsIn, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Propagate the gradients through the normalisation layer.
- Parameters
graph
: The graph to which the normalisaton operation is added.gradsIn
: Input gradients to the normalisation layer.gamma
: Multiplicative parameter used in the normalisation.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.debugPrefix
: A debug prefix added to compute set and tensor names.
-
poplar::Tensor
normStatisticsGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "")¶ Propagate the gradients through the norm statistics layer.
The input to the layer is the output gradients from the normalisation layer. The whitened activations and the input gradients must have undergone a prior rearrangement such that the channel dimension has the same elements as
invStdDev
.- Parameters
graph
: The graph to which the normalisaton operation is added.actsWhitened
: Forward whitened activations.gradsIn
: Input gradients to the normalisation layer.invStdDev
: Inverse standard deviation from norm statistics.prog
: A reference to the a program sequence which will be appended with the code to perform the normalisation.debugPrefix
: A debug prefix added to compute set and tensor names.
-
poplar::Tensor
4.4. Random number operations (poprand)¶
Functions for tensor operations using random numbers. These make use of the hardware pseudo-random number generators (PRNG) on each tile. There is a separate PRNG for each worker thread. These are designed to allow every vertex to generate a different pseudo-random sequence but also, importantly, to ensure that the same sequence can be regenerated when required.
These function have an optional seed parameter for initialising the tiles’ PRNGs. Because there is no 64-bit integer type in device code, this is passed as a tensor of two 32-bit integers. This seed value is common to an entire graph or subgraph.
A “seed modifier” parameter is also used, which enables each vertex to generate a different pseudo-random sequence from the same seed. This is ignored if the seed is not specified.
The pseudo-random sequence is determined by a combination of tile-id, thread-id, seed and seed modifier.
If a seed is provided then, at the end of the operation, the PRNG state is restored to be the same as it was before the operation.
The functions have a reference tensor as a parameter. This is used to define the layout of the output tensor in order to guarantee deterministic results when a seed is specified. It ensures that if the same seed and seed modifier values are used then the same output is obtained.
4.4.1. poprand/RandomGen.hpp¶
-
namespace
poprand
¶ Pseudo-random number generator (PRNG) functions.
Functions
-
poplar::Tensor
dropout
(poplar::Graph &graph, const poplar::Tensor *seed, const uint32_t seedModifier, const poplar::Tensor &input, const poplar::Tensor &reference, double keepProbability, double scale, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Apply dropout to a tensor.
The elements of tensor
input
are multiplied by a mask consisting of a sequence of randomly generated 1 or 0. The keep probability of the dropout P(1) =keepProbability
. The contents of the mask depend on the keep probability, seed, seed modifier and layout of the reference tensor.- Return
A tensor with elements randomly set to either zero or the scaled input value.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the dropout mask.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.input
: The input tensor to be masked.reference
: A tensor that specifies the layout of the output tensor. Must be the same shape as the input.keepProbability
: The probability of keeping an input value.scale
: Scales the output tensor. This is typically the inverse of the dropout probability, (1 / P(1)).prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
shapedDropout
(poplar::Graph &graph, const poplar::Tensor *seed, const uint32_t seedModifier, const poplar::Tensor &input, const poplar::Tensor &reference, double keepProbability, double scale, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Apply shaped dropout to a tensor.
The elements of tensor
input
are multiplied by a mask consisting of a sequence of randomly generated 1 or 0. The keep probability of the dropout P(1) =keepProbability
.Shaped dropout allows row, column and dimension wise dropout, versus element-wise standard dropout. The shape of the dropout must be compatible (broadcastable) to
input
.The contents of the mask depend on the keep probability, seed, seed modifier and layout of the reference tensor.
- Return
A tensor with elements randomly set to either zero or the scaled input value.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the dropout mask.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.input
: The input tensor to be masked.reference
: A tensor that specifies the shape and layout of the dropout. Must be broadcastable to the input.keepProbability
: The probability of keeping an input value.scale
: Scales the output tensor. This is typically the inverse of the dropout probability, (1 / P(1)).prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
uniform
(poplar::Graph &graph, const poplar::Tensor *seed, uint32_t seedModifier, const poplar::Tensor &reference, const poplar::Type &outType, double minVal, double maxVal, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Uniform distribution in a given interval with
maxVal
>minVal
.Generates random data with uniform distribution in the interval [
minVal
,maxVal
]. The output may be of typefloat
,half
orint
.For type
int
, data is generated in the interval [minVal
,maxVal
] with uniform probability if (maxVal
-minVal
) is a power of 2. Otherwise there will be a small bias in the probability generated, with the bias directly proportional to the ratio (maxVal
-minVal
+ 1 ) / 2^32.- Return
A tensor with elements having a uniform distribution of random values.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the distribution.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.reference
: A tensor that specifies the layout of the output tensor.outType
: Type of the output tensor. One offloat
,half
orint
.minVal
: The minimum value of the distribution.maxVal
: The maximum value of the distribution.prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
bernoulli
(poplar::Graph &graph, const poplar::Tensor *seed, uint32_t seedModifier, const poplar::Tensor &reference, const poplar::Type &outType, double prob, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Bernoulli distribution which has the value 1 with the specified probability.
Generates a tensor with random values of 0 and 1, determined by
prob
.- Return
A tensor with elements randomly set to either zero or the scaled input value.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the distribution.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.reference
: A tensor that specifies the layout of the output tensor.outType
: Type of the output tensor. One offloat
,half
orint
.prob
: Probability of an element being 1.prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
normal
(poplar::Graph &graph, const poplar::Tensor *seed, uint32_t seedModifier, const poplar::Tensor &reference, const poplar::Type &outType, double mean, double stdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Normal distribution with given mean and standard deviation.
Generates random data with a normal (Gaussian) distribution. The mean is given by
mean
and the standard deviation bystdDev
.- Return
A tensor with elements randomly set to either zero or the scaled input value.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the distribution.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.reference
: A tensor that specifies the layout of the output tensor.outType
: Type of the output tensor. One offloat
orhalf
.mean
: The mean value of the distribution.stdDev
: The standard deviation of the distribution.prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
truncatedNormal
(poplar::Graph &graph, const poplar::Tensor *seed, uint32_t seedModifier, const poplar::Tensor &reference, const poplar::Type &outType, double mean, double stdDev, double alpha, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Truncated normal distribution.
Generates a distribution derived from a normal distribution with mean
mean
and standard deviationstdDev
. This normal distribution is truncated symmetrically about the mean at (mean
-alpha
*stdDev
) and (mean
+alpha
*stdDev
)- Return
A tensor with elements randomly set to either zero or the scaled input value.
- Parameters
graph
: The graph to add this operation to.seed
: If not null, this is a pair of 32-bit integers used to seed the random number generator that generates the distribution.seedModifier
: Provides a further modification of the seed value. Ignored ifseed
is null.reference
: A tensor that specifies the layout of the output tensor.outType
: Type of the output tensor. One offloat
orhalf
.mean
: The mean value of the distribution.stdDev
: The standard deviation of the distribution.alpha
: Defines the minimum and maximum values of the distribution.prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
void
setSeed
(poplar::Graph &graph, const poplar::Tensor &masterSeed, uint32_t seedModifier, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Sets the random number generator seed on all tiles.
- Parameters
graph
: The graph to add this operation to.masterSseed
: A 64-bit integer to seed the random number on every tile.seedModifier
: Provides a further modification of the seed value.prog
: The program to add this operation to.debugPrefix
: A prefix string for debugging.
-
poplar::Tensor
4.5. Sparse tensor operations (popsparse)¶
Functions for operating on block sparse tensors. Static block and dynamic sparsity are supported.
4.5.1. popsparse/BlockSparse.hpp¶
-
namespace
popsparse
¶ Support for sparse matrices.
-
namespace
experimental
¶ Enums
-
enum
SubBlockMask
¶ Define the sparsity mask inside a block.
The diagonal is defined across sll the non-sparse matrix dimensions, where the row index is equal to the column index.
Values:
-
enumerator
None
¶ No elements are zeroed out.
-
enumerator
ZeroUpperTriangle
¶ Elements in the upper triangle, above the diagonal, are zeroed out.
-
enumerator
ZeroLowerTriangle
¶ Elements in the lower triangle, below the diagonal, are zeroed out.
-
enumerator
Functions
-
poplar::Tensor
bsSoftmax
(poplar::Graph &graph, poplar::Tensor sparseTensor, const std::array<int, 2> &dim, const std::array<int, 2> &blockSize, const std::vector<unsigned char> &sparsity, SubBlockMask subBlockMaskType, poplar::program::Sequence &prog, const std::string &debugStr = "")¶ This function computes softmax on a sparse tensor.
- Parameters
graph
: The Poplar graph.sparseTensor
: The input sparse 2D tensor. It must be in a block-sparse format.dim[0]
: Number of rows of the original dense tensor.dim[1]
: Number of columns of the original dense tensor.blockSize[0]
: Block size of the rows.blockSize[1]
: Block size of the columns.sparsity
: The 2D sparsity mask for the block-sparse tensor, in which ‘1’ is a non zero block and ‘0’ is a zero block.subBlockMaskType
: Sub-block mask type. Elements in upper (or lower) triangle are filled by zeroes in the result.prog
: A reference to the program sequence to which the code to perform the softmax will be appended.
-
void
bsSoftmaxInPlace
(poplar::Graph &graph, poplar::Tensor sparseTensor, const std::array<int, 2> &dim, const std::array<int, 2> &blockSize, const std::vector<unsigned char> &sparsity, SubBlockMask subBlockMaskType, poplar::program::Sequence &prog, const std::string &debugStr = "")¶ This function computes softmax on a sparse tensor, in place.
- Parameters
graph
: The Poplar graph.sparseTensor
: The input sparse 2D tensor. It must be in a block-sparse format.dim[0]
: Number of rows of the original dense tensor.dim[1]
: Number of columns of the original dense tensor.blockSize[0]
: Block size of the rows.blockSize[1]
: Block size of the columns.sparsity
: The 2D sparsity mask for the block-sparse tensor, in which ‘1’ is a non zero block and ‘0’ is a zero block.subBlockMaskType
: Sub-block mask type. Elements in upper (or lower) triangle are filled by zeroes in the result.prog
: A reference to a program sequence which will be appended with the code to perform the softmax.
-
poplar::Tensor
bsSoftmaxGrad
(poplar::Graph &graph, poplar::Tensor sparseOut, poplar::Tensor sparseOutGrad, const std::array<int, 2> &dim, const std::array<int, 2> &blockSize, const std::vector<unsigned char> &sparsity, poplar::program::Sequence &prog, const std::string &debugStr = "")¶ This function computes softmax gradient on a sparse tensor.
- Parameters
graph
: The Poplar graphsparseOut
: The outer (activation) sparse 2D tensor. It must be in block-sparse format.sparseOutGrad
: The outer gradient sparse 2D tensor. It must be in a block-sparse format.dim[0]
: Number of rows of the original dense tensor.dim[1]
: Number of columns of the original dense tensor.blockSize[0]
: Block size of the rows.blockSize[1]
: Block size of the columns.sparsity
: The 2D sparsity mask for the block-sparse tensor, in which ‘1’ is a non zero block and ‘0’ is a zero block.prog
: A reference to a program sequence which will be appended with the code to perform the softmax.
-
enum
-
namespace
4.5.2. popsparse/BlockSparseMatMul.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
namespace
experimental
Functions
-
poplar::Tensor
createBSMatMulInputLHS
(poplar::Graph &graph, const BSMatMulParams &bsMatMul, const std::string &name)¶ Create a tensor for use as the left operand of block-sparse matrix multiplication.
- Return
For non-grouped BSMatMulParams object, if the left matrix is a dense matrix, the return tensor is just a regular 2D matrix. If it is a sparse matrix, the return tensor is an array of non-zero blocks. For group BSMatMulParams object, the return tensor is concatenated along 0 dimension for all ops in a group. tensor for all matrices in a group.
- Parameters
graph
: The Poplar graph.bsMatMul
: The object for block-sparse information, includes the sparsity mask, the matrix size, the block size, and the data type.name
: The debug name of the created matrix.
-
poplar::Tensor
createBSMatMulInputRHS
(poplar::Graph &graph, const BSMatMulParams &bsMatMul, const std::string &name)¶ Create a tensor for use as the right operand of block-sparse matrix multiplication.
- Return
For non-grouped BSMatMulParams object, if the right matrix is a dense matrix, the return tensor is just a regular 2D matrix. If it is a sparse matrix, the return tensor is an array of non zero blocks. For group BSMatMulParams object, the return tensor is concatenated along 0 dimension for all ops in a group. tensor for all matrices in a group.
- Parameters
graph
: The Poplar graph.bsMatMul
: The object for block-sparse information, includes the sparsity mask, the matrix size, the block size, and the data type.name
: The debug name of the created matrix.
-
class
BSMatMulParams
¶ - #include <BlockSparseMatMul.hpp>
This class supports block-sparse matrix multiplication.
The class only saves the sparsity mask, the matrix size, the block size, and the data type, which are used to generate the computation graph.
The matrix data is passed in when function
bsMatMul()
orbsUpdate()
gets called.The purpose of this design is to reuse the instance of this class when only the data of the matrix is changed, and the matrix sparsity does not change.
The current implementation is based on Zoltan to generate the hypergraph partition for all tiles. Zoltan usually runs 2 minutes for ~16k non-zero blocks, which is expensive if it runs for every matrix multiplication.
The right matrix is always sparse, and the left matrix can be dense or sparse.
Public Functions
-
BSMatMulParams
(const std::array<int, 3> &dim, const std::array<int, 3> &blockSize, const std::vector<unsigned char> &rhsSparsity, bool rhsNeedTranspose, poplar::Type inDataType, poplar::Type outDataType, poplar::Type partialDataType, unsigned numGroupsIn = 1)¶ This constructor is for a dense matrix (left side) multiplying a sparse matrix (right side).
- Parameters
dim[0]
: Number of rows in the left-hand matrix.dim[1]
: Number of columns in the left-hand matrix.dim[2]
: If the right matrix needs to be transposed, this is the number of rows in the right-hand matrix. Otherwise, it is number of columns in the right-hand matrix.blockSize[0]
: Block size of the rows in the left-hand matrix.blockSize[1]
: Block size of the columns in the left-hand matrix.blockSize[2]
: Block size of the columns in the right-hand matrix.rhsSparsity
: The 2D sparsity mask for right hand block sparse matrix, in which ‘1’ is a non zero block and ‘0’ is a zero block. For group operation this parameter is concatenated sparsity masks for all ops in a group.rhsNeedTranspose
: Whether the right hand matrix need be transposed. This is mostly to support backward pass. If this parameter is true:dim, blockSize must conform to transposed shape
rhsSparsity must be in original, non-transposed order
rhsMatrix in bsMatMul() must contain data within blocks in original, non-transposed order
inDataType
: Input data type.outDataType
: Output data type.partialDataType
: Partial data type.numGroupsIn
: number of groups for group operation or 1 for non-group operation
-
BSMatMulParams
(const std::array<int, 3> &dim, const std::array<int, 3> &blockSize, const std::vector<unsigned char> &lhsSparsity, bool lhsNeedTranspose, const std::vector<unsigned char> &rhsSparsity, bool rhsNeedTranspose, poplar::Type inDataType, poplar::Type outDataType, poplar::Type partialDataType, unsigned numGroupsIn = 1)¶ This constructor is for a sparse matrix multiplied by a sparse matrix.
It is not supported.
-
BSMatMulParams
(const std::array<int, 3> &dim, const std::array<int, 3> &blockSize, const std::vector<unsigned char> &resSparsity, poplar::Type inDataType, poplar::Type outDataType, poplar::Type partialDataType, SubBlockMask subBlockMask = SubBlockMask::None, unsigned numGroupsIn = 1)¶ This constructor is for a dense matrix multiplying a dense matrix.
The multiply is performed as a sparse operation and the result stored as a sparse matrix.
- Parameters
dim[0]
: Number of rows in the left-hand matrix.dim[1]
: Number of columns in the left-hand matrix.dim[2]
: Number of columns in the right-hand matrix.blockSize[0]
: Block size of the rows in the left-hand matrix.blockSize[1]
: Block size of the columns in the left-hand matrix.blockSize[2]
: Block size of the columns in the right-hand matrix. The block size of the columns in the left-hand matrix equals the block size of the rows in the right-hand matrix.resSparsity
: The 2D sparsity mask for the result block-sparse matrix, in which ‘1’ is a non-zero block and ‘0’ is a zero block.resSparsity
: The 2D sparsity mask for the result block sparse matrix, in which ‘1’ is a non zero block and ‘0’ is a zero block. For group operation this parameter is concatenated sparsity masks for all ops in a group.outDataType
: Output data type.partialDataType
: Partial data type.SubBlockMask
: The mask inside a block. SeeSubBlockMask
inBlockSparse.hpp
for details.numGroupsIn
: number of groups for group operation or 1 for non-group operation
-
BSMatMulParams
(BSMatMulParams &&other)¶
-
~BSMatMulParams
()¶
-
-
poplar::Tensor
-
namespace
Note: in the API, the sparse-weight matrix representing the parameters of the fully-connected layer per group is W, with a dense shape of [outputChannelsPerGroup, inputChannelsPerGroup].
The equivalent dense operations done for the different passes are where each multiplication is per group.
Fwd/Inf: Ao = W * Ai
Where: - Ao has shape [outputChannelsPerGroup, batchSize] - Ai has shape [inputChannelsPerGroup, batchSize]
GradA: Gi = W’ * Go
Where: - Go has shape [outputChannelsPerGroup, batchSize] - Gi has shape [inputChannelsPerGroup, batchSize]
GradW: Gw = Go * Ai
4.5.3. popsparse/FullyConnected.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
namespace
dynamic
¶ Support for dynamic sparse matrices.
Functions
-
SparseTensor
createFullyConnectedWeights
(poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams ¶ms, const std::string &debugName = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Create a sparse tensor that is used as the weights W for a fully connected layer.
The following options are available:
availableMemoryProportion
Decimal between 0 and 1 [=0.6]The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
metaInfoBucketOversizeProportion
Decimal between 0 and 1 [=0.3]This specifies additional elements to allocate in each bucket of meta-information as a proportion of the required size for a perfectly uniformly distributed sparsity pattern.
doGradAPass
(true, false) [=false]doGradWPass
(true, false) [=false]Indicate which passes are present for the operation of the layer as a whole. It is assumed that the forward pass is always present.
partialsType
poplar::Type [=poplar::FLOAT]The type to use for partial results.
sharedBuckets
(true, false) [=true]If set, forces the same buckets to be used for all three passes.
- Return
A tensor with sparse representation of weights for the fully connected layer.
- Parameters
graph
: The Poplar graph.inputType
: The type for inputs to the operation.params
: Parameters for the fully connected layer.debugPrefix
: Optional prefix for all debug names added to the graph.options
: Implementation options for the fully connected layer.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
createFullyConnectedInput
(poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams ¶ms, const std::string &debugName = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Create a dense tensor that is used as the input activations for a fully connected layer.
This returned tensor is of shape [batchSize, inputChannelsPerGroup].
- Parameters
graph
: The Poplar graph.inputType
: The type for inputs to the operation.params
: Parameters for the fully connected layer.debugPrefix
: Optional prefix for all debug names added to the graph.options
: Implementation options for the fully connected layer. See createFullyConnectedWeights() for details.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
fullyConnectedFwd
(poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Run a fully connected forward (or inference) pass.
The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the Fwd operation described in the Note above but with input and output transposed.
The meta information for the sparse weights tensor must be created for the forward (or inference) pass and should be created by use of the createFullyConnectedWeights() function.
- Return
The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles. The result tensor is of shape [batchSize][outputChannelsPerGroup * numGroups]
- Parameters
graph
: The Poplar graph.weights
: Sparsity information of the weights tensor.activations
: The dense activations have shape [batchSize][inputChannelsPerGroup * numGroups]fcParams
: Fully connected layer parameters.prog
: A reference to a program sequence which will be appended with the code to perform the forward operation.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
fullyConnectedGradA
(poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &gradients, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Run a fully connected GradA pass.
The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the GradA computation as described in the Note above but with input and output transposed.
The meta information for the sparse-weights tensor must be created for the GradA pass and should be created by use of createFullyConnectedWeights() function.
- Return
The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles. The tensor is of shape [batchSize][inputChannelsPerGroup * numGroups]
- Parameters
graph
: The Poplar graph.weights
: Sparsity information of the weights tensor.gradients
: The dense loss gradients with respect to output activations and are of shape [batchSize][outputChannelsPerGroup] .fcParams
: Fully connected layer parameters.prog
: A reference to a program sequence which will be appended with the code to perform the GradA operation.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.cache
: Optional pointer to planning cache to use.
-
poplar::Tensor
fullyConnectedSparseGradW
(poplar::Graph &graph, const poplar::Tensor sparsityMetaInfo, const poplar::Tensor &gradA, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)¶ Run a fully connected GradW pass to compute sparse gradients.
The layout of the returned tensor is exactly as that of the representation of the weights NZ values so that any elementwise operation may be done between the two.
The actual implementation differs from that in the Note above as the transpose of the gradients and activations are supplied as parameters to this function.
- Return
The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles.
- Parameters
graph
: The Poplar graph.weightMetaInfo
: Meta information for sparse weights. See SparseTensor representation.gradA
: Dense gradients wrt output activations of shape [batchSize][outputChannelsPerGroup * numGroups]activations
: Input activations of shape [batchSize][inputChannelsPerGroup * numGroups]fcParams
: Fully connected layer parameters.prog
: A reference to a program sequence which will be appended with the code to perform the GradW operation.debugPrefix
: A debug prefix added to compute set and tensor names.options
: The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.cache
: Optional pointer to planning cache to use.
-
std::tuple<unsigned, unsigned, unsigned>
fullyConnectedDenseGradWSerialSplits
(const poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams &fcParams, const poplar::OptionFlags &options_ = {}, PlanningCache *cache = nullptr)¶ Report the serial splitting of a dense gradW output given the memory proportion limit given in options.
A dense gradW output is of shape [numGroups][inputSize][outputSize]
- Return
Serial splits for each of the output dimensions [numGroups][inputSize][outputSize].
- Parameters
graph
: The Poplar graph.inputType
: The type of input.params
: Fully connected params.options
: The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.cache
: Optional pointer to planning cache to use.
-
class
PlanningCache
¶
-
SparseTensor
-
namespace
4.5.4. popsparse/FullyConnectedParams.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
namespace
dynamic
Support for dynamic sparse matrices.
Functions
-
std::ostream &
operator<<
(std::ostream &os, const FullyConnectedParams &p)¶
-
class
FullyConnectedParams
¶ Fully connected parameters
These are the parameters which define a fully connected layer.
Matrix multiplications for the different passes are as follows
For pass =
FC_INFERENCE
orFC_TRAINING_FWD
[
numGroups
][outputChannelsPerGroup
][inputChannelsPerGroup
] * [numGroups
][inputChannelsPerGroup
][batchSize
]For pass =
FC_TRAINING_GRADA
[
numGroups
][inputChannelsPerGroup
][outputChannelsPerGroup
] * [numGroups
][outputChannelsPerGroup
][batchSize
]For pass =
FC_TRAINING_GRADW
[
numGroups
][outputChannelsPerGroup
][batchSize
] * [numGroups
][batchSize
][inputChannelsPerGroup
]
-
FullyConnectedParams
createWithNzRatio
(const SparsityParams &sparsityParams, double nzRatio, std::size_t batchSize, std::size_t numGroups, std::size_t inputChannels, std::size_t outputChannels)¶ Create parameters with the specified ratio of non-zero elements.
-
FullyConnectedParams
createWithNumNonZeroValues
(const SparsityParams &sparsityParams, std::size_t numNonZeroElems, std::size_t batchSize, std::size_t numGroups, std::size_t inputChannels, std::size_t outputChannels)¶ Create parameters with the specified number of non-zero elements.
Private Members
-
SparsityParams
sparsityParams
¶ Sparsity parameters.
-
double
nzRatio
¶ Proportion of weights which are non-zero in range [0,1].
Friends
-
friend bool
operator<
(const FullyConnectedParams &a, const FullyConnectedParams &b)¶
-
std::ostream &
-
namespace
4.5.5. popsparse/SparsePartitioner.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
template<typename
T
>
classPartitionerImpl
¶
-
namespace
dynamic
Support for dynamic sparse matrices.
-
template<typename
T
>
classPartitioner
¶ - #include <SparsePartitioner.hpp>
Class to translate and encode sparsity information for a fully connected layer.
See createFullyConnectedWeights() for details of the options.
Public Functions
-
const PartitionerImpl<T> &
getImpl
() const¶
-
Partitioner
(const FullyConnectedParams ¶ms, const poplar::Type &dataType, const poplar::Target &target, const poplar::OptionFlags &options, PlanningCache *cache = {})¶
-
~Partitioner
()¶
-
SparsityDataImpl<T>
createSparsityDataImpl
(const CSCMatrix<T> &matrix_) const¶ Create implementation sparsity representation for a compressed sparse columns (CSC) matrix.
-
SparsityDataImpl<T>
createSparsityDataImpl
(const CSRMatrix<T> &matrix_) const¶ Creates implementation sparsity representation for a compressed sparse rows (CSR) matrix.
-
SparsityDataImpl<T>
createSparsityDataImpl
(const COOMatrix<T> &matrix_) const¶ Creates implementation sparsity representation for a coordinate (COO) format matrix.
-
COOMatrix<T>
sparsityDataImplToCOOMatrix
(const SparsityDataImpl<T> &sparsityDataImpl) const¶ Create a coordinate (COO) representation matrix from implementation sparsity representation.
The COO entries are ordered by row first, and then columns.
-
CSRMatrix<T>
sparsityDataImplToCSRMatrix
(const SparsityDataImpl<T> &sparsityDataImpl) const¶ Create compressed sparse rows (CSR) representation from implementation sparsity representation.
-
CSCMatrix<T>
sparsityDataImplToCSCMatrix
(const SparsityDataImpl<T> &sparsityDataImpl) const¶ Create compressed sparse columns (CSC) representation from implementation sparsity representation.
Private Members
-
std::unique_ptr<PartitionerImpl<T>>
impl
¶
-
const PartitionerImpl<T> &
-
template<typename
T
>
structSparsityDataImpl
¶ - #include <SparsePartitioner.hpp>
Encoding of sparsity representation.
-
template<typename
-
template<typename
4.5.6. popsparse/SparseStorageFormats.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
template<typename
T
>
structCOOMatrix
¶ - #include <SparseStorageFormats.hpp>
Sparse matrix stored as coordinate (COO) or triplets format.
-
template<typename
T
>
structCSCMatrix
¶ - #include <SparseStorageFormats.hpp>
Sparse matrix stored in compressed sparse columns (CSC) format for a matrix of size [M x N].
There is no explicit encoding of M in the storage. The number of column indices is equal to N + 1.
Public Functions
-
CSCMatrix
(const std::vector<T> &nzValues, const std::vector<std::size_t> &columnIndices, const std::vector<std::size_t> &rowIndices)¶
-
CSCMatrix
(std::vector<T> &&nzValues, std::vector<std::size_t> &&columnIndices, std::vector<std::size_t> &&rowIndices)¶
-
CSCMatrix
() = default¶
-
-
template<typename
T
>
structCSRMatrix
¶ - #include <SparseStorageFormats.hpp>
Sparse matrix stored in compressed sparse rows (CSR) format for a matrix of size [M x N].
There is no explicit encoding of N in the storage. The number of row indices is equal to M + 1.
Public Functions
-
CSRMatrix
(const std::vector<T> &nzValues, const std::vector<std::size_t> &columnIndices, const std::vector<std::size_t> &rowIndices)¶
-
CSRMatrix
(std::vector<T> &&nzValues, std::vector<std::size_t> &&columnIndices, std::vector<std::size_t> &&rowIndices)¶
-
CSRMatrix
() = default¶
-
-
template<typename
4.5.7. popsparse/SparseTensor.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
namespace
dynamic
Support for dynamic sparse matrices.
-
class
SparseTensor
¶ - #include <SparseTensor.hpp>
Representation of a sparse tensor.
-
class
-
namespace
4.5.8. popsparse/SparsityParams.hpp¶
-
namespace
popsparse
Support for sparse matrices.
-
namespace
dynamic
Support for dynamic sparse matrices.
Enums
Functions
-
std::ostream &
operator<<
(std::ostream &os, const SparsityType &t)¶
-
std::ostream &
operator<<
(std::ostream &os, const SparsityStructure &s)¶
-
struct
SparsityParams
¶ Public Functions
-
SparsityParams
(SparsityType type = SparsityType::Element, SparsityStructure structure = SparsityStructure::Unstructured)¶
-
SparsityParams
(const SparsityParams&) = default¶
Friends
-
friend bool
operator<
(const SparsityParams &a, const SparsityParams &b)¶
-
friend std::ostream &
operator<<
(std::ostream &os, const SparsityParams &p)¶
-
-
std::ostream &
-
namespace
4.6. Neural network functions (popnn)¶
Functions used in neural networks (for example, non-linearities, pooling, loss functions).
4.6.1. popnn/BatchNorm.hpp¶
-
namespace
popnn
¶ Functions used in neural networks.
-
namespace
bn
¶ Functions
-
std::pair<poplar::Tensor, poplar::Tensor>
batchNormStatistics
(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo = false, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Estimate mean and inverse of standard deviation of batched activations.
-
poplar::Tensor
batchNormWhiten
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Whiten activations given mean and standard deviation.
-
std::pair<poplar::Tensor, poplar::Tensor>
batchNormalise
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Batch normalise activations given mean, standard deviation and batch norm parameters.
The result is two tensors
normalised activations
whitened activations
-
poplar::Tensor
batchNormalise
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &combinedMultiplicand, const poplar::Tensor &addend, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Computes the output of batch normalisation given:
combinedMultiplicand = gamma / stdDev
addend = beta - gamma * mean / stdDev
-
std::pair<poplar::Tensor, poplar::Tensor>
batchNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters required for parameter update.
-
std::pair<poplar::Tensor, poplar::Tensor>
batchNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters required for parameter update.
-
poplar::Tensor
batchNormGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the batch norm layer.
i.e. gradients are propagated through the complete layer including statistics computation.
-
poplar::Tensor
batchNormGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the batch norm layer.
i.e. gradients are propagated through the complete layer including statistics computation.
-
void
batchNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
batchNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
std::pair<poplar::Tensor, poplar::Tensor>
-
namespace
4.6.2. popnn/GroupNorm.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
gn
¶ Functions
-
std::pair<poplar::Tensor, poplar::Tensor>
groupNormStatistics
(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, unsigned numGroups, bool unbiasedVarEstimate, bool stableAlgo = false, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Estimate mean and inverse of standard deviation of activations.
-
poplar::Tensor
groupNormWhiten
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Whiten activations given mean and standard deviation.
-
std::pair<poplar::Tensor, poplar::Tensor>
groupNormalise
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Group normalise activations given mean, standard deviation and batch norm parameters.
The result is two tensors
normalised activations
whitened activations
-
std::pair<poplar::Tensor, poplar::Tensor>
groupNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
std::pair<poplar::Tensor, poplar::Tensor>
groupNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
poplar::Tensor
groupNormGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
poplar::Tensor
groupNormGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
void
groupNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
void
groupNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
std::pair<poplar::Tensor, poplar::Tensor>
-
namespace
4.6.3. popnn/Gru.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
gru
¶ Functions
-
poplar::Tensor
createInput
(poplar::Graph &graph, const GruParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create an input tensor of shape [numSteps, batchSize, inputSize] which is optimally mapped to multiply the whole input sequence in a single matrix multiply operation.
GRU options
availableMemoryProportion
Decimal between 0 and 1 (inclusive)See createWeights().
inferenceOnly
(true, false) [=true]Sets convolution pass to INFERENCE_FWD if true; TRAINING_FWD otherwise. See createWeights().
partialsType
(half, float) [=float]See createWeights().
- Return
A tensor created in the graph of shape: [timeSteps, batchSize, inputSize]
- Parameters
graph
: Graph objectparams
: The GRU parametersname
: String annotationoptions
: Any implementation/debug options for the GRUplanningCache
: A poplin matrix multiply planning cache
-
poplar::Tensor
createInitialState
(poplar::Graph &graph, const GruParams ¶ms, const std::string &debugPrefix, const poplar::OptionFlags &options, poplin::matmul::PlanningCache *cache)¶
-
std::pair<poplar::Tensor, poplar::Tensor>
createWeightsKernel
(poplar::Graph &graph, const GruParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights kernel used to weight the input and output of a GRU.
Returns the inputWeights and outputWeights.
-
poplar::Tensor
createWeightsBiases
(poplar::Graph &graph, const GruParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights biases.
-
GruWeights
createWeights
(poplar::Graph &graph, const GruParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights (both kernel and biases) used to weight the input and output of a GRU.
-
poplar::Tensor
gruFwd
(poplar::Graph &graph, const GruParams ¶ms, const poplar::Tensor &stateInit, const poplar::Tensor &in, const GruWeights &weights, poplar::Tensor *intermediates, poplar::program::Sequence &fwdProg, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Calculate the result of applying a GRU across a sequence.
The following are the formulas for a GRU cell:
r_t = sigmod(w_r * x_t + u_r * h_t-1 + b_r)
u_t = sigmod(w_u * x_t + u_u * h_t-1 + b_u)
c_t = tanh(w_c * x_t + u_c * (r_t x h_t-1) + b_c)
h_t = u_t x h_t-1 + (1 - u_t) x c_t
Where:
* is matrix multiplication
x is Hadamard product
The GRU is run for seqSize steps each with a batch of size batchSize and input size inputSize and output size outputSize. The total number of units within each GRU cell is
BASIC_GRU_CELL_NUM_UNITS
.- Return
The output of the GRU. Depending on the outputFullSequence parameter the output tensor is either the output of the last timestep in the shape [batch, outputSize] or it is the sequence of outputs for every timestep in the shape [timesteps, batch, outputSize]
- Parameters
graph
: Graph to which the GRU cell belongs.params
: The parameters of the GRU.stateInit
: Initial state for the GRU.in
: The input tensor to the GRU of dimension [timesteps, batch, inputSize].weights
: The GRU weights structure.[out] intermediates
: Intermediate results that are retained in the forward pass of training for use in the backward pass. It includes the data for reset gate, update gate, candidate, and output if outputFullSequence is false. This argument should be set to null if we are only doing inference.fwdProg
: Program sequence.debugPrefix
: String used as prefix for compute sets.options
: GRU implementation options. See createInput().planningCache
: The matmul planning cache.
-
poplar::Tensor
gruBwd
(poplar::Graph &graph, const GruParams ¶ms, poplar::program::Sequence &prog, const poplar::Tensor &fwdOutputInit, const poplar::Tensor &fwdIntermediatesSeq, const GruWeights &weights, const poplar::Tensor &fwdInputSeq, const poplar::Tensor &fwdOutput, const poplar::Tensor &gradLayerNext, poplar::Tensor *inputGrad, poplar::Tensor *bwdIntermediates, const std::string &debugPrefix, const poplar::OptionFlags &options_, poplin::matmul::PlanningCache *planningCache)¶ Run GRU backward pass.
The backward pass executes in reverse order compared to the forward pass. If the forward steps for a GRU layer are sf = {0, 1, 2, …, S - 1} then the backward steps run for sb = {S - 1, S - 2, …. , 1, 0}.
- Return
The gradient of the initial output.
- Parameters
graph
: Graph to which the GRU cell belongs.params
: The parameters of the GRU.prog
: Program sequence.fwdOutputInit
: Forward output tensor for initial step.fwdIntermediatesSeq
: Intermediates results from the forward pass.weights
: The GRU weights structure.fwdInputSeq
: The input tensor to the GRU of shape: [timesteps, batch, inputSize]fwdOutput
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.gradLayerNext
: The gradients of the output. Depending on the outputFullSequence parameter this is either the gradient of the output for the last timestep or it is a sequence output gradients for each timestep.[out] *inputGrad
: The gradients of the inputs - may be null if this information is not required.[out] *bwdIntermediates
: Intermediates gradients that are retained in the backward pass of training for use in the weight update. It includes the derivatives for reset gate, update gate, and candidate. This argument should be set to null if you do not need to calculate weight deltas.debugPrefix
: String used as prefix for compute sets.options
: GRU implementation options. See createInput().planningCache
: The matmul planning cache.
-
GruWeights
gruWU
(poplar::Graph &graph, const GruParams ¶ms, poplar::program::Sequence &prog, const poplar::Tensor &fwdOutputInit, const poplar::Tensor &fwdIntermediates, const poplar::Tensor &bwdIntermediates, const GruWeights &weights, const poplar::Tensor &input, const poplar::Tensor &output, const std::string &debugPrefix, const poplar::OptionFlags &options_, poplin::matmul::PlanningCache *planningCache)¶ Run a standalone weight update pass.
Takes intermediates and gradients from the backward pass and calculates and returns weight deltas.
- Return
A set of weight gradients to sum with weights.
- Parameters
graph
: Graph to which the GRU cell belongs.params
: The parameters of the GRU.prog
: Program sequence to add operations to.fwdOutputInit
: Forward output tensor for initial step.fwdIntermediates
: Intermediate results from the forward pass.bwdIntermediates
: Intermediate results from the backward pass.weights
: The GRU weights structure.input
: The input tensor to the GRU of shape: [timesteps, batch, inputSize]output
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.debugPrefix
: String used as a prefix to compute sets and tensors added to the graph.options
: GRU implementation options. See createInput().planningCache
: The matmul planning cache.
-
poplar::Tensor
gruBwdWithWU
(poplar::Graph &graph, const GruParams ¶ms, poplar::program::Sequence &prog, const poplar::Tensor &fwdOutputInit, const poplar::Tensor &fwdIntermediates, const GruWeights &weights, const poplar::Tensor &input, const poplar::Tensor &output, const poplar::Tensor &outputGrad, poplar::Tensor *inputGrad, GruWeights &weightsGrad, const std::string &debugPrefix, const poplar::OptionFlags &options_, poplin::matmul::PlanningCache *planningCache)¶ Run a combined GRU backward and weight update pass.
Use this combined backward and weight update pass in preference to
gruBwd
andgruWU
separately in order to allow the most efficient implementation to be chosen if you do not need to split the operation.- Return
The gradient of the initial output.
- Parameters
graph
: Graph to which the GRU cell belongs.params
: The parameters of the GRU.prog
: Program sequence.fwdOutputInit
: Forward output tensor for initial step.fwdIntermediates
: Intermediates results from the forward pass.weights
: The GRU weights structure.input
: The input tensor to the GRU of shape: [timesteps, batch, inputSize]output
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.outputGrad
: The gradients of the output. Depending on the outputFullSequence parameter this is either the gradient of the output for the last timestep or it is a sequence output gradients for each timestep.[out] *inputGrad
: The gradients of the inputs - may be null if this information is not required.weightsGrad
: A set of weight deltas to sum with weights.debugPrefix
: String used as prefix for compute sets.options
: GRU implementation options. See createInput().planningCache
: The matmul planning cache.
-
struct
GruParams
¶ - #include <Gru.hpp>
Structure representing the parameters of the GRU.
-
struct
GruWeights
¶ - #include <Gru.hpp>
Structure holding all the parameters of a GRU cell, or the deltas for those parameters (depending on the context).
-
poplar::Tensor
-
namespace
4.6.4. popnn/GruDef.hpp¶
Enums
-
enum
BasicGruCellUnit
¶ The units within a basic GRU cell.
In general all of these require a weight matrix, a bias and a non-linearity. Typically, a fixed type of non-linearity is associated with each type of unit.
Values:
-
enumerator
BASIC_GRU_CELL_RESET_GATE
¶
-
enumerator
BASIC_GRU_CELL_UPDATE_GATE
¶
-
enumerator
BASIC_GRU_CELL_CANDIDATE
¶
-
enumerator
BASIC_GRU_CELL_NUM_UNITS
¶
-
enumerator
4.6.5. popnn/InstanceNorm.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
in
¶ Functions
-
std::pair<poplar::Tensor, poplar::Tensor>
instanceNormStatistics
(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Estimate mean and inverse of standard deviation of activations.
-
poplar::Tensor
instanceNormWhiten
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Whiten activations given mean and standard deviation.
-
std::pair<poplar::Tensor, poplar::Tensor>
instanceNormalise
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Instance normalise activations given mean, standard deviation and norm parameters.
The result is two tensors
normalised activations
whitened activations
-
std::pair<poplar::Tensor, poplar::Tensor>
instanceNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
std::pair<poplar::Tensor, poplar::Tensor>
instanceNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
poplar::Tensor
instanceNormGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
poplar::Tensor
instanceNormGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
void
instanceNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Update parameters given gradients w.r.t. parameters.
-
void
instanceNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
uint64_t
getFwdFlops
(uint64_t numChannels, uint64_t actsPerChannel, bool computeEstimates)¶ In flop computation, the following applies:
Acts per channel:
for fc layers: the total number of batches.
for conv layers: the field size per channel * batch size.
Number of channels:
for fc layers: the total number of activations in a batch.
for conv layers: the total number of channels.
-
uint64_t
getBwdFlops
(uint64_t numChannels, uint64_t actsPerChannel)¶
-
uint64_t
getWuFlops
(uint64_t numChannels, uint64_t actsPerChannel)¶
-
std::pair<poplar::Tensor, poplar::Tensor>
-
namespace
4.6.6. popnn/LayerNorm.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
ln
¶ Functions
-
std::pair<poplar::Tensor, poplar::Tensor>
layerNormStatistics
(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo = false, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Estimate mean and inverse of standard deviation of activations.
-
poplar::Tensor
layerNormWhiten
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Whiten activations given mean and standard deviation.
-
std::pair<poplar::Tensor, poplar::Tensor>
layerNormalise
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Layer normalise activations given mean, standard deviation and norm parameters.
The result is two tensors:
normalised activations
whitened activations
-
std::pair<poplar::Tensor, poplar::Tensor>
layerNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
std::pair<poplar::Tensor, poplar::Tensor>
layerNormParamGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t parameters for parameter update.
-
poplar::Tensor
layerNormGradients
(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the layer norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
poplar::Tensor
layerNormGradients
(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Compute gradients w.r.t input activations for the layer norm layer.
Gradients are propagated through the complete layer including statistics computation.
-
void
layerNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Update layer norm parameters given the gradients w.r.t. parameters.
-
void
layerNormParamUpdate
(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶
-
std::pair<poplar::Tensor, poplar::Tensor>
-
namespace
4.6.7. popnn/Loss.hpp¶
-
namespace
popnn
Functions used in neural networks.
Functions
-
poplar::program::Program
calcLoss
(poplar::Graph &graph, const poplar::Tensor &modelOutputs, const poplar::Tensor &expected, const poplar::Tensor &loss, const poplar::Tensor &deltas, const poplar::Tensor &deltasScale, const poplar::Tensor &modelOutputScaling, LossType lossType, const std::string &debugPrefix = "")¶ Calculate loss and gradient for a set of activations and expected labels.
- Parameters
graph
: Graph to add operations and tensors to.modelOutputs
: 2D tensor of model outputs per-batch to calculate loss for.expected
: One-hot encoded tensor (Labels per-batch) with the same number of rows as modelOutputs. Elements of the expected labels may be masked by using MASKED_LABEL_CODE. Such labels will not contribute to loss calculation.loss
: 1D Tensor to store the loss per-batch. Has the same number of rows as modelOutputs.deltas
: 2D Tensor to store deltas for each activation from the expected per-batch. Has the same dimensions as modelOutputs.deltasScale
: Optional Tensor to scale output deltas with when the lossType is CROSS_ENTROPY_LOSS. Scaling will be deltasScale / modelOutputScaling. If no tensor is specified a default will be created initialised with 1.0.modelOutputScaling
: Optional Tensor indicating the scaling of the modelOutputs when lossType is CROSS_ENTROPY_LOSS, normally from a softMax layer when the nonLinearity used is SOFTMAX_SCALED. If no tensor is specified a default will be created initialised with 1.0.lossType
: Method for calculating loss measurement.debugPrefix
: Optional debug prefix for operations and tensors for this operation.
-
poplar::program::Program
calcLoss
(poplar::Graph &graph, const poplar::Tensor &modelOutputs, const poplar::Tensor &expected, const poplar::Tensor &loss, const poplar::Tensor &deltas, LossType lossType, const std::string &debugPrefix = "")¶
-
poplar::program::Program
calcLoss
(poplar::Graph &graph, const poplar::Tensor &modelOutputs, const poplar::Tensor &expected, const poplar::Tensor &loss, const poplar::Tensor &deltas, const poplar::Tensor &deltasScale, const poplar::Tensor &modelOutputScaling, const poplar::Tensor &numCorrect, LossType lossType, const std::string &debugPrefix = "")¶ Calculate loss, gradient, and number of correct classifications per-batch for a set of activations and expected labels.
Elements of the expected labels may be masked by using MASKED_LABEL_CODE. Such labels will not contribute to the accuracy and loss calculation.
- See
calcLoss
, andcalcAccuracy
which this function is simply a combination of.
-
poplar::program::Program
calcLoss
(poplar::Graph &graph, const poplar::Tensor &modelOutputs, const poplar::Tensor &expected, const poplar::Tensor &loss, const poplar::Tensor &deltas, const poplar::Tensor &numCorrect, LossType lossType, const std::string &debugPrefix = "")¶
-
poplar::program::Program
calcAccuracy
(poplar::Graph &graph, const poplar::Tensor &modelOutputs, const poplar::Tensor &expected, const poplar::Tensor &numCorrect, const std::string &debugPrefix = "")¶ Calculate the number of correct classifications for a set of activations and expected labels.
- Parameters
graph
: Graph to add operations and tensors to.modelOutputs
: 2D tensor of model outputs per-batch to calculate loss for.expected
: Labels per-batch. Elements of the expected labels may be masked by using MASKED_LABEL_CODE. Such labels will not contribute to the accuracy calculation.numCorrect
: Tensor to store the number of correct classifications. Must be scalar, or single-element Tensor.activationType
: Device type used for activations.expectedType
: Device type used for expected labels.debugPrefix
: Optional debug prefix for operations and tensors for this operation.
-
poplar::Tensor
argMax
(poplar::Graph &graph, const poplar::Tensor &input, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Compute argmax for each of the outer dimensions of
input
tensor.If
input
is a tensor of dim [y][x] then argmax is computed over x elements for each of the y outer dimension elements- Parameters
graph
: Graph to add operations and tensors to.input
: 2D tensor of inputsprog
: Program to which the graph for this operation is addeddebugPrefix
: Optional debug prefix for operations and tensors for this operation.
-
poplar::Tensor
argMin
(poplar::Graph &graph, const poplar::Tensor &input, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Compute argmin for each of the outer dimensions of
input
tensor.If
input
is a tensor of dim [y][x] then argmin is computed over x elements for each of the y outer dimension elements- Parameters
graph
: Graph to add operations and tensors to.input
: 2D tensor of inputsprog
: Program to which the graph for this operation is addeddebugPrefix
: Optional debug prefix for operations and tensors for this operation.
-
poplar::Tensor
topK
(poplar::Graph &graph, const poplar::Tensor &input, poplar::Tensor &indices, unsigned K, bool sort, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Find the top K elements of |input|.
Takes a 2D tensor in the form of [batch][values] and will return a tensor in the shape of [batch][K] where K is the max values of each batch of values.
- Parameters
graph
: Graph to add operations and tensors to.input
: 2D tensor of inputsindices
: The tensor to store the indices in.K
: The number of values to return.sort
: If true values will be sorted in descending order.prog
: Program to which the graph for this operation is addeddebugPrefix
: Optional debug prefix for operations and tensors for this operation.
-
poplar::program::Program
4.6.8. popnn/Lstm.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
lstm
¶ Functions
-
std::vector<std::pair<poplin::MatMulParams, poplar::OptionFlags>>
getMatMulPrePlanParameters
(LstmParams params, poplar::OptionFlags opts)¶ Predict what matrix multiplications will be needed for the given parameters and return list of corresponding matmul parameters and options.
-
uint64_t
getBasicLstmCellFwdFlops
(const LstmParams ¶ms)¶
-
uint64_t
getBasicLstmCellBwdFlops
(const LstmParams ¶ms)¶
-
uint64_t
getBasicLstmCellWuFlops
(const LstmParams ¶ms)¶
-
poplar::Tensor
createInput
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create an input tensor of shape {numSteps, batchSize, inputSize} which is optimally mapped to multiply the whole input sequence in a single matrix multiply operation.
LSTM options
availableMemoryProportion
Decimal between 0 and 1 (inclusive)inferenceOnly
(true, false) [=false]Sets convolution pass to INFERENCE_FWD if true; TRAINING_FWD otherwise. See poplin::createWeights().
partialsType
(half, float) [=float]weightAccumulatorsType
(half, float) [=data type of lstm]Data type of the weight acccumulators for the lstms weight matrices and biases
preCalcWeights
(true, false) [=false]If true, use one big matrix multiply before the recurrent calculation to perform the part of the calculation that only depends on the input sequence.
recomputationMode
(none, cellAndTanh, full) [=none]none: No recomputation in the backwards pass.
cellAndTanh: Small amount of recomputation in the backwards pass, yielding some reduction in memory footprint for the layer.
full: Recompute everything from the forward pass. Saves the most memory at the cost of an extra forward pass of cycles.
- Return
A tensor created in the graph of shape {timeSteps, batchSize, inputSize}.
- Parameters
graph
: Graph object.params
: The LSTM parameters.name
: String annotation.options
: Any implementation/debug options for the LSTM.planningCache
: A poplin matrix multiply planning cache.
-
poplar::Tensor
createInitialOutput
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the initial output that can be combined with the initial cell state using a LstmState.
This then can be fed into the LSTM call at the first timestep.
- Return
A tensor which is the cell state for the forward operation of the LSTM cell.
- Parameters
graph
: Graph object.params
: The LSTM parameters.name
: String annotation.options
: Any implementation/debug options for the LSTM. See createInput().planningCache
: A poplin matrix multiply planning cache.
-
poplar::Tensor
createInitialCellState
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the initial cell state that can be combined with the initial output using a LstmState.
This then can be fed into the LSTM call at the first timestep.
- Return
A tensor which is the cell state for the forward operation of the LSTM cell.
- Parameters
graph
: Graph object.params
: The LSTM parameters.name
: String annotation.options
: Any implementation/debug options for the LSTM. See createInput().planningCache
: A poplin matrix multiply planning cache.
-
LstmState
createInitialState
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Creates the initial state (both output and cellState) that is fed into the LSTM call at the first timestep.
It can be initialised by writing the appropriate member or using zeroInitialState().
- Return
A tensor which is the state for the forward operation of the LSTM cell.
- Parameters
graph
: Graph object.params
: The LSTM parameters.name
: String annotation.options
: Any implementation/debug options for the LSTM. See createInput().planningCache
: A poplin matrix multiply planning cache.
-
void
zeroInitialState
(poplar::Graph &graph, const LstmState &initialState, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Initialize the forward state of an LSTM with zeros.
- Parameters
graph
: Graph object.initialState
: The initial state to zero.prog
: The program to extend with the initialization codedebugPrefix
: A debug string to prepend to debug indentifiers in the added code.
-
std::pair<poplar::Tensor, poplar::Tensor>
createWeightsKernel
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights kernel used to weight the input of an LSTM.
Returns the inputWeights and outputWeights.
-
poplar::Tensor
createWeightsBiases
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights biases.
-
LstmWeights
createWeights
(poplar::Graph &graph, const LstmParams ¶ms, const std::string &name, const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights (both kernel and biases) used to weight the input of an LSTM.
-
std::pair<poplar::Tensor, poplar::Tensor>
lstmFwd
(poplar::Graph &graph, const LstmParams ¶ms, const LstmState &stateInit, const poplar::Tensor &in, const LstmWeights &weights, poplar::Tensor *intermediates, poplar::program::Sequence &fwdProg, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Calculate the result of applying an LSTM across a sequence.
The LSTM is run for seqSize steps each with a batch of size batchSize and input size inputSize and output size outputSize. The total number of units within each LSTM cell is lstmUnits = BASIC_LSTM_CELL_NUM_UNITS.
- Return
The output of the LSTM and the final cell state.
Depending on the outputFullSequence parameter the output tensor is either the output of the last timestep in the shape [batch, outputSize] or it is the sequence of outputs for every timestep in the shape [timesteps, batch, outputSize].
- Parameters
graph
: Graph to which the LSTM cell belongs.params
: The parameters of the LSTM.stateInit
: Initial state for the LSTM.in
: The input tensor to the LSTM of dimension [timesteps, batch, inputSize].weights
: The LSTM weights structure.[out] intermediates
: Intermediate results that are retained in the the forward pass of training for use in the backward pass. This argument should be set to null if we are only doing inference.weights
: The LSTM weights structure.fwdProg
: Program sequence.debugPrefix
: String used as prefix for compute sets.options
: LSTM implementation options. See createInput().planningCache
: The matmul planning cache.
-
LstmState
lstmBwd
(poplar::Graph &graph, const LstmParams ¶ms, poplar::program::Sequence &prog, const LstmState &fwdStateInit, const poplar::Tensor &fwdIntermediates, const LstmWeights &weights, const poplar::Tensor &input, const poplar::Tensor &output, const poplar::Tensor &outputGrad, const poplar::Tensor *lastCellStateGrad, poplar::Tensor *inputGrad, poplar::Tensor *bwdIntermediates, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Run LSTM backward pass.
The backward pass executes in reverse order as compared to the forward pass. If the forward steps for a LSTM layer are sf = {0, 1, 2, …, S - 1} then the backward steps run for sb = {S - 1, S - 2, …. , 1, 0}.
- Return
The gradient of the initial state.
- Parameters
graph
: Graph to which the LSTM cell belongs.params
: The parameters of the LSTM.prog
: Program sequence.fwdStateInit
: Forward state tensor for initial step.fwdIntermediates
: Intermediates results from the forward pass.weights
: The LSTM weights structure.input
: The input tensor to the LSTM of shape: [timesteps, batch, inputSize].output
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.outputGrad
: The gradients of the output. Depending on theoutputFullSequence
parameter this is either the gradient of the output for the last timestep or it is a sequence output gradients for each timestep.lastCellStateGrad
: The gradient of the last cell state - may be null if there is no incoming gradient.[out] *inputSeqGrad
: The gradients of the inputs - may be null if this information is not required.[out] *bwdIntermediates
: Intermediates gradients that are retained in the backward pass of training for use in the weight update. This argument should be set to null if you do not need to calculate weight deltas.debugPrefix
: String used as prefix for compute sets.options
: LSTM implementation options. See createInput().planningCache
: The matmul planning cache.
-
LstmWeights
lstmWU
(poplar::Graph &graph, const LstmParams ¶ms, poplar::program::Sequence &prog, const LstmState &fwdStateInit, const poplar::Tensor &fwdIntermediates, const poplar::Tensor &bwdIntermediates, const LstmWeights &weights, const poplar::Tensor &input, const poplar::Tensor &output, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Run a standalone weight update pass.
Takes intermediates and gradients from the backward pass and calculates and returns weight deltas.
- Return
A set of weight gradients to sum with weights.
- Parameters
graph
: Graph to which the LSTM cell belongs.params
: The parameters of the LSTM.prog
: Program sequence to add operations to.fwdStateInit
: Forward state tensor for initial step.fwdIntermediates
: Intermediate results from the forward pass.bwdIntermediates
: Intermediate results from the backward pass.weights
: The LSTM weights structure.input
: The input tensor to the LSTM of shape: [timesteps, batch, inputSize].output
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.debugPrefix
: String used as a prefix to compute sets and tensors added to the graph.options
: LSTM implementation options. See createInput().planningCache
: The matmul planning cache.
-
LstmState
lstmBwdWithWU
(poplar::Graph &graph, const LstmParams ¶ms, poplar::program::Sequence &prog, const LstmState &fwdStateInit, const poplar::Tensor &fwdIntermediates, const LstmWeights &weights, const poplar::Tensor &input, const poplar::Tensor &output, const poplar::Tensor &outputGrad, const poplar::Tensor *lastCellStateGrad, poplar::Tensor *inputGrad, LstmWeights &weightsGrad, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {}, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Run a combined LSTM backward and weight update pass.
Use this combined backward and weight update pass in preference to
lstmBwd
andlstmWU
separately in order to allow the most efficient implementation to be chosen if you do not need to split the operation.- Return
The gradient of the initial state.
- Parameters
graph
: Graph to which the LSTM cell belongs.params
: The parameters of the LSTM.prog
: Program sequence.fwdStateInit
: Forward state tensor for initial step.fwdIntermediates
: Intermediates results from the forward pass.weights
: The LSTM weights structure.input
: The input tensor to the LSTM of shape: [timesteps, batch, inputSize].output
: The output tensor from the forward pass. Depending on the outputFullSequence parameter this is either the output for the last timestep or it is a sequence of outputs for each timestep.outputGrad
: The gradients of the output. Depending on theoutputFullSequence
parameter this is either the gradient of the output for the last timestep or it is a sequence output gradients for each timestep.lastCellStateGrad
: The gradient of the last cell state - may be null if there is no incoming gradient.[out] *inputSeqGrad
: The gradients of the inputs. May be null if this information is not required.weightsGrad
: A set of weight deltas to sum with weights.debugPrefix
: String used as prefix for compute setsoptions
: LSTM implementation options. See createInput().planningCache
: The matmul planning cache.
-
struct
LstmParams
¶ - #include <Lstm.hpp>
Structure representing the parameters of the LSTM.
Public Functions
-
LstmParams
() = default¶
Public Members
-
std::vector<std::size_t>
layerSizes
¶ The number of neurons before and after each layer of the LSTM.
If the LSTM consists of N layers, then this should be a vector of size N+1. The first element is the input size and each subsequent element is the output size of the LSTM layer.
-
bool
outputFullSequence
= true¶ If true the Lstm function returns the entire sequence of outputs, otherwise it returns just the final output.
-
bool
doInputWeightCalc
= true¶ If this parameter is set to false then the LSTM will skip the calculation of weighted inputs (only useful for benchmarking).
-
bool
calcInputGradients
= true¶ If this parameter is set to false then the LSTM will skip the calculation of the gradients of the inputs.
-
-
struct
LstmState
¶ - #include <Lstm.hpp>
Structure holding the state of a LSTM cell, or the gradients for the state (depending on the context).
-
struct
LstmWeights
¶ - #include <Lstm.hpp>
Structure holding all the parameters of an LSTM cell, or the deltas for those parameters (depending on the context).
-
std::vector<std::pair<poplin::MatMulParams, poplar::OptionFlags>>
-
namespace
4.6.9. popnn/LstmDef.hpp¶
Enums
-
enum
BasicLstmCellUnit
¶ The units within a basic LSTM cell.
The term unit is used to refer to either a gate, or a cell state vector computation. In general all of these require a weight matrix, a bias and a non-linearity. Typically, a fixed type of non-linearity is associated with each type of unit.
Values:
-
enumerator
BASIC_LSTM_CELL_FORGET_GATE
¶
-
enumerator
BASIC_LSTM_CELL_INPUT_GATE
¶
-
enumerator
BASIC_LSTM_CELL_CANDIDATE
¶
-
enumerator
BASIC_LSTM_CELL_OUTPUT_GATE
¶
-
enumerator
BASIC_LSTM_CELL_NUM_UNITS
¶
-
enumerator
4.6.10. popnn/NonLinearity.hpp¶
Defines
-
DEF_NONLINEARITY_INPLACE
(fn, nlType)¶
-
DEF_NONLINEARITY_
(fn, nlType)¶
-
DEF_NONLINEARITY
(fn, nlType)¶
-
namespace
popnn
Functions used in neural networks.
Functions
-
void
nonLinearityInPlace
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Update tensor
t
by applying the given non-linearity in-place.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply tot
.t
: The tensor to apply the non-linearity to.prog
: The sequence to add the operation to.debugPrefix
: Optional string to use as a prefix to debug information.
-
void
nonLinearityInPlace
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, poplar::ComputeSet &cs, const std::string &debugPrefix = "")¶ Update tensor
t
by applying the given non-linearity in-place.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply tot
.t
: The tensor to apply the non-linearity to.cs
: The compute set to add vertices to.debugPrefix
: Optional string to use as a prefix to debug information.
-
void
nonLinearityInPlace
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Update tensor
t
by applying the given non-linearity in-place and return the scaling factor by which outputs from this operation are multiplied innonLinearityScaling
.For NonLinearityType other than SOFTMAX_SCALED
nonLinearityScaling
will be 1.0f upon return.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply tot
.t
: The tensor to apply the non-linearity to.nonLinearityScaling
: Reference to a float which will be overwritten with the scaling factor by which outputs from this operation int
are multiplied.prog
: The sequence to add the operation to.debugPrefix
: Optional string to use as a prefix to debug information.
-
void
nonLinearityInPlace
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, float &nonLinearityScaling, poplar::ComputeSet &cs, const std::string &debugPrefix = "")¶ Update tensor
t
by applying the given non-linearity in-place and return the scaling factor by which outputs from this operation are multiplied innonLinearityScaling
.For NonLinearityType other than SOFTMAX_SCALED
nonLinearityScaling
will be 1.0f upon return.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply tot
.t
: The tensor to apply the non-linearity to.nonLinearityScaling
: Reference to a float which will be overwritten with the scaling factor by which outputs from this operation int
are multiplied.cs
: The compute set to add vertices to.debugPrefix
: Optional string to use as a prefix to debug information.
-
poplar::Tensor
nonLinearity
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Apply the given non-linearity to tensor
t
and return the result.- Return
A new tensor containing the contents of
t
with the given non-linearity applied.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply.t
: The tensor to apply the non-linearity to.prog
: The sequence to add the operation to.debugPrefix
: Optional string to use as a prefix to debug information.
-
poplar::Tensor
nonLinearity
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Apply the given non-linearity to tensor
t
and return the result.Also returns the scaling factor by which outputs from this operation are multiplied in
nonLinearityScaling
.For NonLinearityType other than SOFTMAX_SCALED
nonLinearityScaling
will be 1.0f upon return.- Return
A new tensor containing the contents of
t
with the given non-linearity applied.- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to apply tot
.t
: The tensor to apply the non-linearity to.nonLinearityScaling
: Reference to a float which will be overwritten with the scaling factor by which outputs from this operation int
are multiplied.prog
: The sequence to add the operation to.debugPrefix
: Optional string to use as a prefix to debug information.
-
void
sigmoidInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
sigmoidInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
sigmoid
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
sigmoid
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
reluInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
reluInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
relu
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
relu
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
tanhInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
tanhInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
tanh
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
tanh
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
geluInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
geluInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
gelu
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
gelu
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
softmaxInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
softmaxInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
softmax
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
softmax
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
softmaxStableInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
softmaxStableInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
softmaxStable
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
softmaxStable
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
scaledSoftmaxStableInPlace
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
void
scaledSoftmaxStableInPlace
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
scaledSoftmaxStable
(poplar::Graph &graph, poplar::Tensor t, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
scaledSoftmaxStable
(poplar::Graph &graph, poplar::Tensor t, float &nonLinearityScaling, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶
-
poplar::Tensor
nonLinearityInputGradient
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor act, poplar::Tensor outGradient, poplar::ComputeSet &cs, const std::string &debugPrefix = "")¶ Computes and returns the input gradient for a non-linearity from the activations and gradients at the output of the non-linearity.
- Return
A new tensor with the calculated gradient for the input of the non-linearity.
- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to compute the input gradient for.act
: The output activations from the non-linearity. For the GELU non-linearity only this is the input to the non-linearity.outGradient
: The gradients at the output of the non-linearity.cs
: The compute set to add vertices to.debugPrefix
: Optional string to use as a prefix to debug information.
-
poplar::Tensor
nonLinearityInputGradient
(poplar::Graph &graph, NonLinearityType nonLinearityType, poplar::Tensor act, poplar::Tensor outGradient, poplar::program::Sequence &prog, const std::string &debugPrefix = "")¶ Computes and returns the input gradient for a non-linearity from the activations and gradients at the output of the non-linearity.
- Return
A new tensor with the calculated gradient for the input of the non-linearity.
- Parameters
graph
: The graph to add the operation to.nonLinearityType
: The type of non-linearity to compute the input gradient for.act
: The output activations from the non-linearity. For the GELU non-linearity only this is the input to the non-linearity.outGradient
: The gradients at the output of the non-linearity.prog
: The sequence to add the operation to.debugPrefix
: Optional string to use as a prefix to debug information.
-
void
4.6.11. popnn/NonLinearityDef.hpp¶
-
namespace
popnn
Functions used in neural networks.
Enums
-
enum
NonLinearityType
¶ Values:
-
enumerator
SIGMOID
¶ Sigmoid:
y = 1 / (1 + e^(-x))
-
enumerator
RELU
¶ Rectified Linear Unit:
x >= 0 -> y = x
x < 0 -> y = 0
-
enumerator
TANH
¶ Hyperbolic tangent:
y = tanh(x)
-
enumerator
GELU
¶ Gaussian Error Linear Unit:
y = x * Phi(x) where Phi(x) is the cumulative distribution function of normal gaussian distribution. Phi(x) is approximated as:
Phi(x) = 0.5 * (1 + (tanh(x * 0.7978845608 * (1 + 0.044715 * x * x))))
-
enumerator
SOFTMAX
¶ Softmax:
Always applied over the innermost dimension of the given tensor. Outer dimensions are independent of one another.
-
enumerator
SOFTMAX_STABLE
¶ Same as SOFTMAX, but slower more numerically stable algorithm used.
-
enumerator
SOFTMAX_SCALED
¶ Same as SOFTMAX, but slower more numerically stable algorithm used.
Outputs are scaled to allow use of greater dynamic range in outputs.
-
enumerator
-
enum
4.6.12. popnn/NonLinearityDefUtil.hpp¶
-
template<>
structpoputil
::
VertexTemplateToString
<popnn::NonLinearityType>¶ - #include <NonLinearityDefUtil.hpp>
Specialise vertex template stringification for non-linearity type.
Public Static Functions
-
std::string
to_string
(const popnn::NonLinearityType &nlType)¶
-
std::string
-
namespace
popnn
Functions used in neural networks.
Functions
-
const char *
asString
(const popnn::NonLinearityType &type)¶
-
std::ostream &
operator<<
(std::ostream &os, const popnn::NonLinearityType &type)¶
-
std::istream &
operator>>
(std::istream &in, popnn::NonLinearityType &type)¶
-
const char *
-
namespace
poputil
-
template<> NonLinearityType >
- #include <NonLinearityDefUtil.hpp>
Specialise vertex template stringification for non-linearity type.
Public Static Functions
-
std::string
to_string
(const popnn::NonLinearityType &nlType)¶
-
std::string
-
4.6.13. popnn/Norms.hpp¶
-
namespace
popnn
Functions used in neural networks.
Functions
-
std::uint64_t
getNormFwdFlops
(std::size_t statisticsSize, std::size_t numActsElements, bool computeStats = true)¶ Flops for forward pass of a norm layer with a given size of statistics vector and the total elements in the activations input to the layer.
For Batch Norm,
computeStats
should be set to false for inference if batch statistics are not computed as averaged batch statistics may be combined with norm parameters.
-
std::uint64_t
4.6.14. popnn/Pooling.hpp¶
-
namespace
popnn
Functions used in neural networks.
-
namespace
pooling
¶ Functions
-
std::ostream &
operator<<
(std::ostream &o, const PoolParams ¶ms)¶
-
const char *
asString
(const PoolingType &method)¶
-
std::vector<std::size_t>
getOutputFieldShape
(const PoolParams ¶ms)¶
-
uint64_t
getFwdFlops
(const PoolParams ¶ms)¶
-
uint64_t
getBwdFlops
(const PoolParams ¶ms)¶
-
double
getFwdPerfectCycleCount
(const poplar::Graph &graph, const PoolParams ¶ms)¶
-
double
getBwdPerfectCycleCount
(const poplar::Graph &graph, const PoolParams ¶ms)¶
-
poplar::Tensor
pool
(poplar::Graph &graph, const PoolParams ¶ms, const poplar::Tensor &in, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ Add a pooling operation to the graph.
This performs a pooling over the spatial dimensions […]. The shape of the input should be [B x inChans x …].
- Return
A tensor with the results of the pooling operation
- Parameters
graph
: The operation will be added to this graphparams
: Pooling parametersin
: Input tensorprog
: Program sequence to append the operation todebugPrefix
: Debug name for the operationoptions
: Pooling options (not currently used)
-
poplar::Tensor
poolInputGradient
(poplar::Graph &graph, const PoolParams ¶ms, const poplar::Tensor &in, const poplar::Tensor &pooled, const poplar::Tensor &pooledGradient, bool useScaledGradient, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ For MAX, AVG or SUM pooling.
Note - recommend the specific function for AVG or SUM pooling, below. Calculate the gradient w.r.t. to the input of a pooling operation given the gradient of the output.
This performs a pooling over the spatial dimensions […]. The shape of the input should be [B x inChans x …].
- Return
A tensor with the results of the pooling operation
- Parameters
graph
: The operation will be added to this graphparams
: Pooling parametersin
: Forward activations tensor input to poolingpooled
: Output of pooling in the forward passpooledGradient
: Gradients to the pooling operationuseScaledGradient
: Use scaled gradient if set to true. Otherwise, the gradient is propagated to all the positions which matched pooled value in forward pass.prog
: Program sequence to append the operation todebugPrefix
: Debug name for the operationoptions
: Pooling options. See pool().
-
poplar::Tensor
poolInputGradient
(poplar::Graph &graph, const PoolParams ¶ms, const unsigned fwdChansPerGroup, const poplar::Tensor &pooledGradient, poplar::program::Sequence &prog, const std::string &debugPrefix = "", const poplar::OptionFlags &options = {})¶ For AVG and SUM pooling Calculate the gradient w.r.t.
to the input of a pooling operation given the gradient of the output.
This performs a pooling over the spatial dimensions […]. The shape of the output will be [B x inChans x …].
- Return
A tensor with the results of the pooling operation
- Parameters
graph
: The operation will be added to this graphparams
: Pooling parametersfwdChansPerGroup
: Used in creating the output tensorpooledGradient
: Gradients to the pooling operationprog
: Program sequence to append the operation todebugPrefix
: Debug name for the operationoptions
: Pooling options. See pool().
-
struct
PoolParams
¶ Public Functions
-
PoolParams
(PoolingType poolingType, std::vector<std::size_t> inputFieldShape, std::vector<std::size_t> kernelShape, std::vector<unsigned> stride, std::vector<int> inputTruncationOrPaddingLower, std::vector<int> inputTruncationOrPaddingUpper, std::size_t numChannels, std::size_t batchSize, poplar::Type dType)¶
Public Members
-
PoolingType
poolingType
¶
-
-
std::ostream &
-
namespace
4.6.15. popnn/PoolingDef.hpp¶
-
namespace
popnn
Functions used in neural networks.
4.6.16. popnn/Recurrent.hpp¶
-
namespace
poplin
Linear algebra functions.
A collection of utility functions to assist calculation of input/output ranges when moving a 2-dimensional kernel over a larger 2-dimensional space (for example in convolution or pooling layers)
-
namespace
matmul
-
namespace
-
namespace
popnn
Functions used in neural networks.
-
namespace
rnn
¶ Functions
-
std::vector<std::pair<poplin::MatMulParams, poplar::OptionFlags>>
getMatMulPrePlanParameters
(std::size_t numSteps, std::size_t batchSize, std::size_t inputSize, std::size_t outputSize, const poplar::Type &dType, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, bool hasFeedforwardWeights = true)¶ Predict what matrix multiplications will be needed for the given parameters and return list of corresponding matmul parameters and options.
-
uint64_t
getFwdFlops
(unsigned sequenceSize, unsigned batchSize, unsigned inputSize, unsigned outputSize, bool weightInput = true)¶ Compute the total flops for the forward pass of RNN.
-
uint64_t
getBwdFlops
(unsigned sequenceSize, unsigned batchSize, unsigned inputSize, unsigned outputSize, bool calcInputGrad = true)¶ Compute the total flops for the backward pass of RNN.
-
uint64_t
getWuFlops
(unsigned sequenceSize, unsigned batchSize, unsigned inputSize, unsigned outputSize)¶ Compute the total flops for the weight update pass of RNN.
-
poplar::Tensor
createInput
(poplar::Graph &graph, unsigned numSteps, unsigned batchSize, unsigned inputSize, unsigned outputSize, const poplar::Type &dType, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, const std::string &name = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create a tensor which is input to a vanilla RNN.
The layout of the tensor is best for a multiplication of the input weight matrix with the given number of steps.
- Return
Tensor of shape {numSteps, batchSize, inputSize}
- Parameters
graph
: Graph objectnumSteps
: Number of steps used in the forward weighting of inputbatchSize
: Number of batch elementsinputSize
: Size of the input for each sequence stepoutputSize
: Output(hidden) size of each sequence elementinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesdType
: Data type of the created tensorpartialsType
: Data type of intermediate calculationsname
: Name of the tensorplanningCache
: The matmul planning cache.
-
poplar::Tensor
createFwdState
(poplar::Graph &graph, const poplar::Type &dType, unsigned batchSize, unsigned outputSize, poplar::program::Sequence &prog, bool initState, bool inferenceOnly, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create initial state for a vanilla RNN.
The state apart from the activations are initialised by the control program
The amount of hidden state may depend on whether the RNN is used for inference or training.
- Return
A 2D tensor of shape {batchSize, outputSize}
- Parameters
graph
: Graph objectdType
: data type of the created tensorbatchSize
: Number of batch elementsoutputSize
: Output(hidden) of each sequence elementprog
: Control programinitState
: Initialise the stateinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesdebugPrefix
: String annotationplanningCache
: The matmul planning cache.
-
poplar::Tensor
getOutputFromFwdState
(const poplar::Tensor &fwdState)¶ Extract prev output tensor from hidden state.
The returned tensor is a view of tensor and can be used to initialise the tensor if required
-
poplar::Tensor
createWeightsInput
(poplar::Graph &graph, unsigned sequenceSize, unsigned batchSize, unsigned inputSize, unsigned outputSize, const poplar::Type &dType, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, const std::string &namePrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights used to weight the input of a vanilla RNN layer.
The tile mapping of the weight tensor is best for multiplication with a sequence size in the input activation tensor used to multiply with the input weights.
- Parameters
graph
: Graph objectsequenceSize
: Number of sequence steps used in the forward weighting of the input. The best tile mapping is when this matches the sequence size of the input activation tensorbatchSize
: Number of batch elementsinputSize
: Input size of each sequenceoutputSize
: Output(hidden) size of each sequencedType
: Data type of the created tensorpartialsType
: Data type of partial results in the computationinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesnamePrefix
: A string description of the weights tensorplanningCache
: The matmul planning cache.
-
poplar::Tensor
createWeightsFeedback
(poplar::Graph &graph, unsigned batchSize, unsigned outputSize, const poplar::Type &dType, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, const std::string &namePrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create the weights used in the recurrent part of a vanilla RNN layer.
- Parameters
graph
: Graph objectbatchSize
: Number of batch elementsoutputSize
: Output(hidden) size of each sequencedType
: Data type of the created tensorpartialsType
: Data type of partial results in the computationinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesnamePrefix
: A string description of the created tensorplanningCache
: The matmul planning cache.
-
poplar::Tensor
forwardWeightInput
(poplar::Graph &graph, const poplar::Tensor &actIn, const poplar::Tensor &weights, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Perform feedforward part of a RNN layer.
The feedforward part of the RNN layer must be followed by the feedback part to complete the RNN layer. i.e. the output must be fed as the feedforward input to the feedback part.
The following definitions are used below: numSteps is the number of sequence steps batchSize is the batchSize inputSize is the size of the input for each step outputSize is the size of the output for each step
- See
forwardIterate
- Return
Output tensor with shape {numSteps, batchSize, outputSize}
- Parameters
graph
: Graph pbjectactIn
: Input activation tensor with shape {numSteps, batchSize, inputSize}weights
: Feedforward weights with shape {outputSize, inputSize}prog
: Program sequence to which programs added by this function are appended topartialsType
: Data type for intermediatesinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesdebugPrefix
: Debug prefix stringplanningCache
: The matmul planning cache.
-
poplar::Tensor
forwardIterate
(poplar::Graph &graph, const poplar::Tensor &feedFwdIn, const poplar::Tensor &initState, const poplar::Tensor &feedbackWeights, const poplar::Tensor &biases, poplar::program::Sequence &prog, popnn::NonLinearityType nonLinearityType, const poplar::Type &partialsType = poplar::FLOAT, bool inferenceOnly = false, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Perform the feedback part of the RNN layer.
The feedback part of the RNN layer must be preceded by the feedforward part of the RNN layer to complete the layer
The following definitions are used below: numSteps is the number of steps batchSize is the batchSize inputSize is the size of the input for each step outputSize is the size of the output for each step
- See
forwardWeightInput
- Return
Output activations of RNN layer
- Parameters
graph
: Graph objectfeedFwdIn
: Input to this function (output from feedforward part of the RNN layerinitState
: The initial state of the RNN layer(i.e. the previous output)feedbackWeights
: Feedback weightsbiases
: Biasesprog
: Program sequence to which programs added by this function are appended tononLinearityType
: Non linearity used for the output activationspartialsType
: Data type for intermediatesinferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesdebugPrefix
: Debug prefix stringplanningCache
: The matmul planning cache.
-
poplar::Tensor
createBwdState
(poplar::Graph &graph, const poplar::Type &dType, unsigned batchSize, unsigned outputSize, poplar::program::Sequence &prog, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Create initial state for backward pass of a vanilla RNN.
- Return
Tile mapped initial state tensor
- Parameters
graph
: Graph objectdType
: Data type of the created tensorbatchSize
: Number of batch elements processedoutputSize
: Number of output activationsprog
: Control programdebugPrefix
: String annotationplanningCache
: The matmul planning cache.
-
std::pair<poplar::Tensor, poplar::Tensor>
backwardGradientStep
(poplar::Graph &graph, const poplar::Tensor &nextLayerGrad, const poplar::Tensor &bwdState, const poplar::Tensor &actOut, const poplar::Tensor &weightsInput, const poplar::Tensor &weightsFeedback, poplar::program::Sequence &prog, popnn::NonLinearityType nonLinearityType, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Compute a single step of backward pass of a vanilla RNN layer.
Two gradient outputs are produced. The first is at the input of the RNN layer for the step. The second is at the adder and can be used to backward propagate through the earlier steps.
- Return
A pair of tensors. The first is the loss gradient at the input layer. The second is the backward state needed to run the next backward step
- Parameters
graph
: Graph objectnextLayerGrad
: Loss gradient fed as input to this stepbwdState
: Gradient state for previous stepactOut
: Output activationweightsInput
: Input weightsweightsFeedback
: Feedback weightsprog
: Control program to which to add programs tononLinearityType
: Type of non-linearityfirstStep
: Set to true to indicate if first step in the backward passpartialsType
: Data type used in intermediate calculationsdebugPrefix
: A string annotationplanningCache
: The matmul planning cache.
-
poplar::Tensor
backwardGradientStep
(poplar::Graph &graph, const poplar::Tensor &nextLayerGrad, const poplar::Tensor &bwdState, const poplar::Tensor &actOut, const poplar::Tensor &weightsFeedback, poplar::program::Sequence &prog, popnn::NonLinearityType nonLinearityType, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Same as function above with the difference that the input gradients are not computed.
-
void
paramDeltaUpdate
(poplar::Graph &graph, const poplar::Tensor &bwdState, const poplar::Tensor &actIn, const poplar::Tensor &prevOut, poplar::Tensor &weightsInputDeltasAcc, poplar::Tensor &weightsFeedbackDeltasAcc, poplar::Tensor &biasDeltasAcc, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const std::string &debugPrefix = "", poplin::matmul::PlanningCache *planningCache = nullptr)¶ Update parameter deltas for a vanilla RNN step.
The parameter deltas updated are:
Feedback Weights
Input Weights
Bias The new deltas computed for this step are added to the accumulated deltas from previous steps. The caller must zero the accumulated tensors at the first call if the tensors to maintain the result are in-place.
- Parameters
graph
: Graph object.bwdState
: Gradient state for this step.actIn
: Input activations for this step.prevOut
: Previous RNN output activations for this step.weightsInputDeltasAcc
: Previous weights input deltas tensor. This tensor must be tile-mapped. The deltas from this step are added to this tensor.weightsFeedbackDeltasAcc
: Previous feedback weights deltas tensor. This tensor must be tile-mapped. The deltas from this step are added to this tensor.biasDeltasAcc
: Previous bias deltas tensor. This tensor must be tile-mapped. The deltas from this step are added to this tensor.prog
: Control program to which to add programs to.partialsType
: Data type used in intermediate calculations.debugPrefix
: String annotation.planningCache
: The matmul planning cache.
-
poplar::Tensor
rnnFwdSequence
(poplar::Graph &graph, poplar::program::Sequence &prog, const poplar::Tensor &fwdStateInit, const poplar::Tensor *weightedIn, const poplar::Tensor &biases, const poplar::Tensor &feedFwdWeights, const poplar::Tensor &feedbackWeights, const poplar::Tensor &prevLayerActs, const popnn::NonLinearityType &nonLinearityType, const poplar::Type &partialsType, bool inferenceOnly, const std::string &debugPrefix, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Perform the forward part of the RNN layer.
The feedback part of the RNN layer must be preceded by the feedforward part of the RNN layer to complete the layer.
The following definitions are used below:
numSteps is the number of steps
batchSize is the batchSize
inputSize is the size of the input for each step
outputSize is the size of the output for each step
- See
forwardWeightInput
- Return
Forward state tensor for all steps [0:seqSize)
- Parameters
graph
: Graph object.prog
: Control program.fwdStateInit
: Forward state tensor for initial step.weightedIn
: Preweighted input, or nullptr if Wff is to be applied.biases
: Biases.feedFwdWeights
: Input weights Wff.feedbackWeights
: Feedback weights Wfb.prevLayerActs
: Activations from previous layer (output from feedforward part of the RNN layer.nonLinearityType
: Non linearity used for the output activations.partialsType
: Data type for intermediates.inferenceOnly
: Whether the RNN layer is for inference only. If true, we can ignore backwards and weight update passesdebugPrefix
: Debug prefix string.planningCache
: The matmul planning cache.
-
std::tuple<poplar::Tensor, poplar::Tensor, poplar::Tensor, poplar::Tensor>
rnnBwdSequence
(poplar::Graph &graph, bool doWU, bool ignoreInputGradientCalc, poplar::program::Sequence &prog, const poplar::Tensor &fwdStateInit, const poplar::Tensor &fwdState, const poplar::Tensor &biases, const poplar::Tensor &feedFwdWeights, const poplar::Tensor &feedbackWeights, const poplar::Tensor &outGradient, const poplar::Tensor &actIn, const popnn::NonLinearityType &nonLinearityType, const poplar::Type &partialsType, const std::string &debugPrefix, poplin::matmul::PlanningCache *planningCache = nullptr)¶ Perform the feedback part of the RNN layer.
The feedback part of the RNN layer must be preceded by the feedforward part of the RNN layer to complete the layer.
The following definitions are used below:
numSteps is the number of steps
batchSize is the batchSize
inputSize is the size of the input for each step
outputSize is the size of the output for each step
- See
forwardWeightInput
- Return
Returns four tensors:
gradients for previous layer
input weight deltas
output weight deltas
bias deltas
When doWU is false the weight and bias deltas are not calculated
- Parameters
graph
: Graph objectdoWU
: Calculate weight updatesignoreInputGradientCalc
: Do not calculate the gradients over the input weightsprog
: Control programfwdStateInit
: Forward state tensor for initial stepfwdState
: Forward state tensor for all steps [0:seqSize)biases
: BiasesfeedFwdWeights
: Input weights WfffeedbackWeights
: Feedback weights WfboutGradient
: Gradient from next layeractIn
: Activations from previous layer (output from feedforward part of the RNN layernonLinearityType
: Non linearity used for the output activationspartialsType
: Data type for intermediatesdebugPrefix
: Debug prefix stringplanningCache
: The matmul planning cache.
-
std::vector<std::pair<poplin::MatMulParams, poplar::OptionFlags>>
-
namespace
4.6.17. popnn/SpatialSoftMax.hpp¶
-
namespace
popnn
Functions used in neural networks.
Functions
-
std::pair<poplar::Tensor, poplar::Tensor>
spatialSoftMax2D
(poplar::Graph &graph, poplar::program::Sequence &prog, const poplar::Tensor &fields, float temperature, bool disableSoftmax = false, const std::string &name = "")¶ Implements a spatial softmax specialised for 2D input fields.
This computes the expected coordinates (normalised to be in [-1.0, 1.0]) for every 2D field in the input tensor. A (trainable) temperature scalar is added which normalises the softmax across the fields.
The output of the spatial softmax (first tensor in the returned pair) is a set of expected x and y coordinates for the maximum activation in each field. This result has shape {F, 2} where F is the number of fields. Y-coordinates run down the first column and X-coordinates down the second column to preserve (row,column) indexing order into the original fields.
- Return
A pair of tensors. First is the output of the spatial-softmax, second is scalar temperature variable.
- Parameters
graph
: Graph to which variables and vertices will be added.prog
: Program to which operations will be added.fields
: The input Tensor. Must have rank 3. Interpretation is a set of 2D scalar fields of identical height (H) and width (W) given by the two inner dimensions (so shape is {F, H, W} where F is the number of fields).temperature
: Initial value for the softmax scaling/normalisation parameter.name
: Optional name used as prefix for introduced variables.disableSoftmax
: Turns off softmax computation in this function. This is useful if you have already computed a softmax over all the fields due to other processing or for test/debug.
-
std::pair<poplar::Tensor, poplar::Tensor>