DynamicSlice

#include <popops/DynamicSlice.hpp>

Support for dynamic slices.

namespace poplar: Poplar classes and functions.

namespace popops

Common functions, such as elementwise and reductions.

Functions

bool operator<(const SlicePlan &a, const SlicePlan &b) noexcept

bool operator==(const SlicePlan &a, const SlicePlan &b) noexcept

bool operator!=(const SlicePlan &a, const SlicePlan &b) noexcept

poplar::Tensor createSliceableTensor(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, std::size_t minGrainSize = 0, const poplar::DebugContext &debugContext = {})

Create and map a tensor to be sliced/updated efficiently.

The returned tensor will be spread over as many tiles as possible while respecting the minimum number of elements per tile (minGrainSize) and still being in a form that can be sliced/updated efficiently.

Parameters

graph – The Poplar graph.
type – The type of the elements.
shape – The shape of the tensor to be slice/updated.
dims – The dimensions of the tensor that will be slice/updated.
sizes – The size of the slice in each of the dimensions.
minGrainSize – The minimum elements per slice mapped to each tile
debugContext – Optional debug information.

Returns

A tensor shape shape that is suitably mapped

poplar::Tensor createSliceableTensor(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Create and map a tensor to be sliced/updated efficiently.

The returned tensor will be laid out according to the plan.

Parameters

graph – The Poplar graph.
type – The type of the elements.
shape – The shape of the tensor to be slice/updated.
dims – The dimensions of the tensor that will be slice/updated.
sizes – The size of the slice in each of the dimensions.
plan – Plan describing how the slicing/updating operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

Returns

A tensor shape shape that is suitably mapped.

poplar::Tensor createSliceTensor(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<size_t> &dims, const std::vector<size_t> &sizes, std::size_t numIndices, const poplar::DebugContext &debugContext = {})

Create and map a tensor to be sliced into or updated from efficiently.

Introspection on the tensor t is used to lay out the created tensor such that it can be used to efficiently update t.

Parameters

graph – The Poplar graph.
t – The tensor to be updated.
dims – The dimensions of the tensor that will be sliced/updated.
sizes – The number of elements of each dimension in dims that will be sliced/updated.
numIndices – The number of slices this tensor should contain.
plan – Plan describing how the slicing/updating operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

Returns

A tensor with shape [numIndices, shape…] mapped appropriately to be sliced into/updated from.

poplar::Tensor createSliceTensor(poplar::Graph &graph, const poplar::Type &type, const std::vector<std::size_t> &shape, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, std::size_t numIndices, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Create and map a tensor to be sliced into or updated from efficiently.

The returned tensor is laid out according to the plan for the slice/update operation.

Parameters

graph – The Poplar graph.
type – The type of the elements.
shape – The shape of the tensor to be slice/updated.
dims – The dimensions of the tensor that will be sliced/updated.
sizes – The number of elements of each dimension in dims that will be sliced/updated.
numIndices – The number of slices this tensor should contain.
plan – Plan describing how the slicing/updating operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

Returns

A tensor with shape [numIndices, shape…] mapped appropriately to be sliced into/updated from.

poplar::Tensor createIndicesTensor(poplar::Graph &graph, const std::vector<std::size_t> &dims, std::size_t numIndices, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Create and map a tensor to contain indices for slicing or updating a tensor efficiently.

Parameters

graph – The Poplar graph.
dims – The dimensions of a tensor to be sliced/updated that will be sliced/updated using these indices.
numIndices – The number of indices this tensor should contain
plan – Plan describing how the slicing/updating operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

Returns

A tensor of shape [numIndices, dims.size()] mapped appropriately to be used as the indices for a slice/update operation. Element type is always UNSIGNED_INT.

poplar::Tensor createSliceableTensorFromSlice(poplar::Graph &graph, const poplar::Tensor &s, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &numSlices, const poplar::DebugContext &debugContext = {})

Create and map a tensor to be sliced/updated.

The tensor is mapped in a way that can be efficiently sliced and updated to/from the given slice tensor. It will be distributed across as many tiles as the given slice and with the same contiguous regions on each tile. The tensor’s shape and mapping are derived from the reference slice tensor.

Parameters

graph – The Poplar graph.
s – The reference slice.
dims – The dimensions of the returned tensor that will be sliced.
numSlices – The number of independent slices in each sliced dimension.
debugContext – Optional debug information.

Returns

A tensor to be sliced/updated.

poplar::Tensor dynamicSlice(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &offset, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})

Slice a tensor based on offsets specified by a tensor.

dims gives the dimensions to slice, sizes defines the size of the slice in those dimensions and offset gives the base offsets on each execution.

offset[0], dims and sizes must have the same size. offset may have a second dimension with an element per tile, which can eliminate exchange.

Parameters

graph – The Poplar graph.
t – The source tensor.
offset – A tensor of offsets at which the output is extracted.
dims – The dimensions of t to slice.
sizes – The size of the slice in each of the dimensions in dims.
prog – The program to be extended
debugContext – Optional debug information.

Returns

The specified subtensor

void dynamicSliceWithOutput(poplar::Graph &graph, const poplar::Tensor &output, const poplar::Tensor &t, const poplar::Tensor &offset, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})

Slice a tensor based on offsets specified by a tensor.

dims gives the dimensions to slice, sizes defines the size of the slice in those dimensions and offset gives the base offsets on each execution.

offset[0], dims and sizes must have the same size. offset may have a second dimension with an element per tile, which can eliminate exchange.

Parameters

graph – The Poplar graph.
output – The output tensor, This should ideally be created with createSliceTensor to maximise efficiency,
t – The source tensor.
offset – A tensor of offsets at which the output is extracted.
dims – The dimensions of t to slice.
sizes – The size of the slice in each of the dimensions in dims.
prog – The program to be extended
debugContext – Optional debug information.

poplar::Graph::TileToTensorMapping getSliceMapping(poplar::Graph &graph, const poplar::Tensor &t, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes)

Get the tile mapping for a slice of a tensor.

dims gives the dimensions to slice, sizes defines the size of the slice in those dimensions.

Parameters

graph – The Poplar graph.
t – The source tensor.
dims – The dimensions of t to slice.
sizes – The size of the slice in each of the dimensions in dims.

void dynamicUpdate(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offset, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})

Update a subtensor at offsets read from a tensor.

dims gives the dimensions that are partially updated, by sizes elements, at offsets offset. Unspecified dimensions are copied in full with zero offset.

offset[0], dims and sizes must have the same size. offset may have a second dimension with an element per tile, which can eliminate exchange.

Parameters

graph – The Poplar graph.
t – The tensor to update.
s – The updates.
offset – The offset within t to be updated.
dims – The dimensions to be dynamically updated.
sizes – The size of the update in each of the dimensions in dims.
prog – The program to be extended.
debugContext – Optional debug information.

poplar::Tensor multiSlice(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &offsets, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Take multiple slices from a base tensor.

The returned tensor will have a rank one greater than t. Its outer dimension will be offsets.dim(0). Note that dims refers to the dimensions of t. t can be created using createSliceableTensor() to ensure efficient mapping.

Parameters

graph – The Poplar graph.
t – The tensor being sliced.
offsets – The offsets within t to be sliced.
dims – The dimensions of t to be sliced.
sizes – The size of the update in each of the dimensions in dims.
prog – The program to be extended.
plan – Plan describing how the operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

void multiUpdate(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offsets, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Update multiple slices in a tensor.

Parameters

graph – The Poplar graph.
t – The tensor being updated.
s – The slices to insert.
offsets – The offsets within t to be updated.
dims – The dimensions of t to be updated.
sizes – The size of the update in each of the dimensions in dims.
prog – The program to be extended.
plan – Plan describing how the operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

void multiUpdateAdd(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offsets, const poplar::Tensor &scale, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Accumulate multiple slices in a tensor for i offsets: t[offsets[i]] += scale * s[i] t, s must be of the same type.

Parameters

graph – The Poplar graph.
t – The tensor being updated (must be rank 2).
s – The slices to accumulate.
offsets – The offsets within t to be accumulated.
scale – The scaling to apply to the update. The type of the tensor should be the same as that of t and s except for the case when t and s are of type HALF. In which case scale can be of type FLOAT or HALF.
dims – The dimensions of t to be accumulated (must be rank 1).
sizes – The size of the accumulate in each of the dimensions in dims.
prog – The program to be extended.
plan – Plan describing how the operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

void multiUpdateMax(poplar::Graph &graph, const poplar::Tensor &t, const poplar::Tensor &s, const poplar::Tensor &offsets, const std::vector<std::size_t> &dims, const std::vector<std::size_t> &sizes, poplar::program::Sequence &prog, const SlicePlan &plan, const poplar::OptionFlags &options, const poplar::DebugContext &debugContext = {})

Find maximum over multiple slices in a tensor for i offsets: t[offsets[i]] = max(t[offsets[i]], s[i]) t, s must have the same element type offsets[i] >= t.dim(0) are ignored.

Parameters

graph – The Poplar graph.
t – The tensor being updated (must be rank 2).
s – The slices to find maximum over.
offsets – The offsets within t to find maximum over.
dims – The dimensions of t to find maximum over (must be rank 1).
sizes – The size of the update in each of the dimensions in dims.
prog – The program to be extended.
plan – Plan describing how the operation will be implemented.
options – Flags controlling how the operation will be implemented.
debugContext – Optional debug information.

class SlicePlan

#include <DynamicSlice.hpp>

An object representing a plan that describes how to implement a slice or update.

This can be used as a parameter to a function that will slice or update a tensor.

Public Functions

SlicePlan()

~SlicePlan()

SlicePlan(const SlicePlan &other)

SlicePlan(SlicePlan &&other)

SlicePlan &operator=(const SlicePlan &other)

SlicePlan &operator=(SlicePlan &&other)

SlicePlan(std::unique_ptr<SlicePlanInternal> internal)

inline SlicePlanInternal &getImpl() const

Private Members

std::unique_ptr<SlicePlanInternal> internal

Friends

friend std::ostream &operator<<(std::ostream &o, const SlicePlan &p)

friend bool operator<(const SlicePlan &a, const SlicePlan &b) noexcept

friend bool operator==(const SlicePlan &a, const SlicePlan &b) noexcept

friend poplar::ProfileValue toProfileValue(const SlicePlan &p)

namespace embedding

Functions

SlicePlan plan(const poplar::Graph &graph, const poplar::Type &dataType, const std::size_t numEntries, const std::size_t outputSize, const std::vector<std::size_t> &numLookups, const poplar::OptionFlags &options)

Create a plan for implementing a set of operations on an embedding matrix.

** Embedding plan options **

usedForSlice (true, false) [=true]

If true, you intend to use this embedding plan for both a multiSlice operation. An error is thrown if set to false and usedForUpdate is set to false.
usedForUpdate (true, false) [=true]

If true, you intend to use this embedding plan for both a multiUpdate operation. An error is thrown if set to false and usedForSlice is set to false.
operationForUpdate (“none”, “add”, “max”) [=”add”]

Only applicable when usedForUpdate = true. Is the type of operation used in multi-update. Set to “none” for multiUpdate “add” for multiUpdateAdd “max” for multiUpdateMax
availableMemoryProportion Positive decimal

If set, gives the proportion of tile memory made available for temporary variables (variables that become live and die during the operation) for this operation. If not set, the operation has the freedom to use unlimited temporary memory.
indicesDistribution (uniform, onePoint) [=uniform]

A description of the statistical distribution of the indices that will be sliced/updated over the input size (numEntries) of the operation. This is used to when estimating the runtime of the multiSlice and multiUpdate* operation.
- uniform Indices are assumed to be uniformly distributed over the input size of the embedding.
- onePoint Indices are assumed to all be equal.
planMinimisationTarget (memory, cycles) [=memory]

Select what should be minimised when planning this operation.
- memory Minimise a weighted combination of estimated maximum tile memory needed for code, for input/indices/output operands, and temporary variables for the operation.
- cycles Minimise estimated total cycles for the operation.
indicesAreSorted (true, false) [=false]

Plan assuming indices used in MultiUpdate/MultiUpdateOp are sorted in increasing order. The same option must then be used along with the plan when calling MultiUpdate with and without an operation.
partialType (half, float) If not provided, defaults to using the same as the data type. Partials type should always be the same or higher precision than the data type of the embedding matrix. Is applicable only for the case where operationForUpdate is add. It is ignored for all other operations and multi-slice.

Parameters

graph – The graph the operation will be added to.
dataType – The data type of the entries in the embedding matrix and the resulting lookups from the matrix.
numEntries – Input size of embedding matrix.
outputSize – Output size of embedding matrix lookup.
numLookups – Vector of numbers of indices which will be looked up in the embedding matrix.
options – Set of option flags controlling how the operation will be implemented.

Returns

A plan which describes how the embedding matrix lookup/update operations should be implemented.