Rearrange

#include <popops/Rearrange.hpp>

Operations to rearrange tensors on tiles.

namespace popops

Common functions, such as elementwise and reductions.

namespace rearrange

Functions

bool canUseFastTranspose(const poplar::Target &target, const poplar::Type &type, unsigned numRows, unsigned numColumns, unsigned numTranspositions)

Determine if a fast transposition codelet may be used based on the given target/data type/no.

of rows/no. of columns.

Parameters
  • target – The target the operation will be targeted at.

  • type – The data type of the tensor to transpose.

  • numRows – The no. of rows in each transposition to perform.

  • numColumns – The no. of columns in each transposition to perform.

Returns

A boolean indicating whether or not the fast transposition codelets can be targeted based on the given parameters.

void addTransposeVertices(poplar::Graph &graph, const poplar::ComputeSet &cs, const poplar::Type &dType, unsigned rows, unsigned cols, const poplar::Graph::TileToTensorMapping &mapping, std::function<std::pair<const poplar::Tensor, const poplar::Tensor>(size_t)> getInOut, const poplar::DebugContext &debugContext = {})

Transposes of a set of matrices stored on multiple tiles.

This adds all the needed vertices on the graph.

Parameters
  • graph, cs – The graph and compute set to add the vertices to.

  • dType, rows, cols – The type and dimensions of the matrices to be transposed, the same for all of them.

  • mapping – A vector with ‘number of tiles’ elements, where each element is a vector of intervals indicating which matrices to be transposed are mapped (possibly partially) on each tile.

  • getInOut – A function: pair<Tensor, Tensor> getInOut(size_t index), which, given as input an index inside the intervals specified in ‘mapping’, returns a std::pair of Tensors (in, out) which are the input and output matrix for the ‘index’ transposition. The ‘in’ and ‘out’ return values are 2D matrices, but they must be flattened to a single dimension.

poplar::Tensor partialTranspose(poplar::Graph &graph, const poplar::Tensor &in, const poplar::ComputeSet &cs, const poplar::DebugContext &debugContext = {})

Transpose the innermost pair of dimensions of the specified tensor, writing the results to a new tensor.

This function assumes order of the underlying storage matches the order of the elements in the tensor. This function is optimized for group sizes that are typical of the underlying memory layout of convolution activations / weights - it may be inefficient for other group sizes.

unsigned getMinimumRegroupGrainSize(const poplar::Type &type)

Get the smallest grouping we can transpose between for the given type using fast transposition codelets.

Parameters

type – The data type to be transposed.

Returns

The smallest size of grouping that can be efficiently transposed for the given type.

poplar::Tensor regroupTensor(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &copies, const poplar::ComputeSet &transposeCS, const poputil::GroupingInfo &from, const poputil::GroupingInfo &to, const poplar::DebugContext &debugContext = {})

Insert copies or other operations into the given programs/compute sets to transform the grouping found on the given tensor from from to to.

This is a no-op for a one-dimensional tensor.

Parameters
  • graph – The graph to add the operation to.

  • t – The tensor to regroup.

  • copies – A poplar sequence to add pre-arranging copies to.

  • transposeCS – A compute set that may or may not have vertices added to it to perform the regrouping operation.

  • from – A grouping that is applied to the given tensor t to rearrange from.

  • to – A grouping wanted on the returned tensor.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of t but laid out such that it has the grouping specified in to.

poplar::Tensor regroupIfPossible(poplar::Graph &graph, const poplar::Tensor &t, poplar::program::Sequence &prog, const poputil::GroupingInfo &to, const poplar::DebugContext &debugContext = {})

Insert copies or other operations into the given program to transform the grouping found on the given tensor to to.

This is a no-op for a one-dimensional tensor and if regrouping is not possible. Regrouping may not be possible or done if grouping exists only on the requested dimension, or no suitable grain size is available for efficient regrouping.

Parameters
  • graph – The graph to add the operation to.

  • t – The tensor to regroup.

  • prog – A poplar sequence to add any pre-arranging and other operations to.

  • to – A grouping wanted on the returned tensor.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of t but laid out such that it has the grouping specified in to.

poplar::Tensor regroupTensor(poplar::Graph &graph, const poplar::Tensor &t, std::vector<poplar::program::Copy> &copies, const poplar::ComputeSet &transposeCS, const poputil::GroupingInfo &from, const poputil::GroupingInfo &to, const poplar::DebugContext &debugContext = {})

Insert copies or other operations into the given programs/compute sets to transform the grouping found on the given tensor from from to to.

This is a no-op for a one-dimensional tensor.

Overload that takes a vector of Copy programs instead of a Sequence.

Parameters
  • graph – The graph to add the operation to.

  • t – The tensor to regroup.

  • copies – A vector to add pre-arranging copies to.

  • transposeCS – A compute set that may or may not have vertices added to it to perform the regrouping operation.

  • from – A grouping that is applied to the given tensor t to rearrange from.

  • to – A grouping wanted on the returned tensor.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of t but laid out such that it has the grouping specified in to.

poplar::Tensor regroupIfBeneficial(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &ref, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})

If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the grouping of the resulting tensor matches that of the reference tensor, or a factor of that grouping if it balances memory usage across the target better.

Parameters
  • graph – The graph to add the operation to.

  • in – The tensor to maybe regroup.

  • ref – A reference tensor which will be introspected to find a grouping to apply to the returned tensor.

  • prog – A poplar sequence to add the regrouping operation to.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of the given tensor in rearranged in memory to have a grouping matching ref.

poplar::Tensor regroupIfBeneficial(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &ref, std::vector<poplar::program::Copy> &copies, poplar::ComputeSet transposeCS, const poplar::DebugContext &debugContext = {})

If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the grouping of the resulting tensor matches that of the reference tensor, or a factor of that grouping if it balances memory usage across the target better.

Overload that takes a vector of Copy programs instead of a Sequence.

Parameters
  • graph – The graph to add the operation to.

  • in – The tensor to maybe regroup.

  • ref – A reference tensor which will be introspected to find a grouping to apply to the returned tensor.

  • copies – A vector to add pre-arranging copies to.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of the given tensor in rearranged in memory to have a grouping matching ref.

poplar::Tensor regroupIfBeneficial(poplar::Graph &graph, const poplar::Tensor &in, std::size_t preferredGrouping, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})

If possible and runtime efficient, add an operation to rearrange the given tensor in memory such that the resulting tensor has a grouping in the innermost dimension equivalent to, or a factor of the given preferred grouping if it balances memory usage across the target better.

Parameters
  • graph – The graph to add the operation to.

  • in – The tensor to maybe regroup.

  • preferredGrouping – A size of grouping of the innermost dimension of the given tensor to regroup to.

  • prog – A poplar sequence to add the regrouping operation to.

  • debugContext – Optional debug information.

Returns

A tensor with the contents of the given tensor in rearranged in memory to have a grouping matching ref.