Support for dynamic sparse matrices. More...

Classes
class	Partitioner
	Class to translate and encode sparsity information for a fully connected layer. More...

class	PlanningCache
	Class used to cache the calculation of plans for dynamically sparse operations. More...

class	SparseTensor
	Representation of a sparse tensor. More...

struct	SparsityDataImpl
	Encoding of sparsity representation. More...

Enumerations
enum class	SparsityType { Element , Block }
	Sparsity type. More...

enum class	SparsityStructure
	Sparsity structure.

Functions
poplar::Tensor	createIndicesTensor (poplar::Graph &graph, const FullyConnectedParams &params, std::size_t numIndices, const poplar::OptionFlags &options={}, const poplar::DebugContext &debugContext={})
	Create and map a tensor to contain indices for slicing/updating a tensor efficiently. More...

poplar::Tensor	createSliceTensor (poplar::Graph &graph, const poplar::Type &dataType, const FullyConnectedParams &params, std::size_t numIndices, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Create and map a tensor to be updated from efficiently. More...

poplar::Tensor	embeddingSlice (poplar::Graph &graph, const SparseTensor &t, const poplar::Tensor &indices, poplar::program::Sequence &prog, const FullyConnectedParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Take multiple slices from a base tensor. More...

void	embeddingUpdateAdd (poplar::Graph &graph, const SparseTensor &t, const poplar::Tensor &slices, const poplar::Tensor &indices, const poplar::Tensor &scale, poplar::program::Sequence &prog, const FullyConnectedParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Update a sparse tensor with a set of slices at the given row indices. More...

SparseTensor	createFullyConnectedWeights (poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Create a sparse tensor that is used as the weights W for a fully connected layer. More...

poplar::Tensor	createFullyConnectedInput (poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Create a dense tensor that is used as the input activations for a fully connected layer. More...

poplar::Tensor	fullyConnectedFwd (poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Run a fully connected forward (or inference) pass. More...

poplar::Tensor	fullyConnectedGradA (poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &gradients, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Run a fully connected GradA pass. More...

poplar::Tensor	fullyConnectedSparseGradW (poplar::Graph &graph, const poplar::Tensor sparsityMetaInfo, const poplar::Tensor &gradA, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Run a fully connected GradW pass to compute sparse gradients. More...

std::tuple< unsigned, unsigned, unsigned >	fullyConnectedDenseGradWSerialSplits (const poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams &fcParams, const poplar::OptionFlags &options_={}, PlanningCache *cache=nullptr)
	Report the serial splitting of a dense gradW output given the memory proportion limit given in options. More...

SparseTensor	createSparseDenseMatMulLHS (poplar::Graph &graph, const poplar::Type &inputType, const MatMulParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Create a sparse tensor that is used as the left-hand operand in a sparse * dense matrix multiplication. More...

poplar::Tensor	createSparseDenseMatMulRHS (poplar::Graph &graph, const poplar::Type &inputType, const MatMulParams &params, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Create a dense tensor that is used as the right-hand operand in a sparse * dense matrix multiplication. More...

poplar::Tensor	sparseDenseMatMul (poplar::Graph &graph, const SparseTensor &lhs, const poplar::Tensor &rhs, poplar::program::Sequence &prog, bool transposeLHS=false, bool transposeRHS=false, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr)
	Perform a sparse * dense matrix multiplication, yielding a dense result. More...

Detailed Description

Support for dynamic sparse matrices.

Enumeration Type Documentation

◆ SparsityType

enum class popsparse::dynamic::SparsityType

strong

Sparsity type.

Enumerator

Element

Sparsity is defined at an element level.

Block

Sparsity is defined at a block level.

The matrix is made up of blocks with each of these block are either all zero or not.

Function Documentation

◆ createFullyConnectedInput()

poplar::Tensor popsparse::dynamic::createFullyConnectedInput	(	poplar::Graph &	graph,
		const poplar::Type &	inputType,
		const FullyConnectedParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Create a dense tensor that is used as the input activations for a fully connected layer.

This returned tensor is of shape [batchSize, inputChannelsPerGroup].

Parameters

graph	The Poplar graph.
inputType	The type for inputs to the operation.
params	Parameters for the fully connected layer.
debugContext	Optional debug information.
options	Implementation options for the fully connected layer. See createFullyConnectedWeights() for details.
cache	Optional pointer to planning cache to use.

◆ createFullyConnectedWeights()

SparseTensor popsparse::dynamic::createFullyConnectedWeights	(	poplar::Graph &	graph,
		const poplar::Type &	inputType,
		const FullyConnectedParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Create a sparse tensor that is used as the weights W for a fully connected layer.

The following options are available:

availableMemoryProportion Decimal between 0 and 1 [=0.6]

The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
metaInfoBucketOversizeProportion Decimal between 0 and 1 [=0.3]

This specifies additional elements to allocate in each bucket of meta-information as a proportion of the required size for a perfectly uniformly distributed sparsity pattern.
doGradAPass (true, false) [=false]

doGradWPass (true, false) [=false]

Indicate which passes are present for the operation of the layer as a whole. It is assumed that the forward pass is always present.
partialsType poplar::Type [=poplar::FLOAT]

The type to use for partial results. If the type specified is smaller than the output type then the option is ignored and the output type is used instead.
sharedBuckets (true, false) [=true]

If set, forces the same buckets to be used for all three passes.

Parameters

graph	The Poplar graph.
inputType	The type for inputs to the operation.
params	Parameters for the fully connected layer.
debugContext	Optional debug information.
options	Implementation options for the fully connected layer.
cache	Optional pointer to planning cache to use.

Returns: A tensor with sparse representation of weights for the fully connected layer.

◆ createIndicesTensor()

poplar::Tensor popsparse::dynamic::createIndicesTensor	(	poplar::Graph &	graph,
		const FullyConnectedParams &	params,
		std::size_t	numIndices,
		const poplar::OptionFlags &	options = `{}`,
		const poplar::DebugContext &	debugContext = `{}`
	)

Create and map a tensor to contain indices for slicing/updating a tensor efficiently.

Parameters

graph	The Poplar graph.
params	Parameters for the fully connected layer which defines the embedding operation. Used to decide on layout for the indices.
options	Implementation options for the fully connected layer.
numIndices	The number of indices this tensor should contain
debugContext	Optional debug information.

Returns: A 1D tensor of shape [numIndices]. Element type is always UNSIGNED_INT.

◆ createSliceTensor()

poplar::Tensor popsparse::dynamic::createSliceTensor	(	poplar::Graph &	graph,
		const poplar::Type &	dataType,
		const FullyConnectedParams &	params,
		std::size_t	numIndices,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Create and map a tensor to be updated from efficiently.

Memory layout is based on the planned split of the sparse tensor.

Parameters

graph	The Poplar graph.
dataType	The data type of the returned tensor.
params	Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being updated
numIndices	The number of slices this tensor should contain.
debugContext	Optional debug information.
options	Implementation options for the fully connected layer.
cache	Optional pointer to planning cache to use.

Returns: A 2D tensor with shape [numIndices, params.getInputChannels()] with layout optimised for slicing into/updating from.

◆ createSparseDenseMatMulLHS()

SparseTensor popsparse::dynamic::createSparseDenseMatMulLHS	(	poplar::Graph &	graph,
		const poplar::Type &	inputType,
		const MatMulParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Create a sparse tensor that is used as the left-hand operand in a sparse * dense matrix multiplication.

The following options are available:

availableMemoryProportion Decimal between 0 and 1 [=0.6]

The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
metaInfoBucketOversizeProportion Decimal between 0 and 1 [=0.3]

This specifies additional elements to allocate in each bucket of meta-information as a proportion of the required size for a perfectly uniformly distributed sparsity pattern.
partialsType poplar::Type [=poplar::FLOAT]

The type to use for partial results.
sharedBuckets (true, false) [=true]

If set, forces the same buckets to be used whether or not the sparse (left-hand) operand is transposed or not. Saves memory at the expense of runtime.

Parameters

graph	The Poplar graph.
inputType	The type for inputs to the operation.
params	Parameters for the matrix multiplication.
debugContext	Optional debug information.
options	Implementation options for the matrix multiplication.
cache	Optional pointer to planning cache to use.

Returns: A sparse tensor with sparse representation of left-hand operand for the matrix multiplication.

◆ createSparseDenseMatMulRHS()

poplar::Tensor popsparse::dynamic::createSparseDenseMatMulRHS	(	poplar::Graph &	graph,
		const poplar::Type &	inputType,
		const MatMulParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Create a dense tensor that is used as the right-hand operand in a sparse * dense matrix multiplication.

Parameters

graph	The Poplar graph.
inputType	The type for inputs to the operation.
params	Parameters for the matrix multiplication.
debugContext	Optional debug information.
options	Implementation options for the matrix multiplication.
cache	Optional pointer to planning cache to use.

Returns: A dense tensor for use as right-hand operand for the matrix multiplication.

◆ embeddingSlice()

poplar::Tensor popsparse::dynamic::embeddingSlice	(	poplar::Graph &	graph,
		const SparseTensor &	t,
		const poplar::Tensor &	indices,
		poplar::program::Sequence &	prog,
		const FullyConnectedParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Take multiple slices from a base tensor.

The returned tensor will have dimensions [offsets, k (from params)]

Parameters

graph	The Poplar graph.
t	The sparse tensor being sliced.
indices	The indices of rows of `t` to be sliced.
prog	The program to be extended.
params	Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being sliced.
debugContext	Optional debug information.
options	Implementation options for the fully connected layer.
cache	Optional pointer to planning cache to use.

◆ embeddingUpdateAdd()

void popsparse::dynamic::embeddingUpdateAdd	(	poplar::Graph &	graph,
		const SparseTensor &	t,
		const poplar::Tensor &	slices,
		const poplar::Tensor &	indices,
		const poplar::Tensor &	scale,
		poplar::program::Sequence &	prog,
		const FullyConnectedParams &	params,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Update a sparse tensor with a set of slices at the given row indices.

Parameters

graph	The Poplar graph.
t	The sparse tensor being updated.
slices	The slices to accumulate.
indices	The indices of rows of `t` to accumulate each slice in `slices` into.
scale	The scaling to apply to the update.
prog	The program to be extended.
params	Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being updated
debugContext	Optional debug information.
options	Implementation options for the fully connected layer.
cache	Optional pointer to planning cache to use.

◆ fullyConnectedDenseGradWSerialSplits()

std::tuple< unsigned, unsigned, unsigned > popsparse::dynamic::fullyConnectedDenseGradWSerialSplits	(	const poplar::Graph &	graph,
		const poplar::Type &	inputType,
		const FullyConnectedParams &	fcParams,
		const poplar::OptionFlags &	options_ = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Report the serial splitting of a dense gradW output given the memory proportion limit given in options.

A dense gradW output is of shape [numGroups][inputSize][outputSize]

Parameters

graph	The Poplar graph.
inputType	The type of input.
params	Fully connected params.
options	The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.
cache	Optional pointer to planning cache to use.

Returns: Serial splits for each of the output dimensions [numGroups][inputSize][outputSize].

◆ fullyConnectedFwd()

poplar::Tensor popsparse::dynamic::fullyConnectedFwd	(	poplar::Graph &	graph,
		const SparseTensor &	weights,
		const poplar::Tensor &	activations,
		const FullyConnectedParams &	fcParams,
		poplar::program::Sequence &	prog,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Run a fully connected forward (or inference) pass.

The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the Fwd operation described in the Note above but with input and output transposed.

The meta information for the sparse weights tensor must be created for the forward (or inference) pass and should be created by use of the createFullyConnectedWeights() function.

Parameters

graph	The Poplar graph.
weights	Sparsity information of the weights tensor.
activations	The dense activations have shape [batchSize][inputChannelsPerGroup * numGroups]
fcParams	Fully connected layer parameters.
prog	A reference to a program sequence which will be appended with the code to perform the forward operation.
debugContext	Optional debug information.
options	The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.
cache	Optional pointer to planning cache to use.

Returns: The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles. The result tensor is of shape [batchSize][outputChannelsPerGroup * numGroups]

◆ fullyConnectedGradA()

poplar::Tensor popsparse::dynamic::fullyConnectedGradA	(	poplar::Graph &	graph,
		const SparseTensor &	weights,
		const poplar::Tensor &	gradients,
		const FullyConnectedParams &	fcParams,
		poplar::program::Sequence &	prog,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Run a fully connected GradA pass.

The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the GradA computation as described in the Note above but with input and output transposed.

The meta information for the sparse-weights tensor must be created for the GradA pass and should be created by use of createFullyConnectedWeights() function.

Parameters

graph	The Poplar graph.
weights	Sparsity information of the weights tensor.
gradients	The dense loss gradients with respect to output activations and are of shape [batchSize][outputChannelsPerGroup] .
fcParams	Fully connected layer parameters.
prog	A reference to a program sequence which will be appended with the code to perform the GradA operation.
debugContext	Optional debug information.
options	The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.
cache	Optional pointer to planning cache to use.

Returns: The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles. The tensor is of shape [batchSize][inputChannelsPerGroup * numGroups]

◆ fullyConnectedSparseGradW()

poplar::Tensor popsparse::dynamic::fullyConnectedSparseGradW	(	poplar::Graph &	graph,
		const poplar::Tensor	sparsityMetaInfo,
		const poplar::Tensor &	gradA,
		const poplar::Tensor &	activations,
		const FullyConnectedParams &	fcParams,
		poplar::program::Sequence &	prog,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Run a fully connected GradW pass to compute sparse gradients.

The layout of the returned tensor is exactly as that of the representation of the weights NZ values so that any elementwise operation may be done between the two.

The actual implementation differs from that in the Note above as the transpose of the gradients and activations are supplied as parameters to this function.

Parameters

graph	The Poplar graph.
weightMetaInfo	Meta information for sparse weights. See SparseTensor representation.
gradA	Dense gradients wrt output activations of shape [batchSize][outputChannelsPerGroup * numGroups]
activations	Input activations of shape [batchSize][inputChannelsPerGroup * numGroups]
fcParams	Fully connected layer parameters.
prog	A reference to a program sequence which will be appended with the code to perform the GradW operation.
debugContext	Optional debug information.
options	The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details.
cache	Optional pointer to planning cache to use.

Returns: The tensor holding the result. This tensor will be created, added to the graph and mapped to tiles.

◆ sparseDenseMatMul()

poplar::Tensor popsparse::dynamic::sparseDenseMatMul	(	poplar::Graph &	graph,
		const SparseTensor &	lhs,
		const poplar::Tensor &	rhs,
		poplar::program::Sequence &	prog,
		bool	transposeLHS = `false`,
		bool	transposeRHS = `false`,
		const poplar::DebugContext &	debugContext = `{}`,
		const poplar::OptionFlags &	options = `{}`,
		PlanningCache *	cache = `nullptr`
	)

Perform a sparse * dense matrix multiplication, yielding a dense result.

The sparse left-hand operand tensor is made up of meta information for the sparsity and the non-zero values of the matrix. This sparse tensor must have been created with createSparseDenseMatMulLHS.

If the sparse left-hand operand was created for the sparse equivalent of a dense matrix multiplication:

[groups][m][k] * [groups][k][n] = [groups][m][n]

Then the same sparse left-hand operand can be used to calculate the above as well as:

[groups][k][m] * [groups][m][n] = [groups][k][n]

through the use of the transposeLHS parameter. transposeRHS is also provided for convenience.

Parameters

graph	The Poplar graph.
lhs	The sparse left-hand operand to the matrix multiplication.
rhs	The dense right-hand operand to the matrix multiplication.
prog	A reference to a program sequence which will be appended with the code to perform the matrix multiplication.
transposeLHS	Whether or not to transpose the left-hand operand before multiplying.
transposeRHS	Whether or not to transpose the right-hand operand before multiplying.
debugContext	Optional debug information.
options	Implementation options for the matrix multiplication.
cache	Optional pointer to planning cache to use.

Returns: The tensor holding the dense result of the matrix multiplication. The tensor will be created, added to the graph, and mapped to tiles.

Classes

Enumerations

Functions

Detailed Description

Enumeration Type Documentation

◆ SparsityType

Function Documentation

◆ createFullyConnectedInput()

◆ createFullyConnectedWeights()

◆ createIndicesTensor()

◆ createSliceTensor()

◆ createSparseDenseMatMulLHS()

◆ createSparseDenseMatMulRHS()

◆ embeddingSlice()

◆ embeddingUpdateAdd()

◆ fullyConnectedDenseGradWSerialSplits()

◆ fullyConnectedFwd()

◆ fullyConnectedGradA()

◆ fullyConnectedSparseGradW()

◆ sparseDenseMatMul()