Poplar and PopLibs
|
Support for dynamic sparse matrices. More...
Classes | |
class | Partitioner |
Class to translate and encode sparsity information for a fully connected layer. More... | |
class | PlanningCache |
Class used to cache the calculation of plans for dynamically sparse operations. More... | |
class | SparseTensor |
Representation of a sparse tensor. More... | |
struct | SparsityDataImpl |
Encoding of sparsity representation. More... | |
Enumerations | |
enum class | SparsityType { Element , Block } |
Sparsity type. More... | |
enum class | SparsityStructure |
Sparsity structure. | |
Functions | |
poplar::Tensor | createIndicesTensor (poplar::Graph &graph, const FullyConnectedParams ¶ms, std::size_t numIndices, const poplar::OptionFlags &options={}, const poplar::DebugContext &debugContext={}) |
Create and map a tensor to contain indices for slicing/updating a tensor efficiently. More... | |
poplar::Tensor | createSliceTensor (poplar::Graph &graph, const poplar::Type &dataType, const FullyConnectedParams ¶ms, std::size_t numIndices, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Create and map a tensor to be updated from efficiently. More... | |
poplar::Tensor | embeddingSlice (poplar::Graph &graph, const SparseTensor &t, const poplar::Tensor &indices, poplar::program::Sequence &prog, const FullyConnectedParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Take multiple slices from a base tensor. More... | |
void | embeddingUpdateAdd (poplar::Graph &graph, const SparseTensor &t, const poplar::Tensor &slices, const poplar::Tensor &indices, const poplar::Tensor &scale, poplar::program::Sequence &prog, const FullyConnectedParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Update a sparse tensor with a set of slices at the given row indices. More... | |
SparseTensor | createFullyConnectedWeights (poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Create a sparse tensor that is used as the weights W for a fully connected layer. More... | |
poplar::Tensor | createFullyConnectedInput (poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Create a dense tensor that is used as the input activations for a fully connected layer. More... | |
poplar::Tensor | fullyConnectedFwd (poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Run a fully connected forward (or inference) pass. More... | |
poplar::Tensor | fullyConnectedGradA (poplar::Graph &graph, const SparseTensor &weights, const poplar::Tensor &gradients, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Run a fully connected GradA pass. More... | |
poplar::Tensor | fullyConnectedSparseGradW (poplar::Graph &graph, const poplar::Tensor sparsityMetaInfo, const poplar::Tensor &gradA, const poplar::Tensor &activations, const FullyConnectedParams &fcParams, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Run a fully connected GradW pass to compute sparse gradients. More... | |
std::tuple< unsigned, unsigned, unsigned > | fullyConnectedDenseGradWSerialSplits (const poplar::Graph &graph, const poplar::Type &inputType, const FullyConnectedParams &fcParams, const poplar::OptionFlags &options_={}, PlanningCache *cache=nullptr) |
Report the serial splitting of a dense gradW output given the memory proportion limit given in options. More... | |
SparseTensor | createSparseDenseMatMulLHS (poplar::Graph &graph, const poplar::Type &inputType, const MatMulParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Create a sparse tensor that is used as the left-hand operand in a sparse * dense matrix multiplication. More... | |
poplar::Tensor | createSparseDenseMatMulRHS (poplar::Graph &graph, const poplar::Type &inputType, const MatMulParams ¶ms, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Create a dense tensor that is used as the right-hand operand in a sparse * dense matrix multiplication. More... | |
poplar::Tensor | sparseDenseMatMul (poplar::Graph &graph, const SparseTensor &lhs, const poplar::Tensor &rhs, poplar::program::Sequence &prog, bool transposeLHS=false, bool transposeRHS=false, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}, PlanningCache *cache=nullptr) |
Perform a sparse * dense matrix multiplication, yielding a dense result. More... | |
Support for dynamic sparse matrices.
|
strong |
poplar::Tensor popsparse::dynamic::createFullyConnectedInput | ( | poplar::Graph & | graph, |
const poplar::Type & | inputType, | ||
const FullyConnectedParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Create a dense tensor that is used as the input activations for a fully connected layer.
This returned tensor is of shape [batchSize, inputChannelsPerGroup].
graph | The Poplar graph. |
inputType | The type for inputs to the operation. |
params | Parameters for the fully connected layer. |
debugContext | Optional debug information. |
options | Implementation options for the fully connected layer. See createFullyConnectedWeights() for details. |
cache | Optional pointer to planning cache to use. |
SparseTensor popsparse::dynamic::createFullyConnectedWeights | ( | poplar::Graph & | graph, |
const poplar::Type & | inputType, | ||
const FullyConnectedParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Create a sparse tensor that is used as the weights W for a fully connected layer.
The following options are available:
availableMemoryProportion
Decimal between 0 and 1 [=0.6]
The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
metaInfoBucketOversizeProportion
Decimal between 0 and 1 [=0.3]
This specifies additional elements to allocate in each bucket of meta-information as a proportion of the required size for a perfectly uniformly distributed sparsity pattern.
doGradAPass
(true, false) [=false]
doGradWPass
(true, false) [=false]
Indicate which passes are present for the operation of the layer as a whole. It is assumed that the forward pass is always present.
partialsType
poplar::Type [=poplar::FLOAT]
The type to use for partial results. If the type specified is smaller than the output type then the option is ignored and the output type is used instead.
sharedBuckets
(true, false) [=true]
If set, forces the same buckets to be used for all three passes.
graph | The Poplar graph. |
inputType | The type for inputs to the operation. |
params | Parameters for the fully connected layer. |
debugContext | Optional debug information. |
options | Implementation options for the fully connected layer. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::createIndicesTensor | ( | poplar::Graph & | graph, |
const FullyConnectedParams & | params, | ||
std::size_t | numIndices, | ||
const poplar::OptionFlags & | options = {} , |
||
const poplar::DebugContext & | debugContext = {} |
||
) |
Create and map a tensor to contain indices for slicing/updating a tensor efficiently.
graph | The Poplar graph. |
params | Parameters for the fully connected layer which defines the embedding operation. Used to decide on layout for the indices. |
options | Implementation options for the fully connected layer. |
numIndices | The number of indices this tensor should contain |
debugContext | Optional debug information. |
numIndices
]. Element type is always UNSIGNED_INT. poplar::Tensor popsparse::dynamic::createSliceTensor | ( | poplar::Graph & | graph, |
const poplar::Type & | dataType, | ||
const FullyConnectedParams & | params, | ||
std::size_t | numIndices, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Create and map a tensor to be updated from efficiently.
Memory layout is based on the planned split of the sparse tensor.
graph | The Poplar graph. |
dataType | The data type of the returned tensor. |
params | Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being updated |
numIndices | The number of slices this tensor should contain. |
debugContext | Optional debug information. |
options | Implementation options for the fully connected layer. |
cache | Optional pointer to planning cache to use. |
params.getInputChannels()
] with layout optimised for slicing into/updating from. SparseTensor popsparse::dynamic::createSparseDenseMatMulLHS | ( | poplar::Graph & | graph, |
const poplar::Type & | inputType, | ||
const MatMulParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Create a sparse tensor that is used as the left-hand operand in a sparse * dense matrix multiplication.
The following options are available:
availableMemoryProportion
Decimal between 0 and 1 [=0.6]
The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
metaInfoBucketOversizeProportion
Decimal between 0 and 1 [=0.3]
This specifies additional elements to allocate in each bucket of meta-information as a proportion of the required size for a perfectly uniformly distributed sparsity pattern.
partialsType
poplar::Type [=poplar::FLOAT]
The type to use for partial results.
sharedBuckets
(true, false) [=true]
If set, forces the same buckets to be used whether or not the sparse (left-hand) operand is transposed or not. Saves memory at the expense of runtime.
graph | The Poplar graph. |
inputType | The type for inputs to the operation. |
params | Parameters for the matrix multiplication. |
debugContext | Optional debug information. |
options | Implementation options for the matrix multiplication. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::createSparseDenseMatMulRHS | ( | poplar::Graph & | graph, |
const poplar::Type & | inputType, | ||
const MatMulParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Create a dense tensor that is used as the right-hand operand in a sparse * dense matrix multiplication.
graph | The Poplar graph. |
inputType | The type for inputs to the operation. |
params | Parameters for the matrix multiplication. |
debugContext | Optional debug information. |
options | Implementation options for the matrix multiplication. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::embeddingSlice | ( | poplar::Graph & | graph, |
const SparseTensor & | t, | ||
const poplar::Tensor & | indices, | ||
poplar::program::Sequence & | prog, | ||
const FullyConnectedParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Take multiple slices from a base tensor.
The returned tensor will have dimensions [offsets, k (from params)]
graph | The Poplar graph. |
t | The sparse tensor being sliced. |
indices | The indices of rows of t to be sliced. |
prog | The program to be extended. |
params | Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being sliced. |
debugContext | Optional debug information. |
options | Implementation options for the fully connected layer. |
cache | Optional pointer to planning cache to use. |
void popsparse::dynamic::embeddingUpdateAdd | ( | poplar::Graph & | graph, |
const SparseTensor & | t, | ||
const poplar::Tensor & | slices, | ||
const poplar::Tensor & | indices, | ||
const poplar::Tensor & | scale, | ||
poplar::program::Sequence & | prog, | ||
const FullyConnectedParams & | params, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Update a sparse tensor with a set of slices at the given row indices.
graph | The Poplar graph. |
t | The sparse tensor being updated. |
slices | The slices to accumulate. |
indices | The indices of rows of t to accumulate each slice in slices into. |
scale | The scaling to apply to the update. |
prog | The program to be extended. |
params | Parameters for the fully connected layer which will provide the planned memory layout for the sparse tensor being updated |
debugContext | Optional debug information. |
options | Implementation options for the fully connected layer. |
cache | Optional pointer to planning cache to use. |
std::tuple< unsigned, unsigned, unsigned > popsparse::dynamic::fullyConnectedDenseGradWSerialSplits | ( | const poplar::Graph & | graph, |
const poplar::Type & | inputType, | ||
const FullyConnectedParams & | fcParams, | ||
const poplar::OptionFlags & | options_ = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Report the serial splitting of a dense gradW output given the memory proportion limit given in options.
A dense gradW output is of shape [numGroups][inputSize][outputSize]
graph | The Poplar graph. |
inputType | The type of input. |
params | Fully connected params. |
options | The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::fullyConnectedFwd | ( | poplar::Graph & | graph, |
const SparseTensor & | weights, | ||
const poplar::Tensor & | activations, | ||
const FullyConnectedParams & | fcParams, | ||
poplar::program::Sequence & | prog, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Run a fully connected forward (or inference) pass.
The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the Fwd operation described in the Note above but with input and output transposed.
The meta information for the sparse weights tensor must be created for the forward (or inference) pass and should be created by use of the createFullyConnectedWeights() function.
graph | The Poplar graph. |
weights | Sparsity information of the weights tensor. |
activations | The dense activations have shape [batchSize][inputChannelsPerGroup * numGroups] |
fcParams | Fully connected layer parameters. |
prog | A reference to a program sequence which will be appended with the code to perform the forward operation. |
debugContext | Optional debug information. |
options | The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::fullyConnectedGradA | ( | poplar::Graph & | graph, |
const SparseTensor & | weights, | ||
const poplar::Tensor & | gradients, | ||
const FullyConnectedParams & | fcParams, | ||
poplar::program::Sequence & | prog, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Run a fully connected GradA pass.
The sparse-weights tensor is made up of meta information for the sparsity and the non-zero values. Does the GradA computation as described in the Note above but with input and output transposed.
The meta information for the sparse-weights tensor must be created for the GradA pass and should be created by use of createFullyConnectedWeights() function.
graph | The Poplar graph. |
weights | Sparsity information of the weights tensor. |
gradients | The dense loss gradients with respect to output activations and are of shape [batchSize][outputChannelsPerGroup] . |
fcParams | Fully connected layer parameters. |
prog | A reference to a program sequence which will be appended with the code to perform the GradA operation. |
debugContext | Optional debug information. |
options | The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::fullyConnectedSparseGradW | ( | poplar::Graph & | graph, |
const poplar::Tensor | sparsityMetaInfo, | ||
const poplar::Tensor & | gradA, | ||
const poplar::Tensor & | activations, | ||
const FullyConnectedParams & | fcParams, | ||
poplar::program::Sequence & | prog, | ||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Run a fully connected GradW pass to compute sparse gradients.
The layout of the returned tensor is exactly as that of the representation of the weights NZ values so that any elementwise operation may be done between the two.
The actual implementation differs from that in the Note above as the transpose of the gradients and activations are supplied as parameters to this function.
graph | The Poplar graph. |
weightMetaInfo | Meta information for sparse weights. See SparseTensor representation. |
gradA | Dense gradients wrt output activations of shape [batchSize][outputChannelsPerGroup * numGroups] |
activations | Input activations of shape [batchSize][inputChannelsPerGroup * numGroups] |
fcParams | Fully connected layer parameters. |
prog | A reference to a program sequence which will be appended with the code to perform the GradW operation. |
debugContext | Optional debug information. |
options | The structure describing options on how the operation should be implemented. See createFullyConnectedWeights() for details. |
cache | Optional pointer to planning cache to use. |
poplar::Tensor popsparse::dynamic::sparseDenseMatMul | ( | poplar::Graph & | graph, |
const SparseTensor & | lhs, | ||
const poplar::Tensor & | rhs, | ||
poplar::program::Sequence & | prog, | ||
bool | transposeLHS = false , |
||
bool | transposeRHS = false , |
||
const poplar::DebugContext & | debugContext = {} , |
||
const poplar::OptionFlags & | options = {} , |
||
PlanningCache * | cache = nullptr |
||
) |
Perform a sparse * dense matrix multiplication, yielding a dense result.
The sparse left-hand operand tensor is made up of meta information for the sparsity and the non-zero values of the matrix. This sparse tensor must have been created with createSparseDenseMatMulLHS.
If the sparse left-hand operand was created for the sparse equivalent of a dense matrix multiplication:
[groups][m][k] * [groups][k][n] = [groups][m][n]
Then the same sparse left-hand operand can be used to calculate the above as well as:
[groups][k][m] * [groups][m][n] = [groups][k][n]
through the use of the transposeLHS
parameter. transposeRHS
is also provided for convenience.
graph | The Poplar graph. |
lhs | The sparse left-hand operand to the matrix multiplication. |
rhs | The dense right-hand operand to the matrix multiplication. |
prog | A reference to a program sequence which will be appended with the code to perform the matrix multiplication. |
transposeLHS | Whether or not to transpose the left-hand operand before multiplying. |
transposeRHS | Whether or not to transpose the right-hand operand before multiplying. |
debugContext | Optional debug information. |
options | Implementation options for the matrix multiplication. |
cache | Optional pointer to planning cache to use. |