MatMul
#include <poplin/MatMul.hpp>
Functions and data types for performing matrix multiplies on the IPU.
-
namespace poplin
Linear algebra functions.
Decomposition of a matrix into an lower triangular matrix L and upper triangular matrix U.
Typedefs
-
using MatMulPlanParams = std::tuple<const poplar::Target*, const MatMulParams, const poplar::OptionFlags*>
A tuple containing the required parameters to preplan a matmul:
matmul-specific target for tile / IPU sizing
matmul parameters
implementation options (see matMul() above)
All entries must have matching machine parameters.
-
using MatMulToConvOptions = std::unordered_map<const poplar::OptionFlags*, poplar::OptionFlags>
Mapping of pointers to matrix multiplication option flags to the corresponding convolution option flags.
Functions
-
poplar::Tensor matMul(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Multiply two matrices.
Calculates
C = A * BwhereAandBare matrices.Matrix multiply options
availableMemoryProportionDecimal between 0 and 1 (inclusive) [=0.6]See createWeights().
fullyConnectedPass(NONE, INFERENCE_FWD, TRAINING_FWD, TRAINING_BWD, TRAINING_WU) [=NONE]Optimize the plan for the specified type of pass. Note the abbreviations: FWD (forward), BWD (backward), WU (weight-update).
inputRHSIsPreArranged(true, false) [=false]Indicates to matMul functions whether the input data has already been re-arranged (using preArrangeMatMulInputRHS()). This allows data to be re-arranged once then used many times.
use128BitConvUnitLoad(true, false) [=false]If true, weights are loaded into the convolution unit 128-bits at a time. Otherwise, they are loaded 64-bits at a time. Not all codelets support 128-bit loads. This option affects memory usage and cycle count.
enableMultiStageReduce(true, false) [=true]If true, perform the reduction following the matrix multiplication in multiple stages if it would significantly reduce code size. This comes at the cost of increasing the number of cycles.
enableFastReduce(true, false) [=false]If true, use a faster reduction vertex if the data types and widths allow it. This comes at the cost of further constraints on memory allocation
remapOutputTensor(true, false) [=true]If true, the output of the convolution is remapped if the output is detected to have a poor layout.
partialsType(half, float) [=float]See createWeights().
- Parameters
graph – The Poplar graph.
A – The left argument to the multiplication. This 2D tensor must be already mapped to tiles.
B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles.
prog – A reference to a program sequence which will be appended with the code to perform the multiplication.
outputType – Optional via overloaded function. Element type of returned tensor. The default is
A.elementType()if omitted.debugContext – Optional debug information.
options – The structure describing options on how the multiplication should be implemented.
cache – Optional pointer to a planning cache to use.
- Returns
The tensor holding the result of the multiplication. This tensor will be created, added to the graph and mapped to tiles. Matrix multiply with explicitly defined output type.
-
poplar::Tensor matMul(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Matrix multiply where output type is the same as input
A.
-
void matMulWithOutput(poplar::Graph &graph, const poplar::Tensor &A_, const poplar::Tensor &B_, poplar::Tensor &out, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options_ = {}, PlanningCache *cache = nullptr)
Matrix multiply with explicitly defined output.
-
void matMulReportPlan(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Report the convolution plan corresponding to the parameters and options provided.
- Parameters
out – Stream to write report to.
graph – The Poplar graph.
inputType – Element type of the input tensors.
outputType – Element type of the output tensor.
aShape – Shape of input tensor A.
bShape – Shape of input tensor B.
options – The structure describing options on how the multiplication should be implemented.
cache – Optional pointer to a planning cache to use.
-
poplar::Tensor matMulGrouped(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Multiply two grouped matrices.
Calculates
C[g] = A[g] * B[g]whereA[g]andB[g]are matrices for each element in the group, andgis an element of the set {0, 1, …,G-1}.The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G.
- Parameters
graph – The Poplar graph.
A – The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.
B – The right argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.
prog – A reference to a program sequence which will be appended with the code to perform the multiplication.
outputType – Element type of the returned tensor.
debugContext – Optional debug information.
options – The structure describing options on how the grouped multiplication should be implemented. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
The tensor holding the result of the grouped multiplication. This tensor will be created, added to the graph and mapped to tiles.
-
void matMulGroupedWithOutput(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::Tensor &out, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options_ = {}, PlanningCache *cache = nullptr)
Grouped matmul with explicit output argument.
-
void matMulGroupedReportPlan(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Report the convolution plan corresponding to the
paramsandoptionsprovided.- Parameters
out – Stream to write report to.
graph – The Poplar graph.
inputType – Element type of the input tensors.
outputType – Element type of the output tensor.
aShape – Shape of input tensor A.
bShape – Shape of input tensor B.
options – The structure describing options on how the multiplication should be implemented.
cache – Optional pointer to a planning cache to use.
-
void matMulAcc(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Multiply two matrices and add to a third (with a scaling factor).
Calculates
C += k * A * BwhereA,Bare matrices andkis a constant scalar.- Parameters
graph – The Poplar graph.
C – The tensor to add to. This 2D tensor must be already mapped to tiles.
k – The constant or a single element tensor to multiply the result of the multiplication. If
kis a tensor, it must be of the same type asAA – The left argument to the multiplication. This 2D tensor must be already mapped to tiles.
B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles.
prog – A reference to a program sequence which will be appended with the code to perform the multiplication and add.
debugContext – Optional debug information.
options – The structure describing options on how the multiplication should be implemented. See matMul().
cache – Optional pointer to a planning cache to use. Matrix multiply and accumulate with a scalar scaling factor.
-
void matMulAcc(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Matrix multiply and accumulate with a single-element scaling factor.
-
void matMulGroupedAcc(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Grouped matrix multiply and accumulate.
Multiply two grouped matrices and add to a third (with a scaling factor).
Calculates
C[g] += k * A[g] * B[g]whereA[g],B[g]are matrices andkis a constant scalar. g is element of the set g = {0, 1, …, G-1}The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G.
- Parameters
graph – The Poplar graph.
C – The tensor to add to. This 3D tensor must be already mapped to tiles.
k – The constant or a single element tensor to multiply the result of the multiplication. If
kis a tensor, it must be of the same type asAA – The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles.
B – The right argument to the multiplication. This 3D tensor must be already mapped to tiles.
prog – A reference to a program sequence which will be appended with the code to perform the grouped multiplication and add.
debugContext – Optional debug information.
options – The structure describing options on how the multiplication should be implemented. See matMul().
cache – Optional pointer to planning cache to use. Grouped matrix multiply and accumulate with a scalar scaling factor.
-
void matMulGroupedAcc(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Grouped matrix multiply and accumulate with a single-element scaling factor.
-
poplar::Tensor createMatMulInputLHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the left operand of a matrix multiplication.
The types of the input and and output tensors are specified separately. This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the left argument efficient.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensors.
outputType – Element type of the output tensor.
aShape – The shape of the tensor to be created.
bShape – The shape of the tensor that the created tensor will be multiplied by.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
A tensor of type
typeand shapeaShape. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulInputLHS(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the left operand of a matrix multiplication.
The type of both input and output tensors is specified by
dataType. This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the left argument efficient.- Parameters
graph – The Poplar graph.
dataType – The element type of both the input and output tensors.
aShape – The shape of the tensor to be created.
bShape – The shape of the tensor that the created tensor will be multiplied by.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
A tensor of type
typeand shapeaShape. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulGroupedInputLHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the left operand of a grouped matrix multiplication.
This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the left argument efficient.
The first dimension of the output tensor and the tensor it is multiplied by must the number of groups.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensors.
outputType – Element type of the output tensor.
aShape – The grouped shape [g, r, c] of the created tensor.
bShape – The grouped shape [g, r, c] of the tensor that the created tensor will be multiplied by.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
A tensor of type
outputTypeand grouped shapeaShape. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulInputRHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the right operand of a matrix multiplication.
This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the right argument efficient.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensors.
outputType – Element type of the output tensor.
aShape – The shape of the tensor that the tensor to be created will be multiplied by.
bShape – The shape of the created tensor.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
A tensor of type
typeand shapebShape. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulInputRHS(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the right operand of a matrix multiplication.
Overloaded function for when the input type and output type are the same (represented by the
dataTypeparameter).
-
poplar::Tensor createMatMulOutput(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the output operand of a matrix multiplication.
This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the output argument efficient.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensor.
outputType – Element type of the output tensor.
aShape – The shape of the left-hand input to the matmul.
bShape – The shape of the right-hand input to the matmul.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to a planning cache to use.
- Returns
A tensor of type
typeand shape [aShape[0],bShape[1] ]. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulOutput(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the output operand of a matrix multiplication.
Overloaded function for when the input type and output type are the same (represented by the
dataTypeparameter).
-
poplar::Tensor createMatMulGroupedInputRHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the right operand of a grouped matrix multiplication.
This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the right argument efficient.
The first dimension of the tensor to be created and the tensor it multiplies must the number of groups.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensor.
outputType – Element type of the output tensor.
aShape – The grouped shape [g, r, c] of the tensor that the tensor to be created will be multiplied by.
bShape – The grouped shape [g, r, c] of the created tensor.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to planning cache to use.
- Returns
A tensor of type
typeand grouped shapebShape. The tensor will have been mapped to tiles.
-
poplar::Tensor createMatMulGroupedOutput(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a tensor to be used as the output operand of a grouped matrix multiplication (with output).
This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the output argument efficient.
The first dimension of the tensor to be created and the tensor it multiplies must the number of groups.
- Parameters
graph – The Poplar graph.
inputType – Element type of the input tensor.
outputType – Element type of the output tensor.
aShape – The grouped shape [g, r, c] of the tensor that the tensor to be created will be multiplied by.
bShape – The grouped shape [g, r, c] of the created tensor.
debugContext – Debug information.
options – The implementation options of the multiplication. See matMul().
cache – Optional pointer to planning cache to use.
- Returns
A tensor of type
typeand grouped shape [aShape[g],aShape[r],bShape[c] ]. The tensor will have been mapped to tiles.
-
poplar::Tensor preArrangeMatMulInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Pre-arrange a matrix to be used as the right input of a matrix multiplication with explicitly defined output type.
Re-arrange memory for RHS operand to an upcoming matmul operation. This allows the rearrangement of the memory of a tensor that would otherwise be rearranged as part of the matmul operation for efficiency.
Use this function and the
matMul*()functions with theinputRHSIsPreArrangedoption flag to do any re-arrangement necessary once and then re-use that input multiple times.Only valid for fully connected layers.
- Parameters
graph – The Poplar graph.
aShape – The shape of the left argument to the multiplication.
B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles.
prog – A reference to a program sequence which will be appended with the code to perform the arrangement.
outputType – Optional via overloaded function. Element type of output tensor. The default is
B.elementType()if omitted.debugContext – Optional debug information.
options – Flags describing options for how the multiplication should be implemented. See matMul().
cache – Optional pointer to planning cache to use.
- Returns
New tensor holding the rearranged input. This tensor has the same shape as the given tensor.
-
poplar::Tensor preArrangeMatMulInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Pre-arrange a matrix to be used as the right input of a matrix multiplication, where the output type is the same as
B.
-
poplar::Tensor preArrangeMatMulGroupedInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Pre-arrange a matrix to be used as the right input of a matrix multiplication, with explicitly defined output type.
-
poplar::Tensor transposeGroupedMatrix(const poplar::Tensor &A)
Transposes a grouped matrix tensor.
- Parameters
A – Tensor to transpose
- Returns
Transposed tensor
-
std::set<ConvPlanParams> matMulGetConvPlanParams(const std::set<MatMulPlanParams> &matmuls, MatMulToConvOptions &matmulToConvOpts)
Obtain the set of convolution parameters corresponding to the user supplied set of parameters for matrix multiplication.
- Parameters
matmuls – Set of Matrix multiplication parameter tuples
matmulToConvOpts – Convolution options corresponding to every matrix multiplication options.
- Returns
Set of Convolution parameters
-
void preplanMatMuls(const std::set<MatMulPlanParams> &matmuls, matmul::PlanningCache &cache)
- Deprecated:
Use preplan() instead.
Plan the specified matrix multiplications.
- Parameters
matmuls – A set of parameters to preplan matmuls
cache – The planning cache to update
-
void matmulValidateOptions(const poplar::OptionFlags &options)
Provides an interface to validate the matmul options.
Presence of invalid key or a value will throw an exception.
- Parameters
options – Flags describing options for how the multiplication should be implemented. See matMul().
-
struct MatMulParams
- #include <MatMul.hpp>
Parameters to define a Matrix multiplication.
C=A*BPublic Members
Friends
-
friend bool operator<(const MatMulParams &a, const MatMulParams &b)
-
friend bool operator<(const MatMulParams &a, const MatMulParams &b)
-
namespace matmul
-
class PlanningCache : public poplin::PlanningCache
- #include <MatMul.hpp>
- Deprecated:
Use poplin::PlanningCache instead.
Public Functions
-
poplin::PlanningCache &getImpl()
-
class PlanningCache : public poplin::PlanningCache
-
using MatMulPlanParams = std::tuple<const poplar::Target*, const MatMulParams, const poplar::OptionFlags*>