MatMul
#include <poplin/MatMul.hpp>
Functions and data types for performing matrix multiplies on the IPU.
- 
namespace poplin
- Linear algebra functions. - Typedefs - 
using MatMulPlanParams = std::tuple<const poplar::Target*, const MatMulParams, const poplar::OptionFlags*>
- A tuple containing the required parameters to preplan a matmul: - matmul-specific target for tile / IPU sizing 
- matmul parameters 
- implementation options (see matMul() above) 
 - All entries must have matching machine parameters. 
 - 
using MatMulToConvOptions = std::unordered_map<const poplar::OptionFlags*, poplar::OptionFlags>
- Mapping of pointers to matrix multiplication option flags to the corresponding convolution option flags. 
 - Functions - 
poplar::Tensor matMul(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Multiply two matrices. - Calculates - C = A * Bwhere- Aand- Bare matrices.- Matrix multiply options - availableMemoryProportionDecimal between 0 and 1 (inclusive) [=0.6]- See createWeights(). 
- fullyConnectedPass(NONE, INFERENCE_FWD, TRAINING_FWD, TRAINING_BWD, TRAINING_WU) [=NONE]- Optimize the plan for the specified type of pass. Note the abbreviations: FWD (forward), BWD (backward), WU (weight-update). 
- inputRHSIsPreArranged(true, false) [=false]- Indicates to matMul functions whether the input data has already been re-arranged (using preArrangeMatMulInputRHS()). This allows data to be re-arranged once then used many times. 
- use128BitConvUnitLoad(true, false) [=false]- If true, weights are loaded into the convolution unit 128-bits at a time. Otherwise, they are loaded 64-bits at a time. Not all codelets support 128-bit loads. This option affects memory usage and cycle count. 
- enableMultiStageReduce(true, false) [=true]- If true, perform the reduction following the matrix multiplication in multiple stages if it would significantly reduce code size. This comes at the cost of increasing the number of cycles. 
- enableFastReduce(true, false) [=false]- If true, use a faster reduction vertex if the data types and widths allow it. This comes at the cost of further constraints on memory allocation 
- remapOutputTensor(true, false) [=true]- If true, the output of the convolution is remapped if the output is detected to have a poor layout. 
- partialsType(half, float) [=float]- See createWeights(). 
 - Parameters
- graph – The Poplar graph. 
- A – The left argument to the multiplication. This 2D tensor must be already mapped to tiles. 
- B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles. 
- prog – A reference to a program sequence which will be appended with the code to perform the multiplication. 
- outputType – Optional via overloaded function. Element type of returned tensor. The default is - A.elementType()if omitted.
- debugContext – Optional debug information. 
- options – The structure describing options on how the multiplication should be implemented. 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- The tensor holding the result of the multiplication. This tensor will be created, added to the graph and mapped to tiles. Matrix multiply with explicitly defined output type. 
 
 - 
poplar::Tensor matMul(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Matrix multiply where output type is the same as input - A.
 - 
void matMulWithOutput(poplar::Graph &graph, const poplar::Tensor &A_, const poplar::Tensor &B_, poplar::Tensor &out, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options_ = {}, PlanningCache *cache = nullptr)
- Matrix multiply with explicitly defined output. 
 - 
void matMulReportPlan(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Report the convolution plan corresponding to the parameters and options provided. - Parameters
- out – Stream to write report to. 
- graph – The Poplar graph. 
- inputType – Element type of input tensors. 
- outputType – Element type of output tensor. 
- aShape – Shape of input tensor A. 
- bShape – Shape of input tensor B. 
- options – The structure describing options on how the multiplication should be implemented. 
- cache – Optional pointer to a planning cache to use. 
 
 
 - 
poplar::Tensor matMulGrouped(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Multiply two grouped matrices. - Calculates - C[g] = A[g] * B[g]where- A[g]and- B[g]are matrices for each element in the group, and- gis an element of the set {0, 1, …,- G-1}.- The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G. - Parameters
- graph – The Poplar graph. 
- A – The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles. 
- B – The right argument to the grouped multiplication. This 3D tensor must be already mapped to tiles. 
- prog – A reference to a program sequence which will be appended with the code to perform the multiplication. 
- outputType – Data type to be used for the returned tensor. 
- debugContext – Optional debug information. 
- options – The structure describing options on how the grouped multiplication should be implemented. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- The tensor holding the result of the grouped multiplication. This tensor will be created, added to the graph and mapped to tiles. 
 
 - 
void matMulGroupedWithOutput(poplar::Graph &graph, const poplar::Tensor &A, const poplar::Tensor &B, poplar::Tensor &out, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options_ = {}, PlanningCache *cache = nullptr)
- Grouped matmul with explicit output argument. 
 - 
void matMulGroupedReportPlan(std::ostream &out, const poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Report the convolution plan corresponding to the - paramsand- optionsprovided.- Parameters
- out – Stream to write report to. 
- graph – The Poplar graph. 
- inputType – Element type of input tensors. 
- outputType – Element type of output tensor. 
- aShape – Shape of input tensor A. 
- bShape – Shape of input tensor B. 
- options – The structure describing options on how the multiplication should be implemented. 
- cache – Optional pointer to a planning cache to use. 
 
 
 - 
void matMulAcc(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Multiply two matrices and add to a third (with a scaling factor). - Calculates - C += k * A * Bwhere- A,- Bare matrices and- kis a constant scalar.- Parameters
- graph – The Poplar graph. 
- C – The matrix to add to. This 2D tensor must be already mapped to tiles. 
- k – The constant or a single element tensor to multiply the result of the multiplication. If - kis a tensor, it must be of the same type as- A
- A – The left argument to the multiplication. This 2D tensor must be already mapped to tiles. 
- B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles. 
- prog – A reference to a program sequence which will be appended with the code to perform the multiplication and add. 
- debugContext – Optional debug information. 
- options – The structure describing options on how the multiplication should be implemented. See matMul(). 
- cache – Optional pointer to a planning cache to use. Matrix multiply and accumulate with a scalar scaling factor. 
 
 
 - 
void matMulAcc(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Matrix multiply and accumulate with a single-element scaling factor. 
 - 
void matMulGroupedAcc(poplar::Graph &graph, const poplar::Tensor &C, float k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Grouped matrix multiply and accumulate. - Multiply two grouped matrices and add to a third (with a scaling factor). - Calculates - C[g] += k * A[g] * B[g]where- A[g],- B[g]are matrices and- kis a constant scalar. g is element of the set g = {0, 1, …, G-1}- The multiplication is done for every element in the group. The first dimension of the matrices is the group dimension with value equal to G. - Parameters
- graph – The Poplar graph. 
- C – The matrix to add to. This 3D tensor must be already mapped to tiles. 
- k – The constant or a single element tensor to multiply the result of the multiplication. If - kis a tensor, it must be of the same type as- A
- A – The left argument to the grouped multiplication. This 3D tensor must be already mapped to tiles. 
- B – The right argument to the multiplication. This 3D tensor must be already mapped to tiles. 
- prog – A reference to a program sequence which will be appended with the code to perform the grouped multiplication and add. 
- debugContext – Optional debug information. 
- options – The structure describing options on how the multiplication should be implemented. See matMul(). 
- cache – Optional pointer to planning cache to use. Grouped matrix multiply and accumulate with a scalar scaling factor. 
 
 
 - 
void matMulGroupedAcc(poplar::Graph &graph, const poplar::Tensor &C, const poplar::Tensor &k, const poplar::Tensor &A, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Grouped matrix multiply and accumulate with a single-element scaling factor. 
 - 
poplar::Tensor createMatMulInputLHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the left operand of matrix multiplication. - The types of the input and and output tensors are specified separately. This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the left argument efficient. - Parameters
- graph – The Poplar graph. 
- inputType – The input data type. 
- outputType – The data type of the returned tensor. 
- aShape – The shape of the required matrix. 
- bShape – The shape of the matrix that the required matrix will be multiplied by. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- A matrix of type - typeand shape- aShape. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulInputLHS(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the left operand of matrix multiplication. - The type of both input and output tensors is specified by - dataType. This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the left argument efficient.- Parameters
- graph – The Poplar graph. 
- dataType – The data type of both the input and output tensors. 
- aShape – The shape of the required matrix. 
- bShape – The shape of the matrix that the required matrix will be multiplied by. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- A matrix of type - typeand shape- aShape. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulGroupedInputLHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the left operand of a grouped matrix multiplication. - This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the left argument efficient. - The first dimension of the required matrix and the matrix it multiplies by must the number of groups. - Parameters
- graph – The Poplar graph. 
- type – The data type of the required matrix. 
- aShape – The grouped shape [g, r, c] of the required matrix. 
- bShape – The grouped shape [g, r, c] of the matrix that the required matrix will be multiplied by. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- A matrix of type - typeand grouped shape- aShape. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulInputRHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the right operand of matrix multiplication. - This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the right argument efficient. - Parameters
- graph – The Poplar graph. 
- inputType – The input data type. 
- outputType – The data type of the returned tensor. 
- aShape – The shape of the matrix that the required matrix will be multiplied by. 
- bShape – The shape of the required matrix. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- A matrix of type - typeand shape- bShape. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulInputRHS(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Overloaded function for when inputType == outputType (represented by the dataType parameter). 
 - 
poplar::Tensor createMatMulOutput(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the output operand of matrix multiplication. - This will create a 2D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a matrix multiplication with this tensor as the output argument efficient. - Parameters
- graph – The Poplar graph. 
- inputType – The input data type. 
- outputType – The data type of the returned tensor. 
- aShape – The shape of the matrix that the required matrix will be multiplied by. 
- bShape – The shape of the required matrix. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to a planning cache to use. 
 
- Returns
- A matrix of type - typeand shape [- aShape[0],- bShape[1] ]. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulOutput(poplar::Graph &graph, const poplar::Type &dataType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Overloaded function for when inputType == outputType (represented by the dataType parameter). 
 - 
poplar::Tensor createMatMulGroupedInputRHS(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the right operand of grouped matrix multiplication. - This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the right argument efficient. - The first dimension of the required matrix and the matrix it multiplies by must the number of groups. - Parameters
- graph – The Poplar graph. 
- type – The data type of the required matrix. 
- aShape – The grouped shape [g, r, c] of the matrix that the required matrix will be multiplied by. 
- bShape – The grouped shape [g, r, c] of the required matrix. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to planning cache to use. 
 
- Returns
- A matrix of type - typeand grouped shape- bShape. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor createMatMulGroupedOutput(poplar::Graph &graph, const poplar::Type &inputType, const poplar::Type &outputType, const std::vector<std::size_t> &aShape, const std::vector<std::size_t> &bShape, const poplar::DebugContext &debugContext, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Create a tensor that is used as the output operand of grouped matrix multiplication (with output). - This will create a 3D tensor in the graph. The ordering and tile mapping of the tensor will be set to make a grouped matrix multiplication with this tensor as the output argument efficient. - The first dimension of the required matrix and the matrix it multiplies by must the number of groups. - Parameters
- graph – The Poplar graph. 
- type – The data type of the required matrix. 
- aShape – The grouped shape [g, r, c] of the matrix that the required matrix will be multiplied by. 
- bShape – The grouped shape [g, r, c] of the required matrix. 
- debugContext – Debug information. 
- options – The implementation options of the multiplication. See matMul(). 
- cache – Optional pointer to planning cache to use. 
 
- Returns
- A matrix of type - typeand grouped shape [- aShape[g],- aShape[r],- bShape[c] ]. The tensor will have been mapped to tiles.
 
 - 
poplar::Tensor preArrangeMatMulInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Pre-arrange right-hand side input. - Re-arrange memory for RHS operand to an upcoming matmul operation. This allows the rearrangement of the memory of a tensor that would otherwise be rearranged as part of the matmul operation for efficiency. - Use this function and the matMul*() functions with the - inputRHSIsPreArrangedoption flag to do any re-arrangement necessary once and then re-use that input multiple times.- Only valid for fully connected layers. - Parameters
- graph – The Poplar graph. 
- aShape – The shape of the left argument to the multiplication. 
- B – The right argument to the multiplication. This 2D tensor must be already mapped to tiles. 
- prog – A reference to a program sequence which will be appended with the code to perform the arrangement. 
- outputType – Optional via overloaded function. Element type of returned tensor. The default is - B.elementType()if omitted.
- debugContext – Optional debug information. 
- options – Flags describing options for how the multiplication should be implemented. See matMul(). 
- cache – Optional pointer to planning cache to use. 
 
- Returns
- New tensor holding the rearranged input. This tensor has the same shape as the given tensor. Pre-arrange input with explicitly defined output type. 
 
 - 
poplar::Tensor preArrangeMatMulInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Pre-arrange input where the output type is the same as - B.
 - 
poplar::Tensor preArrangeMatMulGroupedInputRHS(poplar::Graph &graph, const std::vector<std::size_t> &aShape, const poplar::Tensor &B, poplar::program::Sequence &prog, const poplar::Type &outputType, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
- Pre-arrange grouped input with explicitly defined output type. 
 - 
poplar::Tensor transposeGroupedMatrix(const poplar::Tensor &A)
- Transposes a grouped matrix tensor. - Parameters
- A – Tensor to transpose 
- Returns
- Transposed tensor 
 
 - 
std::set<ConvPlanParams> matMulGetConvPlanParams(const std::set<MatMulPlanParams> &matmuls, MatMulToConvOptions &matmulToConvOpts)
- Obtain the set of convolution parameters corresponding to the user supplied set of parameters for matrix multiplication. - Parameters
- matmuls – Set of Matrix multiplication parameter tuples 
- matmulToConvOpts – Convolution options corresponding to every matrix multiplication options. 
 
- Returns
- Set of Convolution parameters 
 
 - 
void preplanMatMuls(const std::set<MatMulPlanParams> &matmuls, matmul::PlanningCache &cache)
- Deprecated:
- Use preplan() instead. 
 - Plan the specified matrix multiplications. - Parameters
- matmuls – A set of parameters to preplan matmuls 
- cache – The planning cache to update 
 
 
 - 
void matmulValidateOptions(const poplar::OptionFlags &options)
- Provides an interface to validate the matmul options. - Presence of invalid key or a value will throw an exception. - Parameters
- options – Flags describing options for how the multiplication should be implemented. See matMul(). 
 
 - 
struct MatMulParams
- #include <MatMul.hpp>Parameters to define a Matrix multiplication. C=A*BPublic Members Friends - 
friend bool operator<(const MatMulParams &a, const MatMulParams &b)
 
- 
friend bool operator<(const MatMulParams &a, const MatMulParams &b)
 - 
namespace matmul
- 
class PlanningCache : public poplin::PlanningCache
- #include <MatMul.hpp>- Deprecated:
- Use poplin::PlanningCache instead. 
 Public Functions - 
poplin::PlanningCache &getImpl()
 
 
- 
class PlanningCache : public poplin::PlanningCache
 
- 
using MatMulPlanParams = std::tuple<const poplar::Target*, const MatMulParams, const poplar::OptionFlags*>