Convolution
#include <poplin/Convolution.hpp>
Functions and data types to support performing convolutions.
-
namespace poplin
Linear algebra functions.
Decomposition of a matrix into an lower triangular matrix L and upper triangular matrix U.
Typedefs
-
using ConvPlanParams = std::tuple<const poplar::Target*, const ConvParams, const poplar::OptionFlags*>
Functions
-
uint64_t getFwdFlops(const ConvParams ¶ms)
Calculate the minimum number of floating point operations required to perform the forward pass convolution given a set of
params
.
-
uint64_t getBwdFlops(const ConvParams ¶ms)
Calculate the minimum number of floating point operations required to perform the backward pass convolution given a set of
params
.
-
uint64_t getWuFlops(const ConvParams ¶ms)
Calculate minimum number of floating point operations required to perform the weight update pass convolution given a set of
params
.
-
double getFwdPerfectCycleCount(const poplar::Graph &graph, const ConvParams ¶ms)
Calculate the number of cycles to perform the forward pass assuming maximal utilisation of target hardware performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency:
cycleCount =
getFwdFlops()/ maximumHardwareVectorization
.- Parameters
graph – Provides target the convolution will run on.
params – Description of convolution.
- Returns
Estimated number of cycles to perform the forward pass.
-
double getBwdPerfectCycleCount(const poplar::Graph &graph, const ConvParams ¶ms)
Calculate the number of cycles to perform the backward pass assuming maximal utilisation of the target hardware, performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency:
cycleCount = getBwdFlops() / maximumHardwareVectorization
.- Parameters
graph – Provides target the convolution will run on.
params – Description of convolution.
- Returns
Estimated number of cycles to perform the backward pass.
-
double getWuPerfectCycleCount(const poplar::Graph &graph, const ConvParams ¶ms)
Calculate the number of cycles to perform the weight update pass assuming maximal utilisation of the target hardware, performing the minimum number of floating point operations.
This takes into account the number of tiles available and vectorization support on the target.
This is an optimistic number useful for estimating efficiency. cycleCount = getWuFlops() / maximumHardwareVectorization
- Parameters
graph – Provides target the convolution will run on.
params – Description of convolution.
- Returns
Estimated number of cycles to perform the weight update pass.
-
poplar::Tensor createWeights(poplar::Graph &graph, const ConvParams ¶ms, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a weight tensor suitable for use with convolution()
The shape of the tensor will be [convGroups x outChansPerConvGroup x inChansPerConvGroup x H x W]
Convolution options
availableMemoryProportion
Decimal between 0 and 1 (inclusive) [=0.6]The amount of memory allocated for use for temporary data whilst the operation is executing (for example, for intermediate calculated values or temporary values passed between tiles on the IPU). The value is specified as a proportion of available memory on the IPU. So, for example, a value of 0.1 will constrain the library to use 10% of the total memory for temporary data.
The library will try and constrain the use of temporary memory to below this value. An operation that has more temporary memory available to use will run in the same or fewer cycles.
For a specific operation, the minimum amount of temporary memory the library is able to use may be more than the amount specified by this option. In this case, if
POPLIBS_LOG_LEVEL=WARN
orPOPLIBS_POPLIN_LOG_LEVEL=WARN
, a warning message will be output, and the amount specified by this option is ignored.Note: if this value is set to less than 5% of memory (so, a value less than 0.05) then it is often the case that the library will need to create a large amount of code and data structures to keep the temporary memory low which could have a permanent memory overhead larger than the saving of temporary memory. You should take great care when setting a value this low.
See also
Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU technical note for some practical examples of using
availableMemoryProportion
partialsType
(half, float) [=float]Data type used for intermediate calculations. If the type specified is smaller than the output type then the option is ignored and the output type is used instead.
pass
(NONE, INFERENCE_FWD, TRAINING_FWD, TRAINING_BWD, TRAINING_WU, FC_INFERENCE_FWD, FC_TRAINING_FWD, FC_TRAINING_BWD, FC_TRAINING_WU) [=NONE]Optimize the plan for the specified type of pass. Note the abbreviations: FWD (forward), BWD (backward), WU (weight-update), FC (fully-connected).
use128BitConvUnitLoad
(true, false) [=false]If true, convolution weights are loaded 128-bits at a time. Otherwise, they are loaded 64-bits at a time. Not all codelets support 128-bit loads. This option affects memory usage and cycle count.
enableMultiStageReduce
(true, false) [=true]If true, perform the reduction following the convolution in multiple stages if it would significantly reduce code size. This comes at the cost of increasing the number of cycles.
enableFastReduce
(true, false) [=false]If true, use a faster reduction vertex if the data types and widths allow it. This comes at the cost of further constraints on memory allocation
enableConvDithering
(true, false) [=false]If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models.
- Parameters
graph – The graph that the tensor will be added to.
params – The same parameters as used by the convolution().
name – Debugging name for the tensor.
options – Options controlling the implementation.
cache – Optional pointer to planning cache to use.
- Returns
The weights tensor suitable for use with convolution().
-
poplar::Tensor createBiases(poplar::Graph &graph, const poplar::Tensor &activations, const poplar::DebugContext &debugContext = {"biases"})
Create a bias tensor suitable for input to the addBias() function.
The tensor will have the shape [outChans]
- Parameters
graph – The graph that the tensor will be added to.
activations – The activation tensor which is output from the convolution.
name – Debugging name for the tensor.
- Returns
The tensor of biases.
-
poplar::Tensor createBiases(poplar::Graph &graph, const poplar::Tensor &activations, const ConvParams ¶ms, const poplar::DebugContext &debugContext = {"biases"}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create a bias tensor suitable for input to the addBias() function with allocation consistent with plan parameters.
The tensor will have the shape [outChans]
- Parameters
graph – The graph that the tensor will be added to.
activations – The activation tensor which is output from the convolution.
params – Parameters as passed to the target convolution.
name – Debugging name for the tensor.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
The tensor of biases.
-
poplar::Tensor createInput(poplar::Graph &graph, const ConvParams ¶ms, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create an input tensor for a convolution.
Use this when you need to create an input data tensor for a convolution. The same set of parameters which will be passed to the convolution() should also be passed to createInput().
The returned tensor has the shape [B x inChans x H x W].
- Parameters
graph – The tensor will be added to this graph.
params – Parameters as passed to the target convolution.
name – Debugging name for the tensor.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
The allocated input tensor.
-
poplar::Tensor createConvOutput(poplar::Graph &graph, const ConvParams ¶ms, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Create an output tensor for a convolution.
Use this when you need to create an output data tensor for a convolution. The same set of parameters which will be passed to the convolution() should also be passed to createInput().
The returned tensor has the shape [B x inChans x H x W].
- Parameters
graph – The tensor will be added to this graph.
params – Parameters as passed to the target convolution.
debugContext – Debugging name for the tensor.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
The allocated output tensor.
-
poplar::Tensor convolution(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &weights, const ConvParams ¶ms, bool transposeAndFlipWeights, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Convolve an input with a set of weights.
The input tensor is in the form [B x inChans x H x W], and can be allocated using createInput(). The weights tensor is in the form [convGroups x outChansPerConvGroup x inChansPerConvGroup x H x W], and can be allocated using createWeights().
The returned tensor has the shape [B x outChans x H x W]
Padding and striding are specified in the ConvParams structure.
- Parameters
graph – The graph that the operation will be added to.
in – Input data tensor.
weights – Weights tensor.
params – Parameters for the form of the convolution.
transposeAndFlipWeights – For the weight update pass.
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
options – Options that control the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
The convolved output tensor.
-
void convolutionWithOutput(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &weights, const poplar::Tensor &out, const ConvParams ¶ms, bool transposeAndFlipWeights, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Convolve an input with a set of weights into a pre-allocated output tensor.
The output tensor is in the form [B x OutChans x H x W], and can be allocated using createConvOutput(). The weights tensor is in the form [convGroups x outChansPerConvGroup x inChansPerConvGroup x H x W], and can be allocated using createWeights(). The input tensor is in the form [B x inChans x H x W], and can be allocated using createInput().
Padding and striding are specified in the ConvParams structure.
- Parameters
graph – The graph that the operation will be added to.
in – Input data tensor.
weights – Weights tensor.
out – Pre-allocated output tensor.
params – Parameters for the form of the convolution.
transposeAndFlipWeights – For the weight update pass.
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
options – Options that control the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
-
void preplanConvolutions(const std::set<ConvPlanParams> &convs, PlanningCache &cache)
- Deprecated:
Use preplan() instead.
Plan the specified convolutions.
All entries must have matching machine parameters.
- Parameters
convs – A set of tuples of:
conv-specific target for tile / IPU sizing
convolution parameters
implementation options. See createWeights().
cache – The planning cache to update.
-
void preplanConvolutions(poplar::Graph &graph, const std::set<ConvPlanParams> &convs, PlanningCache &cache)
- Deprecated:
Use preplan() instead.
Plan the specified convolutions.
All entries must have matching machine parameters.
- Parameters
graph – The graph the convolutions will belong to
convs – A set of tuples of:
conv-specific target for tile / IPU sizing
convolution parameters
implementation options. See createWeights().
cache – The planning cache to update.
-
void weightsTransposeChansFlipXY(poplar::Graph &graph, const poplar::Tensor &weightsIn, const poplar::Tensor &weightsOut, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Copy the weights in
weightsIn
intoweightsOut
such that each element of the kernel is transposed with respect to the input and output channels and flip each spatial dimension of the kernel.See the
transposeAndFlipWeights
parameter in convolution().- Parameters
graph – The graph that the operation will be added to.
weightsIn – The input weights tensor.
weightsOut – The output weights tensor.
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
options – Options controlling the implementation. See createWeights().
-
poplar::Tensor calculateWeightDeltas(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &activations, const ConvParams ¶ms, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Append an operation to a poplar::Program to generate the tensor of weight deltas.
- Parameters
graph – The tensor will be added to this graph.
zDeltas – Tensor containing the gradients with respect to the output of the convolution.
activation – Tensor containing the inputs to the convolution in the forward pass.
params – Parameters of the convolution.
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
The weight deltas are the gradients with respect to the weights of the convolution. These are populated when the operation runs.
-
void convolutionWeightUpdate(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &weights, const poplar::Tensor &activations, ConvParams params, const poplar::Tensor &scale, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Append operations to a poplar::Program to generate and apply the weight update.
See also
- Parameters
graph – The graph that the operation will be added to.
zDeltas – Tensor containing the gradients with respect to the output of the convolution.
weights – Weights tensor.
activations – Tensor containing the inputs to the convolution in the forward pass.
params – Parameters of the convolution.
scale – Scale to apply to the
zDeltas
.prog – Poplar program sequence to append the operations onto.
debugContext – Optional debug information.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
-
void convolutionWeightUpdate(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &weights, const poplar::Tensor &activations, ConvParams params, float scale, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Append operations to a poplar::Program to generate and apply the weight update.
See also
- Parameters
graph – The graph that the operation will be added to.
zDeltas – Tensor containing the gradients with respect to the output of the convolution.
weights – Weights tensor.
activations – Tensor containing the inputs to the convolution in the forward pass.
params – Parameters of the convolution.
scale – Scale to apply to the zDeltas.
prog – Poplar program sequence to append the operations onto.
debugContext – Optional debug information.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
-
void convolutionBiasUpdate(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &biases, const poplar::Tensor &scale, const poplar::OptionFlags &options, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})
Add a program to update
biases
tensor with the gradients derived from thezDeltas
tensor.- Parameters
graph – The graph that the operation will be added to.
zDeltas – Tensor containing the gradients with respect to the output of the convolution.
biases – Biases tensor to update.
scale – Scale to apply to to zDeltas tensor.
options – Options controlling the implementation. See createWeights().
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
-
void convolutionBiasUpdate(poplar::Graph &graph, const poplar::Tensor &zDeltas, const poplar::Tensor &biases, float scale, const poplar::OptionFlags &options, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})
Add a program to update
biases
tensor with the gradients derived from thezDeltas
tensor.- Parameters
graph – The graph that the operation will be added to.
zDeltas – Tensor containing the gradients with respect to the output of the convolution.
biases – Biases tensor to update.
scale – Scale to apply to to
zDeltas
tensor.options – Options controlling the implementation. See createWeights().
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
-
void addBias(poplar::Graph &graph, const poplar::Tensor &in, const poplar::Tensor &biases, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {})
Adds a program to
prog
which addsbiases
toactivations
tensor.- Parameters
graph – The graph that the operation will be added to.
input – Tensor containing values which to add the biases.
biases – Biases to add to the
input
tensor.prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
-
void reportPlanInfo(std::ostream &out, const poplar::Graph &graph, const ConvParams ¶ms, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Report the convolution plan corresponding to the
params
andoptions
provided.- Parameters
out – Output stream to report the plan to.
graph – The graph that the convolution is planned with.
params – The same parameters as used by the convolution().
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
-
PlanCosts reportPlanEstimatedCosts(const poplar::Graph &graph, const ConvParams ¶ms, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Report the estimated cycles and memory costs of the convolution plan corresponding to the
params
andoptions
provided.- Parameters
graph – The graph that the convolution is planned with.
params – The same parameters as used by the convolution().
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
Cycles and memory cost estimates for the planned convolution.
-
void reportWeightUpdatePlanInfo(std::ostream &out, const poplar::Graph &graph, const ConvParams &fwdParams, const poplar::OptionFlags &fwdOptions = {}, PlanningCache *cache = nullptr)
Report the convolution plan corresponding to the weight update pass given the forward pass
params
andoptions
.- Parameters
out – ostream to report the plan to.
graph – The graph that the convolution is planned with.
fwdParams – Forward pass parameters as used by the convolution().
fwdOptions – Forward pass options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
-
poplar::Tensor fullyConnectedWeightTranspose(poplar::Graph &graph, poplar::Tensor weights, const ConvParams ¶ms, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Arranges the weights (activations) such that they are suited for the backward pass in a fully connected layer.
- Parameters
graph – The graph that the operation will be added to.
activations – Tensor containing the inputs to the convolution.
params – Parameters of the convolution.
prog – Poplar program sequence to append the operation onto.
debugContext – Optional debug information.
options – Options controlling the implementation. See createWeights().
cache – Optional pointer to planning cache to use.
- Returns
A tensor with the weights suitably arranged.
-
void convolutionValidateOptions(const poplar::OptionFlags &options)
Provides an interface to validate the convolution options.
Presence of invalid key or a value will throw an exception.
- Parameters
options – Options controlling the implementation. See createWeights().
-
struct PlanCosts
- #include <Convolution.hpp>
Structure for estimated costs returned by reportPlanEstimatedCosts()
-
class PlanningCache
Subclassed by poplin::matmul::PlanningCache
-
namespace internal
Functions
-
std::ostream &operator<<(std::ostream &os, DetailedPlanCosts const &c)
-
std::istream &operator>>(std::istream &is, DetailedPlanCosts &c)
-
DetailedPlanCosts reportDetailedPlanEstimatedCosts(const poplar::Graph &graph, const ConvParams ¶ms, const poplar::OptionFlags &options = {}, PlanningCache *cache = nullptr)
Like reportPlanEstimatedCosts() but return an itemised breakdown of the estimates based on what the planner did.
-
struct DetailedPlanCosts
- #include <Convolution.hpp>
Structure for detailed estimated costs returned by reportDetailedPlanEstimatedCosts().
Public Functions
Public Members
-
std::ostream &operator<<(std::ostream &os, DetailedPlanCosts const &c)
-
using ConvPlanParams = std::tuple<const poplar::Target*, const ConvParams, const poplar::OptionFlags*>