InstanceNorm
#include <popnn/InstanceNorm.hpp>
Instance normalization operations.
Instance norm uses group norm with number of groups = number of channels.
-
namespace popnn
Functions used in neural networks.
-
namespace in
Functions
-
inline std::pair<poplar::Tensor, poplar::Tensor> instanceNormStatistics(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Estimate mean and inverse of standard deviation of activations.
- Parameters
graph – The graph that the normalisation operation is added to.
acts – The activations for which the mean and variance are estimated.
eps – The epsilon value added to the variance to avoid division by zero.
prog – The program sequence to add the operation to.
unbiasedVarEstimate – If true, an unbiased variance estimate will be computed.
stableAlgo – If true, computes the mean first then subtracts the activations from it before computing the variance. The implementation with this flag set to true is slower than when set to false.
partialsType – Poplar type used for partial results. If the type specified is smaller than the input/output type then
partialsType
is ignored and the input/output type is used instead.debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A vector pair with mean and inverse standard deviation.
-
inline poplar::Tensor instanceNormWhiten(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Whiten activations given the mean and standard deviation.
- Parameters
graph – The graph that the normalisation operation is added to.
acts – The input activations that will be whitened.
mean – The previously calculated mean to subtract from the activations. Typically calculated using InstanceNormStatistics().
invStdDev – The previously calculated inverse standard deviation to multiply the activations by. Typically calculated using InstanceNormStatistics().
prog – The program sequence to add the operation to.
debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A new tensor with the whitened activations.
-
inline std::pair<poplar::Tensor, poplar::Tensor> instanceNormalise(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Instance normalise activations given the mean, standard deviation and norm parameters.
As instance normalise uses group normalise, options are passed through. See the groupNormalise() documentation for details of the options.
- Parameters
graph – The graph that the normalisation operation is added to.
acts – The input activations to whiten and normalise, with shape
[B][C][..F..]
where:B
is the batch sizeC
is the number of channels..F..
are the dimensions of an N-dimensional field.
gamma – The gamma weights to multiply by when normalising the whitened activations.
beta – The beta weights to add when normalising the whitened activations.
mean – The mean to subtract when whitening the activations.
invStdDev – The inverse standard deviation to multiply by when whitening the activations.
prog – The program sequence to add the operation to.
debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
Two tensors containing:
normalised activations
whitened activations
-
inline std::pair<poplar::Tensor, poplar::Tensor> instanceNormParamGradients(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Compute gradients with respect to parameters for parameter update.
- Parameters
graph – The graph that the normalisation operation is added to.
acts – The forward-pass activation inputs to this layer.
gradsIn – The gradient with respect to the output of this layer.
mean – The mean of the
acts
tensor, typically calculated using InstanceNormStatistics().iStdDev – The inverse standard deviation of the
acts
tensor, typically calculated using InstanceNormStatistics().prog – The program sequence to add the operation to.
partialsType – The Poplar type to be used for intermediate values. If the type specified is smaller than the input/output type then
partialsType
is ignored and the input/output type is used instead.debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A pair of tensors,
gammaDelta
andbetaDelta
which are the gradients with respect togamma
andbeta
.
-
inline std::pair<poplar::Tensor, poplar::Tensor> instanceNormParamGradients(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Compute gradients with respect to parameters for parameter update.
- Parameters
graph – The graph that the normalisation operation is added to.
actsWhitened – The forward-pass whitened activation inputs to this layer.
gradsIn – The gradient with respect to the output of this layer.
prog – The program sequence to add the operation to.
partialsType – The Poplar type to be used for intermediate values. If the type specified is smaller than the input/output type then
partialsType
is ignored and the input/output type is used instead.debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A pair of tensors,
gammaDelta
andbetaDelta
which are the gradients with respect togamma
andbeta
.
-
inline poplar::Tensor instanceNormGradients(poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Compute gradients with respect to input activations for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
- Parameters
graph – The graph that the normalisation operation is added to.
acts – The forward-pass activation inputs to this layer.
gradsIn – The gradient with respect to the output of this layer.
mean – The mean of the
acts
tensor, typically calculated using InstanceNormStatistics().invStdDev – The inverse standard deviation of the
acts
tensor, typically calculated using InstanceNormStatistics().gamma – The gamma weights to multiply by when normalising the whitened activations.
prog – The program sequence to add the operation to.
partialsType – The Poplar type to be used for intermediate values. If the type specified is smaller than the input/output type then
partialsType
is ignored and the input/output type is used instead.debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A tensor containing the gradients with respect to the input activations for this layer.
-
inline poplar::Tensor instanceNormGradients(poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Compute gradients with respect to input activations for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
- Parameters
graph – The graph that the normalisation operation is added to.
actsWhitened – The forward-pass whitened activation inputs to this layer.
gradsIn – The gradient with respect to the output of this layer.
invStdDev – The inverse standard deviation of the
acts
tensor, typically calculated using InstanceNormStatistics().gamma – The gamma weights to multiply by when normalising the whitened activations.
prog – The program sequence to add the operation to.
partialsType – The Poplar type to be used for intermediate values. If the type specified is smaller than the input/output type then
partialsType
is ignored and the input/output type is used instead.debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
- Returns
A tensor containing the gradients with respect to the input activations for this layer.
-
inline void instanceNormParamUpdate(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Update parameters for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
The
gamma
andbeta
parameters are updated as follows:gamma
+=gammaDelta
*scale
beta
+=betaDelta
*scale
scale
is a float and therefore constant.- Parameters
graph – The graph that the normalisation operation is added to.
gammaDelta – Value used to update
gamma
.betaDelta – Value used to update
beta
.scale – Scale factor for
gammaDelta
andbetaDelta
.gamma – The gamma weights to multiply by when normalising the activations.
beta – The beta weights to add when normalising the activations.
prog – The program sequence to add the operation to.
debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
-
inline void instanceNormParamUpdate(poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Update parameters for the instance norm layer.
Gradients are propagated through the complete layer including statistics computation.
The
gamma
andbeta
parameters are updated as follows:gamma += gammaDelta * scale
beta += betaDelta * scale
scale
is a tensor and therefore variable.- Parameters
graph – The graph that the normalisation operation is added to.
gammaDelta – Value used to update
gamma
.betaDelta – Value used to update
beta
.scale – Scale factor for
gammaDelta
andbetaDelta
.gamma – The gamma weights to multiply by when normalising the activations.
beta – The beta weights to add when normalising the activations.
prog – The program sequence to add the operation to.
debugContext – Optional debug information.
options – Instance normalisation options. See groupNormalise().
-
uint64_t getFwdFlops(uint64_t numChannels, uint64_t actsPerChannel, bool computeEstimates)
For computing the floating point operations required, the following values are used:
Activations per channel:
for fully-connected layers: the total number of batches.
for convolution layers: the field size per channel * batch size.
Number of channels:
for fully-connected layers: the total number of activations in a batch.
for convolution layers: the total number of channels.
- Parameters
numChannels – The activations per channel.
actsPerChannel – The number of channels.
computeEstimates –
- Returns
Number of floating point operations required.
-
uint64_t getBwdFlops(uint64_t numChannels, uint64_t actsPerChannel)
- Parameters
numChannels – The activations per channel. See getFwdFlops().
actsPerChannel – The number of channels. See getFwdFlops().
- Returns
Number of floating point operations required.
-
uint64_t getWuFlops(uint64_t numChannels, uint64_t actsPerChannel)
- Parameters
numChannels – The activations per channel. See getFwdFlops().
actsPerChannel – The number of channels. See getFwdFlops().
- Returns
Number of floating point operations required.
-
inline std::pair<poplar::Tensor, poplar::Tensor> instanceNormStatistics(poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, bool unbiasedVarEstimate, bool stableAlgo, const poplar::Type &partialsType = poplar::FLOAT, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
-
namespace in