|
Poplar and PopLibs
|
Group normalization operations. More...
#include "poplar/Program.hpp"#include "poplar/Tensor.hpp"#include <poplar/OptionFlags.hpp>#include <utility>Go to the source code of this file.
Namespaces | |
| namespace | popnn |
| Functions used in neural networks. | |
Functions | |
| std::pair< poplar::Tensor, poplar::Tensor > | popnn::gn::groupNormStatistics (poplar::Graph &graph, const poplar::Tensor acts, float eps, poplar::program::Sequence &prog, unsigned numGroups, bool unbiasedVarEstimate, bool stableAlgo=false, const poplar::Type &partialsType=poplar::FLOAT, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Estimate mean and inverse of standard deviation of activations. More... | |
| poplar::Tensor | popnn::gn::groupNormWhiten (poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Whiten activations given the mean and standard deviation. More... | |
| std::pair< poplar::Tensor, poplar::Tensor > | popnn::gn::groupNormalise (poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gamma, const poplar::Tensor &beta, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Group normalise activations given the mean, standard deviation and group norm parameters. More... | |
| std::pair< poplar::Tensor, poplar::Tensor > | popnn::gn::groupNormParamGradients (poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &iStdDev, poplar::program::Sequence &prog, const poplar::Type &partialsType=poplar::FLOAT, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Compute gradients with respect to parameters for parameter update. More... | |
| std::pair< poplar::Tensor, poplar::Tensor > | popnn::gn::groupNormParamGradients (poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, poplar::program::Sequence &prog, const poplar::Type &partialsType=poplar::FLOAT, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Compute gradients with respect to parameters for parameter update. More... | |
| poplar::Tensor | popnn::gn::groupNormGradients (poplar::Graph &graph, const poplar::Tensor &acts, const poplar::Tensor &gradsIn, const poplar::Tensor &mean, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType=poplar::FLOAT, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Compute gradients with respect to input activations for the group norm layer. More... | |
| poplar::Tensor | popnn::gn::groupNormGradients (poplar::Graph &graph, const poplar::Tensor &actsWhitened, const poplar::Tensor &gradsIn, const poplar::Tensor &invStdDev, const poplar::Tensor &gamma, poplar::program::Sequence &prog, const poplar::Type &partialsType=poplar::FLOAT, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Compute gradients with respect to input activations for the group norm layer. More... | |
| void | popnn::gn::groupNormParamUpdate (poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, float scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Update parameters for the group norm layer. More... | |
| void | popnn::gn::groupNormParamUpdate (poplar::Graph &graph, const poplar::Tensor &gammaDelta, const poplar::Tensor &betaDelta, const poplar::Tensor &scale, poplar::Tensor &gamma, poplar::Tensor &beta, poplar::program::Sequence &prog, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={}) |
| Update parameters for the group norm layer. More... | |
Group normalization operations.
| std::pair< poplar::Tensor, poplar::Tensor > popnn::gn::groupNormalise | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | acts, | ||
| const poplar::Tensor & | gamma, | ||
| const poplar::Tensor & | beta, | ||
| const poplar::Tensor & | mean, | ||
| const poplar::Tensor & | invStdDev, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Group normalise activations given the mean, standard deviation and group norm parameters.
Group normalisation options
groupNormStridedChannelGrouping (true, false) [=true]
Select groups of channels for group normalisation with a stride between channels. This makes the implementation more efficient but is unconventional. Among other things this will mean that using pre-trained weights would not be possible if not produced with this unconventional implementation.
If we have numGroups groups then the channels in the group groups[groupIdx] are given by:
In the case of instanceNormalise() and layerNormalise() (which use group norm in their implementation) this option will have no effect.
| graph | The graph that the normalisation operation is added to. |
| acts | The input activations to whiten and normalise, with shape [B][C][..F..] where:
|
| gamma | The gamma weights to multiply by when normalising the whitened activations. |
| beta | The beta weights to add when normalising the whitened activations. |
| mean | The mean to subtract when whitening the activations. |
| invStdDev | The inverse standard deviation to multiply by when whitening the activations. |
| prog | The program sequence to add the operation to. |
| debugContext | Optional debug information. |
| options | Group normalisation options. |
| poplar::Tensor popnn::gn::groupNormGradients | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | acts, | ||
| const poplar::Tensor & | gradsIn, | ||
| const poplar::Tensor & | mean, | ||
| const poplar::Tensor & | invStdDev, | ||
| const poplar::Tensor & | gamma, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::Type & | partialsType = poplar::FLOAT, |
||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Compute gradients with respect to input activations for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
| graph | The graph that the normalisation operation is added to. |
| acts | The forward-pass activation inputs to this layer. |
| gradsIn | The gradient with respect to the output of this layer. |
| mean | The mean of the acts tensor, typically calculated using groupNormStatistics(). |
| invStdDev | The inverse standard deviation of the acts tensor, typically calculated using groupNormStatistics(). |
| gamma | The gamma weights to multiply by when normalising the whitened activations. |
| prog | The program sequence to add the operation to. |
| partialsType | Poplar type used for intermediate values. If the type specified is smaller than the input/output type then partialsType is ignored and the input/output type is used instead. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
| poplar::Tensor popnn::gn::groupNormGradients | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | actsWhitened, | ||
| const poplar::Tensor & | gradsIn, | ||
| const poplar::Tensor & | invStdDev, | ||
| const poplar::Tensor & | gamma, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::Type & | partialsType = poplar::FLOAT, |
||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Compute gradients with respect to input activations for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
| graph | The graph that the normalisation operation is added to. |
| actsWhitened | The forward-pass activation inputs to this layer. |
| gradsIn | The gradient with respect to the output of this layer. |
| invStdDev | The inverse standard deviation of the acts tensor, typically calculated using groupNormStatistics(). |
| gamma | The gamma weights to multiply by when normalising the whitened activations. |
| prog | The program sequence to add the operation to. |
| partialsType | Poplar type used for intermediate values. If the type specified is smaller than the input/output type then partialsType is ignored and the input/output type is used instead. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
| std::pair< poplar::Tensor, poplar::Tensor > popnn::gn::groupNormParamGradients | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | acts, | ||
| const poplar::Tensor & | gradsIn, | ||
| const poplar::Tensor & | mean, | ||
| const poplar::Tensor & | iStdDev, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::Type & | partialsType = poplar::FLOAT, |
||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Compute gradients with respect to parameters for parameter update.
| graph | The graph that the normalisation operation is added to. |
| acts | The forward-pass activation inputs to this layer. |
| gradsIn | The gradient with respect to the output of this layer. |
| mean | The mean of the acts tensor, typically calculated using groupNormStatistics(). |
| iStdDev | The inverse standard deviation of the acts tensor, typically calculated using groupNormStatistics(). |
| prog | The program sequence to add the operation to. |
| partialsType | Poplar type used for intermediate values. If the type specified is smaller than the input/output type then partialsType is ignored and the input/output type is used instead. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
gammaDelta and betaDelta which are the gradients with respect to gamma and beta. | std::pair< poplar::Tensor, poplar::Tensor > popnn::gn::groupNormParamGradients | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | actsWhitened, | ||
| const poplar::Tensor & | gradsIn, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::Type & | partialsType = poplar::FLOAT, |
||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Compute gradients with respect to parameters for parameter update.
| graph | The graph that the normalisation operation is added to. |
| actsWhitened | The forward-pass whitened activation inputs to this layer. |
| gradsIn | The gradient with respect to the output of this layer. |
| prog | The program sequence to add the operation to. |
| partialsType | Poplar type used for intermediate values. If the type specified is smaller than the input/output type then partialsType is ignored and the input/output type is used instead. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
gammaDelta and betaDelta which are the gradients with respect to gamma and beta. | void popnn::gn::groupNormParamUpdate | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | gammaDelta, | ||
| const poplar::Tensor & | betaDelta, | ||
| const poplar::Tensor & | scale, | ||
| poplar::Tensor & | gamma, | ||
| poplar::Tensor & | beta, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Update parameters for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
The gamma and beta parameters are updated as follows:
gamma += gammaDelta * scalebeta += betaDelta * scalescale is a tensor and therefore variable.
| graph | The graph that the normalisation operation is added to. |
| gammaDelta | Value used to update gamma. |
| betaDelta | Value used to update beta. |
| scale | Scale factor for gammaDelta and betaDelta. |
| gamma | The gamma weights to multiply by when normalising the activations. |
| beta | The beta weights to add when normalising the activations. |
| prog | The program sequence to add the operation to. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
| void popnn::gn::groupNormParamUpdate | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | gammaDelta, | ||
| const poplar::Tensor & | betaDelta, | ||
| float | scale, | ||
| poplar::Tensor & | gamma, | ||
| poplar::Tensor & | beta, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Update parameters for the group norm layer.
Gradients are propagated through the complete layer including statistics computation.
The gamma and beta parameters are updated as follows:
gamma += gammaDelta * scale beta += betaDelta * scale scale is a float and therefore constant.
| graph | The graph that the normalisation operation is added to. |
| gammaDelta | Value used to update gamma. |
| betaDelta | Value used to update beta. |
| scale | Scale factor for gammaDelta and betaDelta. |
| gamma | The gamma weights to multiply by when normalising the activations. |
| beta | The beta weights to add when normalising the activations. |
| prog | The program sequence to add the operation to. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
| std::pair< poplar::Tensor, poplar::Tensor > popnn::gn::groupNormStatistics | ( | poplar::Graph & | graph, |
| const poplar::Tensor | acts, | ||
| float | eps, | ||
| poplar::program::Sequence & | prog, | ||
| unsigned | numGroups, | ||
| bool | unbiasedVarEstimate, | ||
| bool | stableAlgo = false, |
||
| const poplar::Type & | partialsType = poplar::FLOAT, |
||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Estimate mean and inverse of standard deviation of activations.
| graph | The graph that the normalisation operation is added to. |
| acts | The activations for which the mean and variance are estimated. |
| eps | The epsilon value added to the variance to avoid division by zero. |
| prog | The program sequence to add the operation to. |
| numGroups | The number of groups to split the channel dimension into when calculating group norm statistics. The groupNormStridedChannelGrouping option defines how the split is made. |
| unbiasedVarEstimate | If true, an unbiased variance estimate will be computed. |
| stableAlgo | If true, computes the mean first then subtracts the activations from it before computing the variance. The implementation with this flag set to true is slower than when set to false. |
| partialsType | Poplar type used for intermediate values. If the type specified is smaller than the input/ output type then partialsType is ignored and the input/output type is used instead. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |
| poplar::Tensor popnn::gn::groupNormWhiten | ( | poplar::Graph & | graph, |
| const poplar::Tensor & | acts, | ||
| const poplar::Tensor & | mean, | ||
| const poplar::Tensor & | invStdDev, | ||
| poplar::program::Sequence & | prog, | ||
| const poplar::DebugContext & | debugContext = {}, |
||
| const poplar::OptionFlags & | options = {} |
||
| ) |
Whiten activations given the mean and standard deviation.
| graph | The graph that the normalisation operation is added to. |
| acts | The input activations that will be whitened. |
| mean | The previously calculated mean to subtract from the activations. Typically calculated using groupNormStatistics(). |
| invStdDev | The previously calculated inverse standard deviation to multiply the activations by. Typically calculated using groupNormStatistics(). |
| prog | The program sequence to add the operation to. |
| debugContext | Optional debug information. |
| options | Group normalisation options. See groupNormalise(). |