CTCLoss
#include <popnn/CTCLoss.hpp>
Support for Connectionist Temporal Classification (CTC) Loss.
-
namespace popnn
Functions used in neural networks.
-
namespace ctc
Functions
-
Plan plan(const poplar::Graph &graph, const poplar::Type &inType, const poplar::Type &outType, unsigned batchSize, unsigned maxTime, unsigned maxLabelLength, unsigned numClasses, const poplar::OptionFlags &options = {})
Create a plan for implementing the CTC Loss (gradient) function.
CTC Loss options
partialsType
poplar::Type [=poplar::FLOAT]The type to use for partial results.
availableMemoryProportion
Decimal between 0 and 1 (inclusive) [=0.6]The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.
See also
Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU technical note for some practical examples of using
availableMemoryProportion
- Parameters
graph – The graph the operation will be added to
inType – The data type of the probability data input
outType – The data type of the gradient output
batchSize – The size of the batch to be processed at once
maxTime – The maximum time of any data input to be planned for
maxLabelLength – The maximum length of any label to be planned for
numClasses – The number of symbols/classes in the “alphabet”, including the blankClass
options – Any implementation/debug options for the operation
- Returns
plan The plan produced, which will specify how the operation is to be implemented
-
poplar::Tensor createDataInput(poplar::Graph &graph, const poplar::Type &type, const std::size_t batchSize, const std::size_t maxTime, const std::size_t numClasses, const Plan &plan, const poplar::DebugContext &debugContext = {})
Create and map a data input [maxTime, batchSize, numClasses] tensor which the gradient function will use.
Mapping is according to the plan provided.
- Parameters
graph – The graph the data tensor will be added to
type – The data type of the tensor to be added to the graph
batchSize – The size of the batch to be processed at once
maxTime – The time dimension of the tensor to be created
numClasses – The number of symbols/classes in the “alphabet”, including the blankClass
plan – The plan which will specify how the tensor is to be mapped
debugContext – Optional debug information
- Returns
The data input [maxTime, batchSize, numClasses] tensor
-
poplar::Tensor createLabelsInput(poplar::Graph &graph, const poplar::Type &type, const std::size_t batchSize, const std::size_t maxLabelLength, const Plan &plan, const poplar::DebugContext &debugContext = {})
Create and map a labels input [batchSize, maxLabelLength] tensor which the gradient function will use.
Mapping is according to the plan provided.
- Parameters
graph – The graph the labels tensor will be added to
type – The data type of the tensor to be added to the graph
batchSize – The size of the batch to be processed at once
maxLabelLength – The maximum length of any label
plan – The plan which will specify how the tensor is to be mapped
debugContext – Optional debug information
- Returns
The labels input [batchSize, maxLabelLength] tensor
-
std::pair<poplar::Tensor, poplar::Tensor> calcLossAndGradientLogProbabilities(poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logProbs, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided.
calcLossAndGradientLogProbabilities options
includeSoftmaxGradient
(true, false) [=true]Whether or not to include LogSoftmax in gradient calculation. To avoid numerical issues, it is recommended to be included. But care must be taken to not include gradient of the LogSoftmax (created external to this function call) twice.
zeroInfinity
(true, false) [=false]Whether to zero infinite losses and the associated gradients. Infinite losses mainly occur when data batches are too short to be aligned to the labels.
- Parameters
graph – The graph the operation will be added to
outType – The data type of the gradient output
logProbs – The data input [maxTime, batchSize, numClasses] tensor
labels – The labels input [batchSize, maxLabelLength] tensor
dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengths – A tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
prog – A program sequence to append the operation to
blankClass – The value associated with the blankClass
plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContext – Optional debug information
options – Any implementation/debug options for the operation
- Returns
The loss[batchSize] (negative log probability), and gradient [maxTime, batchSize, numClasses] tensor
-
std::pair<poplar::Tensor, poplar::Tensor> calcLossAndGradientLogits(poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logits, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided.
Prior to performing the gradient calculation, applies log softmax to logits input.
- Parameters
graph – The graph the operation will be added to
outType – The data type of the gradient output
logits – The data input [maxTime, batchSize, numClasses] tensor
labels – The labels input [batchSize, maxLabelLength] tensor
dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengths – A tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
prog – A program sequence to append the operation to
blankClass – The value associated with the blankClass
plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContext – Optional debug information
options – Any implementation/debug options for the operation
- Returns
The loss[batchSize] (negative log probability), and gradient [maxTime, batchSize, numClasses] tensor
-
poplar::Tensor calcCTCLossLogProbabilities(poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logProbs, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Calculate the CTC loss, creating and mapping the result tensor according to the plan provided.
- Parameters
graph – The graph the operation will be added to
outType – The data type of the output
logProbs – The data input [maxTime, batchSize, numClasses] tensor
labels – The labels input [batchSize, maxLabelLength] tensor
dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengths – A tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
prog – A program sequence to append the operation to
blankClass – The value associated with the blankClass
plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContext – Optional debug information
options – Any implementation/debug options for the operation
- Returns
The loss[batchSize] (negative log probability)
-
poplar::Tensor calcCTCLossLogits(poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logits, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})
Calculate the CTC loss, creating and mapping the result tensor according to the plan provided.
Applies log softmax to logits input.
- Parameters
graph – The graph the operation will be added to
outType – The data type of the output
logits – The data input [maxTime, batchSize, numClasses] tensor
labels – The labels input [batchSize, maxLabelLength] tensor
dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengths – A tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
prog – A program sequence to append the operation to
blankClass – The value associated with the blankClass
plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContext – Optional debug information
options – Any implementation/debug options for the operation
- Returns
The loss[batchSize] (negative log probability)
-
Plan plan(const poplar::Graph &graph, const poplar::Type &inType, const poplar::Type &outType, unsigned batchSize, unsigned maxTime, unsigned maxLabelLength, unsigned numClasses, const poplar::OptionFlags &options = {})
-
namespace ctc