Poplar and PopLibs
CTCLoss.hpp File Reference

Support for Connectionist Temporal Classification (CTC) Loss. More...

#include "CTCPlan.hpp"
#include <poplar/Graph.hpp>
#include <poplar/OptionFlags.hpp>
#include <poplar/Program.hpp>

Go to the source code of this file.

Namespaces

namespace  popnn
 Functions used in neural networks.
 

Functions

Plan popnn::ctc::plan (const poplar::Graph &graph, const poplar::Type &inType, const poplar::Type &outType, unsigned batchSize, unsigned maxTime, unsigned maxLabelLength, unsigned numClasses, const poplar::OptionFlags &options={})
 Create a plan for implementing the CTC Loss (gradient) function. More...
 
poplar::Tensor popnn::ctc::createDataInput (poplar::Graph &graph, const poplar::Type &type, const std::size_t batchSize, const std::size_t maxTime, const std::size_t numClasses, const Plan &plan, const poplar::DebugContext &debugContext={})
 Create and map a data input [maxTime, batchSize, numClasses] tensor which the gradient function will use. More...
 
poplar::Tensor popnn::ctc::createLabelsInput (poplar::Graph &graph, const poplar::Type &type, const std::size_t batchSize, const std::size_t maxLabelLength, const Plan &plan, const poplar::DebugContext &debugContext={})
 Create and map a labels input [batchSize, maxLabelLength] tensor which the gradient function will use. More...
 
std::pair< poplar::Tensor, poplar::Tensorpopnn::ctc::calcLossAndGradientLogProbabilities (poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logProbs, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={})
 Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided. More...
 
std::pair< poplar::Tensor, poplar::Tensorpopnn::ctc::calcLossAndGradientLogits (poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logits, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={})
 Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided. More...
 
poplar::Tensor popnn::ctc::calcCTCLossLogProbabilities (poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logProbs, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={})
 Calculate the CTC loss, creating and mapping the result tensor according to the plan provided. More...
 
poplar::Tensor popnn::ctc::calcCTCLossLogits (poplar::Graph &graph, const poplar::Type &outType, const poplar::Tensor &logits, const poplar::Tensor &labels, const poplar::Tensor &dataLengths, const poplar::Tensor &labelLengths, poplar::program::Sequence &prog, const unsigned blankClass, const Plan &plan, const poplar::DebugContext &debugContext={}, const poplar::OptionFlags &options={})
 Calculate the CTC loss, creating and mapping the result tensor according to the plan provided. More...
 

Detailed Description

Support for Connectionist Temporal Classification (CTC) Loss.

Function Documentation

◆ calcCTCLossLogits()

poplar::Tensor popnn::ctc::calcCTCLossLogits ( poplar::Graph graph,
const poplar::Type outType,
const poplar::Tensor logits,
const poplar::Tensor labels,
const poplar::Tensor dataLengths,
const poplar::Tensor labelLengths,
poplar::program::Sequence prog,
const unsigned  blankClass,
const Plan plan,
const poplar::DebugContext debugContext = {},
const poplar::OptionFlags options = {} 
)

Calculate the CTC loss, creating and mapping the result tensor according to the plan provided.

Applies log softmax to logits input.

Parameters
graphThe graph the operation will be added to
outTypeThe data type of the output
logitsThe data input [maxTime, batchSize, numClasses] tensor
labelsThe labels input [batchSize, maxLabelLength] tensor
dataLengthsA tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengthsA tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
progA program sequence to append the operation to
blankClassThe value associated with the blankClass
planThe plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContextOptional debug information
optionsAny implementation/debug options for the operation
Returns
The loss[batchSize] (negative log probability)

◆ calcCTCLossLogProbabilities()

poplar::Tensor popnn::ctc::calcCTCLossLogProbabilities ( poplar::Graph graph,
const poplar::Type outType,
const poplar::Tensor logProbs,
const poplar::Tensor labels,
const poplar::Tensor dataLengths,
const poplar::Tensor labelLengths,
poplar::program::Sequence prog,
const unsigned  blankClass,
const Plan plan,
const poplar::DebugContext debugContext = {},
const poplar::OptionFlags options = {} 
)

Calculate the CTC loss, creating and mapping the result tensor according to the plan provided.

Parameters
graphThe graph the operation will be added to
outTypeThe data type of the output
logProbsThe data input [maxTime, batchSize, numClasses] tensor
labelsThe labels input [batchSize, maxLabelLength] tensor
dataLengthsA tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengthsA tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
progA program sequence to append the operation to
blankClassThe value associated with the blankClass
planThe plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContextOptional debug information
optionsAny implementation/debug options for the operation
Returns
The loss[batchSize] (negative log probability)

◆ calcLossAndGradientLogits()

std::pair< poplar::Tensor, poplar::Tensor > popnn::ctc::calcLossAndGradientLogits ( poplar::Graph graph,
const poplar::Type outType,
const poplar::Tensor logits,
const poplar::Tensor labels,
const poplar::Tensor dataLengths,
const poplar::Tensor labelLengths,
poplar::program::Sequence prog,
const unsigned  blankClass,
const Plan plan,
const poplar::DebugContext debugContext = {},
const poplar::OptionFlags options = {} 
)

Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided.

Prior to performing the gradient calculation, applies log softmax to logits input.

Parameters
graphThe graph the operation will be added to
outTypeThe data type of the gradient output
logitsThe data input [maxTime, batchSize, numClasses] tensor
labelsThe labels input [batchSize, maxLabelLength] tensor
dataLengthsA tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
labelLengthsA tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
progA program sequence to append the operation to
blankClassThe value associated with the blankClass
planThe plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
debugContextOptional debug information
optionsAny implementation/debug options for the operation
Returns
The loss[batchSize] (negative log probability), and gradient [maxTime, batchSize, numClasses] tensor

◆ calcLossAndGradientLogProbabilities()

std::pair< poplar::Tensor, poplar::Tensor > popnn::ctc::calcLossAndGradientLogProbabilities ( poplar::Graph graph,
const poplar::Type outType,
const poplar::Tensor logProbs,
const poplar::Tensor labels,
const poplar::Tensor dataLengths,
const poplar::Tensor labelLengths,
poplar::program::Sequence prog,
const unsigned  blankClass,
const Plan plan,
const poplar::DebugContext debugContext = {},
const poplar::OptionFlags options = {} 
)

Calculate the CTC loss & gradient, creating and mapping the result tensor according to the plan provided.

calcLossAndGradientLogProbabilities options

  • includeSoftmaxGradient (true, false) [=true]

    Whether or not to include LogSoftmax in gradient calculation. To avoid numerical issues, it is recommended to be included. But care must be taken to not include gradient of the LogSoftmax (created external to this function call) twice.

  • zeroInfinity (true, false) [=false]

    Whether to zero infinite losses and the associated gradients. Infinite losses mainly occur when data batches are too short to be aligned to the labels.

    Parameters
    graphThe graph the operation will be added to
    outTypeThe data type of the gradient output
    logProbsThe data input [maxTime, batchSize, numClasses] tensor
    labelsThe labels input [batchSize, maxLabelLength] tensor
    dataLengthsA tensor of shape [batchSize] containing the number of valid timesteps in each data[] batch entry
    labelLengthsA tensor of shape [batchSize] containing the number of valid labels in each labels[] batch entry
    progA program sequence to append the operation to
    blankClassThe value associated with the blankClass
    planThe plan which will specify how the output tensor is to be mapped and how the operation is to be carried out
    debugContextOptional debug information
    optionsAny implementation/debug options for the operation
    Returns
    The loss[batchSize] (negative log probability), and gradient [maxTime, batchSize, numClasses] tensor

◆ createDataInput()

poplar::Tensor popnn::ctc::createDataInput ( poplar::Graph graph,
const poplar::Type type,
const std::size_t  batchSize,
const std::size_t  maxTime,
const std::size_t  numClasses,
const Plan plan,
const poplar::DebugContext debugContext = {} 
)

Create and map a data input [maxTime, batchSize, numClasses] tensor which the gradient function will use.

Mapping is according to the plan provided.

Parameters
graphThe graph the data tensor will be added to
typeThe data type of the tensor to be added to the graph
batchSizeThe size of the batch to be processed at once
maxTimeThe time dimension of the tensor to be created
numClassesThe number of symbols/classes in the "alphabet", including the blankClass
planThe plan which will specify how the tensor is to be mapped
debugContextOptional debug information
Returns
The data input [maxTime, batchSize, numClasses] tensor

◆ createLabelsInput()

poplar::Tensor popnn::ctc::createLabelsInput ( poplar::Graph graph,
const poplar::Type type,
const std::size_t  batchSize,
const std::size_t  maxLabelLength,
const Plan plan,
const poplar::DebugContext debugContext = {} 
)

Create and map a labels input [batchSize, maxLabelLength] tensor which the gradient function will use.

Mapping is according to the plan provided.

Parameters
graphThe graph the labels tensor will be added to
typeThe data type of the tensor to be added to the graph
batchSizeThe size of the batch to be processed at once
maxLabelLengthThe maximum length of any label
planThe plan which will specify how the tensor is to be mapped
debugContextOptional debug information
Returns
The labels input [batchSize, maxLabelLength] tensor

◆ plan()

Plan popnn::ctc::plan ( const poplar::Graph graph,
const poplar::Type inType,
const poplar::Type outType,
unsigned  batchSize,
unsigned  maxTime,
unsigned  maxLabelLength,
unsigned  numClasses,
const poplar::OptionFlags options = {} 
)

Create a plan for implementing the CTC Loss (gradient) function.

CTC Loss options

  • partialsType poplar::Type [=poplar::FLOAT]

    The type to use for partial results.

  • availableMemoryProportion Decimal between 0 and 1 (inclusive) [=0.6]

    The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.

    See also
    [Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU] (https://docs.graphcore.ai/projects/available-memory/) technical note for some practical examples of using availableMemoryProportion
    Parameters
    graphThe graph the operation will be added to
    inTypeThe data type of the probability data input
    outTypeThe data type of the gradient output
    batchSizeThe size of the batch to be processed at once
    maxTimeThe maximum time of any data input to be planned for
    maxLabelLengthThe maximum length of any label to be planned for
    numClassesThe number of symbols/classes in the "alphabet", including the blankClass
    optionsAny implementation/debug options for the operation
    Returns
    plan The plan produced, which will specify how the operation is to be implemented