CTCInference

#include <popnn/CTCInference.hpp>

Support for Connectionist Temporal Classification (CTC) Beam search decoder.

namespace popnn

Functions used in neural networks.

namespace ctc_infer

Functions

ctc::Plan plan(const poplar::Graph &graph, const poplar::Type &inType, unsigned batchSize, unsigned maxTime, unsigned numClasses, unsigned beamwidth, const poplar::OptionFlags &options = {})

Create a plan for implementing the CTC Beam search inference function.

CTC Beam search inference options

  • partialsType poplar::Type [=poplar::FLOAT]

    The type to use for partial results.

  • availableMemoryProportion Decimal between 0 and 1 (inclusive) [=0.6]

    The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation.

Parameters
  • graph – The graph the operation will be added to

  • inType – The data type of the probability data input

  • batchSize – The size of the batch to be processed at once

  • maxTime – The maximum time of any sequence input

  • numClasses – The number of symbols/classes in the “alphabet”, including the blankClass

  • beamwidth – The number of beams to maintain during beamsearch

  • options – Any implementation/debug options for the operation

Returns

plan The plan produced, which will specify how the operation is to be implemented

poplar::Tensor createDataInput(poplar::Graph &graph, const poplar::Type &type, const std::size_t batchSize, const std::size_t maxTime, const std::size_t numClasses, const ctc::Plan &plan, const poplar::DebugContext &debugContext = {})

Create and map a data input [maxTime, batchSize, numClasses] tensor which the beam search function will use.

Mapping is according to the plan provided.

Parameters
  • graph – The graph the data tensor will be added to

  • type – The data type of the tensor to be added to the graph

  • batchSize – The size of the batch to be processed at once

  • maxTime – The time dimension of the tensor to be created

  • numClasses – The number of symbols/classes in the “alphabet”, including the blankClass

  • plan – The plan which will specify how the tensor is to be mapped

  • debugContext – Optional debug information

Returns

The data input [maxTime, batchSize, numClasses] tensor

std::tuple<poplar::Tensor, poplar::Tensor, poplar::Tensor> beamSearchDecoderLogProbabilities(poplar::Graph &graph, const poplar::Tensor &logProbs, const poplar::Tensor &dataLengths, poplar::program::Sequence &prog, unsigned blankClass, unsigned beamwidth, unsigned topPaths, const ctc::Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})

Calculate the most likely topPaths labels and their probabilities given the input logProbs with lengths dataLengths, creating and mapping the result tensors according to the plan provided.

Parameters
  • graph – The graph the operation will be added to

  • logProbs – The data input [maxTime, batchSize, numClasses] tensor

  • dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each logProbs batch entry

  • prog – A program sequence to append the operation to

  • blankClass – The value associated with the blankClass

  • beamWidth – The number of beams to use when decoding

  • topPaths – The number of most likely decoded paths to return, must be less than or equal to beamWidth

  • plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out

  • debugContext – Optional debug information

  • options – Any implementation/debug options for the operation

Returns

The labelProbs[batchSize, topPaths] (negative log probability with the same type as logProbs), labelLengths[batchSize, topPaths] and decodedLabels [batchSize, topPaths, maxTime] tensors

std::tuple<poplar::Tensor, poplar::Tensor, poplar::Tensor> beamSearchDecoderLogits(poplar::Graph &graph, const poplar::Tensor &logits, const poplar::Tensor &dataLengths, poplar::program::Sequence &prog, unsigned blankClass, unsigned beamwidth, unsigned topPaths, const ctc::Plan &plan, const poplar::DebugContext &debugContext = {}, const poplar::OptionFlags &options = {})

Calculate the most likely topPaths labels and their probabilities given the input logits with lengths dataLengths, creating and mapping the result tensors according to the plan provided.

Prior to performing the beam search, applies log softmax to logits input.

Parameters
  • graph – The graph the operation will be added to

  • logits – The data input [maxTime, batchSize, numClasses] tensor

  • dataLengths – A tensor of shape [batchSize] containing the number of valid timesteps in each logits batch entry

  • prog – A program sequence to append the operation to

  • blankClass – The value associated with the blankClass

  • beamWidth – The number of beams to use when decoding

  • topPaths – The number of most likely decoded paths to return, must be less than or equal to beamWidth

  • plan – The plan which will specify how the output tensor is to be mapped and how the operation is to be carried out

  • debugContext – Optional debug information

  • options – Any implementation/debug options for the operation

Returns

The labelProbs[batchSize, topPaths] (negative log probability with the same type as logits), labelLengths[batchSize, topPaths] and decodedLabels [batchSize, topPaths, maxTime] tensors