14. PopART C++ API

This chapter describes the PopART C++ API.

14.1. Sessions

#include <popart/session.hpp>
class Session

Session is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware.

Subclassed by popart::InferenceSession, popart::TrainingSession

Public Functions

virtual ~Session() = 0

Destructor for the Session class.

std::vector<uint32_t> getRNGState()

Get state of the random number generator.

void setRNGState(const std::vector<uint32_t>)

Set state of the random number generator.

void setRandomSeed(uint64_t seedValue)

Set the value of the random number generator seed.

This method explicitly seeds all random operations. Additionally, this method derives a new state for the random number generator (RNG) from the seed and sets it on the device. This RNG state is used to resolve stochastic rounding. Note that to deterministically store and restore the combined random state for a session, do the following:

C++:

// Store random state (session s0).
auto seed = s0.getRandomSeed();
auto rngState = s0.getRNGState();

// Restore random state (session s1).
s1.setRandomSeed(seed);   // <-- affects RNG state, order important
s1.setRNGState(rngState);

Python:

# Store random state (session s0).
seed = s0.getRandomSeed()
rngState = s0.getRNGState()

# Restore random state (session s1).
s1.setRandomSeed(seed)   # <-- affects RNG state, order important
s1.setRNGState(rngState)

Parameters

seedValue – The value of the seed.

uint64_t getRandomSeed()

Get the value of the random number generator seed.

Calling setRandomSeed() with this value (at a later stage) reinstates the random state logic that seeds random operations.

Returns

The value used to seed current random operations.

void compileAndExport(const std::string &filename)

Compile the graph and export it to a file.

This method will first create a poplar::Graph and compile the poplar::Executable. Next, it will export the executable and PopART metadata to the file. The exported file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

Parameters

filename – The name of the file where the compiled executable and metadata will be saved.

void compileAndExport(std::ostream &out)

Compile the graph and export it to a stream.

This method will first create a poplar::Graph and compile the poplar::Executable. Next, it will export the executable and PopART metadata to the stream. The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

This method automatically creates folders as needed if filename is located in a folder which does not exist.

Parameters

out – The stream that the compiled executable and metadata will be written to.

void saveExecutableToFile(const std::string &filename)

Save a compiled graph to a file.

The file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

This method automatically creates folders as needed if filename is located in a folder which does not exist.

Parameters

filename – The name of the file where the compiled executable and metadata will be saved.

Pre

prepareDevice() must have been called.

void saveExecutableToStream(std::ostream &out)

Save a compiled graph to a stream.

The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

Parameters

out – The stream where the compiled executable and metadata will be written to.

Pre

prepareDevice() must have been called.

void saveExecutable(const std::string &path, bool savePopartMetadata = true, bool saveVariables = true)

Save a compiled graph with additional data to a file.

PopART is able to save its state after the model compilation is complete, so that it can be restored at a later time. To make this possible, it is necessary to save such elements as:

  • a serialised Poplar executable,

  • its associated metadata,

  • tensor data blobs if model parameters have not been frozen (refer to the SessionOptions::constantWeights for more information),

  • a PopART-specific opaque blob to store information only relevant to PopART. This is needed to restore PopART state.

The file will be in the PopEF format. This means that the file can be used to restore the state of the PopART program without recompiling the graph, or run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information. If you want to analyze file structure saved by the function please refer to the PopEF dump tool.

Parameters
  • path – The name of the file or directory where the compiled executable, metadata and variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/executable.popef”. If the file exists, the function will overwrite the old data with the new ones.

  • savePopartMetadata – If you do not need the option to restore the PopART state later, you can set the flag to false to reduce disk space taken up by the file.

  • saveVariables – If you don’t need to save variables (tensors) state, you can set the flag to false if you want to save them later or in a different location. The function will save data consistent with the variables contained within the model.

Pre

prepareDevice() must have been called.

void saveVariables(const std::string &path)

Save all variables to a file.

The function will save data consistent with the variables contained within the model.

The file will be in the PopEF format. If you want to analyze tensors saved by the function refer to the PopEF dump tool.

Parameters

path – The name of the file or directory where the compiled variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/variables.popef”. If the file exists, the function will overwrite the old data with the new ones.

Pre

prepareDevice() must have been called.

void checkInplacingAmbiguity() const

Check for potential inplacing ambiguities.

This method creates an AliasModel object for each graph and runs the Poprithms ambiguity checker on it.

Throws an error if the graph has an inplacing ambiguity and will prompt the user to check the inplacing.

See poprithms::memory::inplace::Graph::AmbiguityStatus on the Poprithms GitHub repo for more on what constitutes an ambiguity.

void loadExecutableFromFile(const std::string &filename)

Load the compiled executable and metadata from a file.

The file must have been created with compileAndExport(const std::string).

Parameters

filename – The name of the file to load the executable and metadata from.

void loadExecutableFromStream(std::shared_ptr<std::istream> in)

Load the compiled executable and from a stream.

The stream must have been created with compileAndExport(std::ostream).

Parameters

in – The shared pointer to the stream to load the executable from.

void prepareDevice(bool loadEngine = true)

Prepare the network for execution.

This will create the poplar::Graph and poplar::Engine.

Parameters

loadEngine – If true, load the engine and connect the streams once the device is ready.

void loadEngineAndConnectStreams()

Load the engine on the device and connect the streams.

This will set up the poplar::Streams.

Note: This call is optional. The engine will implicitly be loaded on the device when required.

void weightsFromHost()

Copy weights from the host to the device.

void buffersFromHost()

Copy buffers from the host to the device.

void weightsToHost()

Copy the weights from the device to the host steam memory.

uint64_t getCycleCount(std::string id = "")

Copy the cycle count tensor from the device to the host.

Parameters

id – The identifier of the cycle count tensor.

void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index = 0)

Connect a Poplar stream with a callback.

This method will be called whenever the stream will be read or was written to by the device. The memory location will only be valid for reading or writing for the duration of the callback.

Parameters
  • streamHandle – The name of the stream to connect to.

  • callback – The callback to be called whenever the stream is to be read or was written to by the device.

  • index – The replica index to connect to, when using replicated graphs. Default=0.

void connectStream(const std::string &streamHandle, void *buffer)

Connect a Poplar stream with a fixed location in memory.

Each time data is copied to the stream, this location will be read and each time data is copied from the stream, this location will be written.

Parameters
  • streamHandle – The handle of the stream to connect to.

  • buffer – The pointer to the memory location.

void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index = 0)

Connect a host function to a callback.

The callback takes two arguments, which point to the locations in memory for each of the function’s input and output arguments, respectively. During a host function call, first the device transfers the input data to the host, then the callback is invoked, and finally the output data is copied back to the device. The memory pointed to by the callback arguments must only be accessed during the duration of the callback.

Parameters
  • functionHandle – The name of the host function.

  • callback – The function to be called whenever new input data is available.

  • index – The replica index to connect to, when using replicated graphs. Default=0.

void run(IStepIO &stepIO, std::string debugName = "")

Run one step.

Read input data from address in stepIO.in.

Write the output data to addresses in stepIO.out.

Parameters
  • stepIO – The input and output data.

  • debugName – A debug string to identify this run in logs.

void run(std::string programHandle, IStepIO &stepIO, std::string debugName = "")

Run one step of a custom program.

Read input data from address in stepIO.in.

Write the output data to addresses in stepIO.out.

Parameters
  • programHandle – The handle of the custom program to run.

  • stepIO – The input and output data.

  • debugName – A debug string to identify this run in logs.

void updateExternallySavedTensorLocations(const std::string &fromLocation, const std::string &toLocation)

Update the tensor locations of tensors in the session’s ONNX model.

A new file will be created at this point, and written to when the ONNX model is saved with a subsequent call to modelToHost().

Parameters
  • fromLocation – All externally saved tensors with location fromLocation will have their location updated to toLocation.

  • toLocation – The updated tensor locations. This must not already exist.

void modelToHost(const std::string &fn)

Write the current model to an ONNX file.

Parameters

fn – The path to file. The path can be absolute or relative. If you plan to run your program in multiple processes simultaneously, you should avoid possible race conditions by writing to different files, for example by using temporary files.

TensorInfo getInfo(TensorId) const

Get the tensor information for a tensor.

Parameters

TensorId – The identifier of the tensor to get the tensor information for.

Returns

The tensor information for the tensor.

bool hasInfo(TensorId) const

Check whether a tensor has information.

Parameters

TensorId – The identifier of the tensor to get the tensor information for.

Returns

true if the tensor with identifier TensorId has tensor information and false if not.

std::set<TensorId> getAllTensorIds() const

Returns the ids of all tensors in the model.

Pre

prepareDevice() must have been called.

std::string getSummaryReport(bool resetProfile = true) const

Retrieve the summary report from from the poplar::Engine.

The options which were passed to the Session constructor will influence the information in the report.

This method may only be called after prepareDevice() has been called.

Parameters

resetProfile – If true, resets the execution profile. Default = true.

Returns

A string containing the report.

std::string getSerializedGraph() const

Retrieve the serialized graph from the poplar::Engine.

A JSON format report is produced.

This method may only be called after prepareDevice() has been called.

Returns

A string containing the serialized graph.

pva::Report getReport() const

Retrieve the graph report from the poplar::Engine.

The options which were passed to the Session constructor will influence the information in the report.

This method may only be called after prepareDevice() has been called.

Returns

The PopVision Analysis report object.

void resetHostWeights(const std::string &model, const bool ignoreWeightsInModelWithoutCorrespondingHostWeight = false)

Reset weights with weights in an ONNX model.

Note that the only differences between the ONNX model and the current model must be the weights. No other differences are allowed.

This method only updates the weights on the host. weightsFromHost() must be called after this method to update the weights on the device.

Parameters
  • model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.

  • ignoreWeightsInModelWithoutCorrespondingHostWeight – If true, do not throw an error if there are initializers in the ONNX model without corresponding initializer tensor(s) in the session’s IR.

void readWeights(const IWeightsIO &weightsIo)

Read the weights from the host stream memory and write to the host.

This method may only be called after weightsToHost() has been called.

Parameters

weightsIo – The weight data that is read from the host stream memory is written to the addresses in weightsIo.out.

void writeWeights(const IWeightsIO &weightsIo)

Write the weights from the host to the IR tensor memory.

This method may only be called after weightsFromHost() has been called.

Parameters

weightsIo – The weight data is written to the addresses in weightsIo.out.

std::string serializeIr(IrSerializationFormat format)

Serialize the IR graph to a string.

Parameters

format – The format to use for serializing.

inline const Ir &getIr() const

Get the IR associated with the Session.

inline const popx::Devicex &getDevice() const

Get the device associated with the Session.

inline popx::Devicex &getDevice()

Get the device associated with the Session.

inline const popx::IrLowering &getIrLowering() const

Get the IR lowering associated with the Session.

inline const popx::Executablex &getExecutable() const

Get the executable associated with the Session.

void broadcastWeights(int rootRank = 0)

Broadcasts the weight from the PopRun instance with index rootRank to all other instances.

Parameters

rootRank – The index of the PopRun instance from which the weights should be broadcasted.

void updateEngineCache()

Update cacheEntries from engine cache directory and update ir::hashMatched_ with the updated cacheEntries.

void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo)

Set the DeviceInfo of the Session.

14.1.1. Training session

#include <popart/session.hpp>
class TrainingSession : public popart::Session

TrainingSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware with training provided by optimizing a loss tensor using an optimizer and automatic differentiation (backpropagation).

Public Functions

~TrainingSession() override

Destructor for the TrainingSession class.

void updateOptimizerFromHost(const Optimizer *optimizer)

Update the optimizer from the host.

This method updates the optimizer and the associated hyperparameters but not the optimizer state tensors.

NOTE: The optimizer parameter has to be compatible with the optimizer passed to the TrainingSession constructor. For example, you cannot call this function with an SDG1 optimizer if you created the session with an SDG0 optimizer. This is because it is not possible to change the IR after a session has been constructed.

Parameters

optimizer – A pointer to a popart::Optimizer.

void copyFromRemoteBuffer(const std::string &buffer, void *w, int repeat_index, unsigned replication_index = 0)

Copy from a remote butter into a user buffer.

This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.

Parameters
  • buffer – The name of the remote buffer to copy from.

  • w – Pointer to a user buffer to copy to.

  • repeat_index – The index in the remote buffer to copy from.

  • replication_index – The replicated graph index when using replicated graphs. Default=0.

void copyToRemoteBuffer(void *w, const std::string &buffer, int repeat_index, unsigned replication_index = 0)

Copy from a user buffer to a remote buffer.

This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.

Parameters
  • w – Pointer to a user buffer to copy from.

  • buffer – The remote buffer to copy to.

  • repeat_index – The index in the remote buffer to copy to.

  • replication_index – The replicated graph index when using replicated graphs. Default=0.

Public Static Functions

static std::unique_ptr<TrainingSession> createFromIr(std::shared_ptr<Ir> ir, std::shared_ptr<DeviceInfo> deviceInfo, const std::string name = DefaultTrainingSessionName)

Create a session for training from an IR.

Parameters
  • ir – The IR to create the session from.

  • deviceInfo – The type of device that this session uses.

  • name – The name of this training session. Default: “training”.

static std::unique_ptr<TrainingSession> createFromOnnxModel(const std::string &model, const DataFlow &dataFlow, const TensorId &loss, const Optimizer &optimizer, std::shared_ptr<DeviceInfo> deviceInfo, const InputShapeInfo &inputShapeInfo = InputShapeInfo(), const SessionOptions &userOptions = SessionOptions(), const Patterns &patterns = Patterns(), const std::string name = DefaultTrainingSessionName)

Create a session for inference from an ONNX model.

Parameters
  • model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.

  • dataFlow – Configuration for the data feeds and fetches.

  • loss – The identifier of the final scalar loss tensor for training.

  • optimizer – The name of an optimizer to use when training.

  • deviceInfo – The type of device that this session uses.

  • inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().

  • userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().

  • patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().

  • name – (Optional) The name of this inference session. Default: “training”.

14.1.2. Inference session

#include <popart/session.hpp>
class InferenceSession : public popart::Session

InferenceSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware, without any automatic differentiation (backpropagation) or optimization.

Public Functions

~InferenceSession() override

Destructor for the InferenceSession class.

void popxlSetEngineIsLoaded(bool isLoaded)

Public Static Functions

static std::unique_ptr<InferenceSession> createFromIr(std::shared_ptr<Ir> ir, std::shared_ptr<DeviceInfo> deviceInfo, const std::string name = DefaultInferenceSessionName)

Create a session for inference from an IR.

Parameters
  • ir – The IR to create the session from.

  • deviceInfo – The type of device that this session uses.

  • name – The name of this inference session. Default: “inference”.

static std::unique_ptr<InferenceSession> createFromOnnxModel(const std::string &model, const DataFlow &dataFlow, std::shared_ptr<DeviceInfo> deviceInfo, const InputShapeInfo &inputShapeInfo = InputShapeInfo(), const SessionOptions &userOptions = SessionOptions(), const Patterns &patterns = Patterns(), const std::string name = DefaultInferenceSessionName)

Create a session for inference from an ONNX model.

Parameters
  • model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.

  • dataFlow – Configuration for the data feeds and fetches.

  • deviceInfo – The type of device that this session uses.

  • inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().

  • userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().

  • patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().

  • name – (Optional) The name of this inference session. Default: “inference”.

14.1.3. Session options

#include <popart/sessionoptions.hpp>
enum class popart::AccumulateOuterFragmentSchedule

Enum type that determines how the operations in the accumulate outer fragment will be scheduled across virtual graphs (only relevant to pipelined modes).

Values:

enumerator Scheduler = 0

Don’t add additional constraints and let the scheduler work it out.

enumerator Serial

Add constraints that ensure ops are executed in virtual graph ID order.

enumerator OverlapCycleOptimized

Try and parallelise ops with different virtual graph IDs as much as possible.

enumerator OverlapMemoryOptimized

Try and parallelise ops with different virtual graph IDs but avoid certain steps that are costly in terms of memory usage.

enum class popart::AutodiffStitchStrategy

Enum type representing a strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph.

Strategies may expose tensors that would otherwise have been internal to the forward graph as outputs of this forward graph.

Values:

enumerator RecomputeMinimal = 0

Recompute any backward graph inputs associated with non-gradient forward graph tensors that are neither inputs nor outputs in the forward graph.

enumerator RecomputeAllNonInputs

Recompute any backward graph inputs associated with non-gradient forward graph tensors that are not inputs in the forward graph.

enumerator AddFwdOutputs

For backward graph inputs associated with non-gradient forward graph tensors that are neither inputs or outputs in the forward graph, add them as outputs to the forward graph.

Note

This strategy is not guaranteed to work for all circumstances. In particular, it is unable to deal with subgraphs of IfOp. Using this setting may therefore result in subsequent exceptions in the Autodiff transform and it is therefore inadvisable to use this as an Autodiff default.

enumerator SafeAddFwdOutputs

Like AutodiffStitchStrategy::AddFwdOutputs except that those backward graph inputs that can’t be stitched with AutodiffStitchStrategy::AddFwdOutputs (that is, by adding outputs to the forward graph) are stitched using the AutodiffStitchStrategy::RecomputeMinimal strategy instead.

This means that this is a safe strategy to use as an Autodiff default.

enumerator N

Number of AutodiffStitchStrategy values.

enum class popart::BatchSerializationBatchSchedule

Enum type that describes how to change the batch serialisation subgraph schedule before outlining.

Note

This setting is experimental and may change.

Values:

enumerator Scheduler = 0

Don’t encourage any particular scheduling for ops within batch subgraphs (leave it to the scheduler) but tell the scheduler to schedule subgraphs in sequence.

enumerator Isomorphic

Encourage all ops within batch subgraphs to be scheduled identically and for each subgraph to be scheduled in sequence (good for outlineability).

enumerator OverlapOnIo

Attempt to put the remote load op for batch N+1 right after the compute phase of batch N.

enumerator OverlapOnCompute

Attempt to put the remote load op for batch N+1 right before the compute phase of batch N.

enumerator N

The number of BatchSerializationBatchSchedule values.

enum class popart::BatchSerializationMethod

Enum type that describes how to apply the batch serialization.

Note

This setting is experimental and may change.

Values:

enumerator UnrollDynamic = 0

Unroll the batch with dynamic slicing.

enumerator UnrollStatic

Unroll the batch with static slicing.

enumerator Loop

Loop over the batch dimension.

enumerator N

The number of BatchSerializationMethod values.

enum class popart::BatchSerializationTransformContext

Enum type that describes when to apply batch serialization.

Note

This setting is experimental and may change.

Values:

enumerator Fwd = 0

Apply batch serialiation before growing the backward pass.

enumerator Bwd

Apply batch serialiation after growing the backward pass.

enumerator N

The number of BatchSerializationTransformContext values.

enum class popart::ExecutionPhaseIOSchedule

Enum type to specify when to load tensors.

Values:

enumerator Preload = 0

Preload tensors in previous phase for use in current phase.

enumerator OnDemand

Load tensors just before they are required.

enumerator N

The number of ExecutionPhaseIOSchedule values.

enum class popart::ExecutionPhaseSchedule

Enum type to specify the order of processing optimizer operations for different weights of the same execution phase.

The steps for phased execution are:

  1. Copy to IO tiles if necessary.

  2. Run collective operations if necessary.

  3. Load optimizer state.

  4. Update optimizer state.

  5. Apply optimizer.

  6. Store updated tensor if necessary.

Values:

enumerator Interleaving = 0

Process above steps for one weight at a time (for example: 123456, 123456, 123456).

The scheduler may interleave these steps.

enumerator Batch

Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange (for example: 333, 111, 222, 444, 555, 666).

enumerator BatchClusteredIO

Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange, and maximise stream copy merges by keeping RemoteLoad/RemoteStore operations clustered (for example: 333, 111, 222, 444, 555, 666).

enumerator N

The number of ExecutionPhaseSchedule values.

enum class popart::GradientTensorTrackingMethod

Enum type to specify the method for selecting gradient tensors whose statistics are to be tracked for the AutomaticLossScale transform.

Values:

enumerator AllNonViewChangingGradientTensors = 0

Track all gradients of non-view-changing gradient tensors.

enumerator ConvAndMatmulGradients

Track all gradients of inputs to MatMul and Convolution ops.

enumerator GradientsOfUserSpecifiedTensors

Track gradients of user-specified tensors.

enumerator N

The number of GradientTensorTrackingMethod values.

enum class popart::Instrumentation

Enum type used to specify an instrumentation type.

Values:

enumerator Outer = 0

Outer loop instrumentation, graph over all IPUs.

enumerator Inner

Inner loop instrumentation, graph per IPU.

enumerator N

The number of Instrumentation values.

enum class popart::IrSerializationFormat

Enum type used to specify a serialization format.

Values:

enumerator JSON

JavaScript Object Notation (JSON).

enum class popart::MeanReductionStrategy

Enum type that specifies when to divide by a mean reduction factor, when doing mean reduction over a sequence of tensors \(t_1, t_2, ..., t_k\).

Values:

enumerator Running = 0

Keep the reduction buffer as the mean of the tensors accumulated so far.

If \(t_1, ..., t_f\) has just been processed, the current accumulator \(s\) is the mean of these values, and the next accumulator update is \(s = \frac{f}{f+1} * s + \frac{1}{f+1} * t_{f+1}\) to keep \(s\) a running mean.

This strategy guarantees \(s \le \max(a_1, ..., a_k)\) throughout the accumulation, therefore it will not overflow, but it is generally slower than MeanReductionStrategy::Post.

enumerator Post

Keep the accumulation factor as the running sum, and divide once by \(k\) at the end of the accumulation.

This strategy will generally be faster than MeanReductionStrategy::Running, but is prone to overflow (especially when using fp16).

enumerator N

The number of MeanReductionStrategy values.

enum class popart::MergeVarUpdateType

Enum type used to specify which VarUpdateOp ops to merge.

Values:

enumerator None = 0

Do not merge VarUpdateOp ops.

enumerator All

Merge all VarUpdateOp ops into as few groups as possible.

This is a good choice when memory is not a constraint.

enumerator AutoLoose

Merge into groups while attempting not to increase maximum variable liveness, and also not slice tensor variables so they will need to be processed by different VarUpdateOp ops.

enumerator AutoTight

Merge into groups, so that VarUpdateOp ops process tensors of exactly SessionOptions::mergeVarUpdateMemThreshold in size.

enumerator N

The number of MergeVarUpdateType values.

enum class popart::RecomputationType

Enum type to specify which ops to recompute in the backward pass when doing auto-recomputation.

Values:

enumerator None = 0

No ops are recomputed (Default).

enumerator Standard

Recompute using algorithm that picks checkpoints to try and minimise max liveness.

enumerator NormOnly

Only Norm ops (+ non-linearities, if following) are recomputed.

enumerator Pipeline

Recompute all forward pipeline stages.

enumerator RecomputeAll

Recompute all ops.

enumerator N

The number of RecomputationTypes values.

enum class popart::SubgraphCopyingStrategy

Enum type that describes how copies for inputs and outputs for subgraphs are lowered.

Currently this only affects subgraphs associated with CallOp ops.

Values:

enumerator OnEnterAndExit = 0

Copy all inputs before the start of the subgraph, copy all outputs after all ops in the subgraph.

With this strategy, subgraphs will always map to a single Poplar function.

enumerator JustInTime

Copy inputs just before they are consumed and copy outputs as soon as they are produced.

With this strategy, subgraphs may be lowered into multiple Poplar functions.

enumerator N

The number of SubgraphCopyingStrategy values.

enum class popart::SyntheticDataMode

Enum type used to specify the data source for input tensors.

Values:

enumerator Off = 0

Use real data.

enumerator Zeros

Input tensors are initialised to all zeros.

enumerator RandomNormal

Input tensors are initialised with a random normal distribution ~N(0,1).

enumerator RandomUniform

Input tensors are initialised with a uniform distribution.

enumerator N

The number of SyntheticDataMode values.

enum class popart::VirtualGraphMode

Enum type used to specify a virtual graph mode.

Values:

enumerator Off = 0

Virtual graphs are not enabled.

enumerator Manual

User must set the popart::Op::virtualGraph attribute on all ops.

enumerator Auto

Use the AutoVirtualGraph transform.

enumerator ExecutionPhases

Virtual graphs are tied to execution phases.

enumerator N

The number of VirtualGraphMode values.

struct AccumulateOuterFragmentSettings

A structure containing accumulate outer fragment settings.

Public Functions

AccumulateOuterFragmentSettings() = default
inline AccumulateOuterFragmentSettings(AccumulateOuterFragmentSchedule schedule_, const std::vector<int> &excludedVirtualGraphs_)

Constructor for AccumulateOuterFragmentSettings.

Parameters
  • schedule_ – Indicate how to schedule the accumulate outer fragment. This setting is experimental and may change. Default: AccumulateOuterFragmentSchedule::Serial

  • excludedVirtualGraphs_ – Indicate to explicitly avoid parallelising the virtual graph IDs. This setting is experimental and may change.

Public Members

AccumulateOuterFragmentSchedule schedule = AccumulateOuterFragmentSchedule::Serial

Indicate how to schedule the accumulate outer fragment.

Note

This setting is experimental and may change.

std::vector<int> excludedVirtualGraphs = {}

Indicate to explicitly avoid parallelising the virtual graph IDs.

Note

This setting is experimental and may change.

struct AutodiffSettings

The settings for the Autodiff transform.

Public Functions

AutodiffSettings() = default

Default constructor for the AutodiffSettings struct.

inline AutodiffSettings(AutodiffStitchStrategy stitchStrategy_)

Constructor for the AutodiffSettings struct.

Parameters

stitchStrategy_ – The strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph. Default: AutodiffStitchStrategy::RecomputeAllNonInputs.

Public Members

AutodiffStitchStrategy stitchStrategy = AutodiffStitchStrategy::RecomputeAllNonInputs

The strategy PopART should use to ensure that all graph inputs of a backward graph are available as either inputs or outputs of the forward graph or gradients of outputs of the forward graph.

Note

This is an experimental option and may change.

struct AutomaticLossScalingSettings

A structure containing user configuration for automatic loss scaling settings.

Note

Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.

Public Functions

AutomaticLossScalingSettings() = default

Default constructor for AutomaticLossScalingSettings.

AutomaticLossScalingSettings(bool enabled_, const nonstd::optional<std::vector<TensorId>> &toTrackTensors_, float binEdgeLocation_, float thresholdUpperCountProportion_, int updatePeriod_, GradientTensorTrackingMethod gradientTensorTrackingMethod_)

Constructor for AutomaticLossScalingSettings.

Parameters
  • enabled_ – Indicate whether to keep track (true) or not (false) of the distribution of gradient tensor elements over the floating point range. Default: false.

  • toTrackTensors_ – An optional list of model tensor names, for which gradient statistics will be collected. If not set, the gradients of all tensors produced by default operations (matmul, conv) will be used.

  • binEdgeLocation_ – The location of the bin edge as a proportion of the absolute numerical range of the tracked gradient tensor elements, in the range [0, 1]. 0 represents the smallest representable value, and 1 the maximum. This is the single bin edge of the histogram that is an input to the loss scale updater algorithm. Default: 0.125.

  • thresholdUpperCountProportion_ – The proportion of the elements in the upper bin above which the loss scale is increased, and below which the loss scale is decreased. Should be in the range [0, 1]. Default: 1e-7.

  • updatePeriod_ – Indicate how often the loss scale update factor should be updated with respect to optimizer steps. Default: 1

  • gradientTensorTrackingMethod_ – The method for selecting gradient tensors whose statistics are to be tracked. Default: GradientTensorTrackingMethod::AllNonViewChangingGradientTensors.

std::size_t hash() const

Public Members

bool enabled = false
float binEdgeLocation = 0.125f
float thresholdUpperCountProportion = 1e-7
nonstd::optional<std::vector<TensorId>> toTrackTensors
int updatePeriod = 1
GradientTensorTrackingMethod gradientTensorTrackingMethod = GradientTensorTrackingMethod::AllNonViewChangingGradientTensors
struct BatchSerializationSettings

A structure containing batch serialization settings.

Public Functions

BatchSerializationSettings() = default

Default constructor for BatchSerializationSettings.

BatchSerializationSettings(int factor_, bool concatOnVirtualGraphChange_, bool concatOnExecutionPhaseChange_, bool concatOnPipelineStageChange_, BatchSerializationTransformContext transformContext_ = BatchSerializationTransformContext::Fwd, BatchSerializationMethod method_ = BatchSerializationMethod::UnrollDynamic, BatchSerializationBatchSchedule batchSchedule_ = BatchSerializationBatchSchedule::Isomorphic)

Constructor for BatchSerializationSettings.

Parameters
  • factor_ – The number of compute batches to split operations into. Default: 0.

  • concatOnVirtualGraphChange_ – Indicate to break batch serialization chains (true) when the virtual graph changes (by concatenating the compute batches to the local batch). Default: true.

  • concatOnExecutionPhaseChange_ – Indicate to break batch serialization chains (true) when the execution phase changes (by concatenating the compute batches to the local batch). Default: true.

  • concatOnPipelineStageChange_ – Indicate to break batch serialization chains (true) when the pipeline stage changes (by concatenating the compute batches to the local batch). Default: true.

  • transformContext_ – An experimental value to control when batch serialization is applied. Default: ::Fwd.

  • method_ – An experimental value to control how batch serialization is applied. Default: BatchSerializationMethod::UnrollDynamic.

  • batchSchedule_ – An experimental value that changes how operations are scheduled. Default: BatchSerializationBatchSchedule::Isomorphic.

Public Members

int factor = 0

The number of compute batches to split operations into.

bool concatOnVirtualGraphChange = true

Break batch serialization chains when the virtual graph changes (by concatenating the compute batches to the local batch).

bool concatOnExecutionPhaseChange = true

Break batch serialization chains when the execution phase changes (by concatenating the compute batches to the local batch).

bool concatOnPipelineStageChange = true

Break batch serialization chains when the pipeline stage changes (by concatenating the compute batches to the local batch).

BatchSerializationTransformContext transformContext = BatchSerializationTransformContext::Fwd

Experimental value to control when batch serialization is applied.

BatchSerializationMethod method = BatchSerializationMethod::UnrollDynamic

Experimental value to control how batch serialization is applied.

BatchSerializationBatchSchedule batchSchedule = BatchSerializationBatchSchedule::Isomorphic

Experimental value that changes how operations are scheduled.

struct ExecutionPhaseSettings

A structure containing ExecutionPhase settings.

Public Functions

ExecutionPhaseSettings() = default

Default constructor for ExecutionPhaseSettings.

inline ExecutionPhaseSettings(int phases_, bool stages_, ExecutionPhaseIOSchedule weightIOSchedule_, ExecutionPhaseIOSchedule activationIOSchedule_, ExecutionPhaseIOSchedule optimizerStateIOSchedule_, ExecutionPhaseIOSchedule accumulatorIOSchedule_, ExecutionPhaseSchedule schedule_)

Constructor for ExecutionPhaseSettings.

Parameters
  • phases_ – The number of execution phases for the whole model. Default=0.

  • stages_ – The number of overlapping stages:

    • 1: Parallel streaming memory, default for 1 IPU per replica.

    • 2: PingPong between 2 IPUs, default for 2 or more IPUs per replica (Default).

  • weightIOSchedule_ – The execution phase IO schedule for weight tensors. Default: ExecutionPhaseIOSchedule::Preload.

  • activationIOSchedule_ – The execution phase IO schedule for activation and gradient tensors. Default: ExecutionPhaseIOSchedule::Preload.

  • optimizerStateIOSchedule_ – An experimental value to control when batch serialization is applied. Default: ExecutionPhaseIOSchedule::OnDemand.

  • accumulatorIOSchedule_ – An experimental value to control how batch serialization is applied. Default: ExecutionPhaseIOSchedule::Preload.

  • schedule_ – An experimental value that changes how operations are scheduled. Default: ExecutionPhaseSchedule::Interleaving.

Public Members

int phases = 0

Number of ExecutionPhases for the whole model.

int stages = 2

Number of overlapping stages.

  • 1: Parallel streaming memory, default for 1 IPU per replica.

  • 2: PingPong between 2 IPUs, default for 2 or more IPUs per replica.

ExecutionPhaseIOSchedule weightIOSchedule = ExecutionPhaseIOSchedule::Preload

The execution phase IO schedule for weight tensors.

ExecutionPhaseIOSchedule activationIOSchedule = ExecutionPhaseIOSchedule::Preload

The execution phase IO schedule for activation and gradient tensors.

ExecutionPhaseIOSchedule optimizerStateIOSchedule = ExecutionPhaseIOSchedule::OnDemand
ExecutionPhaseIOSchedule accumulatorIOSchedule = ExecutionPhaseIOSchedule::Preload
ExecutionPhaseSchedule schedule = ExecutionPhaseSchedule::Interleaving
struct ReplicatedCollectivesSettings

A structure containing settings for replicated collective operations.

Public Functions

ReplicatedCollectivesSettings(bool prepareScheduleForMergingCollectives = false, bool mergeAllReduceCollectives = false, bool mergeReduceScatterCollectives = false, bool mergeAllGatherCollectives = false)

Constructor for the ReplicatedCollectivesSettings struct.

Parameters
  • prepareScheduleForMergingCollectives – Insert constraints into the schedule such that collectives which can be merged occur one right after the other. true to insert constraints, false otherwise. Default: false.

  • mergeAllReduceCollectives – Identify allreduce operations which can be scheduled at the same time, and perform them as one larger operation to better utilize the bandwidth between replicas. true to identify operations, false otherwise. Default: false.

std::size_t hash() const

Public Members

bool prepareScheduleForMergingCollectives = false
bool mergeAllReduceCollectives = false
bool mergeReduceScatterCollectives = false

Identifies reduce-scatter operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.

bool mergeAllGatherCollectives = false

Identifies allgather operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.

struct SessionOptions

A structure containing user configuration options for the Session class.

Public Functions

inline bool explicitPipeliningEnabled() const

Enable explicit pipelining.

Determined from values for enablePipelining, useHostCopyOpsfault and enableExplicitMainLoops.

inline bool implicitPipeliningEnabled() const

Enable implicit pipelining.

Determined from values for enablePipelining, useHostCopyOpsfault and enableExplicitMainLoops.

inline void enableExplicitIR(bool enable)

Enable explicit representations in the IR (code paths).

Enabled if true, otherwise not.

bool shouldDelayVarUpdates() const
int64_t getGlobalReplicationFactor() const

Get the global replication factor.

Returns

  • If enableDistributedReplicatedGraphs is true, then return globalReplicationFactor.

  • If enableReplicatedGraphs is true, then return replicatedGraphCount.

  • otherwise return 1.

unsigned getAccumulationFactor() const

Get the gradient accumulation factor.

Throws an error if gradient accumulation is not enabled (enableGradientAccumulation is false) and the factor (accumulationFactor) is set to >1.

Returns

The accumulation factor.

unsigned getBufferingDepth(const TensorId &id, bool rearrangedOnHost)
bool autoRecomputationEnabled() const

Returns true if auto-recomputation is enabled, false otherwise.

inline SessionOptions()

Constructor for SessionOptions.

Public Members

std::string logDir

A directory for log traces to be written into.

std::set<std::string> dotChecks = {}

When to write .dot files during IR construction.

int firstDotOp = 0

The ops written to the .dot file will be a part of the schedule, controlled by firstDotOp and finalDotOp.

In particular, it will be [max(0, firstDotOp), min(N ops in IR, finalDotOp)).

int finalDotOp = 10000

See firstDotOp.

bool dotOpNames = false

Enable inclusion of the op name in the .dot file (the op type is always exported).

Enabled when true. Default: false.

bool exportPoplarComputationGraph = false

Enable export of Poplar computational graph.

Enabled when true. Default: false.

bool exportPoplarVertexGraph = false

Enable export of Poplar vertex graph.

Enabled when true. Default: false.

bool separateCallOpPdfs = true

Enable creation of separate PDFs for each subgraph when generating PDFs of IR graphs.

Enabled when true. Default: true.

bool enableOutlining = true

Enable outlining.

This identifies and extracts repeated parts of computational graph into subgraphs. Enabled when true. Default: true.

bool enableOutliningCopyCostPruning = true

Enable inclusion of the cost of copying of cached sections should be in the outlining cost model.

Enabled when true. Default: true.

float outlineThreshold = 1.0f

Specify the incremental value that a sub-graph requires, relative to its nested sub-graphs (if any), to be eligible for outlining.

A high threshold results in fewer sub-graphs being outlined, a negative value results in all being outlined. The gross value of a sub-graph is the sum of its constituent ops’ Op::getSubgraphValue() values. To disable outlining, it is better to set enableOutlining to false than to set this value to infinity. The default value of 1.0f results in all high value operations such as convolution being cached, but standalone low value operations such as ReLU will not be.

Default: 1.0f.

float outlineSequenceBreakCost = 10000.0f

Specify the penalty applied to outlining potential sub-graphs if the sub-graph to be created breaks up a sequence of operations that are more efficient (for example for overlapping compute and exchange) when outlined together.

Default: 10000.0f.

SubgraphCopyingStrategy subgraphCopyingStrategy = SubgraphCopyingStrategy::OnEnterAndExit

Specify how copies for inputs and outputs for subgraphs are lowered.

Setting this value to SubgraphCopyingStrategy::JustInTime may save memory at the cost of fragmenting subgraphs into multiple Poplar functions. This may be particularly useful when a number of weight updates are outlined in one subgraph, as it may prevent multiple weight tensors from being live at the same time inside the subgraph.

Default: SubgraphCopyingStrategy::OnEnterAndExit.

RecomputationType autoRecomputation = RecomputationType::None

Enable recomputation of operations in the graph in the backward pass.

This will reduce model size at the cost of computation cycles.

Default: RecomputationType::None (no recomputation).

MergeVarUpdateType mergeVarUpdate = MergeVarUpdateType::None

Enable merging of VarUpdates into groups of VarUpdates, by flattening and concatenating variable tensors and updating tensors.

Default: MergeVarUpdateType::None (no merging).

int64_t mergeVarUpdateMemThreshold = 1000000

Specify the memory threshold for VarUpdateOp merging algorithms.

The MergeVarUpdateType::AutoLoose and MergeVarUpdateType::AutoTight VarUpdateOp merging algorithms have a threshold on the total memory of variable tensors to merge for updating. Defined as total memory in bytes.

Default: 1000000.

int64_t looseThresholdAtPeak = 8000

Specify the threshold at peak used in the calculation of the absolute threshold in the MergeVarUpdateType::AutoLoose VarUpdateOp merging algorithm.

 min(mergeVarUpdateMemThreshold, liveAtPeak - liveCurrently +
looseThresholdAtPeak)

where:

  • liveAtPeak is an estimate of the maximum live memory of the computation; and

  • liveCurrently is an estimate of the live memory where the threshold is being used to determine whether to schedule or postpone a VarUpdateOp.

Default: 80000.

bool rearrangeAnchorsOnHost = true

Enable rearrangement (in memory) of anchor tensors to be done on the host.

Before anchor tensors are streamed from device to host, they are not necessarily arranged in memory as required when they are to be copied from host stream to host. This can be done on the device or on the host.

Default: true (Rearrangement done on host to save memory, but often at the expense of cycles, especially for larger anchor tensors.).

bool rearrangeStreamsOnHost = false

Enable rearrangement (in memory) of stream tensors to be done on the host.

Before stream tensors are streamed from host to device, they are not necessarily arranged in memory as required when they are to be copied from host stream to device. This can be done on the device or on the host.

Default: false (Rearrangement done on device).

bool enablePrefetchDatastreams = true

Enable prefetching for input data streams.

Poplar will speculatively read data for a stream before it is required in order to allow the ‘preparation’ of the data to occur in parallel with compute. Enabled when true. Default: true.

unsigned defaultBufferingDepth = 1

Specify the default buffering depth value used for streams that are not re-arranged on the host.

For tensors that are rearranged on the host, a buffering depth of 1 will always be used. This default value can be overridden via bufferingDepthMap.

unsigned defaultPrefetchBufferingDepth = initialDefaultPrefetchBufferingDepthValue

Deprecated:

This session option name has been deprecated and will be removed in a future release.

Please use the alias defaultBufferingDepth instead.

std::map<TensorId, unsigned> bufferingDepthMap

This mapping can be used to set stream-specific buffering depths.

The buffering depth could be thought of as being the size of a circular buffer that feeds data to and from Poplar. A buffering depth greater than 1 may improve the performance due to increased parallelisation but comes at the cost of increasing the memory footprint. Streams for tensors that have no entry in this map will default to 1 (if a tensor is rearranged on host) or defaultBufferingDepth (if a tensor is not rearranged on host). Specifying a tensor that gets rearranged on host in this map will throw an error.

std::map<TensorId, unsigned> prefetchBufferingDepthMap

Deprecated:

This session option name has been deprecated and will be removed in a future release.

Please use the alias bufferingDepthMap instead.

bool enableNonStableSoftmax = false

Enable the non-stable softmax Poplar function.

By default, the stable softmax Poplar function is used. The input tensor to softmax, \(x\), is preprocessed by subtracting \(max(x)\) from each element before computing the exponentials, ensuring numerical stability. If the inputs to the softmax operations are small enough to not cause overflow when computing the exponential, then the non-stable version can be enabled instead, to increase the speed.

Default: false (not enabled).

bool enableReplicatedGraphs = false

Enable replication of graphs. Default: false (not enabled).

bool enableGradientAccumulation = false

Enable gradient accumulation. Default: false (not enabled).

ReductionType accumulationAndReplicationReductionType = ReductionType::Sum

Specify how gradients are reduced when using gradient accumulation and graph replication.

Default: ReductionType::Sum.

MeanReductionStrategy meanAccumulationAndReplicationReductionStrategy = MeanReductionStrategy::Post

Specify when to divide by a mean reduction factor when accumulationAndReplicationReductionType is set to ReductionType::Mean.

Default: MeanReductionStrategy::Post.

int64_t replicatedGraphCount = 1

Specify the number of model replications.

If enableReplicatedGraphs is true, replicatedGraphCount will set the number of model replications. For example, if the model uses 1 IPU, a replicatedGraphCount of 2 will use 2 IPUs. If the model is pipelined across 4 IPUs, a replicatedGraphCount of 4 will use 16 IPUs in total. Therefore, the number of IPUs requested must be a multiple of replicatedGraphCount. If the training is done across multiple instances of the program then the replicatedGraphCount is the number of replicas for this instance.

int64_t accumulationFactor = 1

Specify the number of micro-batches to accumulate before applying the varUpdate.

VirtualGraphMode virtualGraphMode = VirtualGraphMode::Off

Specify how to place ops on virtual graphs to achieve model parallelism, either manually using model annotations, or automatically.

Default: VirtualGraphMode::Off.

std::vector<float> virtualGraphSplitRatios

Specify split ratios when VirtualGraphModel::Auto enabled.

These values represent split ratios in each device and each of the values is in range (0, 1).

For example, to uniformly split the whole graph on 4 IPUs, the value should be [0.25, 0.25, 025, 0.25].

bool enablePipelining = false

Enable pipelining of virtual graphs. Default: false (not enabled).

SyntheticDataMode syntheticDataMode = SyntheticDataMode::Off

Specify whether to use real or synthetic data to initialize input tensors.

Streaming to/from the host is only enabled for SyntheticDataMode::Off which indicates that real data is being used.

Default: SyntheticDataMode::Off.

bool instrumentWithHardwareCycleCounter = false

Add instrumentation to the program to count the number of device cycles (of a single tile, on a single IPU) that the main program takes to execute.

Expect this to have a small detrimental impact on performance.

std::set<Instrumentation> hardwareInstrumentations = {Instrumentation::Outer}
bool disableGradAccumulationTensorStreams = false

Disable saving of weight gradient tensors off the device.

If true, the weight gradient tensors are not saved off the device when devicex.weightsFromHost() is called.

Note

This option is overridden if syntheticDataMode is not SyntheticDataMode::Off.

Note

Weight gradient tensors that are also optimiser tensors will only be disabled if both disableGradAccumulationTensorStreams and disableOptimizerStateTensorStreams are true.

bool disableOptimizerStateTensorStreams = false

Disable streaming of optimizer tensors.

If true, streaming of optimizer tensors is disabled. This setting can be used to conserve memory if you are not interested in checkpointing the optimizer state.

Note

Weight gradient tensors that are also optimiser tensors will only be disabled if both disableGradAccumulationTensorStreams and disableOptimizerStateTensorStreams are true.

bool compileEngine = true

Setting to only build the Poplar graph but not compile not.

If false, the backend will build the Poplar graph but not compile it into an Engine. In this case, no execution can be performed, and nothing can be transferred to the device. API calls which retrieve information from the graph building stage, such as tile mapping introspection, can still be used.

bool constantWeights = true

Specify an optimization for an inference session to have constant weights.

Set this option to false in order to change the weights with a call to Session::resetHostWeights() after the session has been prepared. This option has no effect on a training session.

Default: true.

bool enableEngineCaching = false

Enable Poplar executable caching.

The file is saved to the location defined with cachePath. The file will be in the PopEF format. This means that it can be used to run inference using the Triton Inference Server because Graphcore provides a backend to it. See the Poplar Triton Backend user guide for more information.

Default: false (not enabled).

bool enableVariablesCaching = true

Enable variable caching.

This means that the caching process will save variables as additional PopEF blobs to the file location defined with cachePath. If PopART will require data for variables (during cache reading process), they will be automatically read from the cache file.

Note, turning this off allows a PopART Session to optimise the host memory it consumes during model runtime. Specifically, weightsToHost() can write directly to the IR tensor data buffers. If the option were on, this would not be safe and the session would have to create separate buffers to write the fetched data to.

Default: true (enabled).

std::string cachePath = "session_cache"

Folder to save the poplar::Executable to.

bool enableFloatingPointChecks = false

Enable that exceptions are thrown when floating point errors occur.

Default: false (not enabled).

bool enableStochasticRounding = false

Enable stochastic rounding.

PopART will set the Poplar engine option target.deterministicWorkers to true if this option is set and to false if it is not set. Adding a value for “target.deterministicWorkers” to SessionOptions::engineOptions overrides this behaviour.

Default: false (not enabled).

bool _enableRngStateManagement = false
ExecutionPhaseSettings executionPhaseSettings

Configuration settings for execution phases.

AccumulateOuterFragmentSettings accumulateOuterFragmentSettings

Configuration setting for operations in the accumulate outer fragment.

bool explicitRecomputation = false

Enable explicit recomputation.

Default: false (not enabled).

NumIOTiles numIOTiles

Number of IPU tiles dedicated to IO.

bool aliasZeroCopy = false

Enable zero-copy for subgraphs.

BatchSerializationSettings batchSerializationSettings

Configuration setting for batch serialization.

AutodiffSettings autodiffSettings

Configuration settings for the autodiff transform.

bool delayVarUpdates = true

Options to delay variable updates as much as possible.

bool scheduleNonWeightUpdateGradientConsumersEarly = false
bool enableFullyConnectedPass = true

Enable the global fullyConnectedPass option for matmuls.

See also

poplin::matMul(poplar::Graph, poplar::Tensor, poplar::Tensor, poplar::program::Sequence, poplar::Type, poplar::DebugContext, poplar::OptionFlags, matmul::PlanningCache).

bool enableSerializedMatmuls = true

Enable/disable the serializing of matmuls.

std::string partialsTypeMatMuls

Set the partials type globally for matmuls.

Can be overridden individually with Builder.setPartialsType(). Valid values are "float" and "half". By default, this is not set, so no global partials type is imposed.

bool enableStableNorm = false

If true, computes the mean first and subtracts the activations from it before computing the variance.

The implementation with this flag set to true is slower than when set to false. The stable version requires the first order moment to be estimated and applied to the sample set before the second order central moment is calculated.

std::map<std::string, std::string> engineOptions

Poplar engine options.

std::map<std::string, std::string> convolutionOptions

Poplar convolution options.

std::map<std::string, std::string> lstmOptions

Poplar LSTM options.

std::map<std::string, std::string> matmulOptions

Poplar matmul options.

std::map<std::string, std::string> reportOptions

Poplar reporting options.

std::map<std::string, std::string> gclOptions

GCL options.

ExperimentalSettings experimentalSettings

Configuration setting for custom transform applier.

std::vector<std::string> customCodelets

List of codelet files (with file extension) to be added to the Poplar graph.

See the Poplar documentation for poplar::Graph for more information.

std::vector<TensorId> updatableNamedBuffers

List of model named buffers that can be updated with call to copyNamedBuffersToDevice().

This allows to update just a subset of model weights instead of all or them as it happens with copyWeightsToDevice() call.

std::string customCodeletCompileFlags

Compile flags for the custom codelets.

For example -g to generate debug info. See the Poplar documentation for poplar::Engine for more information.

double timeLimitScheduler = 1e9

The maximum allowed time (in seconds) that can be spent searching for a good graph schedule before a solution must be returned.

int64_t swapLimitScheduler = static_cast<int64_t>(1e9)

The maximum number of improving steps allowed by the scheduling algorithm before a solution must be returned.

std::string serializedPoprithmsShiftGraphsDir = {}

The directory to serialize Poprithms graphs to.

PopART uses Poprithms for scheduling PopART graphs. The Poprithms graphs created for scheduling can be optionally serialised (written to file). If serializedPoprithmsShiftGraphsDir is empty, then the graphs will not be serialised. The names of serialization files will be poprithms_shift_graph_i.json for the lowest non-existing values of i. The directory must already exist, PopART will not create it.

std::string kahnTieBreaker = "greedy"

Specify which method is used to control how ops are scheduled.

The initial scheduling is done with Kahn’s algorithm. When several ops are free to be scheduled, this controls which method is used.

Options are described in the Poprithms KahnTieBreaker enum.

size_t transitiveClosureOptimizationThreshold = {100000}

Specify the transitive closure optimization threshold.

The transitive closure optimization pass can significantly accelerate the scheduler. It does not, in general, affect the final schedule returned. It is run between initialization with Kahn’s algorithms and the shifting swaps. The transitive closure optimization pass is O(nOps^2) and so should not be used for extremely large graphs. If a graph is above this threshold, the transitive closure optimization pass is not run.

bool decomposeGradSum = false

Enable replacement of single sums of partial gradients with a tree of additions.

This can reduce max liveness at the cost of extra cycles. A typical use case for this would be if a large weight tensor is used as an input to many operations.

Default: false (not enabled).

ReplicatedCollectivesSettings replicatedCollectivesSettings

Control the behavior of different collective operations.

bool enableDistributedReplicatedGraphs = false

Enable training with Poplar replicated graphs across multiple PopART instances.

Default: false (not enabled).

int64_t globalReplicationFactor = 1

The total number of replicas in a multi-instance, replicated-graph training session (this should be left as the default value (1) if distributed replicated graphs are disabled).

This value includes local replication.

int64_t globalReplicaOffset = 0

The first replica index that this PopART instance is running.

bool groupHostSync = false

Specify to group the streams from the host to the device at the beginning of the schedule, and the streams from the device to the host at the end of the schedule.

This trades off memory usage for speed.

When true

, tensors will stay live for longer.

Default:

false (not enabled).

Note

This setting has no effect when useHostCopyOps is enabled (true).

bool strictOpVersions = true

Enable strict op version checks.

Strict op version checks will throw an error if the exact version of an op required for the model opset is not supported. Turning this check off will cause PopART to fall back to the latest implementation of the op that is supported.

Default:

true (enabled).

Warning

Turning off these checks may cause undefined behaviour.

bool opxAliasChecking = false

Enable running Opx checks to verify that IR tensor aliasing information corresponds to the lowered Poplar tensor aliasing.

Default: false (not enabled).

bool opxModifyChecking = false

Enable running Opx checks to verify that IR tensor modification information corresponds to the lowered Poplar tensor modifications.

Default: false (not enabled).

bool useHostCopyOps = false

Enable use of IR graph operations for data and anchor streams.

Default: false (not enabled).

bool enableEfficientOverlapIOTopoCons = false

Enable simplified and equivalent overlapIO constraints.

Suppose we have the N bins in each of three stage(8 for before loop /7 for insdie loop /6 for after loop), and L ops for each bins, vallina implementaiton of overlapio creates topocons of complexity O(N*N*L*L).

To make sure InitOps in each step are scheduled before HostLoadOps, we only need to keep topo constrains in each bin and let the last of op of each bin Bin0 is scheduled before the first op of Bin1 next to Bin0. Then total complexity O(N*N*L*L) is reduced to (N*L).

Default: false (not enabled).

bool enableLoadAndOffloadRNGState = false

Enable load and offload of device RNG state from host.

Default: false (not enabled).

TensorLocationSettings activationTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}

Tensor location settings for activation/gradient tensors.

TensorLocationSettings weightTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}

Tensor location for weight tensors.

TensorLocationSettings optimizerStateTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}

Tensor location for optimizer state tensors.

TensorLocationSettings accumulatorTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}

Tensor location for gradient accumulator tensors.

std::map<TensorId, TensorLocation> tensorLocationSettingsOverride

Override tensor location for specific tensors by setting tensor locations for specific tensor ID values.

AutomaticLossScalingSettings automaticLossScalingSettings

Settings to enable and configure the automatic loss scaling behaviour when training.

Note

Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.

DeveloperSettings developerSettings

Settings for developers to configure testing and benchmarking.

bool enableSupportedDataTypeCasting = true

Enable casting to supported data types.

If enabled (true), casts any tensor of unsupported data types to supported data types when lowering to Poplar. Currently, this implies casting:

  • INT64 -> INT32

  • UINT64 -> UINT32 The cast will throw an error for incompatible data types and over/underflows, and will warn about narrowing casts.

Default: true (enabled).

bool enableExplicitMainLoops = false

Enable explicit main loop transformation, and disable implicit training loops.

Note

This will be deprecated and enabled by default.

bool groupNormStridedChannelGrouping = false

Enable fast math mode for group norms.

Group norms have a fast math mode which changes the implementation to run faster on IPU but as a consequence is incompatible with other implementations (so for running trained weights on host). The default (false) is to use the correct, but slightly slower mode.

std::function<void(int, int)> compilationProgressLogger

Callback function used to indicate PopART compilation progress.

The function should not block. All calls to the callback function will be made from the main thread so blocking in the callback will block compilation from progressing.

If this logger is not set then compilation progress will be printed on the info channel.

Param int

The progress value.

Param int

The maximum value for the progress.

int compilationProgressTotal = 100

Total progress ticks until compilation complete.

bool enableMergeExchange = true

Enable merging remote and host IO operations to facilitate IO overlap.

true to enable, otherwise false.

Default=true.

bool ensureFp32LossScaleTensor = false

Ensure that the loss scale tensor is fp32 and that this is combined with fp16 activations as late as possible to produce the first fp16 activation gradients.

This makes it possible to choose a loss scale value greater than max(fp16). This is also recommended when automatic loss scaling is enabled. Only compatible with models that have an fp16 loss scale tensor. true ensures that the loss scale tensor is fp32.

Default: false.

bool enableInplaceAmbiguityChecking = false

Enable creation of an AliasModel object for each graph and run the Poprithms ambiguity checker on it.

This throws an error if the graph has a potential inplacing ambiguity.

See poprithms::memory::inplace::Graph::AmbiguityStatus for more info on what constitutes an ambiguity.

If set to true, AliasModel object is created for each graph and the the Poprithms ambiguity checker is run on it. No ambiguity checking is performed if this option is set to false (default). However inplace fallbacks will occur if necessary.

bool createImplicitPipeliningFwdOnlyProgram = false

Deprecated:

Create a custom program containing the forward pipeline only.

bool throwIfLog2ScaleTensorNotInRange = true

If set to true, throw a Poplar error if any fused ops that consume a log2 scale tensor receive a log2 scale tensor value not in the integer range [-32, 32).

If set to false, no error is thrown. However, note that this may lead to undefined behaviour if the value of the log2 scale is outside the range.

bool enableConstantFoldingOfMultipleConsumers = true

If set to false, disable constant folding on ops if any input have multiple consumers.

Default=true.

bool useLoopCandidateCreator = false

Use loop candidate creator for constant if one exsits.

Default=false.

bool stashAllTensorsInferencePipeline = false

Stash all tensors when inference pipeline.

Default=false.

struct ExperimentalSettings

Public Members

std::map<std::string, std::vector<std::string>> customTransformApplierSettings

Custom transform applier settings.

Enable to insert custom transform sequence at predefined checkpoint. Multiple checkpoint names and transform names can be passed for different model configurations.

The predefined checkpoint names are: FWD0: Initial IR immediately after lowering from ONNX to the IR.

FWD1: After the pre-alias patterns have been applied to FWD0.

BWD0: After growing the backward pass (including the optimiser step). Note this happens before optimiser decomposition, so the optimiser will appear as a single special op rather than the many ops that implement it.

PREALIAS: After pre-alias transforms have been applied to BWD0.

MAINLOOPS: After the MainLoops transform has been applied. This transform adds explicit loop ops to the IR for device iterations (batches per step) and gradient accumulation.

FINAL: The final IR after preparation.

The transform names are defined by PopART and users.

For example to execute ‘Transform A’ and ‘Transform B’ at ‘Fwd0’ checkpoint and exectue ‘Transform C’ at ‘Fwd1’ checkpoint:

{ “Fwd0”: [ “Transform A”, “Transform B” ], “Fwd1”: [ “Transform C” ] }

Note

This setting is experimental for inference and may change.

bool createHostTransferableTensorWithOffset = false

Accumulate the created tensors bytes, rotate the start tile of the next tensor to balance the tile mapping.

Especially when there are a lot of small input tensors, enable it can avoid mapping on tile0 all the time.

Default=false.

class NumIOTiles

A wrapper class for the SessionOptions::numIOTiles option that permits any int value and has an ‘unassigned’ state.

Public Functions

NumIOTiles()

Constructor.

NumIOTiles(int numIOTiles)

Constructor.

Parameters

numIOTiles – The number of IPU tiles dedicated to IO.

bool operator==(const int &rhs) const

Compare with int.

operator int() const

Auto convert to int.

NumIOTiles &operator=(const int &x)

Assign value using int.

struct TensorLocationSettings

A structure containing user configuration for cache/offloading settings.

Public Functions

TensorLocationSettings() = default

Constructor.

TensorLocationSettings(TensorLocation location_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)

Constructor.

Parameters
  • location_ – The tensor location information.

  • minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.

  • minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.

TensorLocationSettings(TensorStorage storage_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)

Constructor.

Parameters
  • storage_ – The tensor storage information.

  • minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.

  • minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.

Public Members

TensorLocation location = TensorLocation()

The default tensor location for this tensor type.

int minElementsForOffChip = 2

The minimum number of elements below which offloading won’t be considered.

int minElementsForReplicatedTensorSharding = 8192

A minimum number of elements below which replicated tensor sharding won’t be considered.

#include <popart/variablesettings.hpp>
class VariableSettings

A class to dictate behaviour of variables and reductions of such across multiple graphs.

Public Functions

void verify()

Runs test to see if the VariableSettings are invalid, and throws an error if so.

const CommGroup getSharedVariableDomain() const
Returns

the CommGroup sharedVariableDomain of this VariableSettings.

ReplicaGrouping getReplicaGrouping(unsigned numReplicas) const
Parameters

numReplicas – The number of replicas in the IR this is used in.

Returns

the ReplicaGrouping domain of this VariableSettings.

bool isUsingCommGroup() const
Returns

whether the VariableSettings were initialised using a CommGroup or a stride.

CommGroupType getCommGroupType() const
Returns

the CommGroupType. The value of this is invalid if VariableSettings::isUsingCommGroup returns false.

unsigned getStride() const
Returns

the stride. The value of this is invalid if VariableSettings::isUsingCommGroup returns true.

unsigned getGroupSize() const
Returns

the replica group size.

inline VariableRetrievalMode getRetrievalMode() const
Returns

the VariableRetrievalMode retrievalMode of this VariableSettings.

VariableSettings()

“Default” constructor, defaults CommGroup to [All, 0] and retrievalMode to OnePerGroup.

VariableSettings(CommGroup sharedVariableDomain_)

Defaults VariableRetrievalMode to OnePerGroup.

VariableSettings(VariableRetrievalMode retrievalMode_)

Defaults CommGroup to [All, 0].

VariableSettings(CommGroup sharedVariableDomain_, VariableRetrievalMode retrievalMode_)

Entirely custom VariableSettings.

VariableSettings(unsigned stride, unsigned groupSize)
VariableSettings(unsigned stride, unsigned groupSize, VariableRetrievalMode retrievalMode)
unsigned numReplicasReturningVariable(unsigned replicaCount) const

Calculate the number of replicas that will return this variable.

Parameters

replicaCount – Number of global replicas.

Returns

Number of variables returned.

unsigned getGroupCount(unsigned replicaCount) const
Parameters

replicaCount – The replicationFactor of the graph.

Returns

The number of groups given the replicaFactor and the VariableSettings.

unsigned getStride(unsigned replicaCount) const
Parameters

replicaCount – The replicationFactor of the graph.

Returns

The stride between each member of a group.

unsigned getRealGroupSize(unsigned replicaCount) const

Because CommGroup’s don’t have a defined group-size if the type is All or None, this function will return a group-size that is always accurate, based on replicas.

Parameters

replicaCount – The replication factor

Returns

The actual number of replicas in a group

unsigned getGroupRepresentative(unsigned group) const

Get the default first member of a group.

Parameters

group – The group to return the representative for.

Returns

The representative replica of this group.

Shape shapeOnReplica(Shape full_shape, unsigned replicaCount, const TensorId name) const

The shape Onnx reads holds an extra outer dimension in certain cases, where the outer dimension represents the number of returning replica variables.

This function takes an Onnx full-shape and removes the outer dimension safely (ie. checks if the outer dimension matches an expected outer dimension). A quick-function to avoid duplicate code.

Parameters
  • full_shape – The shape as presented by Onnx.

  • replicaCount – The local replication factor, used to calculate the return factor.

  • name – The TensorId of the function, used to give good error feedback.

Returns

The shape of the data on the replica.

Shape shapeOnHost(Shape replica_shape, unsigned replicaCount) const

Takes the shape of a tensor on a replica and returns it’s full ONNX shape.

This is the inverse operation to shapeOnReplica

Parameters
  • replica_shape – The shape of the data on a replica.

  • replicaCount – The local replication factor, used to calculate the return factor.

Returns

The shape as presented by Onnx.

std::vector<std::vector<std::int64_t>> groups(unsigned replicaCount) const

This function returns a set of vectors where each vector contains all the replicaId’s of the replicas with a sharedVariableDomain given the variableSettings and the replicaCount.

Parameters

replicaCount – The local replication factor

Returns

A set of sets, such that set.at(a).set(b) is member nr. b of group a, and set.size() is the number og groups and set.at(A).size() is the size of the group.

bool operator==(const VariableSettings &other) const

Compare two variable-settings.

Parameters

otherVariableSettings to compare these settings to.

Returns

True if all internal elements are the same

bool operator!=(const VariableSettings &other) const

Compare two variable-settings.

Parameters

otherVariableSettings to compare these settings to.

Returns

False if all internal elements are the same

enum class popart::VariableRetrievalMode

Enum type that describes how to retrieve variables from the replicas.

Each replica is in a group defined by the VariableSettings::sharedVariableDomain. Replicas within a group have variables initialized with the same values.

Values:

enumerator OnePerGroup = 0

Returns one variable per group (defined by the VariableSettings::sharedVariableDomain CommGroup), automatically returns the first replica of each group, where first means the one with the lowest replica ID.

enumerator AllReduceReplicas

As OnePerGroup, but performs an AllReduce among the replicas in the same group according to VariableSettings::sharedVariableDomain !!! CURRENTLY UNSUPPORTED.

enumerator AllReplicas

Returns all replica Weights.

#include <popart/commgroup.hpp>
class CommGroup

Class to specify sub-groups of replicas.

Examples of derived sub-groups:

  • IPU-link domain sub-rack:

type == Consecutive && replicaGroupSize == 64/replica-size/N

where N is a power of two and replicaGroupSize > 1.

  • Complete IPU-link domain / full rack:

type == Consecutive && replicaGroupSize == 64/replica-size

  • Using GW-links only:

type == Orthogonal && replicaGroupSize == numberOfIpuLinkDomains

Public Functions

CommGroup()

Default CommGroup constructor.

Sets type to CommGroupType::All and replicaGroupSize to 0.

inline CommGroup(CommGroupType type, unsigned groupSize)

Construct CommGroup.

Parameters
  • groupType – The replica group type.

  • groupSize – The replica group size.

explicit CommGroup(const ReplicaGrouping &grouping)

Construct CommGroup from a ReplicaGrouping.

Parameters

grouping – The replica grouping.

ReplicaGrouping toReplicaGrouping(unsigned numReplicas) const

Convert this CommGroup to a ReplicaGrouping.

Parameters

numReplicas – The number of replicas to pass to create the replica grouping with.

Returns

The replica grouping.

bool operator==(const CommGroup &other) const
bool operator!=(const CommGroup &other) const

Public Members

CommGroupType type = CommGroupType::All

Replica group type.

unsigned replicaGroupSize = 0

Replica group size.

enum class popart::CommGroupType

PopART equivalent of GCL CommGroupType.

Each of these enumeration constants has a corresponding GCL CommGroupType value.

Values:

enumerator All = 0

All replicas viewed as one group, replica group size is ignored.

enumerator Consecutive

Groups are consecutive in replicas.

If there are N replicas denoted {0, ... N-1} and the group size is k, then there are N/k groups of size k as {0, 1, ... k-1}, {k, ... 2k-1} ... {N-k-1, ... N-1}.

enumerator Orthogonal

Groups are sliced orthogonal to the replica ordering.

If there are N replicas denoted {0, ... N-1} and the group size is k, then there are m = N/k groups of size k as {0, m, 2m, ...}, {1, m+1, 2m+1, ...} ... {m-1, 2m-1, ... N-1}.

enumerator None

Each replica is in its own group; the replica group size is ignored.

enumerator N

Number of values.

14.2. Data input and output (IStepIO)

#include <popart/istepio.hpp>
class IStepIO

An abstract base class through which input and output data is passed to a Session (see Session::run).

Data is passed via buffers. In the case of buffers returned by IStepIO::in, PopART reads from these buffers. In the case of IStepIO::out, PopART writes to these buffers. The IStepIO::inComplete() and IStepIO::outComplete() functions are called by PopART to signal it is done with an input or output buffer.

An IStepIO implementation should conceptually implement a rolling queue of active buffers for each input and output tensor. Every successful call to IStepIO::in should yield a new data buffer for PopART to read from and add it to the head of the conceptual queue. Conversely, every call to IStepIO::inComplete() should be taken to mean that the buffer at the tail-end of the queue is no longer being used by PopART. This buffer is removed from the conceptual queue.

Note that a IStepIO::in call with the prefetch flag set is only considered successful when it returns data.

Output works analogously to input.

The expected total number of input (or output) buffers that are ‘completed’ for a tensor in one Session::run call is bps \(\times\) SessionOptions::accumulationFactor \(\times\) SessionOptions::replicatedGraphCount, where bps is the number of batches per call to Session::run (this is a value captured by the DataFlow instance passed to the Session instance).

Note, however, that there may be additional ‘incomplete’ calls to IStepIO::in and IStepIO::out.

Furthermore, the number of input (or output) buffers that may be ‘incomplete’ at a given time for a given tensor should not normally be more than SessionOptions::bufferingDepth \(\times\) SessionOptions::replicatedGraphCount, but this bound is not guaranteed.

EXAMPLE: Suppose a session is configured such that the total expected number of input buffers is 6 and these are input buffers for a tensor with ID t with 100 elements. The associated input calls in IStepIO may look like this if SessionOptions::bufferingDepth is 3:

in("t", 100, false) -> Give buffer[0] to PopART.
in("t", 100, true) -> Give buffer[1] to PopART.
in("t", 100, true) -> Give buffer[2] to PopART.
inComplete("t", 100) -> buffer[0] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[3] to PopART.
inComplete("t", 100) -> buffer[1] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[4] to PopART.
inComplete("t", 100) -> buffer[2] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[5] to PopART.
inComplete("t", 100) -> buffer[3] is no longer required and can be reused.
in("t", 100, true) -> No data available, return nullptr.
inComplete("t", 100) -> buffer[4] is no longer required and can be reused.
inComplete("t", 100) -> buffer[5] is no longer required and can be reused.

Subclassed by popart::StepIOCallback, popart::StepIOGeneric< ARRAY_TYPE, ACCESSOR_TYPE, ArrayInfoT >, popart::StepIOGeneric< IArray, StepIONS::IArrayAccessor, IArray & >

Public Functions

virtual ~IStepIO() = default

Destructor for IStepIO.

virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, const bool isBroadcast = false) = 0

Request a new input data buffer.

The memory in this buffer is available for use in PopART until the corresponding inComplete() call.

Note

: Failing to provide a valid data buffer will result in a runtime failure if prefetch is set to false.

Parameters
  • id – The ID of the tensor to return data for.

  • numElements – The number of elements in the tensor.

  • prefetch – If set to true the inability to provide data is not considered an error. If false, it is considered an error if no data can be provided.

Returns

The input buffer for this tensor (or nullptr on failure) returned as a ConstVoidData object.

virtual void inComplete(TensorId id, int64_t numElements, const bool isBroadcast = false) = 0

Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.

Parameters
  • id – The ID of the tensor to return data for.

  • numElements – The number of elements in the tensor.

virtual MutableVoidData out(TensorId id, int64_t numElements) = 0

Request a new output data buffer.

The memory in this buffer is available for use in PopART until the corresponding inComplete() call and will be modified in-place.

Note

Failing to provide a valid data buffer will result in a runtime failure.

Parameters
  • id – The ID of the tensor to return data for.

  • numElements – The number of elements in the tensor.

Returns

The output buffer for this tensor returned as a MutableVoidData object.

inline virtual void outComplete(TensorId)

Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.

Parameters
  • id – The ID of the tensor to return data for.

  • numElements – The number of elements in the tensor.

inline void enableRuntimeAsserts(bool b)

Enable or disable runtime asserts.

If runtime asserts are enabled, then a check that the input and output buffers have the correct number of elements is performed. As Session.run() is called multiple times during a user’s session, the check is only performed in the first call to Session.run(), under the assumption that the user is unlikely to change the size of buffers between runs.

Parameters

b – The setting to enable runtime asserts (true) or disable runtime asserts (false).

inline bool runtimeAssertsEnabled() const

Check if runtime asserts are enabled.

Returns

true if runtime asserts are enabled, otherwise false.

virtual void assertNumElements(const popx::Executablex&) const = 0

Check number of elements.

This check is performed when runtimeAssertsEnabled() is true.

Parameters

Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.

#include <popart/stepio.hpp>
class StepIO : public popart::StepIOGeneric<IArray, StepIONS::IArrayAccessor, IArray&>

Class to provide a Session object with input and output data.

Public Functions

inline StepIO(std::map<TensorId, IArray&> inputs, std::map<TensorId, IArray&> outputs)

Constructor for StepIO.

Parameters
  • inputs – The input data.

  • outputs – The output data.

class StepIOCallback : public popart::IStepIO

Class that implements the IStepIO interface using user-provided callback functions.

The IStepIO interface contains a number of pure virtual member functions through which PopART receives buffers to read data from and buffers to write data to. StepIOCallback inherits from IStepIO and implements those member functions by delegating the logic to the callback functions passed in the constructor. This gives the user full control as to how data buffers are provisioned.

See IStepIO for more details on the expected behaviour of the callbacks.

Public Types

using InputCallback = std::function<ConstVoidData(TensorId, bool)>

Callable object that implements IStepIO::in().

using InputCompleteCallback = std::function<void(TensorId)>

Callable object that implements IStepIO::inComplete().

using OutputCallback = std::function<MutableVoidData(TensorId)>

Callable object that implements IStepIO::out().

using OutputCompleteCallback = std::function<void(TensorId)>

Callable object that implements IStepIO::outComplete().

Public Functions

inline StepIOCallback(InputCallback inputCallback, InputCompleteCallback inputCompleteCallback, OutputCallback outputCallback, OutputCompleteCallback outputCompleteCallback)

Construct a StepIOCallback object.

Parameters
inline virtual void assertNumElements(const popx::Executablex&) const

Check number of elements.

This check is performed when IStepIO::runtimeAssertsEnabled() is true.

Parameters

Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.

virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, bool) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the inputCallback parameter passed to the constructor.

This function should not be called directly.

virtual void inComplete(TensorId id, int64_t numElements, bool) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the inputCompleteCallback parameter passed to the constructor.

This function should not be called directly.

virtual MutableVoidData out(TensorId id, int64_t numElements) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the outputCallback parameter passed to the constructor.

This function should not be called directly.

virtual void outComplete(TensorId id) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the outputCompleteCallback parameter passed to the constructor.

This function should not be called directly.

class IWeightsIO

A virtual class for accessing pointers to the data required to perform a training step.

Subclassed by popart::WeightsIO

Public Functions

virtual ~IWeightsIO() = default

Destructor for IWeightsIO.

virtual bool contains(TensorId) const = 0

Check if the WeightsIO instance contains the weights for a specific tensor.

Parameters

TensorId – The ID of the tensor to look for weights for.

Returns

true if the WeightsIO instance contains weights for the tensor, false otherwise.

virtual MutableVoidData weight(TensorId) const = 0

Retrieve weights for a specific tensor.

Parameters

TensorId – The ID of the tensor to retrieve weights for.

Returns

The weights.

class WeightsIO : public popart::IWeightsIO

Class representing weights.

Public Functions

~WeightsIO() override = default

Destructor for WeightsIO.

virtual bool contains(TensorId) const final

Check if the WeightsIO instance contains the weights for a specific tensor.

Parameters

TensorId – The ID of the tensor to look for weights for.

Returns

true if the WeightsIO instance contains weights for the tensor, false otherwise.

virtual MutableVoidData weight(TensorId) const final

Retrieve weights for a specific tensor from the WeightsIO object.

Parameters

TensorId – The ID of the tensor to retrieve weights for.

Returns

The weights.

void insert(TensorId, MutableVoidData)

Insert weights for a specific tensor into the WeightsIO object.

Parameters
  • TensorId – The ID of the tensor to insert weights for.

  • MutableVoidData – The weights to insert.

struct IArrayAccessor

Structure to help with accessing the data in IArray objects.

Public Static Functions

static inline void *getDataPointer(IArray &array)

Get pointer to the data.

Parameters

array – The IArray object.

Returns

A pointer to the data contained in the IArray object.

static inline size_t getArraySize(const IArray &array)

Get the number of data elements.

Parameters

array – The IArray object.

Returns

The number of data elements.

static inline DataType getArrayDataType(IArray &array)

Get the data type of the data.

Parameters

array – The IArray object.

Returns

The data type of the data.

static inline size_t getArrayRank(IArray &array)

Get the rank of the data array.

Parameters

array – The IArray object.

Returns

The rank of the data array.

static inline int64_t getArrayDim(IArray &array, size_t index)

Get the size of the data at a specific location.

Parameters
  • array – The IArray object.

  • index – The index of the data element in the IArray object.

Returns

The size of the data at the specific location.

#include <popart/stepio_generic.hpp>
template<typename ARRAY_TYPE, typename ACCESSOR_TYPE, typename ArrayInfoT>
class StepIOGeneric : public popart::IStepIO

Subclassed by popart::StepIO

Public Functions

inline void assertNumElements(const popx::Executablex &exe) const final
inline TensorInfo getTensorInfo(ARRAY_TYPE &array) const
template<typename T>
inline T get(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, bool advance_, std::string mapName)
template<typename T>
inline void advance(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, std::string mapName)
inline ConstVoidData in(TensorId id, int64_t numElements, bool, bool) final
inline void inComplete(TensorId id, int64_t numElements, bool) final
inline MutableVoidData out(TensorId id, int64_t numElements) final
struct ArrayInfo

Public Members

ArrayInfoT array
int64_t offset
#include <popart/iarray.hpp>
class IArray

Subclassed by popart::NDArrayWrapper< T >

Public Functions

inline virtual ~IArray()
virtual void *data() = 0
virtual DataType dataType() const = 0
virtual std::size_t rank() const = 0
virtual int64_t dim(size_t index) const = 0
virtual std::size_t nelms() const = 0
virtual const Shape shape() const = 0

14.3. Tensors

#include <popart/tensor.hpp>
class Tensor : public popart::Vertex

Public Functions

Tensor(TensorId, TensorType, Graph&, const DebugContext& = {})
Tensor(TensorId, VariableSettings, Graph&, const DebugContext& = {})
Tensor(TensorId, TensorType, VariableSettings, Graph&, const DebugContext& = {})
inline std::string str() const final
virtual std::unique_ptr<Tensor> clone(Graph &graph_) const
TensorType tensorType() const
std::string tensor_type() const
void setTensorType(TensorType)
inline ReplicatedStreamMode getReplicatedStreamMode() const
inline void setReplicatedStreamMode(const ReplicatedStreamMode &mode)
void setTensorLocationInfo(TensorLocation&, std::pair<RemoteBufferId, RemoteBufferIndex> &remoteBufferInfo)
std::set<PipelineStage> getPipelineStages() const
Op *getProducerUnsafe() const
Op *getProducer() const
void setProducer(Op*)
void resetProducer(Op*)
bool hasProducer() const
bool isGraphInput() const
InIndex getGraphInputIndex() const
bool isGraphOutput() const
OutIndex getGraphOutputIndex() const
bool isLoopInput() const
bool isImplicitLoopInput() const
bool isExplicitLoopInput() const
bool isLoopTripCounter() const
bool isUnmodifiable() const
bool isCheckpointTensor() const
bool isImplicitRecomputeTensor() const
bool isRestoreInplaceTensor() const
bool idIncludesPrefix(const std::vector<std::string>&) const
bool isOptimizerTensor() const
bool isRemoteArgTensor() const
bool isRandomSeedTensor() const
bool isOptimizerStateTensor() const
bool isAccumulatorTensor() const
bool isHostLoadTensor() const

Is this tensor produced by a HostLoad Op or MultiExchangeOp with HostLoad descriptor?

Returns

true if producer is a HostLoad Op or MultiExchangeOp with HostLoad descriptor false otherwise.

bool isWeightTensor() const
bool isAnchored() const
bool isRootAnchor() const
bool hasTensorData() const
TensorData *tensorData()
const TensorData *tensorData() const
bool anyAlias(std::function<bool(Tensor*)> predicate) const
bool anyAliasFor(std::function<bool(Tensor*)> predicate, const AliasModel &popMem) const
void setTensorDataFromCopyOf(const void *src, std::size_t size)
void setTensorDataFromViewOf(void *src, std::size_t size)
void setTensorDataByEmplaceOf(std::vector<char> &&data)
void setTensorData(const TensorData &td)
void setTensorData(TensorData &&td)
std::vector<Op*> associatedOps() const
inline Graph &getGraph()
inline const Graph &getGraph() const
Ir &getIr()
const Ir &getIr() const
bool hasVirtualGraphId() const
VGraphId getVirtualGraphId() const
VGraphId getVirtualGraphIdUnsafe() const
VGraphIdAndTileSet getVirtualGraphIdAndTileSet(std::set<OpId> &visited) const
VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe() const
VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe(std::set<OpId> &visited) const
int getBatchAxis() const
bool consumersAllPreLoss() const
bool isModified(bool considerLoopInput = true) const

Check if any of the consumers modify this tensor.

Parameters

considerLoopInput – If explicit loop inputs should be considered as being modified. If false, only operations modifying the tensor inplace will be considered.

Returns

True if the tensor is modified, otherwise false.

bool isAliased() const

Check if any of the consumers alias this tensor.

Returns

True if the tensor is aliased to any output, otherwise false.

view::Regions modifiedRegionsByOps(std::vector<Op*> ops, Aliases &aliases) const
view::Regions modifiedRegionsByOps(std::vector<OpId> opIds, Aliases &aliases) const
std::set<Op*, POpCmp> getInplaceModifiers() const

Find operations that modify a tensor.

Returns

All operations that (direct and indirectly) modify this tensor

std::set<Op*, POpCmp> getInplaceModifiersFor(const AliasModel *popMem) const

Find operations that modify a tensor with the given poprithm graph.

Returns

All operations that (direct and indirectly) modify this tensor

std::vector<char> getDataViaGraphTraversal() const
inline const popart::DebugInfo &getDebugInfo() const
inline void setVariableUpdateType(VariableUpdateType type)

Members of old subclass VariableTensor class VariableTensor : public Tensor {.

inline VariableUpdateType getVariableUpdateType() const
inline void setCopyFromTensor(TensorId value)
inline TensorId getCopyFromTensor()
inline VariableSettings getVariableSettings() const
Returns

The VariableSettings of this Variable

std::vector<int64_t> returnedShape(unsigned replicationFactor)

Returns the shape necessitated by IO.

Parameters

replicationFactor – The replication factor

Returns

the shape of the tensor, considering replica groups

void verifyMutableVoidInfo(const TensorInfo mutableVoidInfo, unsigned replicationFactor)

Check that the info of a mutableVoidData object matches the expectations set by the TensorInfo and VariableSettings.

Throws an error if there is a mismatch.

Parameters
  • mutableVoidInfo – The data of the MutableVoidInfo with the same id as this tensor

  • replicationFactor – The replicationFactor of this instance

void setPreparedVGraphIdAndTileSet()

Set the preparedVGraphIdAndTileSet.

Public Members

TensorId id
Consumers consumers
TensorInfo info
TensorLocationInfo tensorLocationInfo
InputSettings inputSettings
enum class popart::TensorType

Values:

enumerator ActGrad = 0
enumerator Const
enumerator Stream
enumerator Unknown
enumerator Variable
enumerator N
enum class popart::VariableUpdateType

Values:

enumerator None = 0
enumerator Gradient
enumerator Copy
#include <popart/tensorinfo.hpp>
enum class popart::DataType

There is a one-to-one correspondence between popart::DataTypes and ONNX_NAMESPACE::TensorProto_DataTypes, which is equivalent to decltype(ONNX_NAMESPACE::TensorProto().data_type()).

Values:

enumerator UINT8 = 0
enumerator INT8
enumerator FLOAT8_143
enumerator FLOAT8_152
enumerator UINT16
enumerator INT16
enumerator INT32
enumerator INT64
enumerator UINT32
enumerator UINT64
enumerator BOOL
enumerator FLOAT
enumerator FLOAT16
enumerator BFLOAT16
enumerator DOUBLE
enumerator COMPLEX64
enumerator COMPLEX128
enumerator STRING
enumerator UNDEFINED
class DataTypeInfo

Public Functions

DataTypeInfo(DataType type__, int nbytes__, bool isFixedPoint__, std::string name__, std::string lcasename__)
DataType type() const
const int &nbytes() const
const std::string &name() const
const std::string &lcasename() const
bool isFixedPoint() const
class TensorInfo

Public Functions

TensorInfo(DataType, const Shape&)

Create TensorInformation based on data type and shape.

Parameters
  • data_type – - The data type.

  • shape – - The actual shape of the tensor.

TensorInfo(DataType data_type, const Shape &shape, const Shape &meta_shape)

Create TensorInformation based on data type, shape and meta shape.

Parameters
  • data_type – - The data type.

  • shape – - The actual shape of the tensor.

  • meta_shape – - The meta shape of the tensor, which can for example be used to store the original tensor shape before replicated tensor sharding was applied.

TensorInfo(std::string data_type, std::string shape)
TensorInfo(std::string data_type, const Shape&)
explicit TensorInfo(const ONNX_NAMESPACE::TensorProto&)
explicit TensorInfo(const ONNX_NAMESPACE::TypeProto&)
void set(const ONNX_NAMESPACE::TensorProto&)
void set(const ONNX_NAMESPACE::TypeProto&)
TensorInfo() = default
void set(DataType)
void set(DataType, const Shape&)
void set(DataType, const Shape&, const Shape&)
const Shape &shape() const
const Shape &metaShape() const
std::vector<size_t> shape_szt() const
inline Rank rank() const
inline int64_t nelms() const
int64_t nbytes() const
inline int64_t dim(int i) const
inline std::vector<int> strides(const std::vector<long> &shape)

Get the strides of the tensor, that is the number of bytes to step in each dimension when traversing an array in memory.

See https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html

Parameters

shape – The on-host ONNX shape of a tensor. This is different from this->shape(), which gives the on-replica shape of a tensor

Returns

std::vector<int> The strides vector.

DataType dataType() const
const std::string &data_type() const
const std::string &data_type_lcase() const
void append(std::ostream&) const
bool isSet() const
bool operator==(const TensorInfo&) const
bool operator!=(const TensorInfo&) const
Shape shapeFromString(const std::string &s) const
ONNX_NAMESPACE::TypeProto getOnnxTypeProto() const
const DataTypeInfo *getDataTypeInfo() const

Public Static Functions

static std::string npOutDataTypeExceptionMessage(const TensorInfo &i0, const TensorInfo &i1, const std::string &debugName)
#include <popart/tensorindex.hpp>
class TensorIndexMap

Public Functions

TensorIndexMap() = default
~TensorIndexMap()
void insert(int, Tensor*)
void reset(int, Tensor*)
void erase(int)
void clear()
bool contains(Tensor*) const
Tensor *tensor(int)
const Tensor *tensor(int) const
TensorId id(int) const
bool hasIndex(int) const
const std::vector<int> &indices(Tensor*) const
const std::map<Tensor*, std::vector<int>, PTensorCmp> &indicesMap() const
const std::map<int, Tensor*> &tensorMap() const
const std::vector<Tensor*> tensors() const
std::map<int, TensorId> tensorIdMap() const
std::map<TensorId, int> idMap() const
int n() const
void append(std::stringstream&, std::string prefix, int max_id_length) const
void setInfoIfIndex(const TensorInfo&, int index)
std::vector<TensorId> getSerialised() const
int maxIdLength() const
std::map<int, Shape> getIndexShapeMap()
int minIndex() const
int maxIndex() const
#include <popart/tensorlocation.hpp>
enum class popart::ReplicatedTensorSharding

Enum type to specify whether to shard tensors over replicas.

Values:

enumerator Off = 0

Don’t shard tensors over replicas.

enumerator On = 1

Do shard tensors over replicas.

enumerator N = 2

Number of values.

class TensorLocation

Class that describes the memory characteristics of one or multiple tensors.

See also: SessionOptions.

Public Functions

TensorLocation()

Equivalent to calling TensorLocation(TensorStorage::Undefined, TileSet::Compute, TileSet::Compute, ReplicatedTensorSharding::Off)

TensorLocation(TensorStorage storage)

Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, ReplicatedTensorSharding::Off)

TensorLocation(TensorStorage storage, ReplicatedTensorSharding replicatedTensorSharding)

Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, replicatedTensorSharding)

TensorLocation(TensorStorage storage, ReplicatedTensorSharding replicatedTensorSharding, CommGroup shardingDomain)

Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, replicatedTensorSharding, shardingDomain)

TensorLocation(TensorStorage storage, TileSet loadTileSet, TileSet storageTileSet, ReplicatedTensorSharding replicatedTensorSharding)

Construct a TensorLocation from parameters.

Parameters
  • storage – The memory location of the tensor(s).

  • loadTileSet – The tiles through which the tensor(s) are loaded onto the chip.

  • storageTileSet – The tiles on which the tensor(s) are stored.

  • replicatedTensorSharding – Whether to apply replicated tensor. sharding.

TensorLocation(TensorStorage storage, TileSet loadTileSet, TileSet storageTileSet, ReplicatedTensorSharding replicatedTensorSharding, CommGroup shardingDomain)

Construct a TensorLocation from parameters.

Parameters
  • storage – The memory location of the tensor(s).

  • loadTileSet – The tiles through which the tensor(s) are loaded onto the chip.

  • storageTileSet – The tiles on which the tensor(s) are stored.

  • replicatedTensorSharding – Whether to apply replicated tensor. sharding.

  • shardingDomain – GCL communication group across which to shard the tensor. Perpendicular replicas will not shard, and reduce gradients normally (via AllReduce). Defaults to sharding across all replicas.

TensorLocation(std::vector<int64_t> serialized)
bool operator==(const TensorLocation &rhs) const
bool operator!=(const TensorLocation &rhs) const
std::vector<int64_t> serialize() const
bool isRemote() const

Public Members

TensorStorage storage

The memory location of the tensor(s).

TileSet loadTileSet

The tiles through which the tensor(s) are loaded onto the chip.

TileSet storageTileSet

The tiles on which the tensor(s) are stored.

ReplicatedTensorSharding replicatedTensorSharding

Whether to apply replicated tensor sharding (RTS) or not.

CommGroup shardingDomain

The GCL comm groups across which to shard the tensor.

enum class popart::TensorStorage

Enum type that determines where a tensor is stored.

Values:

enumerator OnChip = 0

Store the tensor in on-chip memory.

enumerator OffChip = 1

Store the tensor in streaming memory.

enumerator N = 2

Number of values.

enum class popart::TileSet

Enum type to specify a set of tiles.

Values:

enumerator Compute = 0

The set of tiles designated for compute operations.

enumerator IO = 1

The set of tiles designated for IO operations.

enumerator Undefined = 2

Undefined (no) tile set.

enumerator N = 3

Number of values.

14.4. Optimizers

#include <popart/optimizer.hpp>
class Optimizer

Interface for describing an Optimizer and, internally, how to grow the optimiser step for each weight.

  • The end-user facing interface constructed by the user to describe what kind of optimiser to use.

  • Then also used internally by the Ir to grow the optimiser step for each weight.

  • Stores OptimizerValues for optimizer parameters like learning rate, loss scaling, etc.

    See also

    OptimiserValue.

  • Optimizer stores the values for each weight - they can have different values. There is a “default” for all weights, then you can specify specific values for specific weights. This is encapsulated by an OptimizerValueMap, which is a sparse map from weight to value, with unspecified values implying the default.

    See also

    OptimizerValueMap.

  • At runtime, the user can dynamically update the Optimizer, e.g. by setting new OptimizerValues. validReplacement determines whether the new Optimizer is interchangable with the one the Ir was built for. For example, trying to replace an SGD Optimizer with an Adam Optimizer would throw.

Subclassed by popart::Adam, popart::Adaptive, popart::SGD

Public Functions

virtual ~Optimizer() = default

  • Optimizer class has a two-part initialisation. The ctor, used by the end-user, and setFactorsFromOptions called by the Ir to finish initialisation once we have all the relevant information during Ir preparation.

  • Some key methods used by the Ir to grow optimiser step for each weight are createOp, getInputIds, optimizerInputs.

  • If the OptimizerValue is const, no Ir tensor for that value is created and the VarUpdateOp created for that weight will not have the optional input for that tensor. The Opx of the VarUpdateOp will emit poplar code that uses the provided value directly.

    If the OptimizerValue is not const, an Ir tensor for that value is created and the VarUpdateOp created for that weight will have the optional input for that tensor. The tensor will be a stream tensor, so that it can be updated later from host. The tensor will be streamed an initial value of the OptimizerValue’s value.

  • It is common for Optimizer

    implementations to make use of “compound

    scalars”. Take for example the SGD0 weight update equation: w <- w * (1 - lr * (1 - dm) * wd) - g * (lr * (1 - dm) / ls) w is the weights and g is the grads. lr, dm, wd, ls are all the “atomic scalars”. These are the scalars/hyperparameters of the

    Optimizer that the user can set using OptimizerValues, as described above.

    Multiple atomic scalars appear in expressions together, and will be operated on together before being used by an Op that also consumes a tensor (in this case the weights or grads). For SGD0, they can be grouped as follows:

    w <- w * {1 -  lr * (1 - dm) * wd} -  g * { lr * (1 - dm) / ls }
             ^^^^^^^^^^^^^^^^^^^^^^^^^        ~~~~~~~~~~~~~~~~~~~~~~
                        |                               |
       weight decay scale factor 0                      |
                                               scaled learning rate 0
    

    We call wdsf0 and slr0 the “compound scalars”.

    We can statically precompute the OptimizerValues for these compound scalars using the OptimizerValues of the atomic scalars. This makes the Ir simpler, as we now have only:

    w <- w * wdsf0 - g * slr0
    

    The CompoundScalarHelpers are used to precompute the compound scalar values.

    If any of the composite atomic scalars are non-const, the compound scalar is non-const.

    See also

    compoundscalarhelper.hpp

Optimizer(OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings, const DebugContext &debugContext)
Optimizer(const Optimizer&) = default
virtual void validReplacement(const Optimizer &other) const
virtual OptimizerType type() const = 0
virtual std::string type_s() const = 0
virtual std::unique_ptr<Optimizer> clone() const = 0
virtual void resetTensorData(Tensor&) const = 0
virtual void setTensorData(Tensor&) const = 0
virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const = 0
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const = 0

Returns the TensorIds of the input tensors to the VarUpdateOp this optimiser will create for the given weight .

Specifically, The TensorId at index i will be the id of the input tensor at InIndex i of the VarUpdateOp. If the input is an OptimizerValue, if it is const, then “” will be returned, else the relevant reservered prefix for that OptimizerValue will be used, followed by the weight id. The prefixes are defined in tensornames.hpp, for example reservedDefaultWeightDecayScaleFactor0Prefix or reservedSpecificScaledLearningRate1Prefix (note there are different prefixes depending on if the weight has a specific or default value for that OptimizerValue).

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const = 0
inline const OptimizerValue &lossScaling() const
inline float getLossScalingVal() const
float getFinalLossScalingVal() const
virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const = 0
virtual void setFactorsFromOptions(const SessionOptions&)
bool gradientAccumulationEnabled() const
bool meanReductionEnabled() const
bool postMeanAccumulationEnabled() const
bool postMeanReplicationEnabled() const
int64_t getReplicatedGraphCount() const
int64_t getAccumulationFactor() const
bool meanGradientAccumulationEnabled() const
inline const std::vector<ClipNormSettings> &getClipNormSettings() const
virtual bool hasSpecific(const Tensor &w) const = 0
virtual bool hasSpecific() const = 0
virtual size_t hash() const
inline DebugContext getDebugContext() const

Public Static Functions

static TensorId getLossScalingTensorId(DataType)
enum class popart::OptimizerType

Types of optimizers.

Values:

enumerator SGD = 0
enumerator Adam
enumerator Adaptive
enumerator NTYPES
enum class popart::OptimizerReductionType

Reduction mode when doing data-parallel training over replicated graphs.

Depending on the optimizer used and its configuration, this option describes how the reduction of gradients over replicas will occur. For example, directly on the gradient, on the gradient accumulator, or on the momentum. See the documentation of individual optimizers for more information.

Values:

enumerator None = 0

No replicated graph reduction.

enumerator GradReduce

Gradient reduction (every iteration, after a weight’s gradient is produced)

enumerator AcclReduce

Momentum reduction (SGD1, after the gradient accumulation loop, if applicable)

enumerator AccumReduce

Accumulator reduction (Adam/SGD2 + gradient accumulation, after the gradient accumulation loop)

enum class popart::WeightDecayMode

Values:

enumerator Decay

Weight decay (e.g. AdamW)

enumerator L2Regularization

L2 regularization (e.g. PyTorch-like Adam)

#include <popart/optimizervalue.hpp>
class OptimizerValue

A class used to represent values of hyper parameters.

Public Functions

OptimizerValue() = default

Equivalent to OptimizerValue(0, false).

inline OptimizerValue(float v)

Equivalent to OptimizerValue(v, true).

inline OptimizerValue(float v, bool c)

Constructor.

Parameters
  • v – The current value of the hyper parameter.

  • c – A boolean flag to indicate whether the parameter will remain at this value forever (true) or may change over time (false).

inline OptimizerValue(std::pair<float, bool> x)
inline float val() const
inline bool isConst() const
void validReplacement(const OptimizerValue &rhs) const
bool operator==(const OptimizerValue &rhs) const
#include <popart/optimizervaluemap.hpp>
class OptimizerValueMap

Public Functions

inline OptimizerValueMap(OptimizerValue g)
OptimizerValue get(const TensorId &id) const
void insertSpecific(const TensorId&, OptimizerValue)
inline bool hasSpecific(const TensorId &id) const
inline bool hasSpecific() const
inline OptimizerValue getDefault() const
void validReplacement(const OptimizerValueMap &rhs) const
inline const std::map<TensorId, OptimizerValue> &getSpecifics() const

14.4.1. Stochastic Gradient Descent (SGD)

#include <popart/clipnormsettings.hpp>
class ClipNormSettings

A data structure used to represent a maximum value constraint on one or more weights.

This is passed to the optimizer on construction.

Public Types

enum class Mode

Values:

enumerator ClipSpecifiedWeights
enumerator ClipAllWeights

Public Functions

ClipNormSettings(const std::vector<TensorId> &weightIds_, float maxNorm_)

DEPRECATED This will be removed from a future release.

Constructor.

Parameters
  • weightIds_ – The weight tensor IDs that this constraint applies to.

  • maxNorm_ – The maximum permissible value.

const std::vector<TensorId> &getWeightIds() const
float getMaxNorm() const
Mode getMode() const
bool operator==(const ClipNormSettings&) const
bool operator!=(const ClipNormSettings &other) const

Public Members

std::vector<TensorId> weightIds
float maxNorm

Public Static Functions

static ClipNormSettings clipWeights(const std::vector<TensorId> &weightIds_, float maxNorm_)
static ClipNormSettings clipAllWeights(float maxNorm_)
#include <popart/sgd.hpp>
class SGD : public popart::Optimizer

Stochastic Gradient Descent (SGD) optimizer.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The SGD optimizer has the following state for each weight:

  • velocity ( \(v\))

The SGD optimizer has the following hyper parameters:

  • learning rate ( \(\text{lr}\))

  • momentum ( \(\text{mm}\))

  • weight decay ( \(\text{wd}\))

  • dampening ( \(\text{dm}\))

  • velocity scaling ( \(\text{vs}\))

  • loss scaling ( \(\text{ls}\))

  • nesterov

  • clip norm settings

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see SGD::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first updates the optimizer state as follows:

\[ v' := v * \text{mm} + (1 - \text{dm}) * (g + \text{wd} * w) \text{ \ . } \]

Following the update of the optimizer state the optimizer uses said state to update the weight:

if nesterov is True:

\[ g' := g + \text{wd} * w + \text{mm} * v' \text{ \ . } \]
\[ w' := w - \text{lr} * g' \text{ \ . } \]
else:
\[ w' := w - \text{lr} * v' \text{ \ . } \]

In addition to the above, the velocity scaling hyper parameter is a scaling factor that can provide improved numerical stability by ensuring the values stored in the optimizer state, \(v\), are scaled by this value. When using this parameter PopART will automatically deal with the artificially scaled velocity value during the weight update and other hyper parameters do not need to be adjusted).

In addition, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.

Finally, it is possible to add clip norm settings for this optimizer. These clip norms compute the L2 norm for a group of weights and adds a scalar term to the weight update that effectively divides it by the norm (or a constant value that is provided as part of the clip norm, which ever is greater).

See the SGD notes in optimizer.hpp for a more detailed and comprehensive derivation of the SGD optimizer step in PopART.

Subclassed by popart::ConstSGD

Public Functions

SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, OptimizerValue nesterov, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

Parameters
  • defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.

  • lossScaling – The loss scaling value to use.

  • nesterov – Option to enable Nesterov momentum. Defaults to false.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

  • sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.

  • accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • debugContext – Optional debug context.

SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

Parameters
  • defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.

  • lossScaling – The loss scaling value to use.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

  • sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.

  • accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • debugContext – Optional debug context.

SGD(const std::map<std::string, std::pair<float, bool>> &params, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

EXAMPLE:

SGD({{"defaultLearningRate", {0.02, false}},
    {"defaultMomentum", {0.6, true}}});

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

This will create an SGD Optimizer which has a constant momentum of 0.6 and a changeable learning rate initially of 0.02. All OptimizerValues not present in the map will take values from the getUnset* functions.

Parameters
  • params – A parameter map where the keys are one or more of "defaultLearningRate", "defaultWeightDecay", "defaultMomentum", "defaultDampening", "defaultVelocityScaling", "lossScaling" or `”nesterov”. The map’s values are pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter because default values will be used where parameters are missing.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

  • sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.

  • accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.

  • debugContext – Optional debug context.

inline SGD()

Default constructor Creates SGD with default scalars (equivalent to getUnset<scalar>() methods), and other default parameters of main constructor.

SGD(const SGD&) = default

Copy constructor.

~SGD() = default
inline virtual OptimizerType type() const final
inline virtual std::string type_s() const final
inline SGDAccumulatorAndMomentum getSGDAccumulatorAndMomentum() const
virtual std::unique_ptr<Optimizer> clone() const final
virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final

Returns the VarUpdateOp for the given weight .

If no gradient accumulation of momentum, this will be a SGD0VarUpdateOp. Else, if getSGDAccumulatorAndMomentum() == ::Combined, this will be an SGD1ComboOp, else if getSGDAccumulatorAndMomentum() == ::CombinedSGD2ComboOp, an SGD2ComboOp

.

The required compound scalar OptimizerValues for the

VarUpdateOp wil be computed and passed to the Op. See the SGD notes above this class for how they are derived. Recall that if non-const, the VarUpdateOp will take an input Tensor for the compound scalar.

See also

Optimizer::createOp

The OptimizerReductionType of the Op is derived as follows: No replication => None Replication, no grad acc => GradReduce Replication, grad acc, SGD1 => AcclReduce Replication, grad acc, SGD2 => AccumReduce See the SGD notes above this class for why this is.

If SGD2, the DataType of the accum and accl1 tensors passed to the SGD2ComboOp will be as set in the SGD constructor. Recall DataType::UNDEFINED means use the same as the weight.

An SGD1ComboOp will later be decomposed by SGD1Decompose

pattern into a series of Ops and Tensors that implement the SGD1 optimiser step.

An SGD12ComboOp will later be decomposed by

SGD2Decompose pattern into a series of Ops and Tensors that implement the SGD2 optimiser step.

See also

SGD1Decompose

See also

SGD2Decompose

virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final

smm1 and wdsf0 have the same data type as the weight . Everything else

virtual void validReplacement(const Optimizer &other) const final
virtual void resetTensorData(Tensor&) const final
virtual void setTensorData(Tensor&) const final
float getStoredValue(const TensorId &optId) const

Tensor “opt” has an id, which it uses to match a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue momentum, OptimizerValue dampening, OptimizerValue velocityScaling, OptimizerValue nesterov)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • learningRate – The learning rate value to use for this specific weight.

  • weightDecay – The weight decay value to use for this specific weight.

  • momentum – The momentum value to use for this specific weight.

  • dampening – The dampening value to use for this specific weight.

  • velocityScaling – The velocity scaling value to use for this specific weight.

  • nesterov – Option to enable Nesterov momentum. Defaults to false.

void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • params – A parameter map where keys are one of "learningRate", "weightDecay", "momentum", "dampening", or "velocityScaling" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

virtual bool hasSpecific(const Tensor &w) const final
virtual bool hasSpecific() const final
virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const
inline const OptimizerValueMap &learningRates() const
inline const OptimizerValueMap &weightDecays() const
inline const OptimizerValueMap &momentums() const
inline const OptimizerValueMap &dampenings() const
inline const OptimizerValueMap &velocityScalings() const
inline const OptimizerValueMap &nesterov() const
virtual size_t hash() const

Public Static Functions

static inline OptimizerValue getUnsetLearningRate()

Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay()

Default weight decay value.

static inline OptimizerValue getUnsetMomentum()

Default momentum value.

static inline OptimizerValue getUnsetDampening()

Default dampening value.

static inline OptimizerValue getUnsetVelocityScaling()

Default velocity scaling value.

static inline OptimizerValue getUnsetLossScaling()

Default loss scaling value.

static inline OptimizerValue getUnsetNesterov()

Default nesterov.

static SGD fromDefaultMap(const std::map<std::string, OptimizerValue>&, const DebugContext &debugContext = {})
class ConstSGD : public popart::SGD

Stochastic Gradient Descent (SGD) optimizer with constant learning rate, weight decay, loss scaling and clip norm settings (and default values for momentum, dampening or velocity scaling).

NOTE: See SGD for detailed meaning for these parameters.

NOTE: This class exists for backwards compatibility with the Python API and may be removed at some point in the future.

Public Functions

inline ConstSGD(float learningRate, float weightDecay = 0, float lossScaling = 1, const std::vector<ClipNormSettings> &clipNormSettings = {})

Constructor.

Parameters
  • learningRate – A constant learning rate.

  • weightDecay – A constant weight decay value.

  • lossScaling – A constant loss scaling value.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

enum class popart::SGDAccumulatorAndMomentum

Strategy for implementing SGD with momentum and/or gradient accumulation.

Values:

enumerator Combined = 0

Implement SGD using a single tensor for the gradient accumulator (accum) and momentum (accl) tensors.

enumerator Separate

Implement SGD using separate tensors for the gradient accumulator (accum) and momentum (accl) tensors.

14.4.2. Adam, AdaMax & Lamb

#include <popart/adam.hpp>
enum class popart::AdamMode

Enum type describing the mode of an Adam optimizer instance.

Values:

enumerator Adam = 0

Adam or AdamW mode, depending on weight decay setting (see Kingma & Ba, 2015 and Loshchilov & Hutter, 2018).

enumerator AdamNoBias

Like Adam but without bias correction.

enumerator AdaMax

Adamax mode.

enumerator Lamb

Lamb mode (see You et al., 2020).

enumerator LambNoBias

Like Lamb but without bias correction.

class Adam : public popart::Optimizer

AdamW, Lamb and AdaMax optimizer implementation.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The optimizer has the following state for each weight:

  • first-order momentum ( \(m\))

  • second-order momentum ( \(v\))

  • time step ( \(t\))

The optimizer has the following hyper parameters:

  • learning rate ( \(\text{lr}\))

  • weight decay ( \(\text{wd}\))

  • beta1 ( \(\beta_1\))

  • beta2 ( \(\beta_2\))

  • epsilon ( \(\epsilon\))

  • loss scaling ( \(\text{ls}\))

  • maximum weight norm ( \(\text{mwn}\))

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adam::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

The values of #AdamMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:

\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]

Secondly, the optimizer updates the optimizer state as follows:

\[\begin{split} m' &:= \beta_1 * m + (1 - \beta_1) * g_\text{tmp} \\ v' &:= \left\{\begin{aligned} \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Adam/AdamNoBias) } \\ \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Lamb/LambNoBias) } \\ \text{max}(\beta_2 * v, |g_\text{tmp}|) & \text{ \; (AdaMax) } \\ \end{aligned}\right.\\ t' &:= t + 1 \\ \end{split}\]

Next, it computes the following terms:

\[\begin{split} m_\text{tmp} &:= \left\{\begin{aligned} m' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{m'}{(1 - \beta_1^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ v_\text{tmp} &:= \left\{\begin{aligned} v' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{v'}{(1 - \beta_2^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ u_\text{tmp} &:= \left\{\begin{aligned} \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} + \text{wd} * w &\text{ \; (Decay) } \\ \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]

Finally, the optimizer updates the weight as follows:

\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * u_\text{tmp} &\text{ \; (Adam/AdamNoBias/AdaMax) } \\ w - \biggl(\frac{\text{min}(\lVert{w}\rVert, \text{mwn})}{\lVert{u_\text{tmp}}\rVert}\biggr) * \text{lr} * u_\text{tmp} &\text{ \; (Lamb/LambNoBias) } \\ \end{aligned}\right. \end{split}\]

In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability of the gradient calculations. If scaledOptimizerState is enabled then the the lossScaling will not be removed before updating the optimizer state. This can improve the numerical stability when accl1_type is set to FLOAT16.

NOTE: The maximum weight norm is referred to as \(\phi\) in You et al., 2020.

Public Functions

virtual bool hasSpecific(const Tensor &w) const final
virtual bool hasSpecific() const final
virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const final
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Constructor.

Parameters
  • defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameters have been inserted.

  • defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameters have been inserted.

  • defaultBeta1 – The beta1 value to use for weights for which no weight-specific hyper parameters have been inserted.

  • defaultBeta2 – The beta2 value value to use for weights for which no weight-specific hyper parameters have been inserted.

  • defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameters have been inserted.

  • lossScaling – The loss scaling value to use.

  • maxWeightNorm – The maxWeightNorm value to use.

  • adamMode – The AdamMode value to use.

  • weightDecayMode – The WeightDecayMode value to use.

  • maxWeightNorm – The maxWeightNorm value to use.

  • accumType – Data type to use for gradient accumulation.

  • accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.

  • accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

  • scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.

  • debugContext – Optional debug context.

Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
Adam(const std::map<std::string, std::pair<float, bool>> &params, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Constructor.

EXAMPLE:

Adam({{"defaultLearningRate", {0.02, False}},
      {"defaultBeta1", {0.9, True}},
      {"defaultBeta2":{0.999, True}}},
      AdamMode::Adam,
      WeightDecayMode::Decay,
      DataType::FLOAT,
      DataType::FLOAT,
      DataType::FLOAT);

Parameters
  • params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultBeta1", "defaultBeta2", "defaultEps", "lossScaling" or "maxWeightNorm", and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

  • adamMode – The AdamMode value to use.

  • weightDecayMode – The WeightDecayMode value to use.

  • maxWeightNorm – The maxWeightNorm value to use.

  • accumType – Data type to use for gradient accumulation.

  • accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.

  • accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.

  • clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

  • scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.

  • debugContext – Optional debug context.

Adam(const Adam&) = default
~Adam() = default
inline virtual OptimizerType type() const final
inline virtual std::string type_s() const final
virtual std::unique_ptr<Optimizer> clone() const final
virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.

In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final

The names and infos of the optimizer tensors.

virtual void validReplacement(const Optimizer &other) const final
virtual void resetTensorData(Tensor&) const final
virtual void setTensorData(Tensor&) const final
float getStoredValue(const TensorId &optId) const

Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue beta1, OptimizerValue beta2, OptimizerValue eps, OptimizerValue mwn)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • learningRate – The learning rate value to use for this specific weight.

  • weightDecay – The weight decay value to use for this specific weight.

  • beta1 – The beta1 value to use for this specific weight.

  • beta2 – The beta2 value to use for this specific weight.

  • eps – The epsilon value to use for this specific weight.

  • mwn – The max weight norm value to use for this specific weight.

void setStep(int64_t step)
void setStep(const TensorId&, int64_t step)
void setStep(std::map<TensorId, int64_t> steps)
void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultBeta1", "defaultBeta2", "defaultEps", "lossScaling" or "maxWeightNorm" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

inline const OptimizerValueMap &learningRates() const
inline const OptimizerValueMap &weightDecays() const
inline const OptimizerValueMap &beta1s() const
inline const OptimizerValueMap &beta2s() const
inline const OptimizerValueMap &epss() const
inline const OptimizerValueMap &maxWeightNorms() const
inline const WeightDecayMode &getWeightDecayMode() const
inline bool useScaledOptimizerState() const
virtual size_t hash() const final
virtual void setFactorsFromOptions(const SessionOptions&) final

Public Static Functions

static inline OptimizerValue getUnsetLearningRate()

Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay()

Default weight decay value.

static inline OptimizerValue getUnsetBeta1()

Default beta1 value.

static inline OptimizerValue getUnsetBeta2()

Default beta2 value.

static inline OptimizerValue getUnsetEps()

Default epsilon value.

static inline OptimizerValue getUnsetLossScaling()

Default loss scaling value.

static inline OptimizerValue getUnsetMaxWeightNorm()

Default maximum weight norm value.

static Adam fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdamMode adamMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, const DebugContext &debugContext = {})

14.4.3. AdaDelta, RMSProp & AdaGrad

#include <popart/adaptive.hpp>
enum class popart::AdaptiveMode

Enum class representing a type of adaptive optimizer.

Values:

enumerator AdaGrad = 0

AdaGrad optimizer.

enumerator RMSProp

RMSProp optimizer.

enumerator CenteredRMSProp

CenteredRMSProp optimizer.

enumerator AdaDelta

AdaDelta optimizer.

class Adaptive : public popart::Optimizer

AdaDelta, RMSProp and AdaGrad optimizer implementation.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The optimizer has the following state for each weight:

  • first-order momentum ( \(v_1\))

  • second-order momentum ( \(v_2\)) (only for AdaGrad/RMSProp)

  • third-order momentum ( \(v_3\))

The optimizer has the following hyper parameters:

  • learning rate ( \(\text{lr}\))

  • weight decay ( \(\text{wd}\))

  • alpha ( \(\alpha\))

  • momentum ( \(\text{m}\)))

  • epsilon ( \(\epsilon\))

  • loss scaling ( \(\text{ls}\))

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adaptive::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

The values of #AdaptiveMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:

\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]

Secondly, the optimizer updates \(v_1\) the optimizer state as follows:

\[\begin{split} v_1' &:= \left\{\begin{aligned} \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (RMSProp/AdaDelta) } \\ \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (CenteredRMSProp) } \\ v_1 + g_\text{tmp}^2 & \text{ \; (AdaGrad) } \\ \end{aligned}\right.\\ \end{split}\]

Next, \(v_2\) is updated, but only for CenteredRMSProp:

\[\begin{split} v_2' &:= \alpha * v_2 + (1 - \alpha) * g_\text{tmp} \text{ \; (CenteredRMSProp) } \\ \end{split}\]

Next, it computes the update term \(u_\text{tmp}\):

\[\begin{split} u_\text{tmp} &:= \left\{\begin{aligned} \frac{g_\text{tmp}}{\sqrt{v_1'} + \epsilon} & \text{ \; (AdaGrad/RMSProp) } \\ \frac{g_\text{tmp}}{\sqrt{v_1' - v_2'^2} + \epsilon} & \text{ \; (CenteredRMSProp) } \\ \frac{g_\text{tmp} * \sqrt{v_2 + \epsilon}}{\sqrt{v_1' + \epsilon}} & \text{ \; (AdaDelta) } \\ \end{aligned}\right. \end{split}\]

Next, \(v_2\) is updated, but only for AdaDelta:

\[\begin{split} v_2' := \alpha * v_2 + (1 - \alpha) * u_\text{tmp}^2 \text{ \; (AdaDelta) } \\ \end{split}\]

Next the third momentum is updated for all modes:

\[ v_3' := m * v_3 + u_\text{tmp} \]

Finally, the optimizer updates the weight as follows:

\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * (v_3' + \text{wd} * w) &\text{ \; (Decay) } \\ w - \text{lr} * v_3' &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]

In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.

Public Functions

virtual bool hasSpecific(const Tensor &w) const
virtual bool hasSpecific() const
virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const
Adaptive(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultAlpha, OptimizerValue defaultMomentum, OptimizerValue defaultEps, OptimizerValue lossScaling, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})

Constructor.

Parameters
  • defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultAlpha – The alpha value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.

  • defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameter have been inserted.

  • lossScaling – The loss scaling value to use.

  • adaptiveMode – The AdaptiveMode value to use.

  • weightDecayMode – The WeightDecayMode value to use.

  • accumType – Data type to use for gradient accumulation.

  • accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.

  • accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.

  • accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.

  • debugContext – Optional debug context.

Adaptive(const std::map<std::string, std::pair<float, bool>> &params, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})

Constructor.

EXAMPLE: ```{.cpp} Adaptive({{“defaultLearningRate”, {0.02, False}}, */ // {“defaultAlpha”, {0.99, True}}}, /** AdaptiveMode::RMSProp, WeightDecayMode::Decay, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT); ```

Parameters
  • params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultAlpha", "defaultMomentum", "defaultEps" or "lossScaling", and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

  • adaptiveMode – The AdaptiveMode value to use.

  • weightDecayMode – The WeightDecayMode value to use.

  • accumType – Data type to use for gradient accumulation.

  • accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.

  • accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.

  • accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.

  • debugContext – Optional debug context.

Adaptive(const Adaptive&) = default
~Adaptive() = default
inline virtual OptimizerType type() const final
inline virtual std::string type_s() const final
virtual std::unique_ptr<Optimizer> clone() const final
virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.

In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final

The names and infos of the optimizer tensors.

virtual void validReplacement(const Optimizer &other) const final
virtual void resetTensorData(Tensor&) const final
virtual void setTensorData(Tensor&) const final
float getStoredValue(const TensorId &optId) const

Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue alpha, OptimizerValue momentum, OptimizerValue eps)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • learningRate – The learning rate value to use for this specific weight.

  • weightDecay – The weight decay value to use for this specific weight.

  • alpha – The alpha value to use for this specific weight.

  • momentum – The momentum value to use for this specific weight.

  • eps – The epsilon value to use for this specific weight.

void setStep(int64_t step)
void setStep(const TensorId&, int64_t step)
void setStep(std::map<TensorId, int64_t> steps)
void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters
  • weight – The TensorId of the weight.

  • params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultAlpha", "defaultMomentum", "defaultEps" or "lossScaling" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

inline const OptimizerValueMap &learningRates() const
inline const OptimizerValueMap &weightDecays() const
inline const OptimizerValueMap &alphas() const
inline const OptimizerValueMap &momentums() const
inline const OptimizerValueMap &epss() const
virtual size_t hash() const

Public Static Functions

static inline OptimizerValue getUnsetLearningRate()

Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay()

Default weight decay value.

static inline OptimizerValue getUnsetAlpha()

Default alpha value.

static inline OptimizerValue getUnsetMomentum()

Default momentum value.

static inline OptimizerValue getUnsetEps()

Default epsilon value.

static inline OptimizerValue getUnsetLossScaling()

Default loss scaling value.

static Adaptive fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdaptiveMode adaptiveMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, const DebugContext &debugContext = {})

14.5. Builder

#include <popart/builder.hpp>
class Builder

An interface for a Builder, used for creating ONNX graphs.

A builder interface for creating ONNX graphs.

ONNX defines a specification for describing graphs and serialising them as protobuf files. This class provides a builder interface for creating such a graph.

Note, in ONNX, all Ops belong to an “Opset”. The Builder itself does not have methods for creating Ops in the ONNX graph, but instead has accessors to Opsets, like AiGraphcoreOpset1, which contain the methods for creating Ops in the graph.

Public Functions

Builder &createSubgraphBuilder()

Create a builder for a graph which is nested inside this builder’s graph.

~Builder()

Destructor for the Builder class.

TensorId addInputTensor(const TensorInfo &tensorInfo, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters
  • tensorInfo – The shape and data type of the input tensor.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const std::string &dataType, const Shape &shape, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters
  • dataType – The data type of the input tensor.

  • shape – The shape of the input tensor.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const TensorInfo &tensorInfo, const InputSettings &settings, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters
  • tensorInfo – The shape and data type of the input tensor.

  • InputSettings – Settings for TileSet and ExchangeStrategy.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const std::string &dataType, const Shape &shape, const InputSettings &settings, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters
  • dataType – The data type of the input tensor.

  • shape – The shape of the input tensor.

  • InputSettings – Settings for TileSet and ExchangeStrategy.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addUntypedInputTensor(const popart::DebugContext &debugContext = {})

Add a new input tensor without a type or shape to the model.

Parameters

debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

void addInputTensorFromParentGraph(const TensorId &tensorId)

Add a new named input tensor (from the parent graph) to the model.

Parameters

tensorId – The identifier string of the input tensor. This identifier must already exist in the name scope of the parent GraphProto and must appear topologically before this sub-graph.

TensorId addInitializedInputTensor(const ConstVoidData &initData, const popart::DebugContext &debugContext = {})

Add a new pre-initialized input tensor to the model.

Parameters
  • initData – The initial data of the input tensor.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInitializedInputTensor(const ConstVoidData &initData, const VariableSettings &variableSettings, const popart::DebugContext &debugContext = {})

Add a new pre-initialized input tensor to the model.

Parameters
  • initData – The initial data of the input tensor.

  • variableSettings – The settings that determine how variables are retrieved from replicas.

  • debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

void addOutputTensor(const TensorId &arg0)

Add an output tensor from a node in the graph into the list of output tensors.

Parameters

arg0 – The tensor id of the output tensor to be added.

inline AiOnnxOpset6 aiOnnxOpset6()

Return the builder interface for ai.onnx opset 6.

inline AiOnnxOpset7 aiOnnxOpset7()

Return the builder interface for ai.onnx opset 7.

inline AiOnnxOpset8 aiOnnxOpset8()

Return the builder interface for ai.onnx opset 8.

inline AiOnnxOpset9 aiOnnxOpset9()

Return the builder interface for ai.onnx opset 9.

inline AiOnnxOpset10 aiOnnxOpset10()

Return the builder interface for ai.onnx opset 10.

inline AiOnnxOpset11 aiOnnxOpset11()

Return the builder interface for ai.onnx opset 11.

inline AiOnnxMlOpset1 aiOnnxMlOpset1()

Return the builder interface for ai.onnx.ml opset 1.

inline AiGraphcoreOpset1 aiGraphcoreOpset1()

Return the builder interface for ai.graphcore opset 1.

std::vector<TensorId> customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const unsigned numOutputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})

Return the output tensors from a custom op added to the model.

Parameters
  • opid – The id of the operator.

  • opsetVersion – The version of the opset.

  • inputs – The tensor ids of the A vector of input tensor ids.

  • numOutputs – The number of output tensors.

  • attributes – The map of attributes and their values to be added.

  • debugContext – Optional debug information.

Returns

The output tensors.

void customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const std::vector<TensorId> &outputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})

Add a custom op to the model.

Parameters
  • opid – The id of the operator.

  • opsetVersion – The version of the opset.

  • inputs – The tensor ids of the A vector of input tensor ids.

  • outputs – The tensor ids of the output tensors.

  • attributes – The map of attributes and their values to be added.

  • debugContext – Optional debug information.

template<class T>
inline TensorId reshape_const(T &t, const std::vector<TensorId> &args, const std::vector<int64_t> &shape, const std::string &name = {})

Add a constant and a reshape a tensor using the provided domain.

Parameters
  • t – The builder interface.

  • args – The tensor ids of the tensors to be updated.

  • shape – The shape information to be used.

  • name – (Optional) The name of the updated tensor. Default: None.

Returns

The tensor id of the updated tensor.

inline void outputTensorLocation(const TensorId &nodeOutputName, TensorLocation value)

Set a value for the output tensor location attribute.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – The location of the tensor.

inline void recomputeOutput(const TensorId &nodeOutputName, RecomputeType value)

Enable recomputation of the output of the node in the backward pass.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – (Optional) The type of the recompute.

inline void recomputeOutputInBackwardPass(const TensorId &nodeOutputName, RecomputeType value = RecomputeType::Recompute)

Enable or disable recomputation of the output of the node in the backward pass.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – (Optional) The type of the recompute. Default: RecomputeType::Recompute.

inline void recomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames, RecomputeType value = RecomputeType::Recompute)

Enable or disable recomputation of the output of the node in the backward pass.

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node.

  • value – (Optional) The type of the recompute. Default: RecomputeType::Recompute.

inline bool getRecomputeOutputInBackwardPass(const TensorId &nodeOutputName)

Check if a node will have its output recomputed in the backward pass.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

true if the output will be recomputed; false otherwise.

inline bool getRecomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames)

Check if a node will have its output recomputed in the backward pass.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

true if the output will be recomputed; false otherwise.

std::vector<TensorId> checkpointOutput(const std::vector<TensorId> &nodeOutputNames)

Add checkpoint operations to the model.

This is the same as an identity op but RecomputeType is Checkpoint by default. Use this to checkpoint a subset of an operation’s output tensors.

Parameters

nodeOutputNames – The tensors to checkpoint.

Returns

The checkpointed tensors.

inline void virtualGraph(const TensorId &nodeOutputName, int64_t value = 0)

Set the virtual graph that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters
  • nodeOutputName – Name of the output tensor of the ONNX node.

  • value – The index of the virtual graph that computes this node. Default=0.

inline void executionPhase(const TensorId &nodeOutputName, int64_t value = 0)

Set the execution phase that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – The index of the virtual graph that computes this node. Default=0.

inline void pipelineStage(const TensorId &nodeOutputName, int64_t value)

Set the value on the pipeline stage attribute.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – The value to be set.

inline void pipelineStage(const std::set<TensorId> &nodeOutputNames, int64_t value)

Set the value on the pipeline stage attribute.

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node.

  • value – The value to be set.

inline void excludePatterns(const TensorId &nodeOutputName, const std::vector<std::string> &patternNames)

Set the patterns to be excluded.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • patternNames – The vector of pattern names to be excluded.

inline void excludePatterns(const std::set<TensorId> &nodeOutputNames, const std::vector<std::string> &patternNames)

Set the patterns to be excluded.

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node.

  • patternNames – The vector of pattern names to be excluded.

inline void setSerializeMatMul(const std::set<TensorId> &nodeOutputNames, std::string mode, int64_t factor, bool keep_precision)

Set the settings for matmuls that should be serialized.

This option will split a matmul into separate smaller matmuls that will be executed in series. This will also serialize the grad operations during training.

Parameters
  • nodeOutputNames – The tensor ids of the output matmul tensors of the ONNX node.

  • mode – The dimension of the matmul to serialize on. Options are: ‘input_channels’, ‘output_channels’, ‘reducing_dim’, ‘none’.

  • factor – The number of serialised matmuls. This must be a factor of the dimensions to serialise on.

void setPartialsType(const TensorId &nodeOutputName, const std::string partialsType)

Set the partials type for the given node.

This is used in the convolution op.

Parameters
  • nodeOutputName – Name of the output tensor of the ONNX node.

  • partialsType – The type for the partials. Options are: FLOAT or HALF.

void setEnableConvDithering(const TensorId &nodeOutputName, int64_t value)

Enable convolution dithering.

Parameters
  • nodeOutputName – The tensor id of the output tensor of the ONNX node.

  • value – The value to enable convolution. This should be 1 to enable convolution dithering and 0 otherwise.

std::string getPartialsType(const TensorId &nodeOutputName)

Get the partials type for the given node.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.

Returns

The partials type.

inline void setInplacePreferences(const TensorId &nodeOutputName, const std::map<OpType, float> &prefs)
void setAvailableMemoryProportion(const TensorId &nodeOutputName, const float availableMemoryProportion)

Set the available memory proportion for the given node.

This is used in the convolution op.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion.

Parameters
  • nodeOutputName – Name of the output tensor of the ONNX node.

  • availableMemoryProportion – The available memory proportion [0, 1).

void setAvailableMemoryProportion(const std::set<TensorId> &nodeOutputNames, const float availableMemoryProportion)

Set the available memory proportion for the given node.

This is used in the convolution op.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

  • availableMemoryProportion – The available memory proportion [0, 1).

void setAttribute(const std::string &attribute, popart::any value)

Set the value of an attribute that will be set on all subsequent operations.

Parameters
  • attribute – The name of the attribute to set.

  • value – The value to set on the attribute.

popart::any getAttribute(const std::string attribute) const

Get an attribute that has been set for all subsequent operations.

Parameters

attribute – The name of the attribute to get.

Returns

The attribute.

bool hasAttribute(const std::string &attribute) const

Check if an attribute exists.

Parameters

attribute – The name of the attribute to check.

Returns

true if the attribute exists; false otherwise.

void clearAttribute(const std::string &attribute)

Unset an attribute that will be set on all subsequent operations.

Parameters

attribute – The name of the attribute to unset.

bool hasAttribute(const std::string &attribute)

Check if an attribute is set.

Parameters

attribute – The name of the attribute to check.

Returns

true if the attribute is set; false otherwise.

popart::any getAttribute(const std::string &attribute)

Get the attribute value.

Parameters

attribute – The name of the attribute.

Returns

The value of the attribute.

int64_t getPipelineStage() const

Get the pipeline stage attribute.

Returns

The pipeline stage.

int64_t getExecutionPhase() const

Get the execution phase attribute.

Returns

The execution phase.

int64_t getVirtualGraph() const

Get the virtual graph attribute.

Returns

The virtual graph.

inline void virtualGraph(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)

Set the virtual graph that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

  • value – The index of the virtual graph that computes this node.

inline void executionPhase(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)

Set the execution phase.

Applies when creating a graph for a multi-IPU configuration.

Parameters
  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

  • value – The index of the virtual graph that computes this node.

void addNodeAttribute(const std::string &attributeName, const int64_t &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – An int64_t value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<int64_t> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A std::vector<int64_t> value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const float &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A float value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<float> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – The std::vector<float> value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::string &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A std::string value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const char *attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A char value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<std::string> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A std::vector<std::string> value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const bool attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A bool value of the attribute to add.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const ConstVoidData &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters
  • attributeName – The name of the attribute to add.

  • attributeValue – A constant tensor initializer.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

bool nodeHasAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Check whether the ONNX node has an attribute set.

This function will throw an exception if it cannot find the unique node.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

true if the node has an attribute set; false otherwise.

int64_t getInt64NodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a int64_t.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the int64_t type.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<int64_t> getInt64VectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a std::vector<int64_t>.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the std::vector<int64_t> type.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

float getFloatNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a float.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the float type.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<float> getFloatVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a std::vector<float>.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::string getStringNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a string.

This function will throw an exception if it cannot find the unique node or the attribute does not exist or it has not been set to the std::string type.

Parameters
  • attributeName – The name of the attribute for which the value is required.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<std::string> getStringVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a vector of strings.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters
  • attributeName – The name of the attribute for which the value is required.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

bool getBoolNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a boolean.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters
  • attributeName – The name of the attribute for which the value is required.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

void removeNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Remove an attribute from the ONNX node.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters
  • attributeName – The name of the attribute to find.

  • nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

std::vector<std::string> getAllNodeAttributeNames(const std::set<TensorId> &nodeOutputNames)

Get all the attribute names from the ONNX node.

This function will throw an exception if it cannot find the unique node.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

The attribute names associated with the ONNX node.

inline int64_t getVirtualGraph(const TensorId &nodeOutputName)

Get the index of the virtual graph that computes this node.

This applies in a multi IPU system.

This function will throw an exception if the virtual graph has not been set in the current scope.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.

Returns

The virtual graph associated with the ONNX node.

inline int64_t getVirtualGraph(const std::set<TensorId> &nodeOutputNames)

Get the index of the virtual graph that computes this node based on multiple output tensors.

This applies in a multi IPU system.

This function will throw an exception if the virtual graph has not been set in the current scope.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

The virtual graph associated with the ONNX node.

inline int64_t getExecutionPhase(const TensorId &nodeOutputName)

Get the execution phase for a single output tensor.

This only applies to a multi-IPU system.

This function will throw an exception if the execution phase has not been set in the current scope.

Parameters

nodeOutputNames – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.

Returns

The execution phase associated with the ONNX node.

inline int64_t getExecutionPhase(const std::set<TensorId> &nodeOutputNames)

Get the execution phase for a set of output tensors.

This only applies to a multi-IPU system.

This function will throw an exception if the execution phase has not been set in the current scope.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

The execution phase associated with the ONNX node.

std::string getModelProto(bool humanReadable = false) const

Retrieve the ONNX serialized ModelProto.

Parameters

humanReadable – If true, return a human readable text representation of the model, otherwise use a binary format.

Returns

A serialized ONNX ModelProto.

void saveModelProto(const std::string &fn)

Save the builder’s ONNX ModelProto into the builder and validate it.

Parameters

fn – The name of a file containing an ONNX model protobuf.

void saveInitializersExternally(const std::vector<TensorId> &ids, const std::string &fn)

Save tensor data externally.

The model data cannot exceed 2GB - the maximum size of a Protobuf message. To avoid this, for large models ONNX tensor data can be saved separately.

Parameters
  • ids – The names of tensors for which data is to be saved externally.

  • fn – The name of a file containing the binary tensor data. This can be an absolute or relative path. If a relative path, when the ONNX model is saved, external tensor data will be written to a path relative to the current working directory.

std::vector<TensorId> getInputTensorIds() const

Return a list of ONNX graph input tensor ids.

Returns

A vector of input tensor ids.

std::vector<TensorId> getOutputTensorIds() const

Return a list of ONNX graph output tensor ids.

Returns

A vector of output tensor ids.

std::vector<TensorId> getValueTensorIds() const

Return a list of ONNX graph value tensor ids.

These tensors are stored in the value_info section of the ONNX GraphProto structure.

Returns

A vector of value tensor names.

std::vector<TensorId> getTrainableTensorIds() const

Return a list of ONNX graph initialized tensor ids.

These tensors are stored in the initialized section of the ONNX GraphProto structure..

Returns

A vector of names of initialized tensors.

bool hasValueInfo(const TensorId &id) const

Check if a tensor has value info.

A tensor may not have value info if this either does not exist or if shape inference has failed.

Returns

True if the tensor has value info; false otherwise..

std::vector<int64_t> getTensorShape(const TensorId id)

Return an ONNX graph tensor shape, from either the input, output, or value_info lists in GraphProto.

Parameters

id – The id of the tensor for which dimensions are required.

Returns

A vector of the tensor dimensions.

bool isInitializer(const TensorId id) const

Check if the ONNX tensor is in the initializer list of GraphProto.

Parameters

id – A tensor id.

Returns

True if the tensor is in the initializer list; false otherwise.

std::string getTensorDtypeString(const TensorId id)

Return an ONNX graph tensor type as a lower case string, from either the input, output, or value_info lists in GraphProto.

Parameters

id – The id of the tensor for which the type is required.

Returns

A lower case string of the tensor data type.

DataType getTensorDataType(const TensorId id)

Return a tensor type from either the input, output, or value_info lists in GraphProto.

Parameters

id – The id of tensor id for which the type is required.

Returns

The data type of the tensor.

void pushNameScope(const std::string &name)

Push a name onto the name scope stack.

The names of tensors and nodes added to the ONNX graph will be prefixed with a concatenation of the names in the name scope stack.

Parameters

name – The tensor name to be pushed onto the name scope stack.

void popNameScope()

Remove the last entry in the name scope stack.

std::string getNameScope(const std::string &name = "") const

Get the current name scope stack using the default delimiter.

Parameters

name – (Optional) A string to concatenate to the end of the stack.

Returns

A string of the concatenated name scope stack.

void setGraphName(const std::string &name)

Set a graph name.

Parameters

name – The string to name the graph.

void setParent(Builder *parent)

Set the parent graph of this builder.

Parameters

parent – The builder to set as the parent of this builder.

Builder *getParent() const

Return the parent graph of this builder or null if there is no parent.

inline bool hasParent() const

Check if this builder represents a subgraph.

Returns

If true then the builder represents a subgraph. If false then the builder does not represent a subgraph.

void embedReplicationFactor(int replicationFactor)

Embed the value of replicationFactor into the OnnxModel.

Should be interpreted as 1 if not present in the model.

Parameters

replicationFactor – The replication factor.

Public Static Functions

static std::unique_ptr<Builder> create()

Create a builder for an ONNX model.

static std::unique_ptr<Builder> createFromOnnxModel(const std::string &modelProtoOrFilename)

Create a builder which loads a serialized ONNX ModelProto into the builder and validates it.

Parameters

modelProtoOrFilename – Either an ONNX model protobuf, or the name of a file containing an ONNX model protobuf.

class Ir

Public Types

enum class ExecutionMode

Values:

enumerator Inference
enumerator Training
enum class SerialiseFormat

Values:

enumerator JSON

Public Functions

poprithms::logging::TimePartitionLogger &timePartitionLogger() const

void foo() { auto timer = timePartitionLogger().scopedStopwatch("In foo"); if (cond0()){ return; } bar(); return; }

When the method timePartitionLoggerStr() (see below) is called, there will be a line with “In foo” summarizing the time between between the construction and destruction of timer, above. Something like:

In foo : 0.03 [s] : 30 % In bar : 0.02 [s] : 10 % unaccounted : 0.05 [s] : 50 % total : 0.10 [s] : 100 %.

In the case where there are multiple timers which exist concurrently, only the most recently constructed one will accumulate time. This means that the most nested scope is the one which will accumulate time.

For more information, see the poprithms SwitchingTimePartitionLogger class

Returns

An object used to track and summarize where wall clock time is spent in PopART compilation. This object is used to partition time into different components (scheduling, outlining, poplar Graph construction, etc.). It can be used as follows:

std::string timePartitionLoggerStr() const
Ir()
~Ir()
Ir(Ir&&) = delete
Ir &operator=(Ir&&) = delete
Ir(const Ir&) = delete
Ir &operator=(const Ir&) = delete
inline uint64_t getId() const
void setOnnxModel(const ONNX_NAMESPACE::ModelProto &model)
inline bool hasOnnxModel() const

Check if there’s an ONNX model in the IR.

This is true if the IR has been created from an ONNX model or using the Builder.

Returns

true If there is an onnx model, false otherwise.

void setDataFlow(const DataFlow &df)
void setUserOptions(const SessionOptions &flags)
void setInputShapeInfo(const InputShapeInfo &info)
inline const InputShapeInfo &getInputShapeInfo() const
void setOptimizer(const Optimizer&)
void ensureOptimizerTensorCreated(const TensorId &optId, const TensorInfo &info, const DebugContext &debugContext = {})
inline const Optimizer &getOptimizer() const
void setDeviceInfo(DeviceInfo&)
const DeviceInfo *getDeviceInfo() const
void setPatterns(const Patterns &p)
inline const Patterns &getPatterns() const
std::string getPatternLevelStr(const Patterns &p)
bool isPatternsLevel(const Patterns &p, PatternsLevel level)
void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)
void removeIsolatedGraphs()
void setExecutionMode(const ExecutionMode &mode)
inline bool isTraining() const
inline bool isTesting() const
void logIr() const
void compareWithSavedHash(const HashesMap &cacheEntries)
void prepare(const IrBundle &bundle, const HashesMap &cacheEntries = {}, size_t hashSeed = 0u)

Prepare the IR based on the IrBundle configuration.

If engine caching is enabled then the IR hash which is based on the IrBundle and the forward graph will be compared to a saved file. If the hash matches then the rest of the Ir preparation will be skipped.

Parameters
  • bundle – The bundle to prepare.

  • cacheEntries – The engine cache.

  • hashSeed – The seed to initiate the IR hash with &#8212; this hash should incorporate non-IR factors that could affect the compilation such as engine options and session options.

void prepareCache(const HashesMap &cacheEntries, size_t hashSeed)
void finalizeOpDebugInfo()
inline bool isPrepared() const
inline bool hashMatched() const
void updateOptimizer(const Optimizer&)
ONNX_NAMESPACE::ModelProto step(int n)
void addAdditionalModelProtoTensor(const TensorId&)
void addAdditionalModelProtoTensor(Tensor*)
void addAdditionalModelProtoTensors()
inline bool additionalModelProtoTensorsHaveBeenAdded() const
inline const std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors() const
inline std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors()
bool isAnchored(const TensorId&) const
bool isRootAnchor(const TensorId&) const
std::set<TensorId> getAnchors() const
std::set<TensorId> getRootAnchors() const
void remapAnchor(const TensorId &from, const TensorId &to)
void addAnchor(const TensorId &t)
const BiMap<TensorId, TensorId> &getAnchorRemap() const
bool streamingIsDisabledForTensor(const Tensor*) const
bool streamingIsDisabledForTensor(const TensorId&) const
bool storingIsDisabledForTensor(const Tensor*) const
bool storingIsDisabledForTensor(const TensorId&) const
void append(std::stringstream&) const
void serialise(SerialiseFormat format, std::stringstream &ss, bool useScheduler = true) const
std::vector<Tensor*> optimizerTensors() const
std::vector<Tensor*> optimizerStateTensors() const
std::map<TensorId, std::vector<Tensor*>> getHostLoadTensors() const

The original input tensor ID (used to identify streams) and the tensors produced by associated HostLoadOp.

std::map<TensorId, std::vector<Tensor*>> getHostStoreTensors() const

The original anchor tensor ID (used to identify streams) and the tensors consumed by associated HostStoreOp.

std::vector<Tensor*> dataStreamTensors() const
std::vector<Op*> opsOfType(const OperatorIdentifier &opid) const
bool isConsumedByOpOfType(TensorId tid, const OperatorIdentifier &opid)
std::vector<const Graph*> getGraphSchedule() const
std::vector<const Graph*> getGraphSchedule(GraphId root) const
std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule ros) const
bool isSchedulable(const OpsBeforeKey&) const
bool virtualGraphsEnabled() const
SyntheticDataMode syntheticDataMode() const
bool useSyntheticData() const
OpId getOpsCounter() const
OpId getAndIncrOpsCounter()
TensorId getFinalLossId() const
OpId getFinalLossOpId() const
void dotCheckpoint(const Ir &ir, std::string check) const
const ONNX_NAMESPACE::ModelProto &getModel() const
Throws

error – if there is no Onnx model.

Returns

const reference to the Onnx model.

std::vector<TensorId> getModelInputIds() const
Returns

the id of every input tensor of the Onnx model. If there is no Onnx model, returns empty.

void setExternalTensorDataInfo(TensorId, const ONNX_NAMESPACE::TensorProto&)

Set the Onnx TensorProto of the given tensor in the Onnx ModelProto.

Throws

error – if this Ir has no Onnx model.

inline const SessionOptions &getSessionOptions() const
inline SessionOptions &getSessionOptions()
inline void setSessionName(const std::string name)
inline const std::string getSessionName() const
std::set<TensorId> getAllTensorIds() const
std::vector<TensorId> getTensorIds(TensorType) const
Tensor *getTensor(const TensorId&) const
bool containsTensor(const TensorId&) const
std::vector<TensorId> getGraphInputIds() const
std::vector<TensorId> getGraphOutputIds() const
const Graph &getMainGraph() const
Graph &getMainGraph()
std::vector<const Graph*> getAllGraphs() const
Graph &getGraph(const GraphId&) const
bool hasGraph(const GraphId&) const
Graph &createGraph(const GraphId&)
void removeGraph(const GraphId&)
std::map<OpId, std::unique_ptr<Op>> &getMainGraphOps()
const std::map<OpId, std::unique_ptr<Op>> &getMainGraphOps() const
std::vector<Op*> getAllOps() const
Op *getOp(OpId opId) const

Returns the Op if it exists in any graph.

Throws an error if the Op could not be found.

Parameters

opId – The unique ID of the Op to find

Returns

The Op pointer if found

Tensors &getMainGraphTensors()
const Tensors &getMainGraphTensors() const
inline const DataFlow &getDataFlow() const
void applyTransform(std::size_t transformId, Graph &graph)
void validateAnchors() const
ExecutionMode getExecutionMode() const
bool canInfer() const
bool canTrain() const
bool hasConstructedBackwards() const
bool hasDecomposedOptimizers() const
bool containsInitialisers() const
bool tensorExistsInInitialisers(TensorId) const
void constructForwards()
Graph &constructFromOnnxGraph(const ONNX_NAMESPACE::GraphProto &graph, const Scope &scope)
void foldConstants(Graph&)
void constructBackwards()
void registerInputTensors()
void updateVertices()
void unsetAllVirtualGraphIds()
void applyPreAliasPatterns(Graph&)
void applyUpdateInplacePrioritiesForIpu()
void applyInplacePattern(Graph&)
void confirmConstIds() const
void confirmNoReservedIds() const
void setFinalLoss(const TensorId &loss)
int getDefaultOpsetVersion(const std::string &domain) const
unsigned getNumVirtualGraphIds() const
int getOpSetVersionFromModel(const std::string &domain) const
inline bool autoRecomputationEnabled() const
bool hasReplicatedTensorSharding() const
bool hasOverlappedIO() const
inline void setRequiresRandomSeed()
inline bool getRequiresRandomSeed() const
RandomReferenceId getAndIncrementRandomReferenceId()
TensorId getOrSetRandomReferenceTensor(RandomReferenceId, TensorId)
void mergeRandomReferenceIds(std::set<RandomReferenceId>&)
void setRemoteBufferInfo(RemoteBufferId, RemoteBufferInfo)
const RemoteBufferInfo getRemoteBufferInfo(RemoteBufferId) const
const std::map<RemoteBufferId, RemoteBufferInfo> getAllRemoteBufferInfos() const
inline void setExecutionPhasesReady()
inline bool getExecutionPhasesReady() const
PipelineStage getNumPipelineStages() const
PipelineInfo pipelineInfo() const
void setMainGraphPathFromLoss()
void verifyTensorInfos() const

Verifies that all tensors have valid TensorInfos.

void setIsPrepared()

Marks the Ir as “prepared”.

This means the Ir is now ready to be lowered. Failing to do this before lowering the Ir will result in an error. The schedule of all graphs will be fixed by calling this. Modifying the graphs after the IR is prepared will result in an error.

PipelineStage getFinalLossPipelineStage() const

Get pipeline stage containing the final loss (the last forward pipeline stage)

Returns

pipeline stage containing the final loss

PipelineStage getMaxPipelineStage() const

Get the max pipeline stage that will exist after the backward pass has been added to the graph.

Returns

max pipeline stage of the graph

Op &getSubgraphAnchorPlaceholder()
inline const decltype(graphs) &getGraphs() const
TensorId createIntermediateTensorId(const TensorId &base_id)
TensorId createSliceTensorId(TensorId base_id, unsigned s, unsigned e)
TensorId createConcatTensorId(TensorId base_id)
GraphId createUniqueSubgraphId(GraphId base_id)
std::vector<std::vector<Op*>> getAccumulateOuterFragmentBinConstraints(const Graph &graph) const
size_t getHash() const
void computeHash(size_t hashSeed)
size_t getIrBundleHash() const
void setIrBundleHash(size_t)
ClonedGraphMaps cloneGraph(GraphId originalGraphId, GraphId newGraphId)

Clone a graph.

The OpIds and TensorIds will differ between the original and the cloned graph. Hence a map between the old OpId and cloned OpId will be returned. The new graph can be obtained by ir.getGraph(newGraphId);

Warning

Does not support cloning of the main graph.

Parameters
  • originalGraphId – The id of the graph to clone

  • newGraphId – The id of the cloned graph

Returns

A struct of maps between the OpIds and TensorIds in the original and new graphs

bool applyPreAliasPattern(const PreAliasPattern*, Graph&)

Public Static Functions

static bool usingEngineCache(const SessionOptions&, const DeviceInfo*)
using popart::HashesMap = std::map<size_t, std::string>
enum class popart::RequireOptimalSchedule

Values:

enumerator Yes = true
enumerator No = false
class Graph

Public Types

enum class CopyInputMarkings

Values:

enumerator Yes = 1
enumerator No = 0
enum class CopyOutputMarkings

Values:

enumerator Yes = 1
enumerator No = 0

Public Functions

Graph(Ir&, const GraphId&)
~Graph()
Graph() = delete
Graph(const Graph&) = delete
const std::map<OpId, std::unique_ptr<Op>> &getOps() const
std::map<OpId, std::unique_ptr<Op>> &getOps()
std::vector<OpId> getOpIds() const
const std::set<int64_t> getAllVirtualGraphIds(bool includeInvalid) const
const std::map<int64_t, int> getVirtualGraphCounts() const
Op *getOp(OpId opId) const

Return a pointer to the Op if it exists.

Throws an error if the Op could not be found.

See also

getOpUnsafe

Parameters

opId – The unique ID of the Op to find

Returns

The Op pointer if found

Op *getOpUnsafe(OpId opId) const

Returns a pointer to the Op if it exists, or nullptr otherwise.

See also

getOp

Parameters

opId – The unique ID of the Op to find

Returns

The Op pointer if found, or nullptr otherwise

const Tensors &getTensors() const
Tensors &getTensors()
Tensor *getTensor(const TensorId&)
void addActGrad(const TensorId&)
void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const DebugContext &debugContext)

Add a variable to this graph with the provided properties.

Parameters
  • name – The name of the variable.

  • info – The tensor info to create the variable with, including shape and data type.

  • src – The data to initialise the tensor with.

  • debugContext – The debug context to assist with debugging.

void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const VariableSettings &vs, const DebugContext &debugContext)

As per addVarInit, but passing a VariableSettings object to allow for grouped replicas.

See also

addVarInit(const TensorId &, const TensorInfo &, const void *, const DebugContext &)

Parameters
  • name – The name of the variable.

  • info – The tensor info to create the variable with, including shape and data type.

  • src – The data to initialise the tensor with.

  • vs – The variablesettings to use.

  • debugContext – The debug context to assist with debugging.

void addConstInit(const TensorId&, const TensorInfo&, const void*, const DebugContext&)
void addStream(const TensorId&, const TensorInfo&, const DebugContext&)
inline const Ir &getIr() const
inline Ir &getIr()
inline const TensorId &getLoss() const
inline void setLoss(const TensorId &loss_)
void constructFromOnnxGraph(const ONNX_NAMESPACE::GraphProto &onnx_graph)
Op *growFromNode(const Node &node)
OpId moveIntoGraph(std::unique_ptr<Op> op)
template<typename OP, typename ...Args>
OP *createOp(Args&&... args)
template<typename OP, typename ...Args>
OP *createConnectedOp(const std::map<InIndex, TensorId> &in, const std::map<OutIndex, TensorId> &out, Args&&... args)
std::vector<const Graph*> getCalledGraphs() const
template<typename T>
void connectInputs(const T &inContainer, OpId opId)
template<typename T>
void connectOutputs(const T &outContainer, OpId opId)
void connectInputsFromInputMapWrapper(const InputMapWrapper &in, OpId id)
void connectOutputsFromOutputMapWrapper(const OutputMapWrapper&, OpId opId)
std::map<int, std::unique_ptr<popart::Op>>::iterator eraseOp(OpId id)
void setVarUpdateConstraints()
void setConvFlipWeightConstraints()
std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule requireOptimalSchedule) const
void freezeSchedule(const OpsBeforeKey &gCons)
bool isSchedulable(const OpsBeforeKey&, bool respectExecutionPhases = false) const
bool hasUserRecomputeOps() const
std::vector<OpSet> getLiveSets(const std::vector<Op*> &topoOps) const
inline const std::vector<TensorId> &getInputIds() const
InIndex getInputIndex(TensorId id) const

Get the index of the graph input with a specific id.

If the id is not a valid input id then a error will be raised.

Parameters

id – Tensor name to find the index for.

Returns

The input index for the specified id, if it exists.

void addInput(const InIndex &index, const TensorId &id, const TensorInfo &info, bool overwrite)

Add a graph input at a specific index in the list.

Parameters
  • index – Force the input to be at the specified index in the graph.

  • id – Tensor name to create and connect

  • info – Tensor info

  • overwrite – Overwrites any existing input at the index if true, otherwise, moves all other inputs by one position

void addInput(const TensorId &id, const TensorInfo &info)

Add a graph input to the end of the list.

Parameters
  • id – Tensor name to create and connect

  • info – Tensor info

void markAsInput(const TensorId&)
TensorId addInput(const TensorInfo&)
Tensor *getInputTensor(InIndex idx) const
inline TensorId getInputId(InIndex idx) const
bool hasInputId(const TensorId &id) const
void removeInput(const TensorId&)
void removeInput(const InIndex&)
inline const std::vector<TensorId> &getOutputIds() const
OutIndex getOutputIndex(TensorId id) const
void markAsOutput(const OutIndex &index, const TensorId &id, bool overwrite)

Mark a graph tensor as graph output at a specific index in the list.

Parameters
  • index – Force the output to be at the specified index in the graph. Overwrites any existing output at the index.

  • id – Tensor in the graph to mark as output

  • overwrite – Overwrites any existing output at the index if true, otherwise, moves all other outputs by one position

void markAsOutput(const TensorId &id)

Mark a graph tensor as graph output at the end of the list.

Parameters

id – Tensor in the graph to mark as output

void removeOutput(const TensorId&)
void removeOutput(const OutIndex&)
inline TensorId getOutputId(OutIndex idx) const
bool hasOutputId(const TensorId &id) const
Tensor *getOutputTensor(OutIndex idx) const
Scope getScope() const
void replaceTensor(const TensorId &oldId, const TensorId &newId)

Replace oldId with newId on any consumers.

Both tensors need to exist.

Parameters
  • oldId – Tensor to disconenct from consumers & graph outputs

  • newId – Tensor to connect from consimers & graph outputs

std::vector<Op*> getCallSiteOps() const
std::vector<Op*> getCallSiteOps(size_t num) const
std::map<OpId, std::unordered_set<OpId>> getEdgeMap() const
inline const std::string &getGraphId() const
std::string getGraphString() const
void copyFrom(const Graph &other, CopyInputMarkings copyInputMarkings = CopyInputMarkings::Yes, CopyOutputMarkings copyOutputMarkings = CopyOutputMarkings::Yes)
std::pair<bool, std::vector<Op*>> getDirectViewChain(Tensor *from, Tensor *to)

Find a chain of view changing ops in the graph from “from” to “to” (if one exists) and return a vector of ops such that op1(op2(…opN(in))) = out for {op1, op1, …, opN}.

If no such chain exists, returns {false, {}};

Parameters
  • from – The tensor to start at

  • to – The tensor to finish at

Returns

std::pair<bool, std::vector<Op *>> The ops along the chain, in order. where the first of the pair is a bool indicating whether the path exists. The second is the vector of ops in order from ‘from’ to ‘to’. Givent the ops are 1-in-1-out, this will also be in schedule order.

void setOnnxToOnnx(std::unique_ptr<onnxpasses::IOnnxToOnnx>)

Set the object which will perform the ONNX -> ONNX transformation, which happens early on in the Graph constructor.

The default object, which is used if this method is not called, is an instance of the onnxpasses::Canonnxalizer class, which performs a set of required transformations, such as decomposing ASinh into more basic Nodes.

void finalizeSchedule()

Finalizes the graph schedule.

Schedule cannot change anymore after this was called. Calling finalize multiple times results in an error.

inline void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)
inline bool canBeRecursivelyAutodiffed() const

If this graph X is called in graph Y, when applying autodiff to Y, is it safe to autodiff X?

inline void setCanBeRecursivelyAutodiffed(bool value)

Public Members

std::unique_ptr<TopoCons> topoCons
const GraphId id

Public Static Attributes

static const int64_t NoVGraph
class AiOnnxMlOpset1 : public popart::DomainOpSet

Class that represents the AI ONNX ML opset.

Public Functions

inline AiOnnxMlOpset1(std::unique_ptr<BuilderImpl> &impl_)

Constructor for the AiOnnxMlOpset1 class.

Parameters

impl_ – A pointer to an implementation of the Builder class.

class AiGraphcoreOpset1 : public popart::DomainOpSet

Class that represents the AI Graphcore opset.

Public Functions

inline AiGraphcoreOpset1(std::unique_ptr<BuilderImpl> &impl_)

Constructor for the AiGraphcoreOpset1 class.

Parameters

impl_ – A pointer to an implementation of the Builder class.

TensorId copyvarupdate(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Copies a tensor to an initalised tensor (variable).

This is used to update an initalised tensor (a variable created using addInitializedInputTensor()) which retains its value between iterations, by setting the value to the value of another tensor (the updater). The purpose is to manually update the tensor in use cases for variables other than trained parameters (weights) or tensors used by other ops.

Parameters
  • args – A vector of the input tensor ids containing the tensor to be updated, tensor and the tensor containing the values for the update, updater as [tensor, updater].

  • debugContext – Optional debug information.

Returns

An alias to the updated variable: to ensure correct ordering of the updated variable, you should use this variable for any op which should operate on the updated variable.

std::vector<TensorId> batchnormalization(const std::vector<TensorId> &args, unsigned num_outputs, float epsilon = 1e-05f, float momentum = 0.9f, const popart::DebugContext &debugContext = {})

Add a batch normalization operation to the model.

This version uses N-1 as the population size for calculating running variance (like PyTorch). PyTorch BatchNorm1d

Whereas, the Onnx version uses N. ONNX version

Parameters
  • args – List of input tensor ids

  • num_outputs – The number of output tensor ids

  • epsilon – The ‘epsilon’ attribute

  • momentum – The ‘momentum’ attribute

  • name – Optional identifier for the operation

Returns

A list of normalized output tensors

std::vector<TensorId> groupnormalization(const std::vector<TensorId> &args, int64_t num_groups, float epsilon = 1e-05f, const DebugContext &debugContext = {})

Add a group normalization operation to the model.

This is a Poplar extension.

The group will be created from a strided input.

Parameters
  • args – A vector of input tensor ids for input data x, scale scale, and bias bias as [x, scale, bias].

  • num_groups – The number of groups to separate the channels into.

  • epsilon – The epsilon value to use to avoid division by zero.

  • debugContext – Optional debug information.

Returns

A vector of output tensor ids for output data y, the mean mean and the variance var as [y, mean, var].

std::vector<TensorId> multiconv(const MultiConvInputs &tensors, const MultiConvDilations &dilations = {}, const MultiConvDilations &inDilations = {}, const MultiConvPads &pads = {}, const MultiConvPads &outPads = {}, const MultiConvStrides &strides = {}, const std::vector<float> &availableMemoryProportions = {}, const std::vector<std::string> &partialsTypes = {}, const nonstd::optional<std::string> planType = nonstd::nullopt, const nonstd::optional<int> perConvReservedTiles = nonstd::nullopt, const nonstd::optional<float> cycleBackOff = nonstd::nullopt, const std::vector<int64_t> enableConvDithering = {}, const DebugContext &debugContext = {})

Add a multi-convolution operation to the model.

Using this multi-convolution API ensures that the convolutions are executed in parallel on the device.

Functionally, a multi-convolution is equivalent to a series of single convolutions. Using this multi-convolution API is always equivalent to calling the single-convolution API (conv) once for each argument.

For example, calling:

A0 = conv({X0, W0, B0})
A1 = conv({X1, W1})

is functionally equivalent to calling:

{A0, A1} = multiconv({{X0, W0, B0}, {X1, Q1}).

It is possible that any two convolutions cannot be executed in parallel due to topological constraints. For example, the following:

B = conv({A, W0});
C = B + A
D = conv({C, W1});

cannot be converted to:

{B, D} = multiconv({{A, W0}, {C, W1}}).

Note that it is not possible to create such a cycle by adding a multi-convolution with this API.

Calls to multiconv() are mapped to poplar::poplin::multiconv::convolution().

All input vectors must be either empty, or equal in length to the number of convolutions. Note that groups for each convolution are automatically inferred from the shapes of the data and weight inputs.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion.

Parameters
  • tensors – List of tensor ids for input tensors for data, weights and biases as [data, weight,bias] for each convolution. bias is optional.

  • dilations – The dilations attributes for each convolution.

  • inDilations – The input dilations attributes for each convolution.

  • pads – The pads for each convolution.

  • outPads – The output padding for each convolution.

  • strides – The strides for each convolution.

  • availableMemoryProportions – The available memory proportions per convolution, each [0, 1).

  • partialsTypes – The partials type per convolution.

  • planType – Run convolutions in parallel or series.

  • perConvReservedTiles – The number of tiles to reserve per convolution when planning.

  • cycleBackOff – Cycle back-off proportion, [0, 1).

  • enableConvDithering – Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models.

  • debugContext – Optional debug information.

Returns

A vector of tensor ids of the output tensor from each convolution.

TensorId subsample(const std::vector<TensorId> &args, const std::vector<int64_t> &strides, const DebugContext &debugContext = {})

Add a sub-sample operation to the model.

This is a Poplar extension.

If multiple tensors are provided, the strides will be applied to them all.

Parameters
  • args – A vector of tensor ids to sub-sample.

  • strides – The strides to use.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId printtensor(const std::vector<TensorId> &args, int64_t print_gradient = 1, const DebugContext &debugContext = {}, const std::string &title = {}, const int summariseThreshold = 1000, const int edgeItems = 3, const int maxLineWidth = 75, const int digits = 8, const int floatFormat = 0, const char separator = ' ', const char openBracket = '[', const char closeBracket = ']')

Add a print tensor operation to the model.

This is a Poplar extension.

Parameters
  • args – A vector of tensor ids to print.

  • print_gradient – Indicates whether the gradient tensor(s) associated with the input tensor(s) are also printed. If 1, the gradient tensor(s) are also printed, otherwise the gradient tensor(s) are not printed.

  • debugContext – Optional debug information.

  • title – An optional title to print.

  • summariseThreshold – (default 1000) If the number of elements of the tensor exceeds this threshold the output will be summarised. Only the edge elements will be displayed with an ellipsis indicating skipped elements. A value of 0 will disable summarisation.

  • edgeItems – (default 3) number of edge elements to include at the beginning and end when summarisation is enabled

  • maxLineWidth – (default 75) lines longer than this limit will be split across multiple lines. A value of 0 will disable line splitting.

  • digits – (default 8) number of digits to display. For integers this limit can be exceeded if any number is large enough. For floating points this does not include the exponent. The number of digits is used in conjunction analysis of the tensor to determine the width of each element to align all elements when printed. A value of 0 disables this analysis and each elements will be printed in an unaligned format.

  • floatFormat – (default 0=Auto) determines the floating point format to use. 0=auto, 1=fixed, 2=scientific 3=none. Automatic mode determines the appropriate format based on the data. If digits==0 this option is disregarded and the floatFormat is set to none.

  • separator – (default space) character used to delininate values.

  • openBracket – (default square bracket) character used to open a tensor.

  • closeBracket – (default square bracket) character used to close a tensor.

Returns

The tensor id of the result tensor.

TensorId nop(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a no-op operation to the model.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId normalize_image(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})

Normalize image and pad it from 3 channels to 4 channels.

The input channel must be in the last dimension.

Parameters
  • args – Contains the image input, offsets, scales input tensors as required by Poplibs

  • scale – the scale to apply

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scale(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})

Add a scale operation to the model.

This is a Poplar extension.

Parameters
  • args – A vector of input tensor ids.

  • scale – The scale to apply.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scaledadd(const std::vector<TensorId> &args, float scale0, float scale1, const DebugContext &debugContext = {})

Add a scaled add operation to the model.

The scaled add operation takes the form:

X = scale0 * T0 + scale1 * T1

where scale0 is the scale factor to be applied to tensor \T0 and scale1 is the scale factor to be applied to tensor \T1.

Parameters
  • args – A vector of input tensor ids: [T0, T1, scale0, scale1].

  • scale0 – The scale to apply (if no scale0 tensor is supplied).

  • scale1 – The scale to apply (if no scale1 tensor is supplied).

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

std::vector<TensorId> lstm(const std::vector<TensorId> &args, int64_t outputFullSequence, const DebugContext &debugContext = {})
TensorId gelu(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a GELU operation to the model.

This is a Poplar extension.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId geluerf(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an accurate GELU (ERF instead of TANH) operation to the model.

Parameters
  • args – A vector of input tensor IDs.

  • debugContext – Optional debug information.

Returns

The tensor ID of the result tensor.

TensorId detach(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a detach operation to the model.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId depthtospace(const std::vector<TensorId> &args, int64_t blocksize, const std::string &mode = "DCR", const DebugContext &debugContext = {})

Add a depth-to-space operation to the model.

This allows DepthToSpace_11 to be targeted from earlier opsets.

The purpose of a depth-to-space operation, also known as pixel shuffling, is to rearrange data from the depth (channels) dimension into the spatial (width and height) dimensions. It is an efficient means of learning upsampling alongside mixing convolution with bilinear interpolation and using transpose convolution.

Parameters
  • args – A vector containing a single tensor id of the input tensor of shape [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.

  • blocksize – The size of the blocks to be moved. If the input is [N, C, H, W] and the blocksize is B, the output will be [N, C/(B*B), H*B, W*B].

  • mode – Specifies how the data is rearranged:

    • ”DCR” (Default): depth-column-row order

    • ”CRD”: column-row-depth order

  • debugContext – Optional debug information.

Returns

A tensor which is a rearrangement of the input tensor.

TensorId round(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a rounding operation to the model.

This allows Round_11 to be targeted from earlier opsets.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The normalized output tensor ids.

TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, Attributes::Int batch_axis, const DebugContext &debugContext = {})

Add an init operation to the model.

Parameters
  • shape – The shape of the tensor to initialise.

  • data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.

  • init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.

  • batch_axis – Batch axis specifies the axis that the batches are split along and is a literal integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, const DebugContext &debugContext = {})

Add an init operation to the model.

Parameters
  • shape – The shape of the tensor to initialise.

  • data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.

  • init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicslice(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})

Add a dynamic slice operation to the model.

Creates a new slice tensor, slice, at offset position, offset, in a tensor, tensor. For example:

slice = tensor[offset]

Parameters
  • args – A vector of input tensor ids: [tensor, offset].

  • axes – The axes along which to slice.

  • sizes – The size of the slice along each axis.

  • noOverlap – Indicates whether the slice regions overlap or not. If 1, slice regions do not overlap, otherwise they do overlap.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicupdate(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})

Add a dynamic update operation to the model.

Creates a copy of a tensor, tensor, and updates the elements of the copied tensor at offset position, offset, with the elements contained in the slice tensor, slice, For example:

out = tensor
out[offset] = slice

Parameters
  • args – A vector of input tensor ids: [tensor, offset, slice].

  • axes – The axes along which to update.

  • sizes – The size of the slice along each axis.

  • noOverlap – Indicates whether the updates overlap or not. If 1, the updates do not overlap, otherwise they do overlap.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamiczero(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})

Add a dynamic zero operation to the model.

Creates a copy of a tensor, tensor, with a slice tensor at offset position, offset set to zero. For example:

out = tensor
out[offset] = 0.0

Parameters
  • args – A vector of input tensor ids: [tensor, offset].

  • axes – The axes along which to zero elements.

  • sizes – The size of the slice along each axis.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicadd(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})

Add a dynamic add operation to the model.

Creates a copy of a tensor, tensor, with a slice tensor, slice, added at an offset position, offset. For example:

out = tensor
out[offset] += slice

Parameters
  • args – A vector of input tensor ids: [tensor, offset, slice].

  • axes – The axes along which to add the slice.

  • sizes – The size of the slice along each axis.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId sequenceslice(const std::vector<TensorId> &args, Attributes::Int zeroUnused, const DebugContext &debugContext = {})

Slice a 2D tensor based on offsets.

The outermost dimension is sliced. For the following:

  • source is the source tensor.

  • destination is the destination tensor.

  • N is the number of elements to copy.

  • sourceOffset is the first element read from the source tensor.

  • destinationOffset is the first element written to in the destination tensor. Then, for each entry in N, sourceOffset and destinationOffset:

    destination[destinationOffset:destinationOffset+N][...] =
    source[sourceOffset:sourceOffset+N][...]
    

Entries after the first N==0 may be ignored. Unreferenced elements of destination are zeroed if zeroUnused is set. The same output element should not be written by multiple inputs.

source and destination must have rank greater than or equal to 2. The outer dimension is sliced; the product of the inner dimensions must match. sourceOffset, destinationOffset and N must be 1-dimensional and of the same size. For example:

N = [1, 1, 1]
sourceOffset = [0, 2, 4]
destinationOffset = [0, 1, 2]

Parameters
  • args – A vector of input tensor ids for the following tensors [source, destination, N, sourceOffset, destinationOffset].

  • zeroUnused – Determines whether to zero unreferenced destination elements. If 1, the unreferenced elements are zeroed, otherwise they are not zeroed.

  • debugContext – Optional debug information.

std::vector<TensorId> call(const std::vector<TensorId> &args, unsigned num_outputs, const Builder &callee, const DebugContext &debugContext = {})

Add a call operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters
  • args – A vector of input tensor ids.

  • callee – The subgraph to call into.

  • debugContext – Optional debug information.

Returns

A vector of tensors; the subgraph outputs.

TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

DEPRECATED: Add a replicated allreduce operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters
  • args – A vector of input tensor ids to reduce across.

  • commGroup – GCL CommGroup parameter.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

Add a replicated allreduce operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters
  • args – A vector of input tensor ids to reduce across

  • collectiveOperator – A Graphcore Communication Library (GCL) collective operator.

  • commGroup – A GCL CommGroup parameter.

  • debugContext – Optional debug information

Returns

The tensor id of the result tensor.

TensorId replicatedreducescatter(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

Add a replicated reduce-scatter operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters
  • args – A vector of input tensor ids to reduce across.

  • collectiveOperator – A Graphcore Communication Library (GCL) collective operator.

  • commGroup – A GCL CommGroup parameter.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId l1loss(const std::vector<TensorId> &args, const float lambda, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})

Add an l1 loss operation to the model.

Calculates the mean absolute error between each element in the input with a zero target.

Parameters
  • args – A vector of input tensor ids.

  • lambda – The scale factor of the L1 loss.

  • reduction – The type of reduction to perform on the individual losses.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId nllloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const nonstd::optional<int> ignoreIndex = nonstd::nullopt, bool inputIsLogProbability = false, const DebugContext &debugContext = {})

Add a negative log-likelihood loss operation to the model.

Calculates the negative log likelihood (NLL) loss given a probability tensor over classes, and a target tensor containing class labels.

Parameters
  • args – A vector of input tensor ids: probability and tensor.

  • reduction – The type of reduction to perform on the individual losses.

  • ignoreIndex – Optional class index to ignore in loss calculation.

  • inputIsLogProbability – If true the input tensor contains log-probabilities, otherwise raw probabilities. Default = false.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId identityloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})

Add an identity loss operation to the model.

Calculates the loss using the identity operator.

Parameters
  • args – A vector of input tensor ids.

  • reduction – The type of reduction to perform on the individual losses.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId tensorremap(const std::vector<TensorId> &args, Attributes::Int remap_type, const DebugContext &debugContext = {})

Add a tensor remap operation to the model.

Changes the tensor layout to conform to the downstream consumers, which means the consumers can read the tensor without having to rearrange it.

Parameters
  • args – The tensor id of the tensor to remap. This is a single tensor that should be copied to a new tensor with a tensor layout conforming to the downstream consumer.

  • remap_type – The type of remap to perform on the forward/backward pass. Backward pass remapping requires the op to exist in the IR before autodiff. The value is the integer attribute value of the enum TensorRemapType.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})

Add a connectionist temporal classification (CTC) loss operation to the model.

With maximum input length T, batch size N, number of classes C and maximum target length S, this op calculates the CTC loss for a logarithmised probabilities tensor with shape [T, N, C], a class target tensor with shape [N, S], an input lengths tensor [N] and a target lengths tensor [N].

Note that C includes a blank class (default=0). The probabilities tensor is padded as required. Target sequences are also padded and are populated with values less than or equal to C, not including the blank class, up to their respective target lengths. Note that target lengths cannot exceed input lengths.

Parameters
  • args – A vector of input tensor ids [log_probs,targets, input_lengths, target_lengths].

  • reduction – The type of reduction to perform on the individual losses.

  • blank – The integer representing the blank class.

  • outDataType – The data type of the output tensors. Default = UNDEFINED.

  • zeroInfinity – If true infinite losses and the associated gradients are zeroed-out. Default = false.

  • debugContext – Optional debug information

Returns

The tensor id of the result tensor.

std::vector<TensorId> _ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})
std::vector<TensorId> ctcbeamsearchdecoder(const std::vector<TensorId> &args, unsigned blank = 0, unsigned beamWidth = 100, unsigned topPaths = 1, const DebugContext &debugContext = {})

Add a connectionist temporal classification (CTC) beam search decoder operation to the model.

Calculate the most likely topPaths labels and their probabilities given the input logProbs with lengths dataLengths.

Parameters
  • args – A vector of input tensor ids. These are [logProbs, dataLengths], where logProbs is of shape [maxTime, batchSize, * numClasses], and dataLengths is of shape [batchSize].

  • blank – The integer representing the blank class.

  • beamWidth – The number of beams to use when decoding.

  • topPaths – The number of most likely decoded paths to return, must be less than or equal to beamWidth.

  • debugContext – Optional debug information.

Returns

The names of the result tensors. These are [labelProbs, labelLengths,decodedLabels], where labelProbsis of shape [batchSize,topPaths],labelLengthsis of shape [batchSize, topPaths], anddecodedLabelsis of shape [batchSize, topPaths,maxTime`].

TensorId shapeddropout(const std::vector<TensorId> &args, const std::vector<int64_t> &shape, float ratio = 0.5f, const DebugContext &debugContext = {})

Add a shaped dropout operation to the model.

Applies a shaped dropout to the input tensor. This operator requires a shape parameter that is used to define the shape of the dropout mask so that strongly correlated features in the input tensor can be preserved. The provided shape must be broadcastable to the input tensor. Note that this operation targets the poprand library function of the same name.

Parameters
  • args – A vector of input tensor ids.

  • shape – The shape of dropout mask. This must be broadcastable to the input.

  • ratio – The probability of dropping an input feature. Default = 0.5.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId atan2(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an atan2 operation to the model.

Returns the element-wise angle theta as a tensor. For \( -\pi < \theta \le \pi \), such that for two input tensors \(x\) and \(y\) and given \( r \ne 0 \), then \( x = r \cos\theta \), and \( y = r \sin\theta \), element-wise.

In the case of \( x > 0 \) , \( \theta = arctan(y/x)\) .

Parameters
  • args – A vector of input tensor ids: [y, x].

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId expm1(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a expm1 operation to the model.

This calculates the element-wise exponential of the input tensor and subtracts one: \( exp(x) - 1 \).

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId log1p(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a log1p operation to the model.

This calculates the element-wise logarithm of the input tensor plus one: \( log(x + 1) \).

Parameters
  • args – A vector of input tensor ids.

  • name – Optional identifier for operation.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId reshape(const TensorId &arg, const Attributes::Ints &shape, const DebugContext &debugContext = {})

Add a reshape operation to the model.

This reshapes an input tensor. This reshape takes the target shape as an attribute instead of a tensor input as for the ONNX reshape op.

Parameters
  • arg – The tensor id of the input tensor.

  • shape – The shape of the output tensor. The output tensor must contain the same number of elements as the input tensor.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId fmod(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an fmod operation to the model.

This is equivalent to the C fmod function. The result has the same sign as the dividend.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

Computes the element-wise remainder of division. The remainder has the same sign as the dividend.

TensorId remainder(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a remainder operation to the model.

This is equivalent to Python’s modulo operator %. The result has the same sign as the divisor.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

Computes the element-wise remainder of division. The remainder has the same sign as the divisor.

TensorId reverse(const std::vector<TensorId> &args, const std::vector<int64_t> &dimensions, const DebugContext &debugContext = {})

Add a reverse operator to the model.

This reverses or flips the tensor along the specified dimensions.

Parameters
  • args – A vector of input tensor ids.

  • dimensions – The dimensions along which to reverse the tensor. If this is empty then this is equivalent to the identity operator.

  • debugContext – Optional debug information.

Returns

The tensor id of the reversed tensor.

TensorId slice(const std::vector<TensorId> &args, const std::vector<int64_t> &ends, const std::vector<int64_t> &starts, const std::vector<int64_t> &axes = std::vector<int64_t>(), const popart::DebugContext &debugContext = {})

Add a slice to the model.

This version of slice uses the starts, ends and axes attributes rather than tensor inputs. This reduces the number of ops as constant tensors are treated as ops while attributes are not.

Parameters
  • args – A vector of input tensor ids.

  • ends – The ends attribute.

  • starts – The starts attribute.

  • axes – The axes attribute.

  • debugContext – Optional debug information.

Returns

The normalized output tensor id.

TensorId packedDataBlock(const std::vector<TensorId> &args, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, const Builder &callback, const DebugContext &debugContext = {})

Add a packedDataBlock operator to the model.

Unpack packed sequences of data and call the callback function on the unpacked sequences.

Parameters
  • args – A vector of input tensor ids.

  • maxSequenceLengths – The maximum length of a sequence in each of the data inputs.

  • resultSize – The size of the first dimension of the result tensor.

  • callbackBatchSize – The number of batches to pass to the callback.

  • callback – The callback function.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

void abort(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an abort operation to the model.

The operation can be conditional or unconditional.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

TensorId bitwisenot(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise NOT operation to the model.

The operation computes the bitwise NOT of an integer tensor.

Parameters
  • args – An input tensor of type integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwiseand(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise AND operation to the model.

The operation computes the bitwise AND of two integer tensors.

Parameters
  • args – Two broadcastable input tensors of type integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwiseor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise OR operation to the model.

The operation computes the bitwise OR of two integer tensors.

Parameters
  • args – Two broadcastable input tensors of type integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwisexor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise XOR operation to the model.

The operation computes the bitwise XOR of two integer tensors.

Parameters
  • args – Two broadcastable input tensors of type integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwisexnor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise XNOR operation to the model.

The operation computes the bitwise XNOR of two integer tensors.

Parameters
  • args – Two broadcastable input tensors of type integer.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

std::vector<TensorId> reducemedian(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &axes = nonstd::nullopt, int64_t keepdims = 1, const DebugContext &debugContext = {})

Add reducemedian operation to the model.

This method computes the median values along the specified axes. In the case of an even number of elements, the lower of the two medians is selected. By default, the input tensor is reduced over all axes. Additionally, the operation also returns the indices of found median values in the reduction axis. If reduction is performed over multiple axes, the indices are “flattened” over the reduced axes, similar to numpy.ndarray.flat. The index may not be the first occurrence of the median value found in the input tensor.

Parameters
  • args – A vector with a single input tensor id.

  • axes – The axes over which the reduction is performed.

  • keepdims – If 1, the result tensors are of equal size as the input, but with reduction axes of size 1. Otherwise, the reduction axes are squeezed and the result tensors have fewer dimensions compared to the input. Default = 1.

  • debugContext – Optional debug information.

Returns

The names of the two result tensors, one for median values and one for indices.

TensorId groupedgather(const std::vector<TensorId> &args, Attributes::Int axis = 0, Attributes::Int group_size = 1, const DebugContext &debugContext = {})
TensorId groupedscatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int group_size = 1, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})

Add a grouped scatterreduce operation to the model.

Reduces all the values from the source tensor src at the indices specified along the given axis by index for each group. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy the src input is the data we are splitting and the indices define the groups for the reduction operation.

In pseudocode the operator can be expressed as:

for g in range(group_size):
    for i in range(axis_size):
        output[g][i] = reduce(src[g][index == i])

where the looping over output indices is implicitly handled by poplar.

Parameters
  • args – A vector of tensor ids as [src, index, initial_values]. initial_values is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor for ScatterReduction::Sum.

  • axis_size – The size of the reduced axis.

  • axis – The axis to reduce along. Default = -1.

  • reduction – The type of reduction to apply. Default = ScatterReduction::Sum.

  • group_size – The number of groups to reduce. Default = 1.

  • enable_index_broadcast – If 1

    index will be broadcasted to match”

    `data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})

Add a scatterreduce operation to the model.

Reduces all the values from the source tensor src at the indices specified along the given axis by index. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy the src input is the data we are splitting and the indices define the groups for the reduction operation.

In pseudocode the operator can be expressed as:

for i in range(axis_size):
    output[i] = reduce(src[index == i])

where the looping over output indices is implicitly handled by poplar.

Parameters
  • args – A vector of tensor ids as [src, index, initial_values]. initial_values is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor for ScatterReduction::Sum.

  • axis_size – The size of the reduced axis.

  • axis – The axis to reduce along. Default = -1.

  • reduction – The type of reduction to apply. Default = ScatterReduction::Sum.

  • enable_index_broadcast – If 1

    index will be broadcasted to match”

    `data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId swish(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a swish operation to the model.

The operation computes the swish activation function, also known as the SiLU activation.

Parameters
  • args – A vector with a single input tensor id.

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId incrementmod(const std::vector<TensorId> &args, Attributes::Float increment, Attributes::Float modulus, const DebugContext &debugContext = {})

Add an incrementmod operation to the model.

The operation is of the form y = (x + increment) % modulus.

Parameters
  • args – A vector with a single input tensor id.

  • increment – A scalar increment

  • modulus – A scalar modulus

  • debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bucketize(const std::vector<TensorId> &args, Attributes::Int right = 0, const DebugContext &debugContext = {})

Add a bucketize operation to the model.

The operation returns the indices of the buckets to which each value in the input tensor belongs. The ranges of each bucket are defined by the boundaries tensor. The returned index satisfies the following rules:

right == 1: boundaries[i-1] <= input[m][n]…[l][x] < boundaries[i] right == 0: boundaries[i-1] < input[m][n]…[l][x] <= boundaries[i]

Parameters
  • args – A vector of tensor IDs containing [input, boundaries]. Where

    • input is an N-D tensor or a scalar containing the search values

    • boundaries is a 1-D tensor defining ranges of the buckets. This must contain a monotonically increasing sequence.

  • right – If 0 (default) then the left boundary is closed.

Returns

The tensor ID of the result tensor. The result tensor has the same size and shape as the input tensor.

std::vector<TensorId> sort(const std::vector<TensorId> &args, Attributes::Int axis = -1, Attributes::Int descending = 0, Attributes::Int stable = 0, const popart::DebugContext &debugContext = {})

Add a sort operation to the model.

Parameters
  • args – A vector with a single input tensor id.

  • axis – The dimension to sort along.

  • descending – If ‘1’ then the elements are sorted in descending order by value.

  • stable – If ‘1’ then the sorting routine becomes stable, preserving the order of equivalent elements.

Returns

A vector of (values, indices) is returned, where the values are the sorted values and indices are the indices of the elements in the original input tensor.

TensorId nearbyint(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a nearby int rounding operation to the model.

Rounds the floating-point argument to an integer value in floating-point format.

Parameters
  • args – A vector of input tensor ids.

  • debugContext – Optional debug information.

Returns

The normalized output tensor ids.

std::vector<TensorId> splinebasis(const std::vector<TensorId> &args, Attributes::Int degree = 1, const DebugContext &debugContext = {})

Add a splinebasis operation to the model.

The operation returns two outputs: coefficients for the B-spline basis functions and weight indices for each spline coefficient.

Parameters
  • args – A vector of tensor IDs containing [pseudo, kernel_size, is_open_spline]. where

    • pseudo is a 2-D tensor with pseudo coordinates, of shape [numEdges * numDims].

    • kernel_size is a 1-D tensor containing the kernel size at each dimension of the edge pseudo coordinates.

    • is_open_slice is a 1-D tensor that for each dimension encodes whether an open or a closed B-spline basis function must be used.

  • degree – The degree of the B-spline basis function.

Returns

The basis and weightIndex tensors, both of shape [numEdges * numSplines]. basis contains the coefficients for the B-spline basis functions. weightIndex contains weight indices for each spline.

TensorId splineweighting(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a splineweighting operation to the model.

The operation returns features weighted by a continuous B-spline kernel function.

Parameters

args – A vector of tensor IDs containing [input, weight, basis weightIndex]. where

  • input is a 2-D tensor (size: [numEdges * numInputChannels]) with input features.

  • weight is a 3-D tensor (size: [numEdges * numInputChannels * numOutputChannels]) containing weights for B-Spline functions.

  • basis is a 2-D tensor (size: [numEdges * numSplines]) of the coefficients for the B-spline basis functions and is produced by the splinebasis op.

  • weightIndex is a 2-D tensor (size: [numEdges * numSplines]) of the weight indices produced by the splinebasis op.

Returns

A tensor of shape [numEdges * numOutputChannels] containing features weighted by a continuous B-spline kernel function.

#include <popart/scope.hpp>
class Scope

Public Functions

inline bool empty() const
void pop()
Scope getCommonParent(const Scope&) const
inline size_t depth() const
bool operator==(const Scope&) const
bool operator!=(const Scope&) const
std::string str() const
Scope operator/(const std::string &name) const
inline operator std::string()
bool isSubscope(const Scope&) const
const std::vector<std::string> getScopeNames() const

Public Static Functions

static inline std::string delimiter()
static Scope getCommonParent(const std::vector<Op*>&)

14.6. Data flow

#include <popart/dataflow.hpp>
enum class popart::AnchorReturnTypeId

Class that defines the identifiers for the return type of the anchor tensors.

An anchor tensor is a tensor that the user wants returned after a call to Session::run(). Each call to Session::run() results in batchesPerStep x accumulationFactor x replicationFactor of anchor tensors being computed. The samples associated with each computation is called a micro batch. The dimensions are user-specified with the following parameters:

This enum type describes the strategy with which the micro batch values for anchor tensors (or their summaries) are written or to the IStepIO instance passed to Session::run.

NOTE: Anchors are essentially what TensorFlow calls “fetches”.

See also

AnchorReturnType.

Values:

enumerator Final = 0

Only return the tensor value for the last micro batch of the Session::run call for each replica.

The buffer shape required for this anchor in IStepIO is [replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator EveryN

Return the tensor value for every N-th global batch for each replica and for all accumulation steps in that global batch.

Note that the value of N is captured by AnchorReturnType.

The buffer shape required for this anchor in IStepIO is [batchesPerStep / N, accumulationFactor, replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator All

Return the tensor value for all micro batches for each replica.

The buffer shape required for this anchor in IStepIO is [batchesPerStep, accumulationFactor, replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator Sum

Return one tensor value for each replica, doing a sum reduction over the batchesPerStep and accumulationFactor dimensions.

The buffer shape required for this anchor in IStepIO is [replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enum class popart::ExchangeStrategy

Enum type to specify an exchange strategy.

JustInTime: .- outer loop ———-&#8212;. |.- inner loop ——–&#8212;.| || load - compute - store || |’———————&#8212;’| ‘———————–&#8212;’

OverlapInnerLoop:

  • Boxes denote subgraphs / subgraph Ops / loops

  • Inputs/outputs are loop carried in order

.- outer loop ————————————-&#8212;. | .- inner loop -. | | load - compute - | - store | | | load - | - compute &#8212; | - store | | | load –&#8212; | - compute - store | | ‘———–&#8212;’ | ‘————————————————–&#8212;’ ^^^^^^^ ^^^^^^^ ^^^^^^^ overlap overlap overlap

OverlapLoops

  • Boxes denote subgraphs / subgraph Ops / loops

  • Numbers on boxes are matching subgraph/loop inputs and outputs

  • Overlap indicators indicate compute & load/store pairs overlapping in time

             load
               |
            compute   load            load         < overlap
               |        |               |
               1        2               |
           .-- inner loop --.           |
           |   |        |   |           |
           | store  compute |           |          < overlap
           | load       |   |           |          < overlap
           |   |        |   |           |
           '----------------'           |
               2        1      load compute        < overlap
               |        |        |      |
               1        2        3      4
    
    .- outer loop ——————————–&#8212;. | | | | | | | compute store | store | < overlap | \ / | | 1 2 | | .&#8212; inner loop &#8212;. | | | | | | | | | store compute | | < overlap | | load | | | < overlap | | | | | | | ‘————-&#8212;’ | | 2 1 | | | | | | load compute | load | < overlap | | | | | | ‘———————————————&#8212;’ 3 4 2 1 | | | | compute | store | < overlap | \ / | 1 2 | .&#8212; inner loop &#8212;. | | | | | | | store compute | < overlap | | load | | < overlap | | | | | | ‘————-&#8212;’ | 2 1 | | | store compute store < overlap | store

OverlapStep: Not supported yet

Values:

enumerator JustInTime = 0

Copy tensor when required.

enumerator OverlapInnerLoop = 1

Preload values in previous inner loop iteration for the next iteration.

enumerator OverlapLoops = 2

Preload values in the previous loop iteration for the next iteration (implies OverlapInnerLoop)

enumerator OverlapStep = 3

Preload values in the previous host training step for next step (implies OverlapLoops) - not supported yet.

enumerator N = 4

Number of values.

class AnchorReturnType

Class that captures an AnchorReturnTypeId value.

When the value is AnchorReturnTypeId::EVERYN, the associated N value. The constructor takes std::string values and converts them as appropriate.

Public Functions

AnchorReturnType()

Default constructor for the AnchorReturnType class.

AnchorReturnType(std::string artString, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)

Constructor for the AnchorReturnType class.

NOTE: Attempting to construct an AnchorReturnType for AnchorReturnTypeId::EVERYN using this constructor will result in an error. Use AnchorReturnType(std::string,int,TileSet,ExchangeStrategy) which also specifies the return period.

Parameters
  • artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):

    • ”final” = AnchorReturnTypeId::FINAL

    • “all” = AnchorReturnTypeId::ALL

    • “sum” = AnchorReturnTypeId::SUM

  • tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.

  • exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.

AnchorReturnType(std::string artString, int returnPeriod, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)

Constructor for the AnchorReturnType class.

Parameters
  • artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):

    • ”final” = AnchorReturnTypeId::FINAL

    • “all” = AnchorReturnTypeId::ALL

    • “sum” = AnchorReturnTypeId::SUM

  • returnPeriod – The value of N in the case of AnchorReturnTypeId::EVERYN.

  • tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.

  • exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.

inline const std::string &str() const

Get a string of AnchorReturnTypeId.

inline const TileSet &tileSet() const

Get the type of the tile set.

inline const ExchangeStrategy &exchangeStrategy() const

Get the type of overlap strategy.

class DataFlow

This class specifies parameters for host-device data streams.

The parameters are used to control the amount input data processed in each step, that is each Session::run call. The parameters also determine how data is returned to the user.

See also

AnchorReturnType, AnchorReturnTypeId.

Public Functions

DataFlow()

Default constructor.

This constructor sets batchesPerStep to 0 and does not have any anchor tensors.

DataFlow(int batchesPerStep)

Construct a DataFlow instance without anchor tensors.

Parameters

batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.

DataFlow(int batchesPerStep, const AnchorReturnTypeMap &anchorMap)

Construct a DataFlow instance with anchor tensors.

Parameters
  • batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.

  • anchorMap – A mapping from output tensor TensorId to AnchorReturnType indicating the strategy with which to write the anchor tensor values to the IStepIO object provided to Session::run.

DataFlow(int batchesPerStep, const std::vector<TensorId> anchorTensorIds, const AnchorReturnType &anchorReturnType = AnchorReturnType("All"))

Construct a DataFlow instance with anchor tensors.

Parameters
  • batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.

  • anchorTensorIds – The tensor ID of anchor tensors.

  • anchorReturnType – The strategy with which to write anchor tensor values to the IStepIO object provided to Session::run.

DataFlow(const DataFlow &rhs) = default
inline void setBatchesPerStep(const int batchesPerStep)

Set the value for batchesPerStep.

class InputSettings

Class that describes the TileSet, ExchangeStrategy, and ReplicatedStreamMode used for an input tensor.

Public Functions

InputSettings()

Constructor for the InputSettings class.

InputSettings(TileSet tileSet, ExchangeStrategy exchangeStrategy)

Constructor for the InputSettings class.

Parameters
  • tileSet – The type of the tile set.

  • exchangeStrategy – The overlap strategy (between IO and compute) for anchor tensors.

InputSettings(ReplicatedStreamMode replicatedStreamMode)

Constructor for the InputSettings class.

Parameters

replicatedStreamMode – The mode used for the replicated stream.

inline const TileSet &tileSet() const

Get the type of the tile set.

inline const ExchangeStrategy &exchangeStrategy() const

Get the type of overlap strategy.

inline ReplicatedStreamMode replicatedStreamMode() const

Get the mode of the replicated stream.

inline void setTileSet(TileSet tileSet)

Set the type of the tile set.

Parameters

tileSet – The type of the tile set..

inline void setExchangeStrategy(ExchangeStrategy exchangeStrategy)

Set the overlap strategy (between IO and compute).

Parameters

exchangeStrategy – The overlap strategy.

inline void setReplicatedStreamMode(ReplicatedStreamMode streamMode)

Set the mode used for the replicated stream.

Parameters

replicatedStreamMode – The mode used for the replicated stream.

using popart::AnchorReturnTypeMap = std::map<TensorId, AnchorReturnType>
#include <popart/replicatedstreammode.hpp>
enum class popart::ReplicatedStreamMode

Values:

enumerator Broadcast
enumerator Replicate

14.7. Device manager

#include <popart/devicemanager.hpp>
enum class popart::DeviceType

Defines the type of device to use for graph compilation and execution.

Values:

enumerator IpuModel = 0

Use the Poplar IPU Model for graph compilation and execution.

The IPU Model will simulate the behaviour of the IPU hardware. It will not completely implement every aspect of a real IPU. (Default).

enumerator Cpu

Use CPU for graph compilation and execution.

enumerator Ipu

Use IPU for graph execution.

enumerator OfflineIpu

Compile graph for later execution.

This can be done even if IPUs are not present. Offline graph compilation is also useful for verifying memory constraints.

enumerator Sim

[For Graphcore internal use only] Use a simulator for graph compilation and execution.

enum class popart::DeviceConnectionType

Controls when to connect to the IPU (if at all).

Values:

enumerator Always = 0

Attach to the IPU from the start (Default).

enumerator OnDemand

Wait until the compilation is complete and the executable is ready to be run before attaching to the IPU.

enumerator Never

Never try to attach to an IPU.

This is useful for offline compilation (DeviceType::OfflineIpu. Trying to run an executable will throw an error.

enum class popart::SyncPattern

Controls synchronisation in multi-IPU systems.

Values:

enumerator Full = 0

Require all IPUs to synchronise on every communication between IPUs or between IPUs and host (Default).

enumerator SinglePipeline

Allow IPUs to synchronise with the host independently, without having to synchronise with each other.

This permits any one IPU to perform host IO while other IPUs are processing data.

enumerator ReplicaAndLadder

Allow an IPU group to communicate with the host without requiring synchronisation between groups.

This permits multiple IPU groups to alternate between performing host IO and computation.

class DeviceInfo

Represents a specific device.

Subclassed by popart::popx::DevicexInfo, popart::popx::DevicexOfflineIpuInfo

Public Functions

DeviceInfo(DeviceType _type, DeviceConnectionType _connectionType, const poplar::OptionFlags &_flags)

Constructor for the DeviceInfo class.

Parameters
  • _type – The type of the device.

  • _connectionType – The setting for when to connect to the device, if at all.

  • _flags – A set of Poplar option/value string flags.

virtual ~DeviceInfo()

Destructor for DeviceInfo.

virtual bool attach() = 0

Attach to the device.

Returns

true if successfully attached to the device, false otherwise.

virtual void detach() = 0

Detach from the device.

virtual bool isAttached() const = 0

Check if attached to the device.

Returns

true if attached to the device, false otherwise.

inline DeviceType getType() const

Get the type of the device.

Returns

The type of the device.

inline DeviceConnectionType getConnectionType() const

Get the setting for when to connect to the device.

Returns

The setting for when to connect to the device.

std::string toString() const

Return a description of the device.

virtual int getId() const = 0

Get the device id.

virtual std::vector<int> getChildIds() const = 0

Get the child device IDs.

The value returned by getId() for a multi-IPU device is a ‘parent ID’ and does not relate to the IDs of the devices it comprises. This function, in the case of real devices, uses the Poplar API to work out which single-IPU device IDs it relates to. In the case of replication, a device includes all IPUs involved, so a 2-IPU model with 2x replication would expect to have 4 child IDs returned here.

virtual std::string getVersion() const = 0

Get the version of the software on the IPU.

virtual int getNumIpus() const = 0

Get the number of IPUs in the device.

virtual int getTilesPerIPU() const = 0

Get the number of tiles per IPU.

virtual int getNumWorkerContexts() const = 0

Get the number of worker contexts per tile.

virtual std::string getIpuVersion() const = 0

Get the IPU version.

virtual std::vector<unsigned> getDriverIds() const = 0

Get the version of the drivers on the IPU.

virtual const poplar::Target &getTarget() const = 0

Get the Poplar target.

inline virtual bool canCompileOffline() const

Get whether the device supports offline compilation.

Returns

true if the device supports offline compilation, otherwise false`.

const poplar::OptionFlags &getOptionFlags() const
void setOnDemandAttachTimeout(const unsigned seconds)

Set timeout (in seconds) for trying to attach to a device.

If unable to attach to a device on the first try, the DeviceManager instance will periodically try to attach to the device until successfully attached or this timeout is reached.

Note

This only applies when trying to attach with DeviceConnectionType::OnDemand.

Parameters

seconds – The timeout (in seconds) for trying to attach to the device.

inline const unsigned &getOnDemandAttachTimeout() const

Get timeout (in seconds) for trying to attach to a device.

Returns

The timeout (in seconds) for trying to attach to the device.

bool tryAttachUntilTimeout()

Periodically try to attach to the device until either the attach timeout is reached or successfully attached.

bool isHwCompatible() const
void writeToDeviceAccessLog(const std::string &event, const std::map<std::string, std::string> &auxKeyVals = {})

Log an event for device debugging purposes.

This event will get logged to the file location defined by the environment variable POPART_LOG_DEVICE_ACCESS_IN_TESTS, if it is set.

Parameters
  • event – A text description of the event to be written to the log.

  • auxKeyVals – Optional additional parameters to log.

class DevicexInfo : public popart::DeviceInfo

Subclassed by popart::popx::DevicexCpuInfo, popart::popx::DevicexIpuInfo, popart::popx::DevicexIpuModelInfo, popart::popx::DevicexSimInfo

Public Functions

inline DevicexInfo(popart::DeviceType _type, popart::DeviceConnectionType _connectionType, poplar::Device &_device, const poplar::OptionFlags &_flags)
~DevicexInfo() override
bool attach() override
void detach() override
inline int getNumIpus() const override
inline int getTilesPerIPU() const override
inline int getNumWorkerContexts() const override
inline std::vector<unsigned> getDriverIds() const override
inline const poplar::Device &getDevice() const
inline const poplar::Target &getTarget() const override
inline std::string getIpuVersion() const override
inline bool isAttached() const override
virtual void setMostRecentlyLoaded(Devicex *devicex)

Mark devicex as the last one that was loaded.

virtual bool isMostRecentlyLoaded(const Devicex *devicex) const

Check if Devicex was the last one that was loaded.

class DevicexCpuInfo : public popart::popx::DevicexInfo

Public Functions

inline DevicexCpuInfo(poplar::Device &_device)
inline int getId() const override
inline std::vector<int> getChildIds() const override
inline std::string getVersion() const override
class DevicexIpuInfo : public popart::popx::DevicexInfo

Public Functions

inline DevicexIpuInfo(popart::DeviceConnectionType _dct, int _id, poplar::Device &_device, const poplar::OptionFlags &_flags)
inline int getId() const override
std::vector<int> getChildIds() const override
std::string getVersion() const override
inline bool canCompileOffline() const override
class DevicexIpuModelInfo : public popart::popx::DevicexInfo

Public Functions

inline DevicexIpuModelInfo(poplar::Device &_device, const std::string _ipuVersion)
inline int getId() const override
inline std::vector<int> getChildIds() const override
inline std::string getVersion() const override
class DevicexSimInfo : public popart::popx::DevicexInfo

Public Functions

inline DevicexSimInfo(poplar::Device &_device)
inline int getId() const override
inline std::vector<int> getChildIds() const override
inline std::string getVersion() const override
class DevicexOfflineIpuInfo : public popart::DeviceInfo

Public Functions

inline DevicexOfflineIpuInfo(poplar::Target &_target, const poplar::OptionFlags &_flags)
inline bool attach() override
inline void detach() override
inline int getId() const override
inline std::vector<int> getChildIds() const override
inline std::string getVersion() const override
inline int getNumIpus() const override
inline int getTilesPerIPU() const override
inline int getNumWorkerContexts() const override
inline std::string getIpuVersion() const override
inline std::vector<unsigned> getDriverIds() const override
inline const poplar::Target &getTarget() const override
inline bool canCompileOffline() const override
inline bool isAttached() const override
class DeviceManager

A class to manage devices.

Public Functions

DeviceManager(const DeviceManager&) = default
~DeviceManager() = default
void registerDeviceProvider(DeviceProvider *provider)

Register a device provider.

Parameters

provider – The device provider to be registered with the device manager.

virtual void enumerate(std::vector<std::shared_ptr<popart::DeviceInfo>> &devices, unsigned requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU)

Get the list of all devices that satisfy the specified criteria.

Parameters
  • devices – The list of devices.

  • requiredNumIPUs – The number of IPUs required.

  • syncPattern – The setting for when to synchronise in a multi-IPU system.

  • type – The type of the device to use for compilation and execution.

  • connectionType – The setting for when to connect to the device.

  • requiredTilesPerIPU – The number of tiles per IPU required.

std::vector<std::shared_ptr<DeviceInfo>> enumerateDevices(SyncPattern pattern = SyncPattern::Full, int numIpus = 1, DeviceType deviceType = DeviceType::Ipu, DeviceConnectionType connectionType = DeviceConnectionType::Always, int tilesPerIPU = 0)

Get the list of all devices with the required criteria.

Parameters
  • pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • numIpus – The number of IPUs required. (Default: 1).

  • deviceType – The type of the device required. (Default: DeviceType::Ipu).

  • connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).

  • tilesPerIPU – The number of tiles per IPU required. (Default: 0).

Returns

The list of devices with the required criteria.

std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern = SyncPattern::Full, uint32_t deviceManagerId = 0, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Get a device with the required criteria.

Parameters
  • syncPattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • deviceManagerId – The ID of the requested device. (Default: 0)

  • connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> tryAcquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)

Finds an available hardware device, with the specified number of IPUs.

This method will attach to the device if connectionType is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.

Parameters
  • numIpus – The number of IPUs on the device (Default: 1).

  • tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).

  • pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).

  • selectionCriterion – The method for selecting a device from the list of valid selections. (Default: DeviceSelectionCriterion::First).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> acquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)

Finds an available hardware device, with a certain number of IPUs.

This method will attach to the device if connectionType is equal to DeviceConnectionType::Always. Throws an error if there are less than numIpus IPUs available.

Parameters
  • numIpus – The number of IPUs on the device [=1].

  • tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).

  • pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • connectionType – The connection type, for deciding when to attach to the device.

  • selectionCriterion – How to select a device from the list of valid selections.

Returns

A device, which can be used with a session.

std::shared_ptr<DeviceInfo> tryAcquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Allocates the hardware device by ID.

This ID can be found running gc-info -l. This method will try to attach to the device if connectionType is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.

Parameters
  • id – The ID of the IPU to be used.

  • pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> acquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Allocates the hardware device by ID.

This ID can be found running gc-info -l. This method will attach to the device if connectionType is equal to DeviceConnectionType::Always.

Parameters
  • id – The ID of the IPU to be used.

  • pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).

  • connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session.

std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options)

Create a simulated device on the host for testing purposes.

Parameters
  • type – The type of device to simulate.

  • options – The configuration settings for the host device.

Returns

The requested device for testing purposes.

std::shared_ptr<DeviceInfo> createCpuDevice()

Create a simulated CPU device for testing purposes.

Returns

A simulated CPU device.

std::shared_ptr<DeviceInfo> createIpuModelDevice(const std::map<std::string, std::string> &options)

Create a simulated IpuModel device for testing purposes.

The following options are supported:

  • numIPUs: The number of IPUs to simulate (Default: 1).

  • ge: The number of tiles per IPU (Default: defaultFewTiles).

  • compileIPUCode: Indicate whether or not to compile real IPU code for modelling.

Parameters

options – Configuration settings for the IPU Model.

Returns

A device.

std::shared_ptr<DeviceInfo> createSimDevice(const std::map<std::string, std::string> &options)
std::shared_ptr<DeviceInfo> createOfflineIPUDevice(const std::map<std::string, std::string> &options)

Create a simulated OfflineIpu device for testing purposes.

This resembles an IPU and is used for offline compilation.

The following options are supported:

  • numIPUs: The number of IPUs to compile for

  • ge: The number of tiles per IPU (Default: defaultManyTiles).

  • ipuVersion: The ipu architecture (Default: “ipu2”).

  • syncPattern: The setting for synchronisation in a multi-IPU system.

Parameters

options – Configuration settings for the IPU Model.

Returns

A simulated OfflineIpu device.

std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo)

Create a simulated OfflineIpu device from the description of another device.

Parameters

deviceInfo – The device to create a OfflineIpu version of.

Returns

An OfflineIpu device.

std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus)

Create a simulated OfflineIpu device from the name of a system.

Parameters
  • system – The device to create a OfflineIpu version of.

  • numIpus – The number of IPUs. Providing 0 corresponds to all IPUs in system

Returns

An OfflineIpu device.

void setOnDemandAttachTimeout(const unsigned seconds)

If unable to attach to a device on first try, the attach timeout set here is the length of time (in seconds) that the DeviceManager will wait to try and attach.

Note: this only takes effect when trying to attach with a DeviceConnectionType::OnDemand DeviceConnectionType.

Parameters

seconds – The attach timeout in seconds.

Public Static Functions

static DeviceManager &createDeviceManager()

Accessor for the device manager.

Returns

A reference to the DeviceManager instance.

class DeviceProvider

The interface for device providers which are registered with the device manager.

Subclassed by popart::popx::DevicexManager

Public Functions

inline virtual ~DeviceProvider()

Destructor for DeviceProvider.

virtual std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, unsigned deviceManagerId, DeviceConnectionType connectionType) = 0

Get the list of all devices that satisfy the specified criteria.

Throws an error if the connection type is DeviceConnectionType::Never.

Parameters
  • syncPattern – The setting for synchronisation on multi-IPU systems.

  • deviceManagerId – The ID of the requested device.

  • connectionType – The setting for when to connect to the device.

Returns

The list of all devices that satisfy the specified criteria.

virtual void enumerate(std::vector<std::shared_ptr<DeviceInfo>> &devices, uint32_t requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU) = 0

Get the list of all devices that satisfy the specified criteria.

Parameters
  • devices – The list of devices.

  • requiredNumIPUs – The number of IPUs required.

  • syncPattern – The setting for when to synchronise in a multi-IPU system.

  • type – The type of the device to use for compilation and execution.

  • connectionType – The setting for when to connect to the device.

  • requiredTilesPerIPU – The number of tiles per IPU required.

virtual std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) = 0

Create a host device for testing.

Parameters
  • type – The type of the device to use for compilation and execution.

  • options – The configuration for the created device. See createCpuDevice(), createIpuModelDevice(), createOfflineIPUDevice() and createSimDevice() for more information about options.

  • syncPattern – The setting for when to synchronise in a multi-IPU system.

Returns

The device for use in testing.

virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) = 0
virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) = 0
class DevicexManager : public popart::DeviceProvider

Public Functions

DevicexManager()
std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, uint32_t deviceManagerId, DeviceConnectionType connectionType) override
void enumerate(std::vector<std::shared_ptr<popart::DeviceInfo>> &devices, unsigned requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU) override
std::shared_ptr<popart::DeviceInfo> createHostDevice(popart::DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) override
std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) override
std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) override
#include <popart/popx/devicex.hpp>
class Devicex

Public Functions

const Ir &ir() const
const IrLowering &lowering() const
IrLowering &lowering()
Devicex(Executablex &exe, std::shared_ptr<DeviceInfo> deviceInfo)
~Devicex()
void prepare()
void weightsFromHost()
void buffersFromHost()
void remoteBufferWeightsFromHost(const bool isUpdate = false)
void optimizerFromHost()
void setRandomSeedFromHost()
uint64_t getRandomSeedToHost()
void setRngStateFromHost()
std::vector<uint32_t> getRngStateToHost()
void setRngStateValue(const std::vector<uint32_t>)
std::map<std::string, std::vector<uint64_t>> cycleCountTensorToHost()
void run(IStepIO&, std::string debugName = "")
void run(std::string programHandle, IStepIO&, std::string debugName = "")
void weightsToHost()
void remoteBufferWeightsToHost()
void weightsToHost(const std::map<TensorId, MutableVoidData>&)
void popxlWeightsToTensorData()

Copy data from the device, to the host buffers, to the tensor.tensorData() buffers.

Will not run a WeightsToHost program if weights already in sync with ipu. After WeightsToHost, marks the weights as in sync with the ipu.

void popxlMarkHostWeightsOutOfSync()

Mark the d2hWeightBuffers as out of sync with the ipu.

void popxlMarkHostWeightsInSync()

Mark the d2hWeightBuffers as in sync with the ipu.

bool popxlAreHostWeightsInSync()

Are all the weights in sync with the ipu?

void readWeights(const IWeightsIO &dst)
void writeWeights(const IWeightsIO &src)
std::string getSummaryReport(bool resetProfile = true) const
std::string getSerializedGraph() const
pva::Report getReport() const
bool isEngineLoaded() const
void setEngineIsLoaded(bool isLoaded)
void connectRandomSeedStream()
void connectRngStateStream()
void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index)
void connectStream(const std::string &streamHandle, void *host_buffer)
void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index)
void copyFromRemoteBuffer(const PopStreamId buffer, void *w, int repeat_index, unsigned replication_index = 0)
void copyToRemoteBuffer(void *w, const PopStreamId buffer, int repeat_index, unsigned replication_index = 0)
unsigned getReplicationFactor() const
unsigned getAccumulationFactor() const
unsigned getGlobalReplicaOffset() const
unsigned getGlobalReplicationFactor() const
bool isReplicatedGraph() const
inline const DeviceInfo *getDeviceInfo() const
inline DeviceInfo *getDeviceInfo()
inline void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo_)
std::set<TensorId> getLinearlyCreatedInputTensors() const
std::set<TensorId> getEfficientlyCreatedInputTensors() const
inline bool prepareHasBeenCalled() const
void loadEngineAndConnectStreams()
void serializeExecutable(std::ostream &out, bool serializePopartMetadata, bool serializeTensorData)
void serializeExecutable(const std::string &path, bool serializePopartMetadata, bool serializeTensorData)
void serializeTensorData(const std::string &path)

Public Members

poplin::PlanningCache convCache
poplin::matmul::PlanningCache matmulCache
bool prePlanConvolutions = true
bool prePlanMatMuls = true

Friends

friend class serialization::WriterImpl
typedef std::string popart::popx::PopStreamId
class Executablex

Public Functions

Executablex(IrLowering &ir_lowering_)
Executablex(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)
IrLowering &lowering()
const IrLowering &lowering() const
const Ir &ir() const
inline bool isDeserialized() const
bool shouldSerialize()
bool containsTensor(const TensorId &id) const
Tensor *getTensor(const TensorId&)
const Tensor *getTensor(const TensorId&) const
std::set<TensorId> getAllTensorIds()
std::vector<TensorId> getTensorIds(TensorType)
void setRandomSeedValue(uint64_t value)
void resetWeights(const ONNX_NAMESPACE::ModelProto &modelProto, const bool ignoreWeightsInModelWithoutCorrespondingIrWeight = false)
inline const SessionOptions &getSessionOptions() const
inline std::vector<Tensor*> &getWeightTensors()
inline const std::vector<Tensor*> &getWeightTensors() const
inline const std::vector<Tensor*> &getAnchorTensors() const
inline const std::vector<Tensor*> &getOptimizerTensors() const
inline const std::vector<Tensor*> &getDataStreamTensors() const
inline const Tensor *getSeedTensor() const
const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &id) const
const std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> getCollectiveBalancedHostRearrangements() const
const std::map<TensorId, CollectiveBalancedReorderId> getCollectiveBalancedHostRearrangementIds() const
std::string getCachePath(const std::string &cacheDir) const
void updateOptimizerTensors()

Public Static Functions

static std::unique_ptr<Executablex> createFromLoweredIr(IrLowering &ir_lowering_)
static std::unique_ptr<Executablex> createFromStream(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)
#include <popart/popx/irlowering.hpp>
class IrLowering

Public Types

using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>

Public Functions

IrLowering(const Ir&, std::shared_ptr<DeviceInfo> deviceInfo, bool prepareGraphHasBeenCalled = false)
virtual ~IrLowering()
inline const Ir &ir() const
void growOpx(Opx*, SequenceMap::SequenceInterval seqInterval)
void growOpxCall(Opx*, SequenceMap::SequenceInterval seqInterval)
inline void setDevicex(Devicex *d)
std::set<TensorId> getLinearlyCreatedInputTensors() const
inline void setLinearlyCreatedInputTensors(const std::set<TensorId> &s)
inline void addLinearlyCreatedInputTensors(TensorId id)
std::set<TensorId> getEfficientlyCreatedInputTensors() const
inline void setEfficientlyCreatedInputTensors(const std::set<TensorId> &s)
inline void addEfficientlyCreatedInputTensors(TensorId id)
bool tryInitTensorByPostIRAliasing(TensorId dstId, RequireParallelWritable requireParallelWritable, const ViewChangers &viewChangers)
inline const std::vector<std::string> &getCycleCountIds() const
inline void setCycleCountIds(const std::vector<std::string> &ids)
inline const PopTensors &tensors() const
inline PopTensors &tensors()
inline const PopPrograms &progs() const
inline PopPrograms &progs()
void instrumentWithHardwareCycleCounter(poplar::program::Sequence&, int64_t tileId = 0, std::string id = "")
inline poplar::Graph &graph()
inline const poplar::Graph &graph() const
void prepareGraph()
void loadPoplarExecutable(serialization::Reader &reader)
poplar::Executable getExecutable(const ProfileCacher &ProfileCacher)
std::string getPoplarGraphDebugName()
std::string getSerializedGraph() const
poplar::Graph &getVirtualGraph(VGraphId virtualGraphIndex, TileSet tileSet = TileSet::Compute)
PriTaskDependency taskWhichCreates(TensorId) const
TaskId taskWhichPopulates(TensorId) const
PriTask getDependencyFreeInitTensorCreatorTask(const TensorId&)
unsigned getReplicationFactor() const
unsigned getAccumulationFactor() const
unsigned getGlobalReplicaOffset() const
unsigned getGlobalReplicationFactor() const
bool isReplicatedGraph() const
bool doRearrangeOnHost(Tensor *tensor) const
int getNumFragments(const Graph &graph) const
bool containsFragments(const Graph &graph) const
bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const
void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)
std::vector<poplar::Function> &getFragmentFunctions(const Graph &graph)
poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart)
void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Add a vector of pairs {f, buffer} for a given graph id, FunctionBufferMappingType pair.

This is enough for an [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.

Parameters
  • gid – The graph id to add the functions and buffers for.

  • fbmt – The FunctionBufferMappingType to add the vector for.

inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Get the Function Buffers for the given GraphId and FunctionBufferMappingType.

Wrapper around popprograms function.

Parameters
  • gid – The GraphId to lookup.

  • fbmt – The FunctionBufferMappingType to lookup.

Returns

FunctionBuffers the vector of functions and buffers.

inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Returns true if a functionBuffer vector exists for the given graphId / FunctionBufferMappingType.

Wrapper around popprograms function.

Parameters
  • gid – The graph id to lookup.

  • fbmt – The FunctionBufferMappingType to lookup.

Returns

true If pairs exist.

Returns

false Otherwise.

std::vector<ICreatorCandidatePtr> getCreatorEndpoints(const Tensor *tensor, bool excludeEndpointsFromPath = true, bool includeDeadends = false) const
std::vector<ICreatorCandidatePtr> getTensorCreators(const Tensor *tensor, bool dependencyFree) const
poplar::Tensor getConst(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, double val, const poplar::DebugContext &dc = {})
inline const ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle() const
inline ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle()
poplar::Tensor getScalarVariable(poplar::Graph &graph, const poplar::Type &type, const poplar::DebugContext &dc = {})
inline LinearMapper &getLinearMapper()
inline InitTensorOffsetMap &getInitTensorOffsetMap()
inline const liveness::LivenessAnalyzer *getLivenessAnalyzer() const
inline const liveness::SubgraphPartitioner *getSubgraphPartitioner() const
inline liveness::AliasZeroCopy *getAliasZeroCopy() const
inline const DeviceInfo *getDeviceInfo() const
inline void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo_)
std::unique_ptr<Opx> createOpx(Op*)
inline Opx *getOpx(OpId id)
inline const Opx *getOpx(OpId id) const
const std::vector<Op*> &getMainGraphOpSeries() const
std::map<Op*, int, POpCmp> getMainGraphOpSeriesNums() const
std::map<Op*, int, POpCmp> getMainGraphOpCounts() const
std::string getContextOpString(ExecutionContext context, const std::vector<TaskId> &taskOrder) const
inline bool prepareGraphHasBeenCalled() const
inline bool getOuterLoopFragEmpty() const
inline bool usingCachedExecutable() const
poplar::DataStream &insertGradientStoreStream(TensorId, TensorInfo, poplar::Graph&)
poplar::DataStream &insertGradientLoadStream(TensorId, TensorInfo, poplar::Graph&)
poplar::DataStream &insertWeightLoadStream(TensorId, TensorInfo, poplar::Graph&)
inline void addPipelineIndexTensor(const poplar::Tensor &tensor)
inline ExchangeBundle &getExchangeBundle()

Get the exchange bundle containing stream and remote buffer data structures.

Returns

Exchange bundle

inline const ExchangeBundle &getExchangeBundle() const

Get the exchange bundle containing stream and remote buffer data structures.

Returns

Exchange bundle

inline const std::vector<poplar::Tensor> getPipelineIndexTensors()
inline const std::map<TensorId, poplar::DataStream> &getFromHostStreams() const
inline const std::map<TensorId, poplar::DataStream> &getToHostAnchorStreams() const
inline const std::map<TensorId, poplar::DataStream> &getToHostWeightStreams() const
template<class T>
inline T *getOpxState(OpId opid)
inline void setProgramHandleIndexMap(const std::map<std::string, unsigned> &programHandleIndexMap_)
inline const std::map<std::string, unsigned> &getProgramHandleIndexMap() const

Public Members

poplar::OptionFlags pooling_options
poplar::OptionFlags lstmOptions
poplar::OptionFlags matmulOptions
poplar::OptionFlags gclOptions
poplar::OptionFlags engineOptions
poplar::OptionFlags reportOptions
std::map<OpId, std::unique_ptr<Opx>> opxs

Public Static Functions

static std::string cycleCountStreamId(std::string id)
static void removeNonDependencyFreeCreators(std::vector<ICreatorCandidatePtr> &candidates)
static PopStreamId h2dId(TensorId)
static PopStreamId d2hId(TensorId, bool isAnchorStream)
static PopStreamId gradientStoreStreamId(TensorId id)
static PopStreamId gradientLoadStreamId(TensorId id)
static PopStreamId weightLoadStreamId(TensorId id)
#include <popart/popx/poptensors.hpp>
class PopTensors

Public Functions

PopTensors(const Ir&)
void insert(TensorId, const poplar::Tensor&)
void insertAliased(TensorId to, TensorId from)
void insertUnsafe(TensorId id, const poplar::Tensor &pt)
const poplar::Tensor &get(TensorId) const
const poplar::Tensor &getView(TensorId) const
bool hasViewChangers(TensorId) const
const ViewChangers &getViewChangers(TensorId)
void setViewChangers(TensorId, const ViewChangers &viewChangers)
bool contains(TensorId) const
const std::map<TensorId, std::shared_ptr<poplar::Tensor>> &getTensors() const
bool canAlias(TensorId, RequireParallelWritable requireParallelWritable) const
#include <popart/popx/popprograms.hpp>
class PopPrograms

Class for managing the complete set of programs that a Devicex can run.

A program in this context is the instance of the poplar::Program class which represents a control program that executes operations on the graph.

The state std::vector<poplar::program::Sequence> seqs contains all these programs, and is populated during IrLowering. The programs are passed to poplar::compileGraph to construct the executable (see IrLowering::getExecutable()).

Public Types

enum ProgramIndex

Values:

enumerator WeightsFromHost = 0
enumerator OptimizerFromHost
enumerator RandomSeedFromHost
enumerator RandomSeedToHost
enumerator RngStateFromHost
enumerator Program
enumerator RngStateToHost
enumerator WeightsToHost
enumerator CycleCountTensorToHost
enumerator CustomProgramsStart
enumerator N
enum class ProgramFragmentIndex

Values:

enumerator StreamWeightsFromHost = 0
enumerator StreamOptimizerFromHost
enumerator RandomSeedFromHost
enumerator RandomSeedToHost
enumerator RngStateFromHost
enumerator Init
enumerator PreForward
enumerator Forward
enumerator Backward
enumerator VarUpdateFromAccumulator
enumerator RngStateToHost
enumerator WeightsToHost
enumerator ToHostFinalCopy
enumerator CycleCountTensorToHost
enumerator N
enum class PipelineFragmentId

Values:

enumerator ToDeviceStream = 0
enumerator Main
enumerator ToHostStream
using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>

Public Functions

PopPrograms(IrLowering *ir_lowering_p_)
const poplar::program::Sequence &streamWeightsFromHostFragment() const
poplar::program::Sequence &streamWeightsFromHostFragment()
const poplar::program::Sequence &streamOptimizerFromHostFragment() const
poplar::program::Sequence &streamOptimizerFromHostFragment()
const poplar::program::Sequence &randomSeedFromHostFragment() const
poplar::program::Sequence &randomSeedFromHostFragment()
const poplar::program::Sequence &randomSeedToHostFragment() const
poplar::program::Sequence &randomSeedToHostFragment()
const poplar::program::Sequence &cycleCountTensorToHostFragment() const
poplar::program::Sequence &rngStateFromHostFragment()
const poplar::program::Sequence &rngStateFromHostFragment() const
poplar::program::Sequence &rngStateToHostFragment()
const poplar::program::Sequence &rngStateToHostFragment() const
poplar::program::Sequence &cycleCountTensorToHostFragment()
const poplar::program::Sequence &toHostFinalCopyFragment() const
poplar::program::Sequence &toHostFinalCopyFragment()
const poplar::program::Sequence &initFragment() const
poplar::program::Sequence &initFragment()
const poplar::program::Sequence &preForwardFragment() const
poplar::program::Sequence &preForwardFragment()
const poplar::program::Sequence &forwardFragment() const
poplar::program::Sequence &forwardFragment()
const poplar::program::Sequence &backwardFragment() const
poplar::program::Sequence &backwardFragment()
const poplar::program::Sequence &accumulateOuterFragment() const
poplar::program::Sequence &accumulateOuterFragment()
const poplar::program::Sequence &weightsToHostFragment() const
poplar::program::Sequence &weightsToHostFragment()
poplar::program::Sequence &forwardOrBackwardFragment(ScheduledPreLoss)
const std::vector<poplar::program::Program> progs() const
poplar::program::Sequence &programFragment(PopPrograms::ProgramFragmentIndex)
int getNumFragments(const Graph &graph) const
std::vector<poplar::program::Sequence> &scopeFragments(const Graph&)
poplar::program::Sequence &scopeFragment(const Graph&, SubgraphPartIndex subgraphPart)
bool containsFragments(const Graph &graph) const
bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const
void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)
std::vector<poplar::Function> &getFragmentFunctions(const Graph &graph, poplar::Graph &poplarGrpah)
poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart, poplar::Graph &poplarGraph)
std::vector<poplar::program::Sequence>::iterator recomputeFragment(OpId)
SequenceMap::SequenceInterval createRecomputeFragment(OpId)
bool hasBeenRecomputed(OpId, ExecutionPhase) const
void recordRecomputed(OpId, ExecutionPhase)
std::string getStrFromPipelineFragmentId(PipelineFragmentId) const
poplar::program::Sequence &pipelineFragment(PipelineStage, PipelineFragmentId, const std::string &desc)
poplar::program::Sequence &pipelineToDeviceStreamFragment(PipelineStage pipelineStage, const std::string &desc)
poplar::program::Sequence &pipelineMainFragment(PipelineStage, const std::string &desc)
poplar::program::Sequence &pipelineToHostStreamFragment(PipelineStage, const std::string &desc)
poplar::program::Sequence &pipelineIpuCopyFragment(const std::string &desc)
poplar::program::Sequence &namedBuffersCopyFragment()
void addPipelineCycle(PipelineInfo pInfo, PipelineCycle pCycle, poplar::program::Sequence &sq, std::ostringstream &ss) const
void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Add a vector of pairs {f, buffer} for a given graph id.

This is enough for a [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.

Parameters
  • pair – The graph id, FunctionBufferMappingType pair to add the functions and buffers for.

  • funcVec – The vector of functions and buffers.

inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Get the Function Buffers for the given GraphId and FunctionBufferMappingType.

Parameters
  • gid – The GraphId to lookup.

  • fbmt – The FunctionBufferMappingType to lookup.

Returns

FunctionBuffers the vector of functions and buffers.

inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Returns true if a functionBuffer vector exists for the given graphId and FunctionBufferMappingType.

Parameters
  • gid – The graph id to lookup.

  • fbmt – The FunctionBufferMappingType to lookup.

Returns

true If pairs exist.

Returns

false Otherwise.

unsigned addCustomProgram(const poplar::program::Program &program)

Add a custom program.

Parameters

program – Program to add

Returns

Index of the popart/poplar program

void createPipelineFunctions()

Turn pipeline sequences into callable pipeline functions.

poplar::program::Sequence getFullProgramFromPipelineFragments(bool fwdOnly) const

Return the program based on the pipeline fragments.

See docs/notes/transforms/pipelining.md#assemble-from-fragments for detailed explanation.

Returns

The program based on the pipeline fragments

Public Members

IrLowering *ir_lowering_p

Public Static Attributes

static const std::unordered_map<popef::ProgramFlow::ProgramIndexType, std::string> commonPrograms
#include <popart/popx/inittensor.hpp>
class ICreatorCandidate

Subclassed by popart::popx::InputCreatorCandidate, popart::popx::InputMultiCreatorCandidate

Public Functions

ICreatorCandidate()
virtual ~ICreatorCandidate() = default
virtual std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) = 0
virtual DnfTensorIds mustExistBeforeCreate() = 0
virtual double getMaxCreatorPriority() const = 0
virtual int64_t getNumElems() const = 0
virtual std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() = 0
virtual std::string str() = 0
virtual std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) = 0
virtual std::vector<popart::view::Region> unwind(popart::view::Region) = 0
virtual std::vector<popart::view::Region> unwind() = 0
virtual int64_t getScheduleIndex() const = 0

Public Static Functions

static bool greaterThan(ICreatorCandidatePtr, ICreatorCandidatePtr)
#include <popart/popx/replicatedtensorshardingbundle.hpp>
class ReplicatedTensorShardingBundle

Helper class to bundle all replicated tensor sharding related lowering information together.

Public Functions

ReplicatedTensorShardingBundle(const Ir &ir)

Construct empty replicated tensor sharding bundle Creates the replicatedTensorShardingTracer with the IR object.

Parameters

ir – IR to create the ReplicatedTensorShardingTracer with

bool hasCollectiveBalancedReorder(const TensorId &tensorId) const

Check whether a tensor has an associated CollectiveBalancedReorder.

Parameters

tensorId – TensorId to check

Returns

True if the tensor has an associated CollectiveBalancedReorder

std::shared_ptr<gcl::CollectiveBalancedReorder> getCollectiveBalancedReorder(const TensorId &tensorId) const

Get the associated CollectiveBalancedReorder of a tensor.

Throws an error if the tensor does not have one.

Parameters

tensorId – TensorId to return the CollectiveBalancedReorder for

Returns

Shared pointer to the associated CollectiveBalancedReorder

const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &tensorId) const

Get the host rearrangement method of a tensor.

Can be applied on the host-side tensor data to rearrange the data before upload or after download to/from the IPU

Parameters

tensorId – TensorId to return the CBR host rearrangement for

Returns

CBR host rearrangement method

void setCollectiveBalancedReorder(const TensorId &tensorId, CollectiveBalancedReorderId cbrId)

Associate an existing CollectiveBalancedReorder with a tensor.

Parameters
  • tensorId – TensorId to associate the CollectiveBalancedReorder with

  • cbrId – Identifier of an existing, registered CollectiveBalancedReorder obtained by registerCollectiveBalancedReorder

CollectiveBalancedReorderId registerCollectiveBalancedReorder(std::shared_ptr<gcl::CollectiveBalancedReorder> cbr)

Register a new collective balanced reorder method.

Parameters

cbrGCL CollectiveBalancedReoder to register

Returns

Registered ID for the CollectiveBalancedReoder

inline const std::map<CollectiveBalancedReorderId, std::shared_ptr<gcl::CollectiveBalancedReorder>> &getCollectiveReorders() const
Returns

inline const ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer() const
Returns

Tracer to resolve replicated tensor sharding groups

inline ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer()
Returns

Tracer to resolve replicated tensor sharding groups

inline const std::map<TensorId, CollectiveBalancedReorderId> &getCollectiveReorderIds() const

Get mapping to resolve which CollectiveBalancedReorder has to be applied to a tensor to restore the original data order.

Returns

Mapping of all tensors and their associated CollectiveBalancedReorderId

#include <popart/popx/linearmapper.hpp>
class LinearMapper

Public Functions

void mapTensor(poplar::Graph &graph, poplar::Tensor &tensor)

14.8. Ops

14.8.1. Op definition for PopART IR

#include <popart/op.hpp>
class Op : public popart::Vertex

Parent class for the concrete Op implementations.

The poplar implementation which the op represents can be found in the corresponding popx::Opx class, and will be lowered to poplar.

Subclassed by popart::AbortOp, popart::AbsGradOp, popart::AdaDeltaUpdaterOp, popart::AdamUpdaterOp, popart::AddBiasOp, popart::AllReduceOp, popart::ArgExtremaOp, popart::AveragePoolGradOp, popart::BaseOnnxRNNGradOp, popart::BaseOnnxRNNOp, popart::BasePadOp, popart::BaseSliceOp, popart::BaseSortOp, popart::BatchNormGradOp, popart::BatchNormOp, popart::BinaryComparisonOp, popart::BoundaryOp, popart::BucketizeOp, popart::CastOp, popart::CastThenPow2ScaleOp, popart::CollectivesBaseOp, popart::ConcatGradOp, popart::ConcatOp, popart::ConvFlipWeightsOp, popart::ConvTransposeOp, popart::CoshOp, popart::CtcBeamSearchDecoderOp, popart::CtcGradOp, popart::CumSumGradOp, popart::CumSumOp, popart::DynamicBaseOp, popart::ElementWiseBinaryBaseOp, popart::ElementWiseBinaryGradOp, popart::ElementWiseNonLinearUnaryGradOp, popart::ElementWiseUnaryBooleanOp, popart::ElementWiseUnaryOp, popart::ExchangeBaseOp, popart::ExpandGradOp, popart::ExpandOp, popart::ExpGradOp, popart::Expm1GradOp, popart::GatherGradOp, popart::GatherOp, popart::GetRandomSeedOp, popart::GlobalAveragePoolGradOp, popart::GlobalAveragePoolOp, popart::GlobalMaxPoolGradOp, popart::GlobalMaxPoolOp, popart::GroupNormGradOp, popart::GroupNormOp, popart::HasReceptiveFieldOp, popart::HistogramOp, popart::IdentityLossGradOp, popart::IfOp, popart::InitOp, popart::InstanceNormGradOp, popart::InstanceNormOp, popart::InternalCodeCopyOp, popart::IoTileCopyOp, popart::IpuCopyOp, popart::L1GradOp, popart::LambSquareOp, popart::LeakyReluGradOp, popart::LogSoftmaxGradOp, popart::LossOp, popart::LossScaleUpdateOp, popart::LRNGradOp, popart::LRNOp, popart::MatMulBaseOp, popart::MaxPoolGradOp, popart::ModifyRandomSeedOp, popart::MultiConvBaseOp, popart::MultiConvDataGradBaseOp, popart::MultiConvWeightsGradBaseOp, popart::NllGradOp, popart::NlllWithSoftmaxGradDirectOp, popart::NormalizeImageOp, popart::OnehotGradOp, popart::OnehotOp, popart::PackedDataBlockOp, popart::ParameterizedOp< TDerivedOp, TOpParams >, popart::PlaceholderOp, popart::PopartLSTMGradOp, popart::PopartLSTMOp, popart::Pow2ScaleThenCastOp, popart::ReduceGradOp, popart::ReduceOp, popart::ReluGradOp, popart::ReshapeBaseOp, popart::ResizeOp, popart::RestoreOp, popart::ReverseBaseOp, popart::RMSPropUpdaterOp, popart::RoiAlignGradOp, popart::RoiAlignOp, popart::ScaledAddOp, popart::ScatterDataGradOp, popart::ScatterReduceGradOp, popart::ScatterReduceOp, popart::ScatterUpdateGradOp, popart::SequenceSliceOp, popart::SGD1NesterovOp, popart::ShapeOrLikeOp, popart::SigmoidGradOp, popart::SoftmaxGradDirectOp, popart::SoftmaxGradOp, popart::SplineBasisOp, popart::SplineWeightingOp, popart::SplitGradOp, popart::SplitOp, popart::SqrtGradOp, popart::StashOp, popart::SubgraphOp, popart::SubsampleBaseOp, popart::SubsampleGradOp, popart::SyncOp, popart::TanhGradOp, popart::TensorRemapOp, popart::TileOp, popart::TopKGradOp, popart::TransposeBaseOp, popart::UpsampleOp, popart::VariadicGradOp, popart::VariadicOp, popart::VarUpdateOp, popart::WhereOp, popart::WhereXGradOp, popart::WhereYGradOp

Public Types

using SubgraphInSig = std::tuple<Op*, fwtools::subgraph::OutIndex, std::string>

The functionality required for sub-graph matching.

Public Functions

inline Settings &getSettings()

Get the settings associated with the op.

Returns

The op settings.

inline const Settings &getSettings() const

Get the settings associated with the op.

Returns

The op settings.

virtual Settings getInSettings(InIndex) const

Return suitable settings for an op inserted before the input to an existing op.

Parameters

InIndex – The input index before which the op is inserted.

Returns

The settings for the op inserted before the input index.

virtual Settings getOutSettings(OutIndex) const

Return suitable settings for an op inserted after the output to an existing op.

Parameters

OutIndex – The output index after which the op is inserted.

Returns

The settings for the op inserted after the output index.

Settings adjustInSettings(InIndex, Op::Settings) const

Adjust the settings to be suitable as input at the input index.

Parameters
  • InIndex – The input index where the settings are to be applied.

  • Settings – The settings to be adjusted.

Returns

Adjusted settings suitable for input at the input index.

Settings adjustOutSettings(InIndex, Op::Settings) const

Adjust the settings to be suitable as output at an output index.

Parameters
  • OutIndex – The output index where the settings are to be applied.

  • Settings – The settings to be adjusted.

Returns

Adjusted settings suitable for output at the output index.

const OptionalVGraphId getOptionalVGraphId() const

Get the ID of the optional virtual graph.

Returns

The ID of the optional virtual graph.

VGraphId getVirtualGraphId() const

Get the ID of the virtual graph.

Returns

The ID of the virtual graph.

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex) const

Get virtual graph ID and tile set associated with an input index.

Parameters

InIndex – The input index.

Returns

The virtual graph ID and tile set at the input index.

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex) const

Get virtual graph ID and tile set associated with an output index.

Parameters

OutIndex – The output index.

Returns

The virtual graph ID and tile set at the output index.

virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const

Get virtual graph ID and tile set associated with an input index.

Parameters
  • InIndex – The input index.

  • visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the input index.

virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const

Get virtual graph ID and tile set associated with an output index.

Parameters
  • OutIndex – The output index.

  • visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the output index.

void setVirtualGraphId(const OptionalVGraphId)

Set a virtual graph ID for the op.

Parameters

OptionalVGraphId – The ID of the virtual graph to set on this op.

bool hasVirtualGraphId() const

Check if the op has a virtual graph ID set.

Returns

true if the op has a virtual graph ID set, false otherwise.

void setPipelineStage(OptionalPipelineStage)

Set a pipeline stage for the op.

Parameters

OptionalPipelineStage – The pipeline stage to be set for the op.

bool hasPipelineStage() const

Check if the op has a pipeline stage set.

Returns

true if the op has a pipeline stage set, false otherwise.

PipelineStage getPipelineStage() const

Get the pipeline stage that has been set for the op.

Returns

The pipeline stage that has been set for the op.

OptionalPipelineStage getOptionalPipelineStage() const

Get the optional pipeline stage.

Returns

The optional pipeline stage that has been set for the op.

const OptionalExecutionPhase getOptionalExecutionPhase() const

Get the optional execution phase.

Returns

The optional execution phase that has been set for the op.

virtual ExecutionPhase getExecutionPhase() const

Get the execution phase that has been set for the op.

Returns

The execution phase that has been set for the op.

void setExecutionPhase(const OptionalExecutionPhase)

Set the execution phase for the op.

Parameters

OptionalExecutionPhase – The execution phase to be set for the op.

bool hasExecutionPhase() const

Check if the op has an execution phase set.

Returns

true if the op has a execution phase set, false otherwise.

const OptionalBatchSerializedPhase getOptionalBatchSerializedPhase() const

Get the optional batch serialized phase.

Returns

The optional batch serialized phase that has been set for the op.

virtual BatchSerializedPhase getBatchSerializedPhase() const

Get the batch serialized phase.

Returns

The batch serialized phase that has been set for the op.

void setBatchSerializedPhase(const OptionalBatchSerializedPhase)

Set the batch serialized phase.

Parameters

OptionalBatchSerializedPhase – The batch serialized phase to be set for the op.

bool hasBatchSerializedPhase() const

Check if the op has a batch serialization phase set.

Returns

true if the op has a batch serialization phase set, otherwise false.

const OptionalStochasticRoundingMethod getOptionalStochasticRoundingMethod() const

Get the optional stochastic rounding method.

Returns

The optional stochastic rounding method that has been set for the op.

virtual StochasticRoundingMethod getStochasticRoundingMethod() const

Get the stochastic rounding method.

Returns

The stochastic rounding method that has been set for the op.

void setStochasticRoundingMethod(const OptionalStochasticRoundingMethod)

Set the optional stochastic rounding method.

Parameters

OptionalStochasticRoundingMethod – The optional stochastic rounding method to be set for the op.

bool hasStochasticRoundingMethod() const

Check if the op has a stochastic rounding method set.

Returns

true if the op has a stochastic rounding method set, otherwise false.

bool isExcludedFromPattern(const Pattern*) const

Check if the op is excluded from a pattern.

Returns

true if the op is excluded from a pattern, false otherwise.

inline virtual int getInBatchAxis(InIndex) const

Get the batch axis for the input index.

Returns

The batch axis for the input index.

inline virtual int getOutBatchAxis(OutIndex) const

Get the batch axis for the output index.

Returns

The batch axis for the output index.

void inheritPlacementAttributes(bool inheritSerializations, AliasModel &aliasModel)

Helper function to set an op’s placement attributes by inheriting them from other ops in the graph.

The attributes that are set include:

  • Execution context.

  • Pipeline stage.

  • Execution phase.

  • Virtual graph ID.

  • Batch serial phase (optional).

Parameters
  • inheritSerializations – The indicator to enable or disable the batch serialization phase. true enables the batch serialization phase and false disables it.

  • aliasModel – An AliasModel object containing alias info for this op’s graph.

Ir &getIr()

Get the IR associated with the op.

Returns

The IR associated with the op.

const Ir &getIr() const

Get the IR associated with the op.

Returns

The IR associated with the op.

inline Graph &getGraph()

Get the graph associated with the op.

Returns

The graph associated with the op.

inline const Graph &getGraph() const

Get the graph associated with the op.

Returns

The graph associated with the op.

inline const Scope &getScope() const

Get the scope associated with the op.

Returns

The scope associated with the op.

inline void setScope(const Scope &scope)

Get the scope associated with the op.

Returns

The scope associated with the op.

inline const std::string &getName() const

Get the name of the op.

Returns

The name of the op.

inline void setName(const std::string &name)

Get the name of the op.

Returns

The name of the op.

inline const OpDebugInfo &getDebugInfo() const

Get the debug info of the op.

Returns

The debug info for the op.

virtual bool isNorm() const

Checks if the op is a norm op.

Returns

true if the op is a norm op, false otherwise.

bool isElementWiseUnary() const

Checks if the op is an element-wise unary op.

Returns

true if the op is an element-wise unary op, false otherwise.

virtual bool canBeReplacedByIdentity() const

Check if the op can be replaced by the identity op.

Returns

true if the op and be replaced by the identity op, false otherwise.

Op(const OperatorIdentifier &_opid, const Op::Settings &settings)

Constructor of the Op class.

Parameters
  • _opid – The operator identifier specifying domain:type:version, minimum and maximum number of input tensors and number of output tensors.

  • settings – The general op settings such as graph, name and scope.

Op(const Op&)

Copy constructor.

Note

This does NOT copy input and output.

Op &operator=(const Op&) = delete
virtual ~Op()

Destructor.

std::string str() const final

Return the op ID.

std::string debugName() const

Return the op name that is used for debug and profiling.

void createAndConnectOutTensor(OutIndex, TensorId)

Create an ActGrad (output) tensor and connect it to this op’s output.

Parameters
  • OutIndex – The output index that the output tensor should be connected to.

  • TensorId – The tensor ID of the tensor to be converted to an output tensor.

void append(std::stringstream &ss) const

Append this op to a stream.

Parameters

ss – The stream to append the op to.

void toJSON(std::stringstream &ss) const

Convert this op to JSON format and append it to a stream.

Parameters

ss – The stream to append the JSON-serialised op to.

int64_t memOfOutputs() const

Return the total memory of used by all output tensors.

inline virtual std::set<InIndex> optionalInputs() const

Return the input indices of all optional inputs to the op.

void defaultConnectInTensor(InIndex, TensorId)

Connect a tensor to an input index.

This method updates the input and updates consumers of the tensor with the tensor ID.

Parameters
  • InIndex – The input index to connect the tensor to.

  • TensorId – The tensor ID of the tensor to connect.

virtual void connectInTensor(InIndex index, TensorId tensorId)

Connect existing tensor to input index.

Parameters
  • index – The input index at which to connect the tensor.

  • tensorId – The ID of the existing tensor.

virtual void connectInTensor(InIndex inIndex, TensorId tensorId, VGraphId vgid)

Connect an existing tensor to an index with the source virtual graph.

Parameters
  • inIndex – The input index at which to connect the tensor.

  • tensorId – The ID of the existing tensor.

  • vgid – The virtual graph on which the existing tensor resides.

void connectInTensorDispatch(InIndex inIndex, TensorId tensorId)

Connect an existing tensor at an index with the source virtual graph.

Dispatcher to resolve issues with templated inheritance overloads. This will automatically derive the virtual graph ID of the input when required.

Parameters
  • inIndex – The input index at which to connect the tensor.

  • tensorId – The ID of the existing tensor.

void connectInTensorLike(const Op *other, InIndex index, TensorId tenId)

Connects the input tensor analogously to another op.

This is useful when cloning graphs or ops, because it avoids having to check if the op requires special considerations when connecting inputs.

IpuCopyOp is currently the only op where this applies, since a source virtual graph has to be specified when connecting it otherwise:

void connectInTensor(InIndex, TensorId, uint64_t sourceIpu);

Parameters
  • other – An op of the same type as the current op, from which to copy how the tensor at the corresponding index should be connected.

  • index – The input index to connect.

  • tenId – The ID of the tensor to connect.

void connectOutTensor(OutIndex, TensorId)

Connect existing tensor to output index.

Parameters
  • index – The output index at which to connect the tensor.

  • tensorId – The ID of the existing tensor.

void disconnectInTensor(Tensor *tensor)

Disconnect an input tensor from the op.

Parameters

tensor – The tensor to disconnect.

virtual void disconnectInTensor(InIndex, Tensor *tensor)

Disconnect an input tensor from the op at a specific input index.

Parameters
  • tensor – The tensor to disconnect.

  • InIndex – The index of the input tensor in the op.

void disconnectInTensor(InIndex)

Disconnect an input tensor from the input index.

Parameters

InIndex – The input index to disconnect the tensor from.

void disconnectOutTensor(Tensor *tensor)

Disconnect an output tensor from the op.

Parameters

tensor – The tensor to disconnect.

void disconnectAllInputs()

Disconnect all input tensors from the op.

void disconnectAllOutputs()

Disconnect all output tensors from the op.

const std::string &name() const

Return the op name.

virtual void setup()

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

void finalizeDebugInfo()

Finalize DebugInfo.

This method is called once after Ir::prepare() has completed.

virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)

Set information about the gradient graphs for this op’s called subgraphs.

If the op has called subgraphs, then this method will get called prior to getGradOps() to provide the op with the information it needs to call the grad version of the called subgraphs.

Parameters

calledGraphsGradInfo – The mapping between the forward graph and information on the gradient graph.

virtual std::vector<std::unique_ptr<Op>> getGradOps()

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual void growAliasModel(AliasModel &aliasModel) const

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier) const

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters
  • aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

  • 2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

virtual view::Regions modifies(InIndex) const

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

virtual view::Regions uses(InIndex) const

Return the input region which this op uses.

Parameters

InIndex – The input index.

Returns

The regions which this op uses.

virtual view::Regions aliases(InIndex, OutIndex) const

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters
  • InIndex – The input index.

  • OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters
  • aliasModel – An alias model object.

  • inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.

  • proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

  1. a mapping from output index to a replica-equal status with an entry for each output tensor.

  2. a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

bool doesAlias() const

Check if any input tensor aliases any output tensor .

Returns

true if any input tensor aliases any output tensor, otherwise false.

inline bool isOutplace() const

Check if this is an outplace op.

This means that no input tensor aliases any output tensor.

Returns

true if this is an outplace op, otherwise false.

bool doesAlias(InIndex inIndex, OutIndex outIndex) const

Check that the input tensor at an input index aliases the output tensor at an output index.

Returns

true if the input tensor at inIndex aliases the output tensor at outIndex, false otherwise.

bool modifies() const

Check if op modifies a tensor at any index.

Returns

true if the op modifies a tensor at any index, otherwise false.

bool modifiesIndex(InIndex in) const

Check if an op modifies a tensor at a specific index.

Parameters

in – The input index to check.

Returns

true if the op modifies the tensor, false otherwise.

bool overwritesTensor(Tensor *t) const

Check if an op overwrites a tensor.

Parameters

t – The tensor to check.

Returns

true if it overwrites the tensor, false otherwise.

bool modifiesTensor(Tensor *t) const

Check if an op modifies a tensor.

Parameters

t – The tensor to check.

Returns

true if it modifies the tensor, false otherwise.

inline virtual bool isInplaceViewChange() const

Check if this is an inplace op that changes a view.

Examples of inplace ops that change views are:

  • ReshapeInplaceOp

  • IdentityInplaceOp

  • TransposeInplaceOp.

    See also

    For more information on views, refer to the IPU Programmer’s Guide.

Returns

true if this is a view changing inplace op, false otherwise.

inline virtual bool isOutplaceViewChange() const

Check if this is an outplace op that changes a view.

Examples of outplace ops that change views are:

Returns

true if this is a view changing outplace op, otherwise false.

virtual int getNonGradInIndex(int gradOpOutIndex) const

Return the index in the non-grad op which has an output edge-gradient tensor in the matching grad op.

This method throws an error if the op this is called on is not a grad op.

Parameters

gradOpOutIndex – The index at which the grad op has an output of an edge-gradient tensor.

Returns

The index in the non-grad op containing the input tensor corresponding to the edge-gradient tensor in the grad op output.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns

The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

virtual std::unique_ptr<Op> clone() const = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

template<typename T>
inline bool isConvertibleTo() const
virtual bool isLossOp() const

Check if this is a LossOp op, for example NllOp or L1Op.

Note

The op SumOp which adds the losses together is not a LossOp.

Returns

true if this is a LossOp op, false otherwise.

virtual bool isIpuCopyOp() const

Check if this is an IpuCopyOp op.

Returns

true if this is an IpuCopyOp op, false otherwise.

virtual bool copiesOptimizerTensors() const

Check if this copies only optimizer tensors from one IPU to another.

Returns

true if this op copies only optimizer tensors from one IPU to another, false otherwise.

virtual bool isOptimizerOp() const

Check if op is part of the optimizer.

bool isGradientClippingOp() const

Check if op is a part of gradient clipping.

virtual bool requiresRandomSeed() const

Check if the op requires a random seed.

This is set to falseby default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns

true if the op requires a random seed, false otherwise.

virtual InIndex getSeedInIndex() const

Check if the op requires a random seed.

This is set to false by default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns

true if the op requires a random seed, false otherwise.

bool hasInput(InIndex index) const

Check if the op has an input at the input index.

Returns

true if the op has an input at the input index, otherwise false.

bool hasOutput(OutIndex index) const

Check if the op has an output at the output index.

Returns

true if the op has an output at the output index, otherwise false.

Tensor *inTensor(InIndex index)

Get the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor at the input index.

const Tensor *inTensor(InIndex index) const

Get the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor at the input index.

Tensor *outTensor(OutIndex index)

Get the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor at the output index.

const Tensor *outTensor(OutIndex index) const

Get the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor at the output index.

TensorId inId(InIndex index)

Get the ID of the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor ID of the tensor at the input index.

const TensorId inId(InIndex index) const

Get the ID of the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor ID of the tensor at the input index.

TensorId outId(OutIndex index)

Get the ID of the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor ID of the tensor at the output index.

const TensorId outId(OutIndex index) const

Get the ID of the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor ID of the tensor at the output index.

TensorInfo &inInfo(InIndex index)

Get the info of the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor info of the tensor at the input index.

const TensorInfo &inInfo(InIndex index) const

Get the info of the input tensor at the input index.

Parameters

index – The input index.

Returns

The tensor info of the tensor at the input index.

TensorInfo &outInfo(OutIndex index)

Get the info of the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor info of the tensor at the output index.

const TensorInfo &outInfo(OutIndex index) const

Get the info of the output tensor at the output index.

Parameters

index – The output index.

Returns

The tensor info of the tensor at the output index.

const Shape &inShape(InIndex index) const

Get the shape info of the input tensor at the input index.

Parameters

index – The input index.

Returns

The shape info of the tensor at the input index.

const Shape &outShape(OutIndex index) const

Get the shape info of the output tensor at the output index.

Parameters

index – The output index.

Returns

The shape info of the tensor at the output index.

size_t inTensorCount() const

Get the number of input tensors of this op.

Returns

The number of input tensors this op has.

size_t outTensorCount() const

Get the number of output tensors of this op.

Returns

The number of output tensors this op has.

Rank inRank(InIndex index) const

Get the rank of the input tensor at the input index.

Parameters

index – The input index.

Returns

The rank of the tensor at the input index.

Rank outRank(OutIndex index) const

Get the rank of the output tensor at the output index.

Parameters

index – The output index.

Returns

The rank of the tensor at the output index.

InIndex inIndex(Tensor*) const

Get the input index of the tensor.

Parameters

Tensor – The input tensor.

Returns

The input index of the tensor in the op.

OutIndex outIndex(Tensor*) const

Get the output index of the tensor.

Parameters

Tensor – The output tensor.

Returns

The output index of the tensor in the op.

virtual void appendAttributes(OpSerialiserBase&) const

Append attributes when serialising the op to a stream.

This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendOutlineAttributes(OpSerialiserBase&) const

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendMore(OpSerialiserBase&) const

Append additional attributes to the stream.

This method should be overridden if the derived class has additional attributes.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

Shape prettyNpOut(const Shape &s0, const Shape &s1) const

Calculate the NumPy broadcast shape for two shapes.

This will throw an error if the broadcast is not aligned. The error will have operator context. Note: If the replicated tensor sharding meta-shape is required, use prettyNpOut with TensorInfo instead.

Parameters
  • s0 – The first shape.

  • s1 – The second shape.

Returns

The NumPy-like broadcasted output shape.

TensorInfo prettyNpOut(const TensorInfo &i0, const TensorInfo &i1, bool checkDataType = true) const

Calculate the NumPy broadcast shape for two shapes.

This will throw an error if the broadcast is not aligned. The error will have operator context.

Parameters
  • i0 – The info for the first tensor containing shape and meta-shape.

  • i1 – The info for the second tensor containing shape and meta-shape.

  • checkDataType – Check that the data types are identical. If true, check that the data types are identical and throw an error if they are not. If false, do not check that data types are identical.

Returns

The NumPy-like broadcast output info containing the correct shape and meta-shape. The data type is taken from i0.

virtual std::vector<const Graph*> getCalledGraphs() const

Get all graphs that this op may call during its execution.

Returns

A vector of all graphs that this op may call during its execution.

std::vector<GraphId> getCalledGraphIds() const

Get the IDs of all graphs that this op may call during its execution.

Returns

A vector of IDs of all graphs that this op may call during its execution.

SubgraphIndex getCalledGraphIndex(const GraphId &id) const

Get the index in the op where the graph is called.

Parameters

id – The ID of the called graph.

Returns

The index at which the graph is called.

virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const

Get the input index for the subgraph corresponding to the op input index.

Parameters
  • subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).

  • inIndex – The input index in the op.

Returns

The input index in the subgraph that corresponds to the input index in the op, or -1 if the op input index is not used by the subgraph.

virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const

Get the input index for the op corresponding to the subgraph input index.

Parameters
  • subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).

  • inIndex – The input index in the subgraph.

Returns

The input index in the op that corresponds to the input index in the subgraph, or -1 if the subgraph input index is not used by the op.

virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const

Get the output index for the subgraph corresponding to the op output index.

Parameters
  • subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).

  • outIndex – The output index in the op.

Returns

The output index in the subgraph that corresponds to the output index in the op, or -1 if the op output index is not used by the subgraph.

virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const

Get the output index for the op corresponding to the subgraph output index.

Parameters
  • subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).

  • outIndex – The output index in the subgraph.

Returns

The output index in the op that corresponds to the output index in the subgraph, or -1 if the subgraph output index is not used by the op.

virtual std::set<OutIndex> opInToOpOutIndex(InIndex in) const

Get the the set of outputs to visit based on the input index (for graph traversal).

Parameters

in – The input index used to determine the set of outputs to visit.

Returns

The set of outputs to visit based on the input index.

virtual std::set<InIndex> opOutToOpInIndex(OutIndex out) const

Get the the set of inputs to visit based on the output index (for graph traversal).

Parameters

out – The output index used to determine the set of inputs to visit.

Returns

The set of inputs to visit based on the output index.

std::string getSubgraphEquivId(const std::map<std::string, popart::any> &externalAttrs = {}) const

Get a string that represents the equivalence class that this op belongs to.

This is used by, for example transforms, to determine if two ops are the same. If and only if two ops return the same equivalence ID then those ops can be considered of the same equivalence class.

Parameters

externalAttrs – Additional attributes by which to distinguish this op. The value types must be one of: float, double, int, int64_t, uint32_t, uint64_t, std::string, std::vector<float>, std::vector<double>, std::vector<int64_t>, popart::Scope, bool, nonstd::optional<int64_t>, nonstd::optional<float>, nonstd::optional<double> or std::map<TensorId, uint64_t>. We use this to add, for example replica-equalness properties to the equivalence ID, which is a property that is calculated on-the-fly as opposed to stored in the op.

Returns

The equivalence ID.

std::map<fwtools::subgraph::InIndex, SubgraphInSig> getSubgraphInputs() const

Get all the producer ops of the tensors consumed at the input index.

Returns

A map of producer ops for the tensors consumed at the input index.

std::map<fwtools::subgraph::OutIndex, OpSet> getSubgraphOutputs() const

Get all the consumer ops of the tensors produced at the output index.

Returns

A map of consumer ops for the tensors produced at the output index.

virtual float getSubgraphValue() const = 0

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

inline float getHighSubgraphValue() const

Return the high subgraph value.

inline float getLowSubgraphValue() const

Return the low subgraph value.

virtual float calcAutoVirtualGraphCost(std::set<int> &inputs_seen)

Get approximate cost of activations between forward and backward graphs.

virtual bool isOutlineable() const

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns

true if the op can be outlined, false otherwise. Default: true.

virtual bool hasSideEffect() const

Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.

Returns

true if the op has side effects, false otherwise. Default=false.

virtual bool canRecompute() const

Check if the op can be recomputed.

To recompute an op means to clone it to produce the same output. The function checks the safeness of recompute in the context of explicit recompute. It may still be unsafe for implicit recompute.

Returns

true if the op can be recomputed, false otherwise. Default: hasSideEffect().

bool inputsUnmodifiable() const

Check if any input indices are unmodifiable or alias an unmodifiable tensor.

Returns

true if any connected variable tensor for all input indices has a non-empty alias chain and is unmodifiable, false otherwise.

bool consumesGraphOutput() const

Check if op consumes the outputs of the graph.

Returns

true if op consumes graph outputs, false otherwise.

bool producesGraphOutput() const

Check if op produces the outputs of the graph.

Returns

true if op produces graph outputs, false otherwise.

bool inputUnmodifiable(InIndex in) const

Check if the input index is unmodifiable or aliases an unmodifiable tensor.

Parameters

in – The input index to check.

Returns

true if any connected variable tensor has a non-empty alias chain and is unmodifiable, false otherwise.

bool inputUnmodifiableFor(InIndex in, const AliasModel *popMem) const

Check if the input index is unmodifiable or aliases an unmodifiable tensor with given poprithm graph.

Parameters

in – The input index to check.

Returns

true if any connected variable tensor has a non-empty alias chain and is unmodifiable, false otherwise.

bool hasAliasedModifiers(OutIndex out) const

Check if output is modified by any consumer.

Parameters

out – The output index to check.

Returns

true if any consumer of any aliased tensor downstream modifies a non-empty region, false otherwise.

bool hasAliasedModifiersFor(OutIndex out, const AliasModel *popMem) const

Check if output is modified by any consumer with the given poprithm graph.

Parameters

out – The output index to check.

Returns

true if any consumer of any aliased tensor downstream modifies a non-empty region, false otherwise.

bool isParentOf(const Op*) const

Check if the graph is a parent of the op.

A graph is a parent of an op if and only if the op is a child of the graph.

Parameters

1 – The op that is being checked.

Returns

true if the graph is a parent graph, false otherwise.

bool isChildOf(const Op*) const

Check if the graph is a child graph.

A graph is a direct child of an op if the graph consumes any of the tensors the op produces.

Parameters

1 – The op that is being checked.

Returns

true if the graph is a child graph, false otherwise.

virtual bool canShard() const

Check if the operation can be sharded into multiple operations.

Returns

true if the operation can be sharded, false otherwise.

virtual ReductionType getShardReductionType(OutIndex index) const

Get the reduction type to apply after sharding, if the output shape does not change.

Parameters

index – The output index at which to determine the reduction type.

Returns

The reduction type.

inline virtual float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const

Get the scale factor to apply after sharding, if required.

Parameters
  • shardedOp – The sharded op.

  • index – The output index at which to determine the scale factor.

Returns

The scale factor. Default:1.0.

std::map<TensorId, std::vector<TensorId>> shard(const std::map<TensorId, std::vector<TensorId>> &inputs)

Shard an operation into multiple operations according to the new, already sharded input tensors.

Parameters

inputs – The sharded input tensors.

Returns

The sharded output tensors.

ShardingPlan shard(const ShardingPlan plan)

Create an output sharding plan from sharding an op.

The sharding plan also contains the individual input/output shards of an operation. When sharding an operation, the new plan is updated with the resulting sharded tensors.

Parameters

plan – The input sharding.

Returns

The plan after sharding the operation containing the resulting sharded tensors.

virtual void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const

Configure a sharded op.

Parameters
  • shardedOp – The sharded op to be configured.

  • settings_ – The settings to apply to the sharded op.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const

Return which inputs and outputs are replicated tensor sharding pairs.

virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain)

Configure the op for replicated tensor sharding at specific indices.

Parameters
  • indices – The indices at which to configure the op for replicated tensor sharding.

  • shardingDomain – The type and size of the replica group specified by a CommGroup object.

virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping)

Configure the op for replicated tensor sharding at specific indices.

Parameters
  • indices – The indices at which to configure the op for replicated tensor sharding.

  • grouping – The stride and size of the replica group specified by a ReplicaGrouping object.

void transferBaseProperties(Op *to)

Transfer the base properties from another op to this op.

Parameters

to – The op to transfer the base properties from.

Op *getPrecedingOp(InIndex inIndex)

Get the producer op of the input tensor at the input index.

Parameters

inIndex – The index at which the input tensor is produced.

Returns

The op which produces the input tensor at the input index.

Op *getFollowingOp(OutIndex outIndex = 0)

Get the op that consumes an output tensor at an output index.

This will throw an error if there is more than one consumer op.

Parameters

outIndex – The index at which the output tensor is consumed.

Returns

The op which consumes the output tensor at the output index.

std::vector<Op*> getFollowingOps(OutIndex outIndex = 0)

Get all ops that consume an output tensor at an output index.

Parameters

outIndex – The index at which the output tensor is consumed.

Returns

A vector of ops which consume the output tensor at the output index.

template<typename T>
inline T *getPrecedingOp(InIndex inIndex)

Get the producer op of the input tensor at the input index.

This will throw an error if the producer op cannot be converted to type T.

Parameters

inIndex – The index at which the input tensor is produced.

Returns

The op, converted to type T, which produces the input tensor at the input index.

template<typename T>
inline T *getFollowingOp(OutIndex outIndex = 0)

Get the op that consumes an output tensor at an output index.

This will throw an error if there is more than one consumer op, or if the consumer op cannot be converted to type T.

Parameters

outIndex – The index at which the output tensor is consumed.

Returns

The op, converted to type T, which consumes the output tensor at the output index.

template<typename T>
inline std::vector<T*> getFollowingOps(OutIndex outIndex = 0)

Get all ops that consume an output tensor at an output index.

This will throw an error if not all of the consumer ops can be converted to type T.

Parameters

outIndex – The index at which the output tensor is consumed.

Returns

A vector of ops, converted to type T, which consume the output tensor at the output index.

bool isPipelineIpuCopyOp() const

Check if the op is of the class IpuCopyOp that copies between pipeline stages.

Returns

true if op is of the class IpuCopyOp and copies between pipeline stages, false otherwise.

Public Members

std::unique_ptr<TensorIndexMap> input
std::unique_ptr<TensorIndexMap> output
OpId id = {-1}
OperatorIdentifier opid
bool pruneable = true
Settings settings
OpDebugInfo debugInfo
struct Settings

Structure to capture the settings for the op.

Public Functions

inline Settings(Graph &graph_, const std::string &name_)

Constructor for the Settings structure.

Parameters
  • graph_ – The graph the op belongs to.

  • name_ – The name of the op.

inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_)

Constructor for the Settings structure.

Parameters
  • graph_ – The graph the op belongs to.

  • name_ – The name of the op.

  • scope_ – The scope of the op.

inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_, const uint64_t parentId_)

Constructor for the Settings structure.

Parameters
  • graph_ – The graph the op belongs to.

  • name_ – The name of the op.

  • scope_ – The scope of the op.

  • parentId_ – The ID of the debug info.

inline Settings(Graph &graph_, const std::string &name_, const uint64_t parentId_)

Constructor for the Settings structure.

Parameters
  • graph_ – The main graph.

  • name_ – The name of the op.

  • parentId_ – The ID of the debug info.

virtual ~Settings() = default

Destructor for the Settings structure.

Settings(const Settings&) = default
inline Settings copy(const std::string &new_name)

Create a copy of the current settings with a new name.

Parameters

new_name – The name of the new settings.

Returns

A copy of the current settings with the new name.

virtual void setFromAttributes(const Attributes &attributes)

Append the optional attributes to the Settings structure depending on whether the attribute has been set in the ONNX model.

Parameters

attributes – The attributes to be added to the Settings structure.

Ir &getIr() const

Get the IR associated with the main graph.

Returns

The IR associated with the main graph.

Public Members

std::reference_wrapper<Graph> graph
std::string name = ""
Scope scope
RecomputeType recomputeType = RecomputeType::Undefined
OptionalTensorLocation tensorLocation
std::vector<std::tuple<std::string, float>> inplacePriorityVeto
std::unordered_set<std::string> excludePatterns
OptionalVGraphId vgraphId
OptionalPipelineStage pipelineStage
OptionalExecutionPhase executionPhase
OptionalBatchSerializedPhase batchSerializedPhase
OptionalStochasticRoundingMethod stochasticRoundingMethod
TileSet tileSet = {TileSet::Compute}
ExecutionContext executionContext = {ExecutionContext::Normal}
std::map<InIndex, InIndex> inferTensorMappingToFrom
double schedulePriority = {0.0}
std::map<std::string, std::string> extraOutlineAttributes
uint64_t debugInfoId = {0}
bool optimizerOp = {false}
bool gradientClippingOp = {false}
class GradInOutMapper

Class that represents the mapping between the indices of the input tensors to the gradient operation and the indices of these same tensors in the non-gradient operation.

Public Functions

GradInOutMapper(InIndex iGrad_, int iNonGrad_, GradOpInType)

Constructor for the GradInOutMapper class.

Parameters
  • iGrad_ – The index of the input tensor to the gradient operation.

  • iNonGrad_ – The index of the gradient operation input tensor as it is indexed in the non-gradient operation.

  • GradOpInType – The type of the input tensor to the gradient operation.

bool operator==(const GradInOutMapper &rhs) const

Check if the current GradInOutMapper object is equal to another GradInOutMapper object.

Parameters

rhs – A GradInOutMapper object to be compared to the current object.

Returns

true if objects are equal, false otherwise.

Public Members

InIndex iGrad
int iNonGrad
GradOpInType type
enum class popart::ReductionType

Define the reduction operation to use over a sequence of tensors.

The two use-cases for this enum type are:

  • denoting how to reduce individual losses produced by a LossOp over a minibatch (specified by the LossOp reduction parameter)

  • denoting how to reduce weight gradients over a number of replicas when gradient accumulation is enabled (specified by the global session option SessionOptions::accumulationAndReplicationReductionType).

Values:

enumerator Sum = 0

Sum the input values and do not scale the output (Default).

enumerator Mean

Take the mean of the input values.

enumerator NoReduction

Do not reduce the input values.

Keep them stacked into a single tensor. So values \(t_1, ..., t_k\) get collected into a tensor \([t_1, ..., t_k]\).

enumerator N

The number of ReductionType values.

#include <popart/operatoridentifier.hpp>
struct OperatorIdentifier

Subclassed by popart::AiGraphcoreOpIdV1

Public Functions

inline OperatorIdentifier(const OpDomain &_domain, const OpType &_type, OpVersion _version, NumInputs inputs = {}, int outputs = 0)
inline bool operator==(const OperatorIdentifier &rhs) const
inline bool operator!=(const OperatorIdentifier &rhs) const
inline bool operator<(const OperatorIdentifier &rhs) const

Public Members

OpDomain domain
OpType type
OpVersion version
NumInputs numInputs
int numOutputs
struct NumInputs

Public Functions

inline NumInputs()
inline NumInputs(int f)
inline NumInputs(int _min, int _max)

Public Members

int min
int max
#include <popart/tensorlocation.hpp>
using popart::VGraphIdAndTileSet = std::pair<VGraphId, TileSet>
#include <popart/basicoptionals.hpp>
using popart::OptionalTensorLocation = BasicOptional<TensorLocation, 9>
using popart::OptionalVGraphId = BasicOptional<VGraphId, 2>
using popart::OptionalPipelineStage = BasicOptional<PipelineStage, 3>
using popart::OptionalExecutionPhase = BasicOptional<ExecutionPhase, 5>
using popart::OptionalBatchSerializedPhase = BasicOptional<BatchSerializedPhase, 7>
using popart::OptionalStochasticRoundingMethod = BasicOptional<StochasticRoundingMethod, 10>
using popart::OptionalDataType = BasicOptional<DataType, 0>
#include <popart/opmanager.hpp>
class OpDefinition

Public Types

using DataTypes = std::vector<DataType>
using Inputs = std::vector<Input>
using Outputs = std::vector<Output>
using Attributes = std::map<std::string, Attribute>

Public Functions

inline OpDefinition()
inline OpDefinition(Inputs i, Outputs o, Attributes a)

Public Members

Inputs inputs
Outputs outputs
Attributes attributes
struct Attribute

Public Functions

inline Attribute(std::string regex)

Public Members

std::string supportedValuesRegex
struct Input

Public Functions

inline Input(std::string n, std::vector<DataType> t, bool _constant = false)

Public Members

std::string name
std::vector<DataType> supportedTensors
bool constant
struct Output

Public Functions

inline Output(std::string n, std::vector<DataType> t)

Public Members

std::string name
std::vector<DataType> supportedTensors
class OpCreatorInfo

Public Functions

inline OpCreatorInfo(const OperatorIdentifier &_opid, const Op::Settings &_settings, const Attributes &_attributes, const std::vector<TensorId> &_inputIds, const std::vector<TensorId> &_outputIds)
inline bool hasInputIds() const
inline bool hasOutputIds() const
const std::vector<TensorId> &getInputIds() const
const std::vector<TensorId> &getOutputIds() const
Tensor *getInputTensor(int index) const
TensorData *getInputTensorData(int index) const
TensorInfo &getInputTensorInfo(int index) const
bool hasInputTensor(int index) const
std::string debugName() const
template<typename T>
inline std::vector<T> getInputData(int index, const std::set<DataType> &acceptedTypes) const
template<typename T>
inline std::vector<T> getInputData(int index) const
template<typename T>
inline T getInputScalarValue(int index) const
template<typename T>
inline T getInputScalarValue(int index, T defaultValue) const

Public Members

const OperatorIdentifier &opid
const Op::Settings &settings
const Attributes &attributes
class OpManager

Public Types

using OpFactoryFunc = std::function<std::unique_ptr<Op>(const OpCreatorInfo&)>
using ComplexOpFactoryFunc = std::function<Op*(const OpCreatorInfo&, Graph &graph)>

Public Functions

OpManager() = default

Public Static Functions

static void registerOp(const OpInfo &opInfo)
static Attributes getAttributesFromAnyMap(std::map<std::string, popart::any> attributes)
static std::unique_ptr<Op> createOp(const OpDomain &domain, const OpType &type, const int opsetVersion, Graph &graph, const std::string &name = "", const Scope &scope = {}, const Attributes &_attr = {}, const std::vector<TensorId> &inputIds = {}, const std::vector<TensorId> &outputIds = {})
static std::unique_ptr<Op> createOp(const OperatorIdentifier &opid, Graph &graph, const std::string &name = "", const Attributes &_attr = {})
static std::unique_ptr<Op> createOpWithInputs(const OperatorIdentifier &opid, Graph &graph, const std::string &name, const Attributes &_attr, const std::vector<TensorId> &inIds)
static Op *createOpInGraph(const Node &node, Graph &graph)
static const std::vector<OperatorIdentifier> getSupportedOperations(bool includePrivate)
static const std::vector<OperatorIdentifier> getUnsupportedOperations(int opsetVersion)
static const OpDefinitions getSupportedOperationsDefinition(bool includePrivate)
static OpVersion getOpVersionFromOpSet(const OpDomain &opDomain, const OpType &type, const int opsetVersion)
class OpInfo

Public Functions

inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, OpFactoryFunc _f1)
inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, ComplexOpFactoryFunc _f2)
OpFactoryFunc &getSimpleFactory()
ComplexOpFactoryFunc &getComplexFactory()
bool hasComplexFactory()

Public Members

bool isPublic
const OperatorIdentifier id
OpDefinition details
enum class popart::RecomputeType

Define the type of recomputation.

Values:

enumerator Undefined = 0

Default value if RecomputeType has not been set.

enumerator Checkpoint

Do not recompute. Outputs from the op are kept from the forward pass.

enumerator Recompute

Recompute operation.

enumerator Recomputed

For explicit recomputation, this marks a cloned operation that had RecomputeType::Recompute set.

After cloning, the original op is changed to RecomputeType::Checkpoint, and the cloned op is changed to Recomputed.

enum class popart::ExecutionContext

Define the type of the execution context.

Values:

enumerator Normal = 0

Run the forward and backward passes (Default).

enumerator AccumulateOuterFragment

Used to run the AccumulateOps after the gradient accumulation loop completes.

enumerator WeightsFromHostFragment

Used to transfer weights from host to device.

enumerator WeightsToHostFragment

Used to download weights from the device to the host.

enumerator OptimizerFromHostFragment

Used to stream the optimizer state from the host.

enumerator Subgraph

Program fragment used for subgraph-specific operations.

enum class popart::GradOpInType

Define the relationship between the input tensors of a gradient operation and the corresponding non-gradient operation.

Values:

enumerator In = 0

Indicates that the input tensor to the gradient operation is an input tensor of the non-gradient operation (Default).

enumerator Out

Indicates that the input tensor to the gradient operation is an output tensor of the non-gradient operation.

enumerator GradOut

Indicates that the input tensor to the gradient operation is an output gradient tensor of the non-gradient operation.

#include <popart/op/varupdate.hpp>
class VarUpdateOp : public popart::Op

Base class used to define PopART ops that update variable tensors.

Subclassed by popart::AccumulatorScaleOp, popart::VarUpdateWithUpdaterOp

Public Functions

VarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual view::Regions aliases(InIndex in, OutIndex) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters
  • InIndex – The input index.

  • OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

virtual std::map<InIndex, TensorId> optimizerInputs() const = 0
inline virtual bool isOptimizerOp() const override

Check if op is part of the optimizer.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

Return which inputs and outputs are replicated tensor sharding pairs.

virtual void growAliasModel(AliasModel&) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

Public Static Functions

static inline InIndex getVarToUpdateInIndex()
static inline OutIndex getUpdatedVarOutIndex()
class AccumulatorScaleOp : public popart::VarUpdateOp

Inplace multiplies a tensor by an OptimizerValue factor.

As with other Ops that consume OptimizerValues, will only have an input tensor for the value if the OptimizerValue is not const.

Will directly zero the input tensor if the factor is const and 0.

Subclassed by popart::AccumulatorZeroOp

Public Functions

AccumulatorScaleOp(const OptimizerValue factor_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const override
virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline const OptimizerValue &getFactor() const
inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

Public Static Functions

static inline InIndex getFactorInIndex()
class AccumulatorZeroOp : public popart::AccumulatorScaleOp

An AccumulatorScaleOp with a factor of 0, so zeroes the input tensor.

Public Functions

inline AccumulatorZeroOp(const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

class VarUpdateWithUpdaterOp : public popart::VarUpdateOp

Subclassed by popart::AccumulateBaseOp, popart::AdamComboOp, popart::AdamVarUpdateOp, popart::AdaptiveComboOp, popart::CopyVarUpdateOp, popart::ScaledVarUpdateOp, popart::SGD0ComboOp, popart::SGD0VarUpdateOpBase, popart::SGD1AcclUpdateOp, popart::SGD1VarUpdateOp, popart::SGDMComboBaseOp

Public Functions

VarUpdateWithUpdaterOp(const OperatorIdentifier &opid, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

Public Static Functions

static inline InIndex getUpdaterInIndex()
class AccumulateBaseOp : public popart::VarUpdateWithUpdaterOp

Subclassed by popart::AccumulateOp, popart::RescaleAccumulateOp, popart::SparseAccumulateOp

Public Functions

AccumulateBaseOp(const OperatorIdentifier &opid, AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override = 0
std::map<InIndex, TensorId> optimizerInputs() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const override
inline AccumulationType getAccumulationType() const
inline const OptimizerValue &getFactor() const

Public Static Functions

static inline constexpr InIndex getFactorInIndex()
class AccumulateOp : public popart::AccumulateBaseOp

Public Functions

AccumulateOp(AccumulationType type, OptimizerValue factor, const Op::Settings&)
std::unique_ptr<Op> clone() const override
class RescaleAccumulateOp : public popart::AccumulateBaseOp

The same as AccumulateOp however it also includes a rescale factor that allows for the accumulator to be rescaled at the same time.

Public Functions

RescaleAccumulateOp(AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const final

Public Static Functions

static inline InIndex getRescaleRatioInIndex()
class SparseAccumulateOp : public popart::AccumulateBaseOp

Say you have: w -> Gather -> x.

In backward pass you have: dW <- GatherGrad <- x

and when the optimiser step is grown: dW <- GatherGrad <- x \ Accumulate -> accum’ / accum

GatherGrad is essentially a scatter. Then we Accumulate the resultant dW on accum. This involves creating an extra dW tensor, so instead we can do:

          x
          |
          V
accum -> SparseAccumulate -> accum’

Where SparseAccumulate can in one operation, without extra space, accumulate the slices of x into accum as required.

The input tensor at getOriginalVarToUpdateInIndex() is an optional input. This is can be used when two different views of the weight are consumed in the forward pass (by ops that will be autodiffed), and one of those ops is a Gather, thus requiring a SparseAccumulate in the weight update step.

We connect getOriginalVarToUpdateInIndex() to the other view of the weight than the one this SparseAccumulate is for. Then, SparseAccumulateOpx will clone that tensor (and its layout) when creating accum.

You probably do not need this outside of the TiedGatherPattern.

See also

SparseAccumulateOpx::createInputTensor for further motivation of why it does this.

Public Functions

SparseAccumulateOp(AccumulationType type, const OptimizerValue &factor, unsigned axis, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual std::set<InIndex> optionalInputs() const override

Return the input indices of all optional inputs to the op.

unsigned getAxis() const

Public Static Functions

static inline constexpr InIndex getIndicesInIndex()
static inline constexpr InIndex getOriginalVarToUpdateInIndex()
static bool supportsAccumulationType(AccumulationType type)
class AdamComboOp : public popart::VarUpdateWithUpdaterOp

Public Functions

AdamComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialB1, OptimizerValue initialB2, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue mwn, OptimizerValue initialGs, AdamMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, bool scaledOptimizerState_, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::map<InIndex, TensorId> optimizerInputs() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
std::set<InIndex> optionalInputs() const final
inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr
const OptimizerValue initWd
const OptimizerValue initB1
const OptimizerValue initB2
const OptimizerValue initEps
const OptimizerValue initLs
const OptimizerValue initMwn
const OptimizerValue initGs
const AdamMode mode
const WeightDecayMode decayMode
const bool withGradAccum
const OptimizerReductionType reductionType
DataType accumType
DataType accl1Type
DataType accl2Type
const bool scaledOptimizerState

Public Static Functions

static inline InIndex getLrInIndex()
static inline InIndex getWdInIndex()
static inline InIndex getBeta1InIndex()
static inline InIndex getBeta2InIndex()
static inline InIndex getEpsInIndex()
static inline InIndex getLsInIndex()
static inline InIndex getMwnInIndex()
static inline InIndex getGsInIndex()
class AdamVarUpdateOp : public popart::VarUpdateWithUpdaterOp

Public Functions

AdamVarUpdateOp(OptimizerValue initLr, OptimizerValue mwn, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::map<InIndex, TensorId> optimizerInputs() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr
const OptimizerValue initMwn

Public Static Functions

static inline InIndex getLambR1SqInIndex()
static inline InIndex getLambR2SqInIndex()
static inline InIndex getLrInIndex()
static inline InIndex getMwnInIndex()
class AdaptiveComboOp : public popart::VarUpdateWithUpdaterOp

Public Functions

AdaptiveComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialA, OptimizerValue initialM, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue initialGs, AdaptiveMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, bool rmspropTFVariant_, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::map<InIndex, TensorId> optimizerInputs() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
std::set<InIndex> optionalInputs() const final
inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr
const OptimizerValue initWd
const OptimizerValue initA
const OptimizerValue initM
const OptimizerValue initEps
const OptimizerValue initLs
const OptimizerValue initGs
const AdaptiveMode mode
const WeightDecayMode decayMode
const bool withGradAccum
const OptimizerReductionType reductionType
DataType accumType
DataType accl1Type
DataType accl2Type
DataType accl3Type
const bool rmspropTFVariant

Public Static Functions

static inline InIndex getLrInIndex()
static inline InIndex getWdInIndex()
static inline InIndex getAlphaInIndex()
static inline InIndex getMomentumInIndex()
static inline InIndex getEpsInIndex()
static inline InIndex getLsInIndex()
static inline InIndex getGsInIndex()
class CopyVarUpdateOp : public popart::VarUpdateWithUpdaterOp

Public Functions

CopyVarUpdateOp(const Op::Settings&)
CopyVarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const final
inline std::map<InIndex, TensorId> optimizerInputs() const final
inline float getSubgraphValue() const final
view::Regions modifies(InIndex) const override
class SGD0ComboOp : public popart::VarUpdateWithUpdaterOp

A single Op that encapsulates all the information needed to describe an SGD0 optimiser step.

The “0” in the name signifies that there is no optimizer state (note a gradient accum tensor may still be required)

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD0Decompose pattern.

See also

SGD for the definition of what SGD0 is.

See also

SGD0Decompose for the definition of this decomposition.

Public Functions

SGD0ComboOp(OptimizerValue initialSwd, OptimizerValue initialSlr, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const override

Return the input indices of all optional inputs to the op.

virtual std::map<InIndex, TensorId> optimizerInputs() const override
virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

Public Members

OptimizerValue initSlr0
OptimizerValue initWdsf0
const bool withGradAccum
const OptimizerReductionType reductionType
const DataType accumType

Public Static Functions

static inline InIndex getSlr0InIndex()
static inline InIndex getWdsf0InIndex()
class SGD0VarUpdateOpBase : public popart::VarUpdateWithUpdaterOp

Subclassed by popart::SGD0VarUpdateOp

Public Functions

SGD0VarUpdateOpBase(const OperatorIdentifier &_opid, OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
std::map<InIndex, TensorId> optimizerInputs() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
std::set<InIndex> optionalInputs() const final

Public Members

const OptimizerValue initSlr0
const OptimizerValue initWdsf0

Public Static Functions

static inline InIndex getSlr0InIndex()
static inline InIndex getWdsf0InIndex()
class SGD0VarUpdateOp : public popart::SGD0VarUpdateOpBase

Public Functions

SGD0VarUpdateOp(OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings&)
std::unique_ptr<Op> clone() const final
float getSubgraphValue() const final
class SGD1AcclUpdateOp : public popart::VarUpdateWithUpdaterOp

Performs the part of the SGD1 velocity update equation that is pre-computed for the next time step after the weight update of the current time step.

Let: v be the input at getVarToUpdateInIndex() g be the input at getUpdaterInIndex() then this op performs: v <- v * smm1 + swd1 * g

See also

SGD for how this is derived and the definitions of smm1 and swd1.

Subclassed by popart::SGD2PartialAcclUpdateOp

Public Functions

SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const override
virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

Public Members

const OptimizerValue initSmm1
const OptimizerValue initSwd1

Public Static Functions

static inline InIndex getSmm1InIndex()
static inline InIndex getSwd1InIndex()
class SGD2PartialAcclUpdateOp : public popart::SGD1AcclUpdateOp

This Op is by design exactly equivalent to an SGD1AcclUpdateOp.

Any logic based on an SGD1AcclUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2PartialAcclUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1AcclUpdateOp.

For SGD2, the entire v update equation could be done in one op (see equation derivation in optimizer.hpp); however, we reuse the SG1AcclUpdateOp and AccumulateOp to implement the equation in the two steps.

Public Functions

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)
SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, OperatorIdentifier opid, const Op::Settings&)
class SGD1VarUpdateOp : public popart::VarUpdateWithUpdaterOp

Performs the SGD1 weight update equation.

Let: w be the input at getVarToUpdateInIndex() g be the input at getUpdaterInIndex() then this op performs: w <- w - slr1 * g

See also

SGD for how this is derived and the definition of slr1.

Subclassed by popart::SGD2VarUpdateOp

Public Functions

SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const final
virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

Public Members

const OptimizerValue initSlr1

Public Static Functions

static inline InIndex getSlr1InIndex()
class SGD2VarUpdateOp : public popart::SGD1VarUpdateOp

This Op is by design exactly equivalent to an SGD1VarUpdateOp.

Any logic based on an SGD1VarUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2VarUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1VarUpdate.

Public Functions

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)
class SGDMComboBaseOp : public popart::VarUpdateWithUpdaterOp

Subclassed by popart::SGD1ComboOp, popart::SGD2ComboOp

Public Functions

SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf, OptimizerValue initialNdsf, OptimizerReductionType reductionType_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override = 0
std::map<InIndex, TensorId> optimizerInputs() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
std::set<InIndex> optionalInputs() const override
inline float getSubgraphValue() const override

Public Members

const OptimizerValue initSmm1
const OptimizerValue initDpsf1
const OptimizerValue initSwd1
const OptimizerValue initSlr1
OptimizerValue initMm
OptimizerValue initWd
OptimizerValue initNgsf
OptimizerValue initNdsf
const OptimizerReductionType reductionType
bool nesterov

Public Static Functions

static inline InIndex getSmm1InIndex()
static inline InIndex getDpsf1InIndex()
static inline InIndex getSwd1InIndex()
static inline InIndex getSlr1InIndex()
static inline InIndex getMmInIndex()
static inline InIndex getWdInIndex()
static inline InIndex getNgsfInIndex()
static inline InIndex getNdsfInIndex()
class SGD1ComboOp : public popart::SGDMComboBaseOp

A single Op that encapsulates all the information needed to describe an SGD1 optimiser step.

The “1” in the name signifies that only one extra optimiser tensor (the accl tensor) is required.

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD1Decompose pattern.

See also

SGD for the definition of what SGD1 is.

See also

SGD1Decompose for the definition of this decomposition.

Public Functions

SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf1, OptimizerValue initialNdsf1, OptimizerReductionType reductionType_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

class SGD2ComboOp : public popart::SGDMComboBaseOp

A single Op that encapsulates all the information needed to describe an SGD2 optimiser step.

The “2” in the name signifies that two extra optimiser tensors (the accum and accl1 tensors) may be required.

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD2Decompose pattern.

See also

SGD for the definition of what SGD2 is.

See also

SGD2Decompose for the definition of this decomposition.

Public Functions

SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)
SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf2, OptimizerValue initialNdsf2, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

Public Members

const bool withGradAccum
const DataType accumType
const DataType accl1Type
class ScaledVarUpdateOp : public popart::VarUpdateWithUpdaterOp

Public Functions

ScaledVarUpdateOp(OptimizerValue initLr, OptimizerValue initWd, bool lrInUpdater, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::map<InIndex, TensorId> optimizerInputs() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr
const OptimizerValue initWd
const bool lrInUpdater

Public Static Functions

static inline InIndex getLrInIndex()
static inline InIndex getWdInIndex()
#include <popart/alias/aliasmodel.hpp>
class AliasModel

A container for the poprithms::memory::inplace::Graph which corresponds to a PopART Graph.

It contains the poprithms Graph, and mappings between PopART Tensors and Ops, and their poprithms equivalents.

Public Types

using PoprithmsTensorId = poprithms::memory::inplace::TensorId
using PoprithmsOpId = poprithms::memory::inplace::OpId

Public Functions

AliasModel()
~AliasModel() = default
void setGraph(const popart::Graph *graph)

Set PopART graph.

void insertTensor(const PoprithmsTensorId &poprithmsTensor, const Tensor &popartTensor)

Register that a poprithms Tensor and a popart Tensor correspond to each other.

In addition to registering the Tensor correspondence, the Ops which produce the respective Tensors are registered to be corresponding.

Parameters
  • poprithmsTensor – The Tensor in the poprithms Graph.

  • popartTensor – The Tensor in the PopART Graph.

void insertOp(PoprithmsOpId, OpId)

Register that a poprithms Op and a popart Op correspond.

Note that multiple poprithms Ops can correspond to a single popart Op.

void insertUnaryModifier0(const Op &op)

This method performs the following steps:

(1) inserts an aliasGate which is open at index 0 (2) appends a modify to the output aliasGate created in (1) (3) registers that op.output(0) match the output of (2) (4) registers that the poprithms ops created at (1) and (2) correspond to #op.

Parameters

op – A PopART Op, which might have multiple inputs, and whose output is a modifies alias of its input at index 0.

void insertUnaryModifier(const Op&, InIndex)

As per insertUnaryModifier0, but the input index may be different from 0.

void insertBinaryModifier(const Op &op)

This method performs the following steps:

(1) inserts an aliasGate whose inputs are the 2 poprithms Tensors corresponding to the 2 inputs of #op. The alias gate is open at the index which #op aliases through, if any.

(2) appends a modify to the output of the aliasGate created at (1)

(3) registers that the poprithms ops (1) and (2) correspond to #op.

Diagramatically, for the PopART Op:

input0 … input1 \ / op | output0

This method creates the following poprithms subgraph:

input0 … input1 \ / aliasGate | modify | output0

Parameters

op – A PopART Op with 2 inputs.

void insertNG2aryModifier(const Op &op, unsigned int numInputs)

The method is the same as insertBinaryModifier except for allowing a larger number of inputs than 2.

Parameters
  • op – A PopART Op with 2 or more inputs.

  • numInputs – The number of inputs

void insertViewChange(PoprithmsTensorId viewChangeOut, const Tensor &t, bool isOutplace)

This method performs the following steps:

(1) adds an aliasGate whose (unique) unput is viewChangeOut,

(2) registers that the output of the aliasGate corresponds to the PopART Tensor #t.

(3) registers that the creator of t (if there is any) corresponds to 2 poprithms ops: the creator of viewChangeOut and the aliasGate created at (1).

Parameters
  • viewChangeOut – This is a Tensor which is the output of a view changing Op, such as reshape and dimShuffle.

  • t – This PopART Tensor is the output of the corresponding PopART view changing Op.

  • isOutplace – This boolean determines if the AliasGate created at (1) should be open or closed. If isOutplace is true, then the AliasGate will be closed.

void update(OpId oldId, OpId newId)

Replace all appearances of #oldId in all maps between PopART and poprithms, with #newId.

This is useful when, for example, an Op is replaced in the PopART Graph during the inplacing transformation.

TensorId getTensorId(const PoprithmsTensorId &id) const
Returns

The TensorId corresponding to a poprithms TensorId.

bool contains(const PoprithmsTensorId&) const
PoprithmsTensorId getPoprithmsTensorId(const TensorId &id) const
Returns

The poprithms TensorId corresponding to a TensorId.

bool contains(const TensorId&) const
OpId getOpId(PoprithmsOpId) const
Returns

The OpId corresponding to a poprithms OpId.

bool contains(PoprithmsOpId) const
PoprithmsOpId getGate(OpId opId) const
Returns

The ID of the AliasGate in the poprithms Graph, which corresponds to the PopART Op #opId. If no such AliasGate exists, an error is thrown.

std::vector<PoprithmsOpId> getAll(OpId) const
Returns

The poprithms OpIds which correspond to a PopART OpId. It is possible for 1 PopART Op to correspond to multiple poprithms Ops.

bool contains(OpId) const
std::vector<Tensor*> allAliases(const Tensor &t) const

Get all aliases for a tensor for this given model.

Returned tensors include the argument #t, if it is non-empty.

bool contains(const Tensor &super, const Tensor &sub) const
Returns

true if all of the ‘allocation’ elements of sub and are also in super.

Public Members

poprithms::memory::inplace::Graph g

The poprithms Graph.

popart::Graph *thisGraph = nullptr

The PopART graph reference.

Public Static Attributes

static constexpr int loadFactor = 0.5

load factor used for hash map containers

#include <popart/op/ipucopy.hpp>
class IpuCopyOp : public popart::Op

Public Functions

IpuCopyOp(const OperatorIdentifier &_opid, VGraphId _destIpu, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline VGraphId getDestIpu() const
const SourceIpuMap &getSourceIpus() const
const SourceTensorMap &getSourceTensors() const
VGraphId getSourceIpu(const TensorId &tenId) const
VGraphId getSourceIpu() const
VGraphId getMinSourceIpu() const
VGraphId getMaxSourceIpu() const
void setSourceIpus(const SourceIpuMap sourceIpus)
void setSourceTensors(const SourceTensorMap sourceTensors)
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
bool isOutlineable() const override
bool isIpuCopyOp() const final
bool copiesOptimizerTensors() const final
void connectInTensor(InIndex, TensorId, VGraphId sourceIpu) override
std::string getFromToStr() const
void disconnectInTensor(InIndex, Tensor*) override
inline bool canShard() const override
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
using popart::SourceIpuMap = std::map<TensorId, VGraphId>
using popart::SourceTensorMap = std::map<VGraphId, std::vector<TensorId>>

14.8.2. Op definition for Poplar implementation

#include <popart/popx/opx.hpp>
class Opx

Subclassed by popart::popx::AbortOpx, popart::popx::AdaDeltaUpdaterOpx, popart::popx::AdamUpdaterOpx, popart::popx::AddBiasDataGradOpx, popart::popx::AddBiasOpx, popart::popx::AllReduceOpx, popart::popx::ArgExtremaOpx, popart::popx::AsinGradOpx, popart::popx::AtanGradOpx, popart::popx::BaseConcatOpx, popart::popx::BaseExpandOpx, popart::popx::BasePadOpx, popart::popx::BaseSliceOpx, popart::popx::BaseSortOpx, popart::popx::BaseWhereOpx, popart::popx::BinaryComparisonOpx, popart::popx::Bucketizex, popart::popx::CastOpx, popart::popx::CastThenPow2ScaleOpx, popart::popx::ClipGradOpx, popart::popx::CollectivesBaseOpx, popart::popx::ConcatGradOpx, popart::popx::ConvFlipWeightsGradOpx, popart::popx::CtcBeamSearchDecoderOpx, popart::popx::CtcGradOpx, popart::popx::CtcOpx, popart::popx::CumSumGradOpx, popart::popx::CumSumOpx, popart::popx::DynamicSliceOpx, popart::popx::DynamicUpdateOpx, popart::popx::DynamicZeroOpx, popart::popx::ElementWiseBinaryOpx, popart::popx::ElementWiseUnaryOpx, popart::popx::EluGradOpx, popart::popx::ExchangeBaseOpx, popart::popx::ExpandGradOpx, popart::popx::GatherBaseOpx, popart::popx::GatherGradOpx, popart::popx::GeluErfGradOpx, popart::popx::GeluGradOpx, popart::popx::GetRandomSeedOpx, popart::popx::GRUGradOpx, popart::popx::GRUOpx, popart::popx::HardSigmoidGradOpx, popart::popx::HistogramOpx, popart::popx::IdentityInplaceOpx, popart::popx::IdentityLossGradOpx, popart::popx::IdentityLossOpx, popart::popx::IfOpx, popart::popx::InitOpx, popart::popx::IoTileCopyOpx, popart::popx::IpuCopyOpx, popart::popx::L1GradOpx, popart::popx::L1Opx, popart::popx::LambSquareOpx, popart::popx::LeakyReluGradOpx, popart::popx::LossScaleUpdateOpx, popart::popx::LRNGradOpx, popart::popx::LRNOpx, popart::popx::LSTMGradOpx, popart::popx::LSTMOpx, popart::popx::MatMulOpx, popart::popx::MaxArgGradOpx, popart::popx::MaxOpx, popart::popx::MeanArgGradOpx, popart::popx::MinArgGradOpx, popart::popx::MinOpx, popart::popx::ModifyRandomSeedOpx, popart::popx::MultiConvBaseOpx, popart::popx::MultiConvWeightsGradBaseOpx, popart::popx::NllGradOpx, popart::popx::NlllWithSoftmaxGradDirectOpx, popart::popx::NllOpx, popart::popx::NopOpx, popart::popx::NormalizeImageOpx, popart::popx::NormOpx, popart::popx::OnehotGradOpx, popart::popx::OnehotOpx, popart::popx::PopartLSTMOpxBase< LSTMOP >, popart::popx::Pow2ScaleThenCastOpx, popart::popx::PrintTensorOpx, popart::popx::RandomNormalOpx, popart::popx::RandomUniformOpx, popart::popx::ReduceL1GradOpx, popart::popx::ReduceL1Opx, popart::popx::ReduceL2GradOpx, popart::popx::ReduceL2Opx, popart::popx::ReduceLogSumExpGradOpx, popart::popx::ReduceLogSumExpOpx, popart::popx::ReduceLogSumGradOpx, popart::popx::ReduceLogSumOpx, popart::popx::ReduceMaxGradOpx, popart::popx::ReduceMaxOpx, popart::popx::ReduceMeanGradOpx, popart::popx::ReduceMeanOpx, popart::popx::ReduceMedianGradOpx, popart::popx::ReduceMedianOpx, popart::popx::ReduceMinGradOpx, popart::popx::ReduceMinOpx, popart::popx::ReduceProdGradOpx, popart::popx::ReduceProdOpx, popart::popx::ReduceSumGradOpx, popart::popx::ReduceSumOpx, popart::popx::ReduceSumSquareGradOpx, popart::popx::ReduceSumSquareOpx, popart::popx::ReluGradOpx, popart::popx::ReshapeBaseOpx, popart::popx::ResizeGradOpx, popart::popx::ResizeOpx, popart::popx::RestoreBaseOpx< Derived >, popart::popx::ReverseBaseOpx, popart::popx::RMSPropUpdaterOpx, popart::popx::RNNGradOpx, popart::popx::RNNOpx, popart::popx::RoiAlignGradOpx, popart::popx::RoiAlignOpx, popart::popx::ScaledAddOpx, popart::popx::ScatterDataGradOpx, popart::popx::ScatterReduceGradOpx, popart::popx::ScatterReduceOpx, popart::popx::ScatterUpdateGradOpx, popart::popx::SeluGradOpx, popart::popx::SequenceSliceInplaceOpx, popart::popx::SequenceSliceOpx, popart::popx::SGD1NesterovOpx, popart::popx::ShapedDropoutOpx, popart::popx::ShrinkGradOpx, popart::popx::SinhGradOpx, popart::popx::SoftmaxGradDirectOpx, popart::popx::SoftPlusGradOpx, popart::popx::SoftSignGradOpx, popart::popx::SplineBasisx, popart::popx::SplineWeightingx, popart::popx::SplitOpx, popart::popx::StashOpx, popart::popx::SubgraphOpx, popart::popx::SubsampleGradOpx, popart::popx::SubsampleInplaceOpx, popart::popx::SubsampleOpx, popart::popx::SumArgGradOpx, popart::popx::SumOpx, popart::popx::SwishGradOpx, popart::popx::SyncOpx, popart::popx::TanhGradOpx, popart::popx::TanhOpx, popart::popx::TensorRemapOpx, popart::popx::ThresholdedReluGradOpx, popart::popx::TileGradOpx, popart::popx::TileOpx, popart::popx::TopKGradOpx, popart::popx::TransposeInplaceOpx, popart::popx::TransposeOpx, popart::popx::VarUpdateOpx, popart::popx::WhereXGradOpx, popart::popx::WhereYGradOpx, popart::popx::ZerosOpx, popart::popx::PopartLSTMOpxBase< PopartLSTMGradOp >, popart::popx::PopartLSTMOpxBase< PopartLSTMOp >, popart::popx::RestoreBaseOpx< RestoreInplaceOpx >, popart::popx::RestoreBaseOpx< RestoreOpx >

Public Functions

Opx(Op*, Devicex*)
virtual ~Opx()
virtual poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const
virtual poplar::Tensor createInputTensor(popart::InIndex index, const poplar::DebugNameAndId &dnai) const
virtual InputCreatorType getInputCreatorType(InIndex index) const
virtual bool canUnwind(InIndex, OutIndex) const
virtual view::RegMap unwindRegion(InIndex, OutIndex) const
virtual poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const
virtual bool createsEquiv(int index0, const Opx *opx1, int index1) const
virtual bool outputCreatedExternally(OutIndex index) const
virtual std::set<TensorId> mustExistBeforeCreate(int index0) const
virtual DnfTensorIds mustExistBeforeCreateDNF(int index0) const
poplar::Tensor cloneNcopy(poplar::program::Sequence&, TensorId) const
poplar::Tensor cloneNcopy(poplar::program::Sequence&, const poplar::Tensor&, const std::string name = "") const
poplar::Tensor broadcast(const std::vector<int64_t>&, TensorId) const
poplar::Tensor broadcast(const std::vector<int64_t>&, poplar::Tensor) const
const Devicex *getDevicex() const
int64_t getVirtualGraphId() const
poplar::Graph &graph() const
poplar::Graph &topLevelGraph() const
virtual poplar::Graph &srcGraph(InIndex) const
virtual poplar::Graph &dstGraph(OutIndex) const
const poplar::Tensor &get(TensorId) const
const poplar::Tensor &getView(TensorId) const
void insert(TensorId, const poplar::Tensor&) const
Tensor *inTensor(InIndex) const
Tensor *outTensor(OutIndex) const
const poplar::Tensor &getInTensor(InIndex index) const
const poplar::Tensor &getOutTensor(OutIndex index) const
const poplar::Tensor &getInView(InIndex index) const
const poplar::Tensor &getOutView(OutIndex index) const
bool hasInViewChangers(InIndex index) const
const ViewChangers &getInViewChangers(InIndex index) const
void setOutViewChangers(OutIndex index, const ViewChangers &changers) const
const TensorInfo &inInfo(InIndex) const
const Shape &inShape(InIndex) const
const TensorInfo &outInfo(OutIndex) const
const Shape &outShape(OutIndex) const
template<class OP>
inline OP &getOp() const
template<class OP>
inline void verifyOp(Op *op, const OperatorIdentifier &opid)
template<class OP>
inline void verifyOp(Op *op, std::vector<OperatorIdentifier> opids)
template<class OP>
inline void verifyOp(Op *op)
bool hasInput(InIndex) const
bool hasOutput(OutIndex) const
void setOutTensor(OutIndex index, const poplar::Tensor &tensor) const
TensorId inId(InIndex index) const
TensorId outId(OutIndex index) const
poplar::Tensor getConst(const poplar::Type &type, const std::vector<size_t> &shape, double val, const std::string &name) const
poplar::Tensor getScalarVariable(const poplar::Type &type, const std::string &name) const
poplar::Tensor getZerosTensor(std::vector<std::size_t>, poplar::Type, std::string) const
poplar::Graph &inGraph(InIndex in) const

Return the virtual graph associated with input at index in.

Parameters

in – the input index

Returns

the corresponding poplar virtual graph

virtual std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const
virtual OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const
virtual bool hasCreatorViewChangers(InIndex index) const
virtual ViewChangers getCreatorViewChangers(InIndex index) const
virtual void growPart(OpxGrowPartId id) const
virtual void grow(poplar::program::Sequence&) const
virtual void grow(std::vector<poplar::program::Sequence>&) const
const popart::DebugInfo &getDebugInfo() const
const poplar::DebugNameAndId getDebugNameAndId(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const
poplar::DebugContext debugContext(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const
virtual PreparedTensorInfos getOutputsToPrepare() const
virtual PreparedTensorInfos getInputsToPrepare() const
poplar::Graph &outGraph(OutIndex out) const

Return the virtual graph associated with output at index out.

Parameters

out – the output index

Returns

the corresponding poplar virtual graph

const std::vector<size_t> inShapeSzt(InIndex) const
poplar::Tensor mapMaybeInPlace(popops::expr::BinaryOpType, poplar::Tensor&, poplar::Tensor&, poplar::program::Sequence&, const poplar::DebugContext&, const poplar::OptionFlags&, const std::string&)

Public Members

double inputCreatorPriority = {0.0}
Op *op_p
class RoiAlignGradOpx : public popart::popx::Opx

Public Functions

RoiAlignGradOpx(Op*, Devicex*)
~RoiAlignGradOpx() override = default
virtual void grow(poplar::program::Sequence&) const final
class RoiAlignOpx : public popart::popx::Opx

Public Functions

RoiAlignOpx(Op*, Devicex*)
~RoiAlignOpx() override = default
virtual void grow(poplar::program::Sequence&) const final

14.8.3. Available Ops (Op class)

struct AiGraphcoreOpIdV1 : public popart::OperatorIdentifier

Public Functions

inline AiGraphcoreOpIdV1(const OpType &_type, NumInputs inputs = {}, int outputs = 0)
class AbortOp : public popart::Op

Public Functions

AbortOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const final
inline bool hasSideEffect() const override

Public Static Functions

static inline InIndex getInIndex()
class AbsGradOp : public popart::Op

Public Functions

AbsGradOp(const AbsOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline virtual float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdArgInIndex()
static inline OutIndex getOutIndex()
class AbsOp : public popart::ElementWiseUnaryOp

Public Functions

AbsOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class AdaDeltaUpdaterOp : public popart::Op

Public Functions

AdaDeltaUpdaterOp(OptimizerValue eps, const Op::Settings&)
std::unique_ptr<Op> clone() const final
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final
inline bool isOptimizerOp() const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

const OptimizerValue initEps

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getAccl1InIndex()
static inline InIndex getAccl2InIndex()
static inline InIndex getEpsInIndex()
static inline OutIndex getUpdaterOutIndex()
class AdamUpdaterOp : public popart::Op

Public Functions

AdamUpdaterOp(AdamMode mode_, OptimizerValue wd, OptimizerValue b1, OptimizerValue b2, OptimizerValue eps, const Op::Settings&)
std::unique_ptr<Op> clone() const final
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final
inline bool isOptimizerOp() const override
view::Regions modifies(InIndex) const final
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

AdamMode mode
const OptimizerValue initWd
const OptimizerValue initB1
const OptimizerValue initB2
const OptimizerValue initEps

Public Static Functions

static inline InIndex getVarInIndex()
static inline InIndex getAccl1InIndex()
static inline InIndex getAccl2InIndex()
static inline InIndex getStepInIndex()
static inline InIndex getWdInIndex()
static inline InIndex getBeta1InIndex()
static inline InIndex getBeta2InIndex()
static inline InIndex getEpsInIndex()
static inline OutIndex getUpdaterOutIndex()
class AddArg0GradOp : public popart::ReduceSumOp

Public Functions

AddArg0GradOp(const Op&, const std::vector<int64_t> &axes)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
class AddArg1GradOp : public popart::ReduceSumOp

Public Functions

AddArg1GradOp(const Op&, const std::vector<int64_t> &axes)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
class AddBiasBiasGradOp : public popart::ReduceSumOp

Public Functions

AddBiasBiasGradOp(const AddBiasOp&, const std::vector<int64_t> &axes)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class AddBiasDataGradOp : public popart::IdentityOp

Public Functions

AddBiasDataGradOp(const AddBiasOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class AddBiasInplaceOp : public popart::AddBiasOp

Public Functions

AddBiasInplaceOp(const AddBiasOp&)
std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
view::Regions modifies(InIndex) const override
view::Regions aliases(InIndex, OutIndex) const override
class AddBiasOp : public popart::Op

Subclassed by popart::AddBiasInplaceOp

Public Functions

AddBiasOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
view::RegMap fwdRegMap(InIndex, OutIndex) const override
view::RegMap bwdRegMap(InIndex, OutIndex) const override
void growAliasModel(AliasModel&) const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getDataInIndex()
static inline InIndex getBiasInIndex()
static inline OutIndex getOutIndex()
class AddLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp

Public Functions

inline AddLhsInplaceOp(const OperatorIdentifier &_, const Op::Settings &_settings)
inline AddLhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
class AddRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp

Public Functions

inline AddRhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
class AllReduceGradOp : public popart::AllReduceOp

Public Functions

AllReduceGradOp(CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class AllReduceOp : public popart::Op

Subclassed by popart::AllReduceGradOp

Public Functions

AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const Op::Settings &settings_)
AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const override
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
inline CollectiveOperator getReduceOp() const
inline bool getIdenticalInputs() const
inline std::vector<int64_t> getIpus() const

Public Static Functions

static inline InIndex getInStartIndex()
static inline OutIndex getOutStartIndex()
class AndOp : public popart::BinaryComparisonOp

Public Functions

AndOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class ArgExtremaOp : public popart::Op

Subclassed by popart::ArgMaxOp, popart::ArgMinOp

Public Functions

ArgExtremaOp(const OperatorIdentifier &_opid, int64_t axis, int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
void setup() final
int64_t getKeepDims() const
int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ArgMaxOp : public popart::ArgExtremaOp

Public Functions

std::unique_ptr<Op> clone() const final
class ArgMinOp : public popart::ArgExtremaOp

Public Functions

std::unique_ptr<Op> clone() const final
class AsinGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

AsinGradOp(const AsinOp&)
std::unique_ptr<Op> clone() const final
class AsinInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

AsinInplaceOp(const AsinOp&)
std::unique_ptr<Op> clone() const final
class AsinOp : public popart::ElementWiseUnaryOp

Public Functions

AsinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class Atan2Arg0GradOp : public popart::ElementWiseBinaryArg0GradOp

Public Functions

Atan2Arg0GradOp(const Op&, const std::vector<int64_t> &reduction_axes)
std::unique_ptr<Op> clone() const final
class Atan2Arg1GradOp : public popart::ElementWiseBinaryArg1GradOp

Public Functions

Atan2Arg1GradOp(const Op&, const std::vector<int64_t> &reduction_axes)
std::unique_ptr<Op> clone() const final
class Atan2LhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp

Public Functions

inline Atan2LhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
class AtanGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

AtanGradOp(const AtanOp&)
std::unique_ptr<Op> clone() const final
class AtanInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

AtanInplaceOp(const AtanOp&)
std::unique_ptr<Op> clone() const final
class AtanOp : public popart::ElementWiseUnaryOp

Public Functions

AtanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class AutoLossScaleProxyGradOp : public popart::AutoLossScaleProxyOp

Public Functions

AutoLossScaleProxyGradOp(const AutoLossScaleProxyOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class AutoLossScaleProxyOp : public popart::ElementWiseUnaryOp

Subclassed by popart::AutoLossScaleProxyGradOp

Public Functions

AutoLossScaleProxyOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class AveragePoolGradOp : public popart::Op

Public Functions

AveragePoolGradOp(const AveragePoolOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override

Public Members

const Shape creatorSpatialK
const Shape creatorStrides
const Shape creatorLowerPads
const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()
static inline InIndex getPooledInIndex()
static inline InIndex getGradPooledInIndex()
static inline OutIndex getOutIndex()
class AveragePoolOp : public popart::HasReceptiveFieldOp

Public Functions

AveragePoolOp(const OperatorIdentifier &_opid, int64_t _countIncludePad, const std::vector<int64_t> &_kernelShape, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
int64_t getNOutChans() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
bool canBeReplacedByIdentity() const override
Shape getSpatialK() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class BaseOnnxRNNGradOp : public popart::Op

Subclassed by popart::GRUGradOp, popart::LSTMGradOp, popart::RNNGradOp

Public Functions

BaseOnnxRNNGradOp(const OperatorIdentifier &_opid, const BaseOnnxRNNOp &fwd_op)
virtual std::unique_ptr<Op> clone() const override = 0
void setup() override
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const override
bool hasLastHiddenStateGradInput() const
bool hasFullHiddenStateGradInput() const
inline float getSubgraphValue() const final

Public Members

const bool hasBiasesInput
const bool hasInitialHInput
const unsigned batch_size
const unsigned input_size
const unsigned max_seq_length
const unsigned hidden_size
const unsigned num_directions = 1

Public Static Functions

static inline InIndex getInputInIndex()
static inline InIndex getInputWeightsInIndex()
static inline InIndex getRecurrenceWeightsInIndex()
static inline InIndex getBiasesInIndex()
static inline InIndex getInitialHInIndex()
static inline InIndex getFullHiddenStateInIndex()
static inline InIndex getLastHiddenStateGradInIndex()
static inline InIndex getFullHiddenStateGradInIndex()
static inline InIndex getSequenceLensInIndex()
static inline OutIndex getInputOutIndex()
static inline OutIndex getInputWeightsOutIndex()
static inline OutIndex getRecurrenceWeightsOutIndex()
static inline OutIndex getBiasesOutIndex()
static inline OutIndex getInitialHOutIndex()
class BaseOnnxRNNOp : public popart::Op

Subclassed by popart::GRUOp, popart::LSTMOp, popart::RNNOp

Public Functions

BaseOnnxRNNOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
int64_t getMaxSeqLength() const
int64_t getBatchSize() const
int64_t getInputSize() const
int64_t getHiddenSize() const
virtual int64_t getNumDirections() const
void checkHiddenSize() const
bool hasBiasesInput() const
bool hasInitialHInput() const
bool hasSeqLenInput() const
std::set<InIndex> optionalInputs() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline virtual std::string getName() const
inline nonstd::optional<int64_t> getHiddenSizeAttribute() const

Public Static Functions

static inline InIndex getInputInIndex()
static inline InIndex getInputWeightsInIndex()
static inline InIndex getRecurrenceWeightsInIndex()
static inline InIndex getBiasesInIndex()
static inline InIndex getSequenceLensInIndex()
static inline InIndex getInitialHInIndex()
static inline OutIndex getFullHiddenStateOutIndex()
static inline OutIndex getLastHiddenStateOutIndex()
class BasePadOp : public popart::Op

Subclassed by popart::BasePadOutplaceOp, popart::PadInplaceOp

Public Functions

BasePadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
bool padSizeZero() const
inline float getSubgraphValue() const final
view::Region valueRegion() const
std::vector<int64_t> padDimensions() const
inline int64_t getLowerPadding(size_t dim) const
inline int64_t getUpperPadding(size_t dim) const
inline const std::string &getMode() const
inline float getPadValue() const
void appendOutlineAttributes(OpSerialiserBase&) const override
void setup() final
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
inline int64_t getRank() const
std::vector<Slice> getSlices() const
inline std::vector<std::ptrdiff_t> getLowerPadding() const
inline std::vector<std::ptrdiff_t> getUpperPadding() const
inline const std::vector<int64_t> &getPads() const
inline const std::vector<unsigned> &getFlips() const
void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class BasePadOutplaceOp : public popart::BasePadOp

Subclassed by popart::PadOp, popart::SliceGradOp

Public Functions

BasePadOutplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
inline bool canBeReplacedByIdentity() const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
class BaseSliceOp : public popart::Op

Subclassed by popart::SliceInplaceOp, popart::SliceOp

Public Functions

BaseSliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void growAliasModel(AliasModel&) const override
void setup() final
virtual void connectInTensor(InIndex, TensorId) final
void appendOutlineAttributes(OpSerialiserBase&) const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
view::Regions uses(InIndex) const final
view::Region createSlicedRegion(const Shape &toBeSliced) const
view::Region getFullInRegion() const
view::Region getFullOutRegion() const
inline const std::vector<int64_t> &getStarts() const
inline const std::vector<int64_t> &getEnds() const
inline const std::vector<int64_t> &getAxes() const
inline const std::vector<int64_t> &getSteps() const
inline void setStarts(const std::vector<int64_t> &x)
inline void setEnds(const std::vector<int64_t> &x)
inline void setAxes(const std::vector<int64_t> &x)
inline void setSteps(const std::vector<int64_t> &x)
std::array<std::vector<int64_t>, 2> getLowerUpper() const
std::vector<Slice> getSlices(std::vector<int64_t> input_shape) const
std::vector<Slice> getSlices() const
std::vector<int64_t> getPads() const
std::vector<unsigned> getFlips() const
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Members

int unwindConcatDim = 0

Public Static Functions

static inline InIndex getInIndex()
static inline InIndex getStartsInIndex()
static inline InIndex getEndsInIndex()
static inline InIndex getAxesInIndex()
static inline InIndex getStepsInIndex()
static inline OutIndex getOutIndex()
class BaseSortOp : public popart::Op

Subclassed by popart::TopKOp

Public Functions

BaseSortOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline int getInIndex()
class BatchNormGradOp : public popart::Op

Public Functions

BatchNormGradOp(const BatchNormOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getEpsilon() const
inline int64_t getSpatial() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getXInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getMeanInIndex()
static inline InIndex getVarInIndex()
static inline InIndex getYGradInIndex()
static inline OutIndex getXOutIndex()
static inline OutIndex getScaleOutIndex()
static inline OutIndex getBOutIndex()
class BatchNormOp : public popart::Op

Public Functions

BatchNormOp(const OperatorIdentifier &_opid, float _epsilon, float _momentum, int64_t _spatial, bool _unbiased_variance, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline float getEpsilon() const
inline float getMomentum() const
inline int64_t getSpatial() const
inline bool useUnbiasedVariance() const
inline bool isTraining() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool isNorm() const override

Public Static Functions

static inline InIndex getXInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getBInIndex()
static inline InIndex getMeanInIndex()
static inline InIndex getVarInIndex()
static inline OutIndex getYOutIndex()
static inline OutIndex getMeanOutIndex()
static inline OutIndex getVarOutIndex()
static inline OutIndex getSavedMeanOutIndex()
static inline OutIndex getSavedVarOutIndex()
class BinaryComparisonOp : public popart::Op

Subclassed by popart::AndOp, popart::EqualOp, popart::GreaterOp, popart::LessOp, popart::OrOp

Public Functions

BinaryComparisonOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getArg0InIndex()
static inline InIndex getArg1InIndex()
static inline OutIndex getOutIndex()
class BinaryConstScalarOp : public popart::ElementWiseUnaryOp

A unary Op, which performs a binary operation (Mul, Div, etc) between its single input tensor and a scalar, whose value is stored as an Op attribute.

The input index (0 or 1) of the tensor and scalar are controlled by the scalarInIndex attribute.

Some examples. Let T be the input tensor of this Op.

[value = 2, opType = “Div”, scalarInIndex = 1]: T / 2.0

[value = 4, opType = “Pow”, scalarInIndex = 0]: 2.0 ** T

[value = 0.2, opType = “Add”, scalarInIndex = 0]: 0.2 + T

[value = 100, opType = “Sub”, scalarInIndex = 1]: T - 100.

Public Types

enum class Type

Values:

enumerator Add = 0
enumerator Sub
enumerator Mul
enumerator Div
enumerator Pow
enumerator N

Public Functions

inline BinaryConstScalarOp(const OperatorIdentifier &x, float value, Type t, int64_t index, const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

inline float value() const
inline Type opType() const
inline int64_t scalarInIndex() const
class BitwiseBinaryOp : public popart::ElementWiseBinaryOp

Public Functions

BitwiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class BitwiseNotOp : public popart::ElementWiseUnaryOp

Public Functions

BitwiseNotOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class BoundaryOp : public popart::Op

Public Functions

inline BoundaryOp(const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
inline void setup() final
inline float getSubgraphValue() const final
inline bool isOutlineable() const override
inline bool hasSideEffect() const override
class BucketizeOp : public popart::Op

Public Functions

BucketizeOp(const OperatorIdentifier &opid, bool right, const Op::Settings &settings)
void setup() override
std::unique_ptr<Op> clone() const override
float getSubgraphValue() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
bool isRight() const noexcept

Public Static Functions

static inline InIndex inIndex()
static inline InIndex boundariesInIndex()
static inline OutIndex outIndex()
class CallGradOp : public popart::CallOp

Public Functions

CallGradOp(CallOp &fwdOp, Graph &bwdGraph, const std::vector<GradInOutMapper> &gradInInfo_, const std::map<int, int> &gradOutInfo_)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class CallOp : public popart::SubgraphOp

Subclassed by popart::CallGradOp

Public Functions

CallOp(const OperatorIdentifier&, Graph &callee, const Op::Settings &settings)
CallOp(const OperatorIdentifier&, Graph &callee, const std::vector<int> &modifiedInputsViaAttrs, const Op::Settings &settings)
void setup() final
std::unique_ptr<Op> clone() const final
Graph &getCalledGraph() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<TensorId> getGradOpInputIds(const Graph &gradGraph)
void appendOutlineAttributes(OpSerialiserBase &os) const override
inline float getSubgraphValue() const final
std::vector<const Graph*> getCalledGraphs() const override
void setCalledGraph(Graph&) override
inline InIndex subgraphInToOpInIndex(InIndex index) const override
inline InIndex opInToSubgraphInIndex(InIndex index) const override
inline OutIndex subgraphOutToOpOutIndex(OutIndex index) const override
inline OutIndex opOutToSubgraphOutIndex(OutIndex index) const override
inline std::set<OutIndex> opInToOpOutIndex(InIndex in) const override
inline std::set<InIndex> opOutToOpInIndex(OutIndex out) const override
inline void growAliasModel(AliasModel &m) const override
void connectInTensor(InIndex inIndex, TensorId tenId) override
class CastGradOp : public popart::CastOp

Public Functions

CastGradOp(const CastOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class CastOp : public popart::Op

Subclassed by popart::CastGradOp

Public Functions

CastOp(const OperatorIdentifier &_opid, DataType _to, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() override
inline DataType toDataType() const
inline float getSubgraphValue() const final
inline bool canShard() const override
inline ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
bool canBeReplacedByIdentity() const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class CeilInplaceOp : public popart::OneWayUnaryInPlaceOp

Public Functions

CeilInplaceOp(const CeilOp&)
std::unique_ptr<Op> clone() const final
class CeilOp : public popart::OneWayUnaryOp

Public Functions

CeilOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class ClipGradOp : public popart::ClipOp

Public Functions

ClipGradOp(const ClipOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Public Static Functions

static inline InIndex getClippedInIndex()
static inline InIndex getGradClippedInIndex()
class ClipInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ClipInplaceOp(const ClipOp&)
std::unique_ptr<Op> clone() const final
float getClipMin() const
float getClipMax() const
void appendOutlineAttributes(OpSerialiserBase&) const override
class ClipOp : public popart::ElementWiseUnaryOp

Subclassed by popart::ClipGradOp

Public Functions

ClipOp(const OperatorIdentifier &_opid, float min_, float max_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline void setClipMin(float value)
float getClipMin() const
inline void setClipMax(float value)
float getClipMax() const
void appendOutlineAttributes(OpSerialiserBase&) const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
bool canBeReplacedByIdentity() const override

Public Static Functions

static inline InIndex clip11MinInputIndex()
static inline InIndex clip11MaxInputIndex()
class CollectivesBaseOp : public popart::Op

Subclassed by popart::MultiCollectiveBaseOp, popart::ReplicatedAllGatherOp, popart::ReplicatedAllReduceOp, popart::ReplicatedReduceScatterOp

Public Functions

CollectivesBaseOp(const OperatorIdentifier &_opid, CommGroup group, const Op::Settings &settings_)
CollectivesBaseOp(const OperatorIdentifier &_opid, const ReplicaGrouping &grouping, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
virtual bool hasCorrespondingLinkedIndexTensor(Tensor *t)
inline bool hasCorrespondingLinkedIndexTensor(InIndex in)
virtual Tensor *getCorrespondingLinkedIndexTensor(Tensor *t)
inline Tensor *getCorrespondingLinkedIndexTensor(InIndex in)
virtual bool isCollectiveLinkedIndexTensor(InIndex in) const
virtual bool isCollectiveLinkedIndexTensor(Tensor *t) const
inline void setGCLCommGroup(CommGroup group)
inline CommGroup getGCLCommGroup() const
void setReplicaGrouping(const ReplicaGrouping &grouping)
const ReplicaGrouping &getReplicaGrouping() const
virtual int64_t getCommSize() const

Number of replicas the collective communicates across.

This will be used to create a CollectiveBalanceReorder in lowering to improve the tile mapping when using RTS.

void appendOutlineAttributes(OpSerialiserBase &os) const override
inline virtual bool isConfigureOutputForReplicatedTensorSharding() const

Check Replicated tensor sharding (RTS) mode Collective operations setup for RTS are allowed to scramble the data element order of the input (AllGather) / output (ReduceScatter) tensor such that the tensor layouts minimize inter-tile exchanges.

As a consequence, the RTS sharded tensor does not follow the original data order and can only be used in elementwise, RTS-enabled operations, such as optimizers, where all inputs consumed are rearranged in the same way.

Returns

True if this operation is configured for replicated tensor sharding

Public Static Functions

static inline InIndex getInIndex()
static inline InIndex getCollectiveLinkedIndex()
static inline OutIndex getOutIndex()
static inline ReplicatedTensorShardingIndicesIndex getDefaultTensorShardingGroupIndex()
class ConcatGradOp : public popart::Op

Public Functions

ConcatGradOp(const ConcatOp &op, InIndex input)
ConcatGradOp(const ConcatInplaceOp &op, InIndex input)
std::unique_ptr<Op> clone() const override
void setup() override
void appendOutlineAttributes(OpSerialiserBase&) const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
int64_t getAxis() const
int64_t getStart() const
int64_t getEnd() const
inline float getSubgraphValue() const final
inline bool canShard() const override
inline ReductionType getShardReductionType(OutIndex index) const override
void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ConcatInplaceOp : public popart::ConcatOp

Public Functions

ConcatInplaceOp(int64_t axis_, const Op::Settings &settings)
ConcatInplaceOp(const ConcatOp &concatOp, int64_t axis_)
std::unique_ptr<Op> clone() const override
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline view::Regions aliases(InIndex in, OutIndex) const final
class ConcatOp : public popart::Op

Subclassed by popart::ConcatInplaceOp

Public Functions

ConcatOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
void setup() final
std::vector<std::unique_ptr<Op>> getGradOps() final
int64_t getAxis() const
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final
inline bool canShard() const override
void growAliasModel(AliasModel&) const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getInIndex(InIndex index)
static inline OutIndex getOutIndex()
static Shape getOutputShape(int64_t axis, const std::vector<const Shape*> inputs)
class ConvDataGradOp : public popart::MultiConvDataGradBaseOp

Public Functions

ConvDataGradOp(const ConvOp&)
std::unique_ptr<Op> clone() const final
inline int numConvs() const override
inline const ConvParameters &getParameters() const

Public Static Functions

static inline InIndex getWeightsInIndex()
static inline InIndex getGradConvolvedInIndex()
static inline OutIndex getOutIndex()
class ConvFlipWeightsGradOp : public popart::ConvFlipWeightsOp

Public Functions

ConvFlipWeightsGradOp(const ConvFlipWeightsGradOp&) = default
ConvFlipWeightsGradOp(const ConvFlipWeightsOp &convFlipWeightsOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class ConvFlipWeightsOp : public popart::Op

Subclassed by popart::ConvFlipWeightsGradOp

Public Functions

ConvFlipWeightsOp(const ConvFlipWeightsOp&) = default
ConvFlipWeightsOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
~ConvFlipWeightsOp() override
std::unique_ptr<Op> clone() const override
void setup() final
std::vector<std::unique_ptr<Op>> getGradOps() final
inline const ConvParameters &getParameters() const
inline void setParameters(const ConvParameters &p)
inline bool getGroupReshape() const
inline void setGroupReshape(bool reshape)
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase &os) const final
inline void setConvOptions(const MultiConvOptions &opts)
inline const MultiConvOptions &getMultiConvOptions() const
inline std::map<std::string, std::string> getConvOptions() const

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ConvOp : public popart::MultiConvBaseOp

Public Functions

ConvOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, const MultiConvOptions &convOpts)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline int numConvs() const final
inline int64_t getGroups() const
inline void setGroup()
inline int64_t getNInChans() const
inline int64_t getNOutChans() const
inline ConvParameters getParameters() const
void restoreAttributesFromParams(const std::vector<ConvParameters>&) override
bool isPow2ScaledConv() const

Returns true if and only if the inputs to the op constitute a valid set of inputs for a fused (float8) convolution.

inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getDataInIndex()
static inline InIndex getWeightsInIndex()
static inline InIndex getLog2ScaleInIndex()
static inline OutIndex getOutIndex()
class ConvTransposeOp : public popart::Op

Public Functions

ConvTransposeOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, std::vector<int64_t> outputPadding, Shape outputShape, const MultiConvOptions &convOpts)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const final
bool isPow2ScaledConvTranspose() const
inline std::set<InIndex> optionalInputs() const override

Public Members

std::vector<int64_t> strides
std::vector<int64_t> dilations
int64_t group
const AutoPad padType
const MultiConvOptions convOpts
ConvParameters params

Public Static Functions

static inline InIndex getInIndex()
static inline InIndex getWeightsInIndex()
static inline InIndex getLog2ScaleInIndex()
static inline OutIndex getOutIndex()
class ConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp

Public Functions

ConvWeightsGradOp(const ConvOp&)
std::unique_ptr<Op> clone() const final
ConvWeightsGradOp(const ConvWeightsGradOp&) = default
inline int numConvs() const final
inline const ConvParameters &getParameters() const

Public Static Functions

static inline InIndex getGradConvolvedInIndex()
static inline InIndex getPreConvolvedInIndex()
static inline OutIndex getOutIndex()
class CosGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

CosGradOp(const CosOp &fwdOp)
std::unique_ptr<Op> clone() const final
class CosOp : public popart::ElementWiseUnaryOp

Public Functions

CosOp(const OperatorIdentifier &_opid, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)
class CoshOp : public popart::Op

Public Functions

CoshOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class CtcBeamSearchDecoderOp : public popart::Op

Public Functions

CtcBeamSearchDecoderOp(const popart::OperatorIdentifier &_opid, unsigned _blankClass, unsigned _beamWidth, unsigned _topPaths, const popart::Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
void appendAttributes(popart::OpSerialiserBase &os) const override
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
std::vector<std::unique_ptr<Op>> getGradOps() final
float getSubgraphValue() const final
bool requiresRandomSeed() const override
inline unsigned getBlankClass() const
inline unsigned getBeamWidth() const
inline unsigned getTopPaths() const
inline unsigned getMaxTime() const
inline unsigned getBatchSize() const
inline unsigned getNumClasses() const

Public Static Functions

static inline InIndex getLogProbsInIndex()
static inline InIndex getDataLengthsInIndex()
static inline OutIndex getLabelProbsOutIndex()
static inline OutIndex getLabelLengthsOutIndex()
static inline OutIndex getDecodedLabelsOutIndex()
class CtcGradOp : public popart::Op

Public Functions

CtcGradOp(const CtcOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
inline ReductionType getReductionType() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
inline bool canShard() const override
inline bool getEnableReducedClassesInLabel() const

Public Static Functions

static inline InIndex getLogProbsGradientWrtCtcLossInIndex()
static inline InIndex getTargetLengthsInIndex()
static inline InIndex getCtcLossGradientInIndex()
static inline OutIndex getLogProbsGradientOutIndex()
class CtcOp : public popart::LossOp

Public Functions

CtcOp(const OperatorIdentifier &_opid, const ReductionType reduction, const unsigned blank, const bool zeroInfinity, const Op::Settings &settings_, const bool enableReducedClassesInLabel, const DataType outDataType = DataType::UNDEFINED)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline unsigned getBlank() const
inline bool getZeroInfinity() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
unsigned getBatchSize() const
unsigned getMaxInputLength() const
unsigned getMaxTargetLength() const
unsigned getNumClasses() const
inline bool canShard() const override
inline bool getEnableReducedClassesInLabel() const

Public Static Functions

static inline InIndex getLogProbsInIndex()
static inline InIndex getTargetsInIndex()
static inline InIndex getInputLengthsInIndex()
static inline InIndex getTargetLengthsInIndex()
static inline OutIndex getCtcLossOutIndex()
static inline OutIndex getLogProbsGradientWrtCtcLossOutIndex()
class CumSumGradOp : public popart::Op

Public Functions

CumSumGradOp(const CumSumOp &op, bool exclusive, bool reverse, int64_t axis)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
bool getExclusive() const
bool getReverse() const
int64_t getAxis() const
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex outGradXInIndex()
static inline InIndex fwdXInIndex()
static inline OutIndex outIndex()
class CumSumOp : public popart::Op

Public Functions

CumSumOp(const OperatorIdentifier &_opid, bool exclusive_, bool reverse_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() final
bool getExclusive() const
bool getReverse() const
int64_t getAxis() const
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex xInIndex()
static inline InIndex axisInIndex()
static inline OutIndex outIndex()
class DetachInplaceOp : public popart::DetachOp

Public Functions

DetachInplaceOp(const DetachOp &detachOp)
DetachInplaceOp(const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
inline view::Regions aliases(InIndex in, OutIndex) const final
class DetachOp : public popart::ElementWiseUnaryOp

Subclassed by popart::DetachInplaceOp

Public Functions

DetachOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
inline std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> clone() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
inline bool isIdentity() const final
inline bool isOutplaceViewChange() const override
class DivArg0GradOp : public popart::ElementWiseBinaryArg0GradOp

Public Functions

DivArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class DivArg1GradOp : public popart::ElementWiseBinaryArg1GradOp

Public Functions

DivArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class DropoutBaseOp : public popart::RandomBaseOp

Subclassed by popart::DropoutOp, popart::ShapedDropoutOp

Public Functions

DropoutBaseOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
bool canBeReplacedByIdentity() const override
inline float getRatio() const
inline void setRatio(float r)
inline InIndex getSeedInIndex() const override
inline bool canShard() const override
void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
static float validateRatioAttribute(const OpCreatorInfo &info)
class DropoutOp : public popart::DropoutBaseOp

Subclassed by popart::DropoutGradOp

Public Functions

DropoutOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() override
bool canBeReplacedByIdentity() const override
void appendAttributes(OpSerialiserBase &os) const override
inline void setOutputMask(bool v)
inline bool getOutputMask() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline void setReferenceId(RandomReferenceId id)
inline RandomReferenceId getReferenceId() const
TensorId getReferenceTensorId()

Public Static Functions

static inline OutIndex getMaskOutIndex()
class DropoutGradOp : public popart::DropoutOp

Public Functions

DropoutGradOp(const DropoutOp &fwdOp)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const override

Public Static Functions

static inline InIndex getGradInIndex()
static inline OutIndex getOutIndex()
class DynamicAddInplaceOp : public popart::DynamicTernaryBaseInplaceOp

Public Functions

DynamicAddInplaceOp(const DynamicAddOp &dynamicAddOp)
DynamicAddInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const final
class DynamicAddOp : public popart::DynamicTernaryBaseOp

Public Functions

DynamicAddOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
class DynamicBaseOp : public popart::Op

Dynamic Base Op.

Base class for operators acting on a run-time selectable slice of a tensor.

The word “dynamic” refers to the fact that the index can be specified during runtime, where index

is the second tensor argument of this operator as specified in

A slice along an axis can be defined as by the tuple (

start, stop, step ) start - will be equal the index for the respective axis stop - will be equal index + size for the respective axis step - will equal 1

See also

graphcoreoperators.hpp. The axes specifies along which axes the tensor should be sliced. The size specifies the size of the slices.

Limitations: Assuming we would like to slice A with dimension (4, 3)

  • Step other than 1 is not supported (i.e. A[::2,:] is not supported)

  • Negative slicing is not supported (i.e. A[:-1,:] is not supported)

  • stop greater than the size of the axis is not supported (i.e. A[:5,:] is not supported)

Example: Given a Tensor A with shape (3, 2, 4, 5) If we specify axes = {1, 3} (i.e. we will slice the first and third axis [counting from 0]) the operator will operate on A[:, index[0]:(index[0]+size[0]), :, index[1]:(index[1]+size[1])] If we instead specify axes = {0, 1, 3} the operator will operate on A[index[0]:(index[0]+size[0]), index[1]:(index[1]+size[1]), :, index[2]:(index[2]+size[2])]

Subclassed by popart::DynamicBinaryBaseOp, popart::DynamicSliceBaseOp, popart::DynamicSlicePadGradOp

Public Functions

DynamicBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

inline const std::vector<int64_t> &getAxes() const
inline void setAxes(const std::vector<int64_t> &x)
inline const std::vector<int64_t> &getSizes() const
inline void setSizes(const std::vector<int64_t> &x)
inline bool isNotOverlapping() const
TensorInfo createOutInfo() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

Public Static Functions

static inline InIndex getIndexInIndex()
static inline OutIndex getOutIndex()
class DynamicBinaryBaseInplaceOp : public popart::DynamicBinaryBaseOp

Subclassed by popart::DynamicZeroInplaceOp

Public Functions

DynamicBinaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::Regions modifies(InIndex) const final
class DynamicBinaryBaseOp : public popart::DynamicBaseOp

Dynamic Binary Base Op.

Base class for operators acting on a run-time selectable slice of a tensor. The word “binary” refers to the fact that the operator takes two tensors as input.

See also

DynamicBaseOp for details

Subclassed by popart::DynamicBinaryBaseInplaceOp, popart::DynamicTernaryBaseOp, popart::DynamicUpdateToUpdateGradOp, popart::DynamicZeroGradOp, popart::DynamicZeroOp

Public Functions

DynamicBinaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline const TensorInfo &getUpdateTensorInfo() const
virtual void growAliasModel(AliasModel &m) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters
  • aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

  • 2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline InIndex getUpdateInIndex()
static inline InIndex getIndexInIndex()
static inline OutIndex getOutIndex()
class DynamicSliceBaseOp : public popart::DynamicBaseOp

Subclassed by popart::DynamicSliceOp, popart::DynamicUpdateUpdaterGradOp

Public Functions

DynamicSliceBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
std::unique_ptr<Op> clone() const override
void setup() final
TensorInfo createOutInfo() const

Public Static Functions

static inline InIndex getInIndex()
class DynamicSliceInplaceOp : public popart::DynamicSliceOp

Dynamic Slice Inplace Op.

This Op takes two or three TensorIds as input (as indicated in

  1. The TensorId of tensor to slice from.

  2. The (optional) TensorId of the index of the starting point of the slice (

    See also

    DynamicBaseOp for explanation).

  3. The TensorId of the tensor to write the slice into (not used in outplace variant).

See also

graphcoreoperators.hpp)

The output is the TensorId of the sliced tensor, aliased

Public Functions

DynamicSliceInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
DynamicSliceInplaceOp(const DynamicSliceOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

virtual view::Regions aliases(InIndex, OutIndex) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters
  • InIndex – The input index.

  • OutIndex – The output index.

Returns

The regions which the output will alias.

class DynamicSliceOp : public popart::DynamicSliceBaseOp

Dynamic Slice Op.

This Op takes two or three TensorIds as input (as indicated in

  1. The TensorId of tensor to slice from.

  2. The (optional) TensorId of the index of the starting point of the slice (

    See also

    DynamicBaseOp for explanation).

  3. The TensorId of the tensor to write the slice into (not used in outplace variant).

See also

graphcoreoperators.hpp)

The output is the TensorId of the sliced tensor.

Subclassed by popart::DynamicSliceInplaceOp

Public Functions

DynamicSliceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual void growAliasModel(AliasModel&) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters
  • aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

  • 2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline InIndex getSliceInIndex()
class DynamicSlicePadGradOp : public popart::DynamicBaseOp

Public Functions

DynamicSlicePadGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
void setup() final
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const override
inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getInIndex()
class DynamicTernaryBaseInplaceOp : public popart::DynamicTernaryBaseOp

Subclassed by popart::DynamicAddInplaceOp, popart::DynamicUpdateInplaceOp

Public Functions

DynamicTernaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::Regions modifies(InIndex) const final
class DynamicTernaryBaseOp : public popart::DynamicBinaryBaseOp

Dynamic Ternary Base Op.

Base class for operators acting on a run-time selectable slice of a tensor. The word “ternary” refers to the fact that the operator takes three tensors as input.

See also

DynamicBaseOp for details

Subclassed by popart::DynamicAddOp, popart::DynamicTernaryBaseInplaceOp, popart::DynamicUpdateOp

Public Functions

DynamicTernaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

Public Static Functions

static inline InIndex getUpdateInIndex()
static inline InIndex getInIndex()
class DynamicUpdateInplaceOp : public popart::DynamicTernaryBaseInplaceOp

Public Functions

DynamicUpdateInplaceOp(const DynamicUpdateOp &dynamicUpdateOp)
DynamicUpdateInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const final
class DynamicUpdateOp : public popart::DynamicTernaryBaseOp

Dynamic Update Op.

This class takes three TensorIds as input (as indicated in

  1. The TensorId of the tensor to be updated.

  2. The TensorId of the index of the starting point of the slice (

    See also

    DynamicBaseOp for explanation).

  3. The TensorId to update with (must match dimension with ( index, axes, sizes )).

See also

graphcoreoperators.hpp)

The output is the TensorId of the updated tensor.

See also

DynamicTernaryBaseOp for details.

Public Functions

DynamicUpdateOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

class DynamicUpdateToUpdateGradOp : public popart::DynamicBinaryBaseOp

Public Functions

DynamicUpdateToUpdateGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class DynamicUpdateUpdaterGradOp : public popart::DynamicSliceBaseOp

Public Functions

DynamicUpdateUpdaterGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class DynamicZeroGradOp : public popart::DynamicBinaryBaseOp

Public Functions

DynamicZeroGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class DynamicZeroInplaceOp : public popart::DynamicBinaryBaseInplaceOp

Public Functions

DynamicZeroInplaceOp(const DynamicZeroOp &dynamicZeroOp)
DynamicZeroInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const final
class DynamicZeroOp : public popart::DynamicBinaryBaseOp

Public Functions

DynamicZeroOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
class ElementWiseBinaryArg0GradOp : public popart::ElementWiseBinaryGradOp

Subclassed by popart::Atan2Arg0GradOp, popart::DivArg0GradOp, popart::FmodArg0GradOp, popart::MulArg0GradOp, popart::PowArg0GradOp

Public Functions

inline ElementWiseBinaryArg0GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
class ElementWiseBinaryArg1GradOp : public popart::ElementWiseBinaryGradOp

Subclassed by popart::Atan2Arg1GradOp, popart::DivArg1GradOp, popart::MulArg1GradOp, popart::PowArg1GradOp, popart::SubtractArg1GradOp

Public Functions

inline ElementWiseBinaryArg1GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
class ElementWiseBinaryBaseOp : public popart::Op

Subclassed by popart::ElementWiseBinaryInplaceLhsOp, popart::ElementWiseBinaryInplaceRhsOp, popart::ElementWiseBinaryOp

Public Functions

ElementWiseBinaryBaseOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
void setup() override
inline float getSubgraphValue() const final
inline bool canShard() const override
void growAliasModel(AliasModel&) const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
inline view::RegMap fwdRegMap(InIndex argIndex, OutIndex) const final
inline view::RegMap bwdRegMap(InIndex argIndex, OutIndex) const final

Public Static Functions

static inline InIndex getArg0InIndex()
static inline InIndex getArg1InIndex()
static inline OutIndex getOutIndex()
class ElementWiseBinaryGradOp : public popart::Op

Subclassed by popart::ElementWiseBinaryArg0GradOp, popart::ElementWiseBinaryArg1GradOp

Public Functions

ElementWiseBinaryGradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
virtual std::unique_ptr<Op> clone() const override = 0
void setup() final
inline const std::vector<int64_t> &getReductionAxes() const
inline float getSubgraphValue() const final
inline const std::map<int, int> &gradOutToNonGradIn() const final
inline virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdArg0InIndex()
static inline InIndex getFwdArg1InIndex()
static inline InIndex getFwdOutIndex()
static inline OutIndex getOutIndex()
class ElementWiseBinaryInplaceLhsOp : public popart::ElementWiseBinaryBaseOp

Subclassed by popart::AddLhsInplaceOp, popart::Atan2LhsInplaceOp, popart::MulLhsInplaceOp, popart::PowLhsInplaceOp

Public Functions

inline ElementWiseBinaryInplaceLhsOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
inline view::Regions modifies(InIndex index) const final
inline view::Regions aliases(InIndex index, OutIndex) const final
class ElementWiseBinaryInplaceRhsOp : public popart::ElementWiseBinaryBaseOp

Subclassed by popart::AddRhsInplaceOp, popart::MulRhsInplaceOp

Public Functions

inline ElementWiseBinaryInplaceRhsOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
inline view::Regions modifies(InIndex index) const final
inline view::Regions aliases(InIndex index, OutIndex) const final
class ElementWiseBinaryOp : public popart::ElementWiseBinaryBaseOp

Subclassed by popart::ElementWiseNpBroadcastableBinaryWithGradOp< AddArg0GradOp, AddArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Atan2Arg0GradOp, Atan2Arg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< DivArg0GradOp, DivArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< MulArg0GradOp, MulArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< PowArg0GradOp, PowArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< SubtractArg0GradOp, SubtractArg1GradOp >, popart::BitwiseBinaryOp, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Arg0GradOp, Arg1GradOp >, popart::FmodOp, popart::PReluOp

Public Functions

ElementWiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void setInplacePriority(const OperatorIdentifier&, float)
float getInplacePriority(const OperatorIdentifier&) const
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
class ElementWiseInplaceUnaryOp : public popart::ElementWiseUnaryOp

Subclassed by popart::AsinInplaceOp, popart::AtanInplaceOp, popart::ClipInplaceOp, popart::EluInplaceOp, popart::ExpInplaceOp, popart::Expm1InplaceOp, popart::GeluErfInplaceOp, popart::GeluInplaceOp, popart::HardSigmoidInplaceOp, popart::IncrementModInplaceOp, popart::LeakyReluInplaceOp, popart::Log1pInplaceOp, popart::LogSoftmaxInplaceOp, popart::OneWayUnaryInPlaceOp, popart::ReluInplaceOp, popart::ScaleInplaceOp, popart::SeluInplaceOp, popart::ShrinkInplaceOp, popart::SigmoidInplaceOp, popart::SinhInplaceOp, popart::SoftmaxInplaceOp, popart::SoftPlusInplaceOp, popart::SoftSignInplaceOp, popart::SwishInplaceOp, popart::ThresholdedReluInplaceOp

Public Functions

inline ElementWiseInplaceUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
inline view::Regions modifies(InIndex index) const final
inline view::Regions aliases(InIndex in, OutIndex) const final
class ElementWiseNonLinearUnaryGradOp : public popart::Op

Subclassed by popart::AsinGradOp, popart::AtanGradOp, popart::CosGradOp, popart::EluGradOp, popart::ErfGradOp, popart::GeluErfGradOp, popart::GeluGradOp, popart::HardSigmoidGradOp, popart::Log1pGradOp, popart::LogGradOp, popart::ReciprocalGradOp, popart::SeluGradOp, popart::ShrinkGradOp, popart::SinGradOp, popart::SinhGradOp, popart::SoftPlusGradOp, popart::SoftSignGradOp, popart::SwishGradOp, popart::ThresholdedReluGradOp

Public Functions

ElementWiseNonLinearUnaryGradOp(const OperatorIdentifier &_opid, const ElementWiseUnaryOp &fwdOp)
std::unique_ptr<Op> clone() const override
void setup() final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdArgInIndex()
static inline OutIndex getOutIndex()
template<class Arg0GradOp, class Arg1GradOp>
class ElementWiseNpBroadcastableBinaryWithGradOp : public popart::ElementWiseBinaryOp

Subclassed by popart::AddOp, popart::Atan2Op, popart::DivOp, popart::MulOp, popart::PowOp, popart::SubtractOp

Public Functions

inline ElementWiseNpBroadcastableBinaryWithGradOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
inline std::unique_ptr<Op> clone() const override
inline virtual std::vector<std::unique_ptr<Op>> getGradOps() final
class ElementWiseUnaryBooleanOp : public popart::Op

Subclassed by popart::IsInf, popart::IsNaN

Public Functions

ElementWiseUnaryBooleanOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const override
inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ElementWiseUnaryOp : public popart::Op

Subclassed by popart::AbsOp, popart::AsinOp, popart::AtanOp, popart::AutoLossScaleProxyOp, popart::BinaryConstScalarOp, popart::BitwiseNotOp, popart::ClipOp, popart::CosOp, popart::DetachOp, popart::ElementWiseInplaceUnaryOp, popart::EluOp, popart::ErfOp, popart::Expm1Op, popart::ExpOp, popart::GeluErfOp, popart::GeluOp, popart::HardSigmoidOp, popart::IdentityOp, popart::IncrementModOp, popart::LeakyReluOp, popart::Log1pOp, popart::LogOp, popart::LogSoftmaxOp, popart::NegateOp, popart::NopOp, popart::NotOp, popart::OneWayUnaryOp, popart::PrintTensorOp, popart::ReciprocalOp, popart::ReluOp, popart::ScaleOp, popart::SeluOp, popart::ShrinkOp, popart::SigmoidOp, popart::SinhOp, popart::SinOp, popart::SoftmaxOp, popart::SoftPlusOp, popart::SoftSignOp, popart::SqrtOp, popart::SquareOp, popart::SwishOp, popart::TanhOp, popart::ThresholdedReluOp

Public Functions

ElementWiseUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const override
inline bool canShard() const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
void growAliasModel(AliasModel&) const override
inline virtual bool isIdentity() const
Returns

true, if and only if (iff) this Op is mathematically equivalent to f(x) = x. This is slightly different to canBeReplacedByIdentity; for example Detach and Identity have isIdentity overridden to return true, but still return false for canBeReplacedByIdentity.

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class EluGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

EluGradOp(const EluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const final
inline float alpha() const
class EluInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

EluInplaceOp(const EluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const final
inline float alpha() const
class EluOp : public popart::ElementWiseUnaryOp

Public Functions

EluOp(const OperatorIdentifier &opid, float alpha, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void appendAttributes(OpSerialiserBase&) const final
inline float alpha() const
class EqualOp : public popart::BinaryComparisonOp

Public Functions

EqualOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class ErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

ErfGradOp(const ErfOp &fwdOp)
std::unique_ptr<Op> clone() const final
class ErfOp : public popart::ElementWiseUnaryOp

Public Functions

ErfOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ExchangeBaseOp : public popart::Op

Subclassed by popart::HostBaseOp, popart::MultiExchangeOp, popart::RemoteBaseOp, popart::RemoteCodeLoadOp

Public Functions

inline ExchangeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const override = 0
inline virtual int getNumExchanges() const
virtual ExchangeDescriptor getExchangeDescriptor(int index) const = 0

Return the exchange descriptor at index A MultiExchangeOp can contain multiple descriptors, while RemoteLoad/Store and HostLoad/Store contain one each.

Parameters

index – Index of the exchange descriptor to return.

Returns

ExchangeDescriptor for the exchange.

inline float getSubgraphValue() const final
inline bool isOutlineable() const final
virtual std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const

Get the descriptor index associated with the input index.

Parameters

index – input index

Returns

pair of descriptor index and input index relative to the descriptor

virtual std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const

Get the descriptor index associated with the output index.

Parameters

index – output index

Returns

pair of descriptor index and output index relative to the descriptor

virtual std::vector<InIndex> descriptorIndexToInIndices(int index) const

Get the input indices associated with the descriptor index.

Parameters

index – exchange descriptor index

Returns

descriptor index

virtual std::vector<OutIndex> descriptorIndexToOutIndices(int index) const

Get the output indices associated with the descriptor index.

Parameters

index – exchange descriptor index

Returns

descriptor index

class ExpGradOp : public popart::Op

Public Functions

ExpGradOp(const ExpOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdOutInIndex()
static inline OutIndex getOutIndex()
class ExpInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ExpInplaceOp(const ExpOp&)
ExpInplaceOp(const Op::Settings &opSettings)
std::unique_ptr<Op> clone() const final
class ExpOp : public popart::ElementWiseUnaryOp

Public Functions

ExpOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class ExpandGradOp : public popart::Op

Public Functions

ExpandGradOp(const ExpandOp &op)
ExpandGradOp(const ExpandInplaceOp &op)
std::unique_ptr<Op> clone() const override
void setup() override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline std::vector<size_t> getXShape()
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getDYIndex()
static inline OutIndex getOutIndex()
class ExpandInplaceOp : public popart::ExpandOp

Public Functions

ExpandInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
ExpandInplaceOp(const ExpandOp &expandOp)
std::unique_ptr<Op> clone() const override
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline view::Regions aliases(InIndex in, OutIndex) const final
class ExpandOp : public popart::Op

Subclassed by popart::ExpandInplaceOp

Public Functions

inline ExpandOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
ExpandOp(const OperatorIdentifier &_opid, const Shape &_outShape, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
void setup() final
std::vector<std::unique_ptr<Op>> getGradOps() final
inline Shape getOutShape() const
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
inline bool canBeReplacedByIdentity() const override
void growAliasModel(AliasModel&) const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
inline float getSubgraphValue() const final
void connectInTensor(InIndex inIndex, TensorId tenId) final

Public Static Functions

static inline InIndex getInTensorIndex()
static inline InIndex getInShapeIndex()
static inline OutIndex getOutIndex()
class Expm1GradOp : public popart::Op

Public Functions

Expm1GradOp(const Expm1Op &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdOutInIndex()
static inline OutIndex getOutIndex()
class Expm1InplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

Expm1InplaceOp(const Expm1Op&)
std::unique_ptr<Op> clone() const final
class Expm1Op : public popart::ElementWiseUnaryOp

Public Functions

Expm1Op(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class FloorInplaceOp : public popart::OneWayUnaryInPlaceOp

Public Functions

FloorInplaceOp(const FloorOp&)
std::unique_ptr<Op> clone() const final
class FloorOp : public popart::OneWayUnaryOp

Public Functions

FloorOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class FmodArg0GradOp : public popart::ElementWiseBinaryArg0GradOp

Public Functions

FmodArg0GradOp(const FmodOp &op, const std::vector<int64_t> &reductionAxes)
std::unique_ptr<Op> clone() const final
class FmodOp : public popart::ElementWiseBinaryOp

Public Functions

FmodOp(const OperatorIdentifier &opId, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class GRUGradOp : public popart::BaseOnnxRNNGradOp

Gradient operator for GRUOp.

Public Functions

GRUGradOp(const GRUOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const final

Return the input indices of all optional inputs to the op.

Public Members

const unsigned linear_before_reset_attribute

Public Static Functions

static inline InIndex getIntermediatesInIndex()
class GRUOp : public popart::BaseOnnxRNNOp

This op applies a single-layer GRU with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#GRU

Public Functions

GRUOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const std::string direction, bool linear_before_reset, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

unsigned getNumChannels() const
int64_t getNumDirections() const override
bool hasOutput(OutIndex) const
virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

bool isTraining() const
inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns

true if the op can be outlined, false otherwise. Default: true.

inline std::string getDirectionAttribute() const
inline int getLinearBeforeResetAttribute() const

Public Static Functions

static inline OutIndex getInitialHPassThroughIndex()
static inline OutIndex getIntermediatesPassThroughIndex()
static inline OutIndex getInputWeightsPassThroughIndex()
static inline OutIndex getRecurrenceWeightsPassThroughIndex()
static inline OutIndex getBiasesPassThroughIndex()
class GatherGradOp : public popart::Op

Subclassed by popart::TiedGatherGradOp

Public Functions

GatherGradOp(const GatherOp &op, int64_t axis, int64_t group_size)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
int64_t getAxis() const
int64_t getGroupSize() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline int getInBatchAxis(InIndex i) const override
inline int getOutBatchAxis(OutIndex) const override
inline bool canShard() const override
inline nonstd::optional<float> getAvailableMemoryProportion() const
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)

Public Static Functions

static inline InIndex gradInIndex()
static inline InIndex indicesInIndex()
static inline OutIndex gradOutIndex()
class GatherOp : public popart::Op

Subclassed by popart::TiedGatherOp

Public Functions

GatherOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t group_size_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() final
int64_t getAxis() const
int64_t getGroupSize() const
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const override
inline bool canShard() const override
inline nonstd::optional<float> getAvailableMemoryProportion() const
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)
inline bool zeroOutOfRangeIndices() const

Public Static Functions

static inline InIndex dataInIndex()
static inline InIndex indicesInIndex()
static inline OutIndex outIndex()
class GeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

GeluGradOp(const GeluOp&)
std::unique_ptr<Op> clone() const final
class GeluInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

GeluInplaceOp(const GeluOp&)
GeluInplaceOp(const Op::Settings &opSettings)
std::unique_ptr<Op> clone() const final
class GeluOp : public popart::ElementWiseUnaryOp

Public Functions

GeluOp(const OperatorIdentifier &opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class GeluErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

GeluErfGradOp(const GeluErfOp&)
std::unique_ptr<Op> clone() const final
class GeluErfInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

GeluErfInplaceOp(const GeluErfOp&)
GeluErfInplaceOp(const Op::Settings &opSettings)
std::unique_ptr<Op> clone() const final
class GeluErfOp : public popart::ElementWiseUnaryOp

Public Functions

GeluErfOp(const OperatorIdentifier &opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class GetRandomSeedOp : public popart::Op

Public Functions

GetRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline InIndex getSeedInIndex() const override
inline int getOutBatchAxis(OutIndex) const override
inline float getSubgraphValue() const final
inline bool isOutlineable() const final
view::Regions aliases(InIndex, OutIndex) const final
view::Regions modifies(InIndex) const final
inline void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline OutIndex getUpdatedSeedOutIndex()
static inline TensorId getStreamedSeedTensorId()
static inline TensorId getUpdatedSeedTensorId()
class GlobalAveragePoolGradOp : public popart::Op

Public Functions

GlobalAveragePoolGradOp(const GlobalAveragePoolOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK
const Shape creatorStrides
const Shape creatorLowerPads
const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()
static inline InIndex getPooledInIndex()
static inline InIndex getGradPooledInIndex()
static inline OutIndex getOutIndex()
class GlobalAveragePoolOp : public popart::Op

Public Functions

GlobalAveragePoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline Shape getSpatialK() const
Shape getStrides() const
Shape getLowerPads() const
Shape getUpperPads() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class GlobalMaxPoolGradOp : public popart::Op

Public Functions

GlobalMaxPoolGradOp(const GlobalMaxPoolOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK
const Shape creatorStrides
const Shape creatorLowerPads
const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()
static inline InIndex getPooledInIndex()
static inline InIndex getGradPooledInIndex()
static inline OutIndex getOutIndex()
class GlobalMaxPoolOp : public popart::Op

Public Functions

GlobalMaxPoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
void setup() override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline Shape getSpatialK() const
Shape getStrides() const
Shape getLowerPads() const
Shape getUpperPads() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class GreaterOp : public popart::BinaryComparisonOp

Public Functions

GreaterOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class GroupNormGradOp : public popart::Op

Public Functions

GroupNormGradOp(const GroupNormOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getEpsilon() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getXInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getMeanInIndex()
static inline InIndex getInvStdDevInIndex()
static inline InIndex getYGradInIndex()
static inline OutIndex getXGradOutIndex()
static inline OutIndex getScaleOutIndex()
static inline OutIndex getBOutIndex()
class GroupNormOp : public popart::Op

Public Functions

GroupNormOp(const OperatorIdentifier &opid_, int64_t num_groups_, float epsilon_, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getEpsilon() const
inline int64_t getNumGroups() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool isNorm() const override
inline float getSubgraphValue() const final
inline bool canShard() const override
bool canBeReplacedByIdentity() const final

Public Static Functions

static inline InIndex getXInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getBInIndex()
static inline OutIndex getYOutIndex()
static inline OutIndex getMeanOutIndex()
static inline OutIndex getInvStdDevOutIndex()
class HardSigmoidGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

HardSigmoidGradOp(const HardSigmoidOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getBeta() const
class HardSigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

HardSigmoidInplaceOp(const HardSigmoidOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getBeta() const
class HardSigmoidOp : public popart::ElementWiseUnaryOp

Public Functions

HardSigmoidOp(const OperatorIdentifier &opid, float _alpha, float _beta, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getBeta() const
class HasReceptiveFieldOp : public popart::Op

Subclassed by popart::AveragePoolOp, popart::MaxPoolOp

Public Functions

HasReceptiveFieldOp(const OperatorIdentifier &_opid, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const override = 0
int getNSpatialDims() const
int64_t getBatchSize() const
int64_t getNInChans() const
virtual Shape getSpatialK() const = 0
Shape getStrides() const
inline Shape getLowerPads() const
inline Shape getUpperPads() const
inline Shape getLowerOutPads() const
inline Shape getUpperOutPads() const
Shape getPads() const
Shape getOutPads() const
Shape getDilations() const
Shape getInDilations() const
std::string getAutoPadStr(const AutoPad &x) const
std::vector<int64_t> getSpatialD() const
std::vector<int64_t> getSpatialO() const
void setup() override
virtual int64_t getNOutChans() const = 0
std::vector<int64_t> lowerPads() const
std::vector<int64_t> upperPads() const
std::vector<int64_t> lowerOutPads() const
std::vector<int64_t> upperOutPads() const
std::vector<size_t> spatialD_szt() const
std::vector<size_t> spatialK_szt() const
std::vector<uint32_t> lowerPads_u32() const
std::vector<uint32_t> upperPads_u32() const
std::vector<int> lowerPads_i32() const
std::vector<int> upperPads_i32() const
std::vector<uint32_t> dilations_u32() const
std::vector<uint32_t> strides_u32() const
void appendOutlineAttributes(OpSerialiserBase&) const override

Public Members

const std::vector<int64_t> basePads
const std::vector<int64_t> baseOutPads
const std::vector<int64_t> baseStrides
const std::vector<int64_t> baseDilations
const std::vector<int64_t> baseInDilations
const AutoPad padType
const bool ceilMode

Public Static Functions

static AutoPad getAutoPad(const std::string &autoPadStr)
static void alterPads(Shape &pads_, Shape OutShape_, Shape spatialD_, Shape spatialK_, std::vector<int64_t> strides_)
static std::vector<int64_t> lowerPads(Shape pads, int nSpatialDims, AutoPad padType)
static std::vector<int64_t> upperPads(Shape pads, int nSpatialDims, AutoPad padType)
static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
static Shape getSpatialOutShape(Shape spatialD_, Shape spatialK_, std::vector<int64_t> pads_, std::vector<int64_t> outPads_, std::vector<int64_t> strides_, std::vector<int64_t> dilations_, std::vector<int64_t> inDilations_, AutoPad auto_pad_, bool ceil_mode_ = false)
struct ReceptiveOpAttributes

Public Functions

void setFromAttributes(const Attributes &attributes)

Public Members

std::vector<int64_t> pads
std::vector<int64_t> outPads
std::vector<int64_t> strides
std::vector<int64_t> dilations
std::vector<int64_t> inDilations
std::string auto_pad
int64_t ceil_mode = 0
class HistogramOp : public popart::Op

Public Functions

HistogramOp(const OperatorIdentifier &_opid, const std::vector<float> &levels_, const bool absoluteOfInput_, const Op::Settings &settings_)
void setup() final
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
std::unique_ptr<Op> clone() const override
inline std::vector<float> getLevels() const
inline bool getAbsoluteOfInput() const

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class HostBaseOp : public popart::ExchangeBaseOp

Subclassed by popart::HostLoadOp, popart::HostStoreOp

Public Functions

inline HostBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, TensorId sid_)
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool canShard() const final
virtual std::unique_ptr<Op> clone() const override = 0
inline bool hasSideEffect() const override
inline void setHostStreamTensorId(TensorId stream_id_)
inline TensorId getHostStreamTensorId() const

Public Static Functions

static inline InIndex getLocalTensorInIndex()
class HostLoadInplaceOp : public popart::HostLoadOp

Public Functions

HostLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
HostLoadInplaceOp(const HostLoadOp&)
std::unique_ptr<Op> clone() const override
void setup() final
view::Regions modifies(InIndex) const override
view::Regions aliases(InIndex, OutIndex) const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
ExchangeDescriptor getExchangeDescriptor(int index) const final
class HostLoadOp : public popart::HostBaseOp

Host Load Op: an op to represent the transfer of data from the host to the device.

It uses the existing host to device transfers created when building the IR, but defers the actual poplar::Copy until the op itself runs. This allows the copy to be scheduled as part of the normal op scheduling.

There is a stage in the IR which adds the following ops:

Device :: InitOp -> input_prehostload -> HostLoadOp -> input -> etc… / Host :: data -> stream

Subclassed by popart::HostLoadInplaceOp

Public Functions

HostLoadOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters
  • aliasModel – An alias model object.

  • inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.

  • proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

  1. a mapping from output index to a replica-equal status with an entry for each output tensor.

  2. a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual void growAliasModel(AliasModel &m) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters
  • aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

  • 2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

ExchangeDescriptor getExchangeDescriptor(int index) const override

Public Static Functions

static inline OutIndex getLocalTensorOutIndex()
class HostStoreOp : public popart::HostBaseOp

Public Functions

HostStoreOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
std::unique_ptr<Op> clone() const override
void setup() final
ExchangeDescriptor getExchangeDescriptor(int index) const final
class IdentityGradOp : public popart::IdentityOp

Public Functions

IdentityGradOp(const IdentityOp &fwdOp)
IdentityGradOp(const Settings &settings_)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class IdentityInplaceOp : public popart::IdentityOp

Public Functions

IdentityInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
IdentityInplaceOp(const IdentityOp &concatOp)
std::unique_ptr<Op> clone() const override
inline view::Regions aliases(InIndex in, OutIndex) const final
inline bool isInplaceViewChange() const override
class IdentityLossGradOp : public popart::Op

Public Functions

IdentityLossGradOp(const IdentityLossOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
bool canBeReplacedByIdentity() const override
inline ReductionType getReductionType() const
inline float getSubgraphValue() const final
inline bool canShard() const override
float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class IdentityLossOp : public popart::LossOp

Public Functions

IdentityLossOp(const OperatorIdentifier &_opid, const ReductionType &reduction, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final
inline bool canShard() const override
inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class IdentityOp : public popart::ElementWiseUnaryOp

Subclassed by popart::AddBiasDataGradOp, popart::IdentityGradOp, popart::IdentityInplaceOp, popart::IfConditionGradOp

Public Functions

IdentityOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
inline bool isIdentity() const final
inline bool isOutplaceViewChange() const override
class IfConditionGradOp : public popart::IdentityOp

Public Functions

IfConditionGradOp(const IfOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class IfGradOp : public popart::IfOp

Public Functions

IfGradOp(const IfOp&, const std::vector<GradInOutMapper> &gradInInfo, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class IfOp : public popart::Op

Subclassed by popart::IfGradOp

Public Functions

IfOp(const OperatorIdentifier&, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
Graph &getThenGraph() const
Graph &getElseGraph() const
const std::map<InIndex, InIndex> &getBranchInIndicesMap(const Graph&) const
const std::map<OutIndex, OutIndex> &getBranchOutIndicesMap(const Graph&) const
inline float getSubgraphValue() const final
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
std::vector<const Graph*> getCalledGraphs() const override
virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
std::set<OutIndex> opInToOpOutIndex(InIndex in) const override
std::set<InIndex> opOutToOpInIndex(OutIndex out) const override
float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override
virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override

Public Static Functions

static inline InIndex getConditionInIndex()
class IncrementModInplaceOp : public popart::ElementWiseInplaceUnaryOp

Increment Modulo Op.

This Op takes one Tensor as input (as indicated in Attributes:

  1. increment - how much to increment the input tensor by (const scalar)

  2. modulus - the modulo operand (const scalar)

See also

graphcoreoperators.hpp)

  1. The Tensor to increment (modulo) The output is the tensor x = (x + increment) % modulus

Inplace - result is mapped back to the input Tensor.

Public Functions

IncrementModInplaceOp(double increment_, double modulus_, const Op::Settings &settings)
IncrementModInplaceOp(const IncrementModOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline double getIncrement() const
inline double getModulus() const
class IncrementModOp : public popart::ElementWiseUnaryOp

Increment Modulo Op.

This Op takes one Tensor as input (as indicated in Attributes:

  1. increment - how much to increment the input tensor by (const scalar)

  2. modulus - the modulo operand (const scalar)

See also

graphcoreoperators.hpp)

  1. The Tensor to increment (modulo) The output is the tensor y = (x + increment) % modulus

Public Functions

IncrementModOp(const OperatorIdentifier &opId, double increment_, double modulus_, const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline double getIncrement() const
inline double getModulus() const
class InitOp : public popart::Op

Public Functions

InitOp(const OperatorIdentifier&, const TensorInfo&, const TensorType&, const InitType&, const Op::Settings&, const int = -1)
std::unique_ptr<Op> clone() const final
void setup() final
inline TensorInfo getTensorInfo() const
inline TensorType getTensorType() const
inline InitType getInitType() const
inline float getSubgraphValue() const final
inline bool isOutlineable() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline int getOutBatchAxis(OutIndex) const override
inline bool canShard() const override

Public Static Functions

static inline InIndex getOutIndex()
class InstanceNormGradOp : public popart::Op

Public Functions

InstanceNormGradOp(const InstanceNormOp &fwd_op)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInputInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getOutGradInIndex()
static inline InIndex getMeanInIndex()
static inline InIndex getInvStdDevInIndex()
static inline OutIndex getInputOutIndex()
static inline OutIndex getScaleOutIndex()
static inline OutIndex getBOutIndex()
class InstanceNormOp : public popart::Op

Public Functions

InstanceNormOp(const OperatorIdentifier &_opid, float _epsilon, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getEpsilon() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool isNorm() const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInputInIndex()
static inline InIndex getScaleInIndex()
static inline InIndex getBInIndex()
static inline OutIndex getOutIndex()
static inline OutIndex getMeanOutIndex()
static inline OutIndex getInvStdDevOutIndex()
class IoTileCopyOp : public popart::Op

Public Functions

IoTileCopyOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline float getSubgraphValue() const final
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class IsInf : public popart::ElementWiseUnaryBooleanOp

Public Functions

IsInf(const OperatorIdentifier &_opid, const Op::Settings&)
std::unique_ptr<Op> clone() const override

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)
class IsNaN : public popart::ElementWiseUnaryBooleanOp

Public Functions

IsNaN(const OperatorIdentifier &_opid, const Op::Settings&)
std::unique_ptr<Op> clone() const override

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)
class L1GradOp : public popart::Op

Public Functions

L1GradOp(const L1Op&)
L1GradOp(const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
inline float getLambda() const
inline ReductionType getReductionType() const
inline bool canShard() const override
float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getFwdActInIndex()
static inline InIndex getGradInIndex()
static inline OutIndex getOutIndex()
class L1Op : public popart::LossOp

Public Functions

L1Op(const OperatorIdentifier &_opid, const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline float getLambda() const
inline bool canShard() const override
inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class LRNGradOp : public popart::Op

Public Functions

LRNGradOp(const LRNOp&)
std::unique_ptr<Op> clone() const final
void setup() final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline float getAlpha() const
inline float getBeta() const
inline float getBias() const
inline int64_t getSize() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline InIndex getFwdInInIndex()
static inline OutIndex getOutIndex()
class LRNOp : public popart::Op

Public Functions

LRNOp(const OperatorIdentifier &_opid, float _alpha, float _beta, float _bias, int64_t _size, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline float getAlpha() const
inline float getBeta() const
inline float getBias() const
inline int64_t getSize() const
void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class LSTMGradOp : public popart::BaseOnnxRNNGradOp

Gradient operator for LSTM op.

Public Functions

LSTMGradOp(const LSTMOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

bool hasLastCellStateGradInput() const
virtual std::set<InIndex> optionalInputs() const final

Return the input indices of all optional inputs to the op.

Public Members

const bool hasInitialCInput
const std::string fwd_debug_name
const ActivationFunction activation
const ActivationFunction recurrent_activation

Public Static Functions

static inline InIndex getInitialCInIndex()
static inline InIndex getIntermediatesInIndex()
static inline InIndex getLastCellStateGradInIndex()
static inline OutIndex getInitialCOutIndex()
class LSTMOp : public popart::BaseOnnxRNNOp

This op applies a single-layer LSTM with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#LSTM

Public Functions

LSTMOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

unsigned getNumChannels() const
nonstd::optional<float> getAvailableMemoryProportion() const
bool hasInitialCInput() const
bool hasOutput(OutIndex) const
virtual std::set<InIndex> optionalInputs() const final

Return the input indices of all optional inputs to the op.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

bool isTraining() const
inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns

true if the op can be outlined, false otherwise. Default: true.

virtual int getInBatchAxis(InIndex) const override

Get the batch axis for the input index.

Returns

The batch axis for the input index.

virtual int getOutBatchAxis(OutIndex) const override

Get the batch axis for the output index.

Returns

The batch axis for the output index.

inline ActivationFunction getActivation() const
inline ActivationFunction getRecurrentActivation() const

Public Static Functions

static inline InIndex getInitialCInIndex()
static inline InIndex getPeepholeInIndex()
static inline OutIndex getLastCellStateOutIndex()
static inline OutIndex getInitialHPassThroughIndex()
static inline OutIndex getInitialCPassThroughIndex()
static inline OutIndex getIntermediatesPassThroughIndex()
static inline OutIndex getInputWeightsPassThroughIndex()
static inline OutIndex getRecurrenceWeightsPassThroughIndex()
static inline OutIndex getBiasesPassThroughIndex()
class LambSquareOp : public popart::Op

Public Functions

LambSquareOp(const Op::Settings&)
std::unique_ptr<Op> clone() const final
void setup() final
inline float getSubgraphValue() const final
inline bool isOptimizerOp() const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class LeakyReluGradOp : public popart::Op, public popart::LeakyReluOpBaseAttributes

Public Functions

LeakyReluGradOp(const LeakyReluOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
void appendAttributes(popart::OpSerialiserBase &os) const override
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getLeakyReluInIndex()
static inline InIndex getGradLeakyReluInIndex()
static inline OutIndex getOutIndex()
class LeakyReluInplaceOp : public popart::ElementWiseInplaceUnaryOp, public popart::LeakyReluOpBaseAttributes

Public Functions

LeakyReluInplaceOp(const LeakyReluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(popart::OpSerialiserBase &os) const override
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
class LeakyReluOp : public popart::ElementWiseUnaryOp, public popart::LeakyReluOpBaseAttributes

Public Functions

LeakyReluOp(const OperatorIdentifier &_opid, float _alpha, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
void appendAttributes(popart::OpSerialiserBase &os) const override
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class LessOp : public popart::BinaryComparisonOp

Public Functions

LessOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class LinearVariadicGradOp : public popart::VariadicGradOp

Subclassed by popart::MeanArgGradOp, popart::SumArgGradOp

Public Functions

LinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
std::unique_ptr<Op> clone() const override
inline virtual bool hasScale() const
inline virtual float getScale() const
class Log1pGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

Log1pGradOp(const Log1pOp&)
std::unique_ptr<Op> clone() const final
class Log1pInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

Log1pInplaceOp(const Log1pOp&)
std::unique_ptr<Op> clone() const final
class Log1pOp : public popart::ElementWiseUnaryOp

Public Functions

Log1pOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class LogGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

LogGradOp(const LogOp &fwdOp)
std::unique_ptr<Op> clone() const final
class LogOp : public popart::ElementWiseUnaryOp

Public Functions

LogOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class LogSoftmaxGradOp : public popart::Op

Public Functions

LogSoftmaxGradOp(const LogSoftmaxOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradProbsInIndex()
static inline InIndex getActsInIndex()
static inline OutIndex getOutIndex()
class LogSoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

LogSoftmaxInplaceOp(const LogSoftmaxOp&)
std::unique_ptr<Op> clone() const final
inline int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const override
class LogSoftmaxOp : public popart::ElementWiseUnaryOp

Public Functions

LogSoftmaxOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings_)
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> clone() const final
int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class LoopOp : public popart::SubgraphOp

Public Functions

LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_)
LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_, int numImplicitScanOutputs_)
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const override
void connectInTensor(InIndex inIndex, TensorId tensorId) final
inline float getSubgraphValue() const final
std::unique_ptr<Op> clone() const override
std::vector<const Graph*> getCalledGraphs() const final
std::vector<TensorId> implicitInputTensors() const
Graph &getCalledGraph() const override
void setCalledGraph(Graph&) override
inline int getTripCountValue() const
inline void setTripCountValue(int value)
int getNumExplicitInputs() const
int getNumImplicitInputs() const
inline int getNumImplicitScanOutputs()
inline void setNumImplicitScanOutputs(int numOutputs)
InIndex subgraphInToOpInIndex(InIndex index) const override
InIndex opInToSubgraphInIndex(InIndex index) const override
OutIndex subgraphOutToOpOutIndex(OutIndex index) const override
OutIndex opOutToSubgraphOutIndex(OutIndex index) const override
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override
void addLoopInput(InIndex index, TensorId tensorId, TensorId subgraphTensorId, bool overwrite)

Add a variadic input to the loop operator.

Parameters
  • index – The position at which a Tensor is consumed by the Op.

  • tensorId – The id of the tensor to add as an input.

  • subgraphTensorId – Tensor which is going to be created in the subgraph.

  • overwrite – If true the original tensor at index will be replaced.

void addLoopOutput(OutIndex index, TensorId tensorId, TensorId subgraphTensorId, bool overwrite)
void removeLoopInput(InIndex index)
void removeLoopOutput(OutIndex index)
inline void growAliasModel(AliasModel &m) const override
std::set<OutIndex> opInToOpOutIndex(InIndex in) const override
std::set<InIndex> opOutToOpInIndex(OutIndex out) const override

Public Static Functions

static inline InIndex getMaximumTripCountInIndex()

Indexing on the LoopOp.

Returns

The LoopOp input index for the maximum number of loop iterations

static inline InIndex getTerminationConditionInIndex()

Indexing on the LoopOp.

Returns

The LoopOp input index specifying the termination condition status

static inline InIndex getFirstInputInIndex()

Indexing on the LoopOp.

Returns

The first regular, user-defined LoopOp input index

static inline OutIndex getFirstOutputOutIndex()

Indexing on the LoopOp.

Returns

The first regular, user-defined LoopOp output index

static inline InIndex getLoopGraphIterationInIndex()

Indexing on the body graph.

Returns

The loop body graph input index specifying the current loop iteration

static inline InIndex getLoopGraphTerminationConditionInIndex()

Indexing on the body graph.

Returns

The loop body graph input index specifying the current termination condition status

static inline InIndex getLoopGraphFirstInputInIndex()

Indexing on the body graph.

Returns

The first regular, user-defined loop body graph input index

static inline OutIndex getLoopGraphTerminationConditionOutIndex()

Indexing on the body graph.

Returns

The loop body graph output index for the termination condition status after the loop body graph has run

static inline OutIndex getLoopGraphFirstOutputOutIndex()

Indexing on the body graph.

Returns

The first regular, user-defined loop body graph output index

class LossOp : public popart::Op

Subclassed by popart::CtcOp, popart::IdentityLossOp, popart::L1Op, popart::NllOp

Public Functions

LossOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const ReductionType reduction_)
virtual std::unique_ptr<Op> clone() const override = 0
bool isLossOp() const override
inline ReductionType getReductionType() const

Public Static Functions

static std::string reductionTypeToString(ReductionType reduction)
static ReductionType reductionTypeFromString(std::string reduction)
class LossScaleUpdateOp : public popart::Op

Public Functions

inline LossScaleUpdateOp(const OperatorIdentifier &_opid, const DataType &updateFactorDType_, const Op::Settings &settings_)
void setup() final
inline float getSubgraphValue() const final
std::unique_ptr<Op> clone() const override
inline DataType getUpdateFactorDType() const
view::Regions aliases(InIndex in, OutIndex) const override
view::Regions modifies(InIndex) const override
void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline InIndex getLossScaleUpdateFactorInIndex()
static inline InIndex getStatisticsTensorInIndex()
static inline OutIndex getUpdatedLossScaleUpdateFactorOutIndex()
class MatMulBaseGradOp : public popart::MatMulBaseOp

Subclassed by popart::MatMulLhsGradOp, popart::MatMulRhsGradOp

Public Functions

MatMulBaseGradOp(const OperatorIdentifier &_opid, const MatMulOp &fwdOp, Phase phase)
MatMulBaseGradOp(const MatMulBaseGradOp&) = default
~MatMulBaseGradOp() override = default
virtual std::unique_ptr<Op> clone() const override = 0
const MatMulOp *getCloneOfCreator() const
inline float getSubgraphValue() const override
class MatMulBaseOp : public popart::Op

The matmul op supports inputs of IR datatype FLOAT8_143 and FLOAT8_152.

Inputs of this are a special case because they type require an additional scalar INT32 tensor input known as the log2Scale. This argument may only be used if and only if the two matmul operands are one of the FLOAT8_* types.

If the matmul inputs are valid FLOAT8 and log2Scale inputs then the matmul is considered a ‘pow2 scaled matmul’. A pow2 scaled matmul is an operation of the form result := A @ B * 2^(log2scale) where @ is the matrix multiply op. In this case, the output and partials type must be FLOAT16. Note that the multiplication by 2^(log2scale) is handled by Poplar and is not listed as an Op in the IR.

Subclassed by popart::MatMulBaseGradOp, popart::MatMulOp

Public Types

enum class Phase

Values:

enumerator Fwd
enumerator BwdLHS
enumerator BwdRHS

Public Functions

MatMulBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const Phase phase_, const nonstd::optional<float> availableMemoryProportion_, const SerialiseSettings &serialization_, const OptionalDataType outputType_, const MatMulPartialsType partialsType_, const bool enableFullyConnectedPass_ = true)
MatMulBaseOp(const MatMulBaseOp&) = default
~MatMulBaseOp() override = default
virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual Shape getExpandedLhsShape() const = 0
virtual Shape getExpandedRhsShape() const = 0
bool useFullyConnectedPass() const
inline void setUseFullyConnectedPass(bool b)
inline nonstd::optional<float> getAvailableMemoryProportion() const
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)
inline const SerialiseSettings &getSerialiseSettings() const
inline SerialiseSettings &getSerialiseSettings()
inline OptionalDataType getOutputType() const
inline Phase getPhase() const
inline void setPhase(Phase p)
virtual void appendOutlineAttributes(OpSerialiserBase &os) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendMore(OpSerialiserBase &os) const override

Append additional attributes to the stream.

This method should be overridden if the derived class has additional attributes.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline MatMulPartialsType getPartialsType() const
inline void setPartialsType(const MatMulPartialsType &pt)
inline virtual bool canShard() const override

Check if the operation can be sharded into multiple operations.

Returns

true if the operation can be sharded, false otherwise.

struct SerialiseSettings

Public Types

enum class Mode

Values:

enumerator None
enumerator InputChannels
enumerator ReducingDim
enumerator OutputChannels

Public Members

Mode mode = Mode::None
uint32_t factor = 0
bool keep_precision = false
class MatMulLhsGradOp : public popart::MatMulBaseGradOp

Public Functions

MatMulLhsGradOp(const MatMulOp &op_)
MatMulLhsGradOp(const MatMulLhsGradOp&) = default
MatMulLhsGradOp &operator=(const MatMulLhsGradOp&) = delete
~MatMulLhsGradOp() override = default
void setup() final
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline Shape getExpandedLhsShape() const override
inline Shape getExpandedRhsShape() const override
Shape getGradInputShape() const
Shape getRhsInputShape() const
Shape getOutputShape() const

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getRhsInIndex()
static inline OutIndex getOutIndex()
class MatMulOp : public popart::MatMulBaseOp

Public Functions

MatMulOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const nonstd::optional<float> &availableMemoryProportion, const SerialiseSettings &serialization_, const OptionalDataType &outputType, const MatMulPartialsType &partialsType_ = MatMulPartialsType::FLOAT)
MatMulOp(const MatMulOp&) = default
MatMulOp &operator=(const MatMulOp&) = delete
~MatMulOp() override = default
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
std::unique_ptr<Op> clone() const final
const Tensor *lhsIn() const
const Tensor *rhsIn() const
const Tensor *log2ScaleIn() const
const Tensor *out() const
inline Shape getExpandedLhsShape() const override
inline Shape getExpandedRhsShape() const override
inline Shape getExpandedOutShape() const
inline void setCanCreateInputs(bool value)
inline bool getCanCreateInputs() const
inline float getSubgraphValue() const final
Shape npMatMulOut(Shape lhs, Shape rhs)
bool isPow2ScaledMatMul() const
inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getLhsInIndex()
static inline InIndex getRhsInIndex()
static inline InIndex getLog2ScaleInIndex()
static inline OutIndex getOutIndex()
class MatMulRhsGradOp : public popart::MatMulBaseGradOp

Public Functions

MatMulRhsGradOp(const MatMulOp &op_)
MatMulRhsGradOp(const MatMulRhsGradOp&) = default
MatMulRhsGradOp &operator=(const MatMulRhsGradOp&) = delete
~MatMulRhsGradOp() override = default
void setup() final
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline Shape getExpandedLhsShape() const override
inline Shape getExpandedRhsShape() const override
Shape getLhsInputShape() const
Shape getGradInputShape() const
Shape getOutputShape() const

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getLhsInIndex()
static inline OutIndex getOutIndex()
class MaxArgGradOp : public popart::NonLinearVariadicGradOp

Public Functions

MaxArgGradOp(const MaxOp&, InIndex)
std::unique_ptr<Op> clone() const final
class MaxOp : public popart::VariadicOp

Public Functions

MaxOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
class MaxPoolGradOp : public popart::Op

Public Functions

MaxPoolGradOp(const MaxPoolOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK
const Shape creatorStrides
const Shape creatorLowerPads
const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()
static inline InIndex getPooledInIndex()
static inline InIndex getGradPooledInIndex()
static inline OutIndex getOutIndex()
class MaxPoolOp : public popart::HasReceptiveFieldOp

Public Functions

MaxPoolOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &kernelShape_, int64_t storageOrder, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
int64_t getNOutChans() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
bool canBeReplacedByIdentity() const override
Shape getSpatialK() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class MeanArgGradOp : public popart::LinearVariadicGradOp

Public Functions

MeanArgGradOp(const MeanOp&, InIndex inIndex)
const std::vector<GradInOutMapper> &gradInputInfo() const final
std::unique_ptr<Op> clone() const final
inline bool hasScale() const final
inline float getScale() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
class MeanOp : public popart::VariadicOp

Public Functions

MeanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
class MinArgGradOp : public popart::NonLinearVariadicGradOp

Public Functions

MinArgGradOp(const MinOp&, InIndex)
std::unique_ptr<Op> clone() const final
class MinOp : public popart::VariadicOp

Public Functions

MinOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
class ModifyRandomSeedOp : public popart::Op

Public Functions

ModifyRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline InIndex getSeedInIndex() const override
inline int getOutBatchAxis(OutIndex) const override
inline float getSubgraphValue() const final
inline bool isOutlineable() const final

Public Static Functions

static inline InIndex getSeedModifierInIndex()
static inline OutIndex getModifiedSeedOutIndex()
static inline TensorId getSeedInTensorId()
static inline TensorId getSeedModifierTensorId(const uint32_t modifier)
static inline TensorId getModifiedSeedTensorId(const uint32_t modifier)
class MulArg0GradOp : public popart::ElementWiseBinaryArg0GradOp

Public Functions

MulArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class MulArg1GradOp : public popart::ElementWiseBinaryArg1GradOp

Public Functions

MulArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class MulLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp

Public Functions

inline MulLhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
void setup() final
class MulRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp

Public Functions

inline MulRhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
void setup() final
class MultiCollectiveBaseOp : public popart::CollectivesBaseOp

The base class for a multi-collective which performs all-gather, all-reduce reduce-scatter operations on lists of tensors by first merging them into a larger tensor.

This improves bandwidth utilization and decreases the number of syncs needed.

Subclassed by popart::MultiReplicatedAllGatherOp, popart::MultiReplicatedAllReduceOp, popart::MultiReplicatedReduceScatterOp

Public Functions

MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, CommGroup commGroup, const Op::Settings &settings, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedBaseOp.

Parameters
  • operatorIdentifier – the identifier for the constructed op

  • commGroup – all of the inputs will be reduced scattered across the same communications group

  • settings – the settings of the op are shared across all inputs

  • outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor

  • inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph

  • outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, const ReplicaGrouping &grouping, const Op::Settings &settings, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedBaseOp.

Parameters
  • operatorIdentifier – the identifier for the constructed op

  • grouping – all of the inputs will be reduced scattered across the same communications group

  • settings – the settings of the op are shared across all inputs

  • outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor

  • inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph

  • outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in) const

Get virtual graph ID and tile set associated with an input index.

Parameters

InIndex – The input index.

Returns

The virtual graph ID and tile set at the input index.

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out) const

Get virtual graph ID and tile set associated with an output index.

Parameters

OutIndex – The output index.

Returns

The virtual graph ID and tile set at the output index.

virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in, std::set<OpId> &visited) const override

Get virtual graph ID and tile set associated with an input index.

Parameters
  • InIndex – The input index.

  • visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the input index.

virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out, std::set<OpId> &visited) const override

Get virtual graph ID and tile set associated with an output index.

Parameters
  • OutIndex – The output index.

  • visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the output index.

bool hasCorrespondingLinkedIndexTensor(Tensor *t) override
Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override
bool isCollectiveLinkedIndexTensor(InIndex in) const override
bool isCollectiveLinkedIndexTensor(Tensor *t) const override
virtual void growAliasModel(AliasModel &m) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

class MultiConvBaseOp : public popart::Op

Subclassed by popart::ConvOp, popart::MultiConvOp

Public Functions

MultiConvBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, std::vector<int64_t> flatStrides_, std::vector<int64_t> flatPads_, std::vector<int64_t> flatDilations_, const AutoPad &padType_, const MultiConvOptions &convOpts_)
std::unique_ptr<Op> clone() const override
void setup() override
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline virtual int numConvs() const
inline int64_t getNSpatialDims(int convIndex) const
inline Shape getSpatialD(int convIndex) const
inline Shape getSpatialK(int convIndex) const
inline int64_t getGroups(int convIndex) const
inline int64_t getNOutChans(int convIndex) const
inline int64_t getNInChans(int convIndex) const
inline Shape lowerPads(int convIndex) const
inline Shape upperPads(int convIndex) const
inline Shape lowerOutPads(int convIndex) const
inline Shape upperOutPads(int convIndex) const
Shape getOutShape(int convIndex, const ConvPads &pads) const
ConvParameters getParameters(int convIndex) const
virtual void setParamsFromDataGradOp(const Op *dataGradOp)
virtual void restoreAttributesFromParams(const std::vector<ConvParameters>&)
inline const MultiConvOptions &getConvOptions() const
inline void setConvOptions(const MultiConvOptions &opts)
int64_t getCumulativeSpatialDims(int64_t i) const
ConvStrides getStrides(int64_t convIndex) const
ConvPads getPads(int64_t convIndex) const
ConvPads getOutPads(int64_t convIndex) const
ConvDilations getDilations(int64_t convIndex) const
ConvDilations getInDilations(int64_t convIndex) const
Shape lowerKernTruncs(int64_t convIndex) const
Shape upperKernTruncs(int64_t convIndex) const
Shape lowerInTruncs(int64_t convIndex) const
Shape upperInTruncs(int64_t convIndex) const
Shape lowerOutTruncs(int64_t convIndex) const
Shape upperOutTruncs(int64_t convIndex) const

Public Static Functions

static void appendConvParameterAttributes(const ConvParameters&, const std::string&, OpSerialiserBase&)
static inline InIndex getDataInIndex(int convIndex)
static inline InIndex getWeightsInIndex(int convIndex)
static inline OutIndex getOutIndex(int convIndex)
static inline int getConvIndexFromInIndex(InIndex index)
class MultiConvDataGradBaseOp : public popart::Op

Subclassed by popart::ConvDataGradOp, popart::MultiConvDataGradOp

Public Functions

MultiConvDataGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
std::unique_ptr<Op> clone() const override
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline const std::vector<GradInOutMapper> &gradInputInfo() const final
inline const std::map<int, int> &gradOutToNonGradIn() const final
inline const ConvParameters &getParameters(int convIndex) const
inline virtual int numConvs() const
inline const MultiConvOptions &getConvOptions() const
inline void setConvOptions(const MultiConvOptions &opts)
inline TensorInfo getDataInfo(int convIndex) const

Public Static Functions

static inline InIndex getWeightsInIndex(int convIndex)
static inline InIndex getGradConvolvedInIndex(int convIndex)
static inline OutIndex getOutIndex(int convIndex)
class MultiConvDataGradOp : public popart::MultiConvDataGradBaseOp

Public Functions

MultiConvDataGradOp(const MultiConvOp&)
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
class MultiConvOp : public popart::MultiConvBaseOp

Public Functions

MultiConvOp(const OperatorIdentifier &_opid, const Settings &settings_, const std::vector<int64_t> &flatStrides_, const std::vector<int64_t> &flatPads_, const std::vector<int64_t> &flatDilations_, const MultiConvOptions &mcOpts_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void appendOutlineAttributes(OpSerialiserBase&) const final
class MultiConvWeightsGradBaseOp : public popart::Op

Subclassed by popart::ConvWeightsGradOp, popart::MultiConvWeightsGradOp

Public Functions

MultiConvWeightsGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
std::unique_ptr<Op> clone() const override
void setup() final
inline const std::vector<GradInOutMapper> &gradInputInfo() const final
inline const std::map<int, int> &gradOutToNonGradIn() const final
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline virtual int numConvs() const
inline const ConvParameters &getParameters(int convIndex) const
inline const MultiConvOptions &getConvOptions() const

Public Static Functions

static inline InIndex getGradConvolvedInIndex(int convIndex)
static inline InIndex getPreConvolvedInIndex(int convIndex)
static inline OutIndex getOutIndex(int convIndex)
class MultiConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp

Public Functions

MultiConvWeightsGradOp(const MultiConvOp&)
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
class MultiExchangeOp : public popart::ExchangeBaseOp

Public Functions

MultiExchangeOp(const OperatorIdentifier&, const Op::Settings&, const std::vector<ExchangeDescriptor>)
std::unique_ptr<Op> clone() const final
void setup() final
view::Regions modifies(InIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
void appendOutlineAttributes(OpSerialiserBase&) const override
int numLoads() const
int numStores() const
inline bool isRemote(int index)
inline void setRemoteBufferId(int index, RemoteBufferId remotebuffer_id)
inline RemoteBufferId getRemoteBufferId(int index) const
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final
inline void growAliasModel(AliasModel &m) const override
inline bool canShard() const final
bool hasSideEffect() const final
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
inline int getNumExchanges() const final
ExchangeDescriptor getExchangeDescriptor(int index) const final
std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const override

Map input index to a tuple of integers (a,b) that corresponds to the input associated with index.

That is, the bth input of getExchangeDescriptor(a) corresponds to the input at index.

Parameters

index – the input index to look up.

Returns

a pair of integers comprising the index of the descriptor and the index of the input associated with the input within the descriptor.

std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const override

Map output index to a tuple of integers (a,b) that corresponds to the output associated with index.

That is, the bth output of getExchangeDescriptor(a) corresponds to the output at index.

Parameters

index – the output index to look up.

Returns

a pair of integers comprising the index of the descriptor and the index of the output associated with the output within the descriptor.

std::vector<InIndex> descriptorIndexToInIndices(int index) const override
std::vector<OutIndex> descriptorIndexToOutIndices(int index) const override
class MultiReplicatedAllReduceOp : public popart::MultiCollectiveBaseOp

A multi-collective class for performing an all-reduce operation on a list of tensors.

The tensors will be merged into a single large tensor and reduced as one, leading to better bandwidth utilization and fewer syncs between replicas than doing the all-reduce on a per-tensor basis. The class supports mixing in-place and out-place all-reduce operations, but requires that all tensors use the same collective group i.e. reduction is over the same replicas. This op is usually constructed in the MergeCollectivesTransform

Public Functions

MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, CommGroup commGroup, const Settings &settings, std::vector<bool> modifiesIndexInplace, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedAllReduceOp.

Parameters
  • collectiveOperator – the collective operator is the same for all input tensors

  • commGroup – all of the inputs will be reduced across the same communications group

  • settings – the settings of the op are shared across all inputs

  • modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place

  • outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor

  • inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph

  • outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, const ReplicaGrouping &grouping, const Settings &settings, const std::vector<bool> &modifiesIndexInplace, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedAllReduceOp.

Parameters
  • collectiveOperator – the collective operator is the same for all input tensors

  • grouping – all of the inputs will be reduced across the same communications group

  • settings – the settings of the op are shared across all inputs

  • modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place

  • outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor

  • inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph

  • outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

inline CollectiveOperator getCollectiveOp() const

Returns the type of the collective used in the all reduce e.g.

addition the same collective operator is used across all the inputs to be reduced

bool hasCorrespondingLinkedIndexTensor(Tensor *t) override
Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override
bool isCollectiveLinkedIndexTensor(InIndex in) const override
bool isCollectiveLinkedIndexTensor(Tensor *t) const override
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

Return which inputs and outputs are replicated tensor sharding pairs.

virtual view::Regions modifies(InIndex index) const override

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

virtual view::Regions aliases(InIndex in, OutIndex out) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters
  • InIndex – The input index.

  • OutIndex – The output index.

Returns

The regions which the output will alias.

virtual void growAliasModel(AliasModel &m) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters
  • aliasModel – An alias model object.

  • inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.

  • proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

  1. a mapping from output index to a replica-equal status with an entry for each output tensor.

  2. a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

class NearbyIntOp : public popart::OneWayUnaryOp

Public Functions

std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class NegateGradOp : public popart::NegateOp

Public Functions

NegateGradOp(const NegateOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class NegateOp : public popart::ElementWiseUnaryOp

Subclassed by popart::NegateGradOp

Public Functions

NegateOp(const OperatorIdentifier &_opid, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class NllGradOp : public popart::Op

Public Functions

NllGradOp(const NllOp&)
NllGradOp(const TensorId &lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const bool inputIsLogProbability, const Op::Settings &settings)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
inline ReductionType getReductionType() const
inline bool hasIgnoreIndex() const
inline nonstd::optional<int> getOptionalIgnoreIndex() const
int getIgnoreIndex() const
inline bool inputIsLogProbability() const
inline TensorId getLossTensorId() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
inline bool canShard() const override
float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()
static inline InIndex getLabelInIndex()
static inline InIndex getGradInIndex()
static inline OutIndex getOutIndex()
class NllOp : public popart::LossOp

Public Functions

NllOp(const OperatorIdentifier &_opid, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, bool inputIsLogProbability, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline bool hasIgnoreIndex() const
inline nonstd::optional<int> getOptionalIgnoreIndex() const
int getIgnoreIndex() const
inline bool inputIsLogProbability() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
inline bool canShard() const override
inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()
static inline InIndex getLabelInIndex()
static inline OutIndex getOutIndex()
class NlllWithSoftmaxGradDirectOp : public popart::Op

Public Functions

NlllWithSoftmaxGradDirectOp(const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
void setup() final
Op *nlllFwdOp() const
inline float getSubgraphValue() const final
inline ReductionType getReductionType() const
inline bool hasIgnoreIndex() const
inline int getIgnoreIndex() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
inline bool canShard() const override
inline ReductionType getShardReductionType(OutIndex index) const override
float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()
static inline InIndex getLabelInIndex()
static inline InIndex getGradProbsInIndex()
static inline OutIndex getLossOutIndex()
static inline OutIndex getGradOutIndex()
class NonLinearVariadicGradOp : public popart::VariadicGradOp

Subclassed by popart::MaxArgGradOp, popart::MinArgGradOp

Public Functions

NonLinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInIndex()
static inline InIndex getFwdOutInIndex()
class NopOp : public popart::ElementWiseUnaryOp

Public Functions

NopOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline bool isOutplaceViewChange() const override
class NotOp : public popart::ElementWiseUnaryOp

Public Functions

NotOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class NormalizeImageOp : public popart::Op

Public Functions

NormalizeImageOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings_, float _scale)
std::unique_ptr<Op> clone() const override
void setup() override
inline float getSubgraphValue() const final
inline bool canShard() const override
const popart::Tensor *imgIn() const
const popart::Tensor *offsetsIn() const
const popart::Tensor *scalesIn() const
inline float getScale() const
bool canBeReplacedByIdentity() const override
bool verifyInputShapes(popart::Shape imgShape)
popart::Shape paddedShape(popart::Shape imgShape, popart::Shape offsetsShape, popart::Shape scalesShape)

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)
static inline popart::InIndex getImageInIndex()
static inline popart::InIndex getOffsetsInIndex()
static inline popart::InIndex getScalesInIndex()
static inline popart::OutIndex getOutIndex()
static inline std::string opName()
class OneWayUnaryInPlaceOp : public popart::ElementWiseInplaceUnaryOp

Subclassed by popart::CeilInplaceOp, popart::FloorInplaceOp, popart::NearbyIntInplaceOp, popart::RoundInplaceOp, popart::SignInplaceOp

Public Functions

OneWayUnaryInPlaceOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
class OneWayUnaryOp : public popart::ElementWiseUnaryOp

Subclassed by popart::CeilOp, popart::FloorOp, popart::NearbyIntOp, popart::RoundOp, popart::SignOp

Public Functions

OneWayUnaryOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class OnehotGradOp : public popart::Op

Public Functions

OnehotGradOp(const OnehotOp &fwdOp_)
std::unique_ptr<Op> clone() const final
void setup() override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline const Shape &getOutputShape() const
inline int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getIndicesInIndex()
static inline OutIndex getOutIndex()
class OnehotOp : public popart::Op

Public Functions

OnehotOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() override
void appendOutlineAttributes(OpSerialiserBase&) const override
void connectInTensor(InIndex inIndex, TensorId tenId) override
inline int64_t getAxis() const
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getIndicesInIndex()
static inline InIndex getValuesInIndex()
static inline OutIndex getOutIndex()
class OrOp : public popart::BinaryComparisonOp

Public Functions

OrOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class PReluOp : public popart::ElementWiseBinaryOp

Public Functions

PReluOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
class PackedDataBlockOp : public popart::Op

Public Functions

PackedDataBlockOp(const OperatorIdentifier&, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, Graph &callback, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final
Graph &getCalledGraph() const
void setCalledGraph(Graph&)
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override
int64_t numCallbackInputs() const
int64_t numDataInputs() const
int64_t getCallbackIterations() const
std::vector<PackedSequences> getPackedInputs()
PackedSequences getPackedOutput()
inline InIndex dataIndex(InIndex i)
inline InIndex offsetsIndex(InIndex i)
inline InIndex lengthsIndex(InIndex i)
inline int64_t getCallbackBatchSize()
inline std::vector<int64_t> getMaxSequenceLengths()
inline int64_t getMaxSequenceLength(int64_t dataIndex)
std::vector<TensorInfo> callbackSequenceInInfos()
class PadGradOp : public popart::SliceOp

Public Functions

PadGradOp(const PadOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class PadInplaceOp : public popart::BasePadOp

Public Functions

PadInplaceOp(const BasePadOutplaceOp&)
std::unique_ptr<Op> clone() const final
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
view::Regions aliases(InIndex, OutIndex) const override
view::Regions uses(InIndex index) const override
class PadOp : public popart::BasePadOutplaceOp

Public Functions

PadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void connectInTensor(InIndex inIndex, TensorId tenId) final
template<typename TDerivedOp, typename TOpParams>
class ParameterizedOp : public popart::Op

Generic base class for simple ops with parameterized attributes.

The aim of this class is to regroup all the common logic in the implementation of custom ops. In particular, it forces gathering all parameters/attributes into a proper data structure, helping generalizing the rest of the code.

Template Parameters
  • TDerivedOP – CRTP template type.

  • TOpParams – Structure containing the op parameters.

Public Types

using ParamsType = TOpParams

Public Functions

inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParamsType &_params, const popart::Op::Settings &_settings)

Construct a custom op.

Parameters
  • _opid – Operator id (default one if not provided).

  • _params – Operation parameters.

  • _settings – Settings.

inline ParameterizedOp(const ParamsType &_params, const popart::Op::Settings &_settings)
template<typename T>
inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParameterizedOp<T, TOpParams> &_op)

Construct a custom op from another op with same parameters.

Typically, this constructor build a grad op from a fwd op.

Template Parameters

TOp input type.

Parameters
  • _opid – Operator identifier (default one if not provided).

  • _op – Operation to extract setting and parameters from.

template<typename T>
inline ParameterizedOp(const ParameterizedOp<T, TOpParams> &_op)
inline virtual std::unique_ptr<Op> clone() const override

Clone the operator.

NOTE: using CRTP trick for generic implementation!

Returns

std::unique_ptr<Op> A unique pointer to the op.

inline virtual void appendAttributes(popart::OpSerialiserBase &os) const override

Append attributes when serialising the op to a stream.

This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

inline virtual bool requiresRandomSeed() const override

Check if the op requires a random seed.

This is set to falseby default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns

true if the op requires a random seed, false otherwise.

inline const TOpParams &params() const
Returns

Custom op parameters.

Public Static Functions

static inline std::unique_ptr<TDerivedOp> createOpFromCreatorInfo(const popart::OpCreatorInfo &info)

Build the op from a PopART OpCreatorInfo data structure.

Parameters

info – The OpCreatorInfo to use.

Returns

Unique ptr of the op created.

static inline TDerivedOp *createOpInGraph(popart::Graph &graph, const std::map<popart::InIndex, popart::TensorId> &in, const std::map<popart::OutIndex, popart::TensorId> &out, const popart::OperatorIdentifier &opid, const TOpParams &params, const popart::Op::Settings &settings)

Create the custom op connected in a graph.

Parameters
  • graph – Graph where to create and connect the op.

  • in – Map of input tensor ids (i.e. name).

  • out – Map of input tensor ids (i.e. name).

  • opid – PopART operator identifier (default one if not provided).

  • params – Custom op parameters.

  • settings – Custom op settings.

Returns

Pointer to the custom op created (owned by the graph?)

static inline TDerivedOp *createOpInGraph(popart::Graph &graph, const std::map<popart::InIndex, popart::TensorId> &in, const std::map<popart::OutIndex, popart::TensorId> &out, const TOpParams &params, const popart::Op::Settings &settings)
class PlaceholderOp : public popart::Op

Public Functions

PlaceholderOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
inline float getSubgraphValue() const final
class PopartLSTMGradOp : public popart::Op

Gradient operator for PopartLSTMOp.

Public Functions

PopartLSTMGradOp(const PopartLSTMOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns

The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

virtual std::set<InIndex> optionalInputs() const final

Return the input indices of all optional inputs to the op.

int64_t getInputSize() const
int64_t getMaxSeqLength() const
int64_t getBatchSize() const
int64_t getHiddenSize() const
inline ActivationFunction getActivation() const
inline ActivationFunction getRecurrentActivation() const

Public Members

const bool outputFullSequence

Public Static Functions

static inline InIndex getInitialStateInIndex()
static inline InIndex getIntermediatesInIndex()
static inline InIndex getWeightsInIndex()
static inline InIndex getBiasesInIndex()
static inline InIndex getSequenceLensInIndex()
static inline InIndex getInputInIndex()
static inline InIndex getFwdOutputInIndex()
static inline InIndex getFwdOutputGradInIndex()
static inline InIndex getFwdCellStateGradInIndex()
static inline OutIndex getInputOutIndex()
static inline OutIndex getWeightsOutIndex()
static inline OutIndex getBiasesOutIndex()
static inline OutIndex getInitialStateOutIndex()
class PopartLSTMOp : public popart::Op

Public Functions

PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)
PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
bool hasBiasesInput() const
std::set<InIndex> optionalInputs() const final
bool hasSeqLenInput() const
int64_t getMaxSeqLength() const
int64_t getBatchSize() const
int64_t getInputSize() const
int64_t getHiddenSize() const
int64_t getNumIntermediates() const
nonstd::optional<float> getAvailableMemoryProportion() const
int getInBatchAxis(InIndex) const override
int getOutBatchAxis(OutIndex) const override
inline ActivationFunction getActivation() const
inline ActivationFunction getRecurrentActivation() const

Public Members

const bool outputFullSequence

Public Static Functions

static inline InIndex getInputInIndex()
static inline InIndex getWeightsInIndex()
static inline InIndex getBiasesInIndex()
static inline InIndex getInitialStateInIndex()
static inline InIndex getSequenceLensInIndex()
static inline OutIndex getOutputOutIndex()
static inline OutIndex getCellStateOutIndex()
static inline OutIndex getIntermediatesOutIndex()
class PowArg0GradOp : public popart::ElementWiseBinaryArg0GradOp

Public Functions

PowArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class PowArg1GradOp : public popart::ElementWiseBinaryArg1GradOp

Public Functions

PowArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class PowLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp

Public Functions

inline PowLhsInplaceOp(const Op::Settings &_settings)
std::unique_ptr<Op> clone() const final
class PrintTensorOp : public popart::ElementWiseUnaryOp

Public Functions

PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const Op::Settings&)
PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const PrintTensorFmt &fmt, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void appendOutlineAttributes(OpSerialiserBase &os) const final
inline bool canBeReplacedByIdentity() const final
inline bool hasSideEffect() const override
inline bool shouldPrint() const
inline const std::string &getTitle() const
inline void setTitle(std::string title_)
inline const PrintTensorFmt &getFmt() const
class RMSPropUpdaterOp : public popart::Op

Public Functions

RMSPropUpdaterOp(OptimizerValue eps, bool TFVariant, const Op::Settings&)
std::unique_ptr<Op> clone() const final
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSubgraphValue() const final
inline bool isOptimizerOp() const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

const OptimizerValue initEps
const bool TFVariant

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getAccl1InIndex()
static inline InIndex getAccl2InIndex()
static inline InIndex getEpsInIndex()
static inline OutIndex getUpdaterOutIndex()
class RNNGradOp : public popart::BaseOnnxRNNGradOp

Gradient operator for RNNOp.

Public Functions

RNNGradOp(const RNNOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const final

Return the input indices of all optional inputs to the op.

Public Members

const ActivationFunction activation_attribute
class RNNOp : public popart::BaseOnnxRNNOp

This op applies a single-layer Elman RNN with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#RNN

For each batch element, the following output is computed:

\[ h_t = f(W x_t + b_x + R h_{t-1} + b_h) \]
where:
  • \(f\) is a supported nonlinearity function

  • \(W\) is the input weight

  • \(x_t\) is the t’th element of the input sequence

  • \(R\) is the recurrence weight matrix

  • \(h_{t-1}\) is the previous output sequence element. \(h_0\) can be provided by the user

  • \(b_x\) and \(b_h\) are the input and recurrence biases respectively

The op outputs the full sequence \(h_1, h_2, ...\), as well as the last element of the sequence.

If the biases or \(h_0\) are not set, they are considered to be 0 and not trained (are treated as constant 0s in the model).

Public Functions

RNNOp(const OperatorIdentifier &_opid, ActivationFunction activation, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

virtual int getInBatchAxis(InIndex) const override

Get the batch axis for the input index.

Returns

The batch axis for the input index.

virtual int getOutBatchAxis(OutIndex) const override

Get the batch axis for the output index.

Returns

The batch axis for the output index.

inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns

true if the op can be outlined, false otherwise. Default: true.

inline virtual std::string getName() const final

Public Members

const ActivationFunction activation_attribute
class RandomBaseOp : public popart::ShapeOrLikeOp

Subclassed by popart::DropoutBaseOp, popart::RandomNormalBaseOp, popart::RandomUniformBaseOp

Public Functions

RandomBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
inline bool requiresRandomSeed() const final
inline std::vector<DataType> getSupportedDataTypes() const override

Public Static Functions

static std::vector<DataType> supportedDataTypes()
static void errorIfSeedIsSet(const Attributes &attr, OperatorIdentifier opid)
class RandomNormalBaseOp : public popart::RandomBaseOp

Subclassed by popart::RandomNormalLikeOp, popart::RandomNormalOp

Public Functions

RandomNormalBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getMean() const
inline float getScale() const
class RandomNormalLikeOp : public popart::RandomNormalBaseOp

Public Functions

RandomNormalLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
std::vector<std::unique_ptr<Op>> getGradOps() final
inline InIndex getSeedInIndex() const final
std::unique_ptr<RandomNormalOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()
class RandomNormalOp : public popart::RandomNormalBaseOp

Public Functions

RandomNormalOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline InIndex getSeedInIndex() const final
class RandomUniformBaseOp : public popart::RandomBaseOp

Subclassed by popart::RandomUniformLikeOp, popart::RandomUniformOp

Public Functions

RandomUniformBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getHigh() const
inline float getLow() const
class RandomUniformLikeOp : public popart::RandomUniformBaseOp

Public Functions

RandomUniformLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
std::vector<std::unique_ptr<Op>> getGradOps() final
inline InIndex getSeedInIndex() const final
std::unique_ptr<RandomUniformOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()
class RandomUniformOp : public popart::RandomUniformBaseOp

Public Functions

RandomUniformOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final
inline InIndex getSeedInIndex() const final
class ReciprocalGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

ReciprocalGradOp(const ReciprocalOp&)
std::unique_ptr<Op> clone() const final
class ReciprocalOp : public popart::ElementWiseUnaryOp

Public Functions

ReciprocalOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceGradOp : public popart::Op

Subclassed by popart::ReduceL1GradOp, popart::ReduceL2GradOp, popart::ReduceLogSumExpGradOp, popart::ReduceLogSumGradOp, popart::ReduceMaxGradOp, popart::ReduceMeanGradOp, popart::ReduceMedianGradOp, popart::ReduceMinGradOp, popart::ReduceProdGradOp, popart::ReduceSumGradOp, popart::ReduceSumSquareGradOp

Public Functions

ReduceGradOp(const AiGraphcoreOpIdV1 &opid, const ReduceOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const override
void setup() override
const std::vector<int64_t> &getAxes() const
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const final
const Shape &backwardShape() const
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ReduceL1GradOp : public popart::ReduceGradOp

Public Functions

ReduceL1GradOp(const ReduceL1Op &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
class ReduceL1Op : public popart::ReduceOp

Public Functions

ReduceL1Op(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceL2GradOp : public popart::ReduceGradOp

Public Functions

ReduceL2GradOp(const ReduceL2Op &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
static inline InIndex getFwdOutInIndex()
class ReduceL2Op : public popart::ReduceOp

Public Functions

ReduceL2Op(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceLogSumExpGradOp : public popart::ReduceGradOp

Public Functions

ReduceLogSumExpGradOp(const ReduceLogSumExpOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
static inline InIndex getFwdOutInIndex()
class ReduceLogSumExpOp : public popart::ReduceOp

Public Functions

ReduceLogSumExpOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceLogSumGradOp : public popart::ReduceGradOp

Public Functions

ReduceLogSumGradOp(const ReduceLogSumOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdOutInIndex()
class ReduceLogSumOp : public popart::ReduceOp

Public Functions

ReduceLogSumOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceMaxGradOp : public popart::ReduceGradOp

Public Functions

ReduceMaxGradOp(const ReduceMaxOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
static inline InIndex getFwdOutInIndex()
class ReduceMaxOp : public popart::ReduceOp

Public Functions

ReduceMaxOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceMeanGradOp : public popart::ReduceGradOp

Public Functions

ReduceMeanGradOp(const ReduceMeanOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
class ReduceMeanOp : public popart::ReduceOp

Public Functions

ReduceMeanOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceMedianGradOp : public popart::ReduceGradOp

Public Functions

ReduceMedianGradOp(const ReduceMedianOp &fwd_op, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const override

Public Static Functions

static inline InIndex getIndicesInIndex()
class ReduceMedianOp : public popart::ReduceOp

Public Functions

ReduceMedianOp(const OperatorIdentifier &opid, const nonstd::optional<std::vector<int64_t>> &axes, int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
void setup() override
std::vector<std::unique_ptr<Op>> getGradOps() final
bool canBeReplacedByIdentity() const override

Public Static Functions

static inline OutIndex getIndicesOutIndex()
class ReduceMinGradOp : public popart::ReduceGradOp

Public Functions

ReduceMinGradOp(const ReduceMinOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
static inline InIndex getFwdOutInIndex()
class ReduceMinOp : public popart::ReduceOp

Public Functions

ReduceMinOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceOp : public popart::Op

Subclassed by popart::ReduceL1Op, popart::ReduceL2Op, popart::ReduceLogSumExpOp, popart::ReduceLogSumOp, popart::ReduceMaxOp, popart::ReduceMeanOp, popart::ReduceMedianOp, popart::ReduceMinOp, popart::ReduceProdOp, popart::ReduceSumOp, popart::ReduceSumSquareOp

Public Functions

ReduceOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
void setup() override
const std::vector<int64_t> &getAxes() const
bool getKeepDims() const
void setAxes(std::vector<int64_t> value)
void setKeepDims(int64_t value)
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final
const Shape &backwardShape() const
inline bool canShard() const override
inline int getOutBatchAxis(OutIndex) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ReduceProdGradOp : public popart::ReduceGradOp

Public Functions

ReduceProdGradOp(const ReduceProdOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
class ReduceProdOp : public popart::ReduceOp

Public Functions

ReduceProdOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceSumGradOp : public popart::ReduceGradOp

Public Functions

ReduceSumGradOp(const ReduceSumOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
class ReduceSumOp : public popart::ReduceOp

Subclassed by popart::AddArg0GradOp, popart::AddArg1GradOp, popart::AddBiasBiasGradOp, popart::SubtractArg0GradOp

Public Functions

ReduceSumOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class ReduceSumSquareGradOp : public popart::ReduceGradOp

Public Functions

ReduceSumSquareGradOp(const ReduceSumSquareOp &fwdOp, const Shape &backward_shape)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()
class ReduceSumSquareOp : public popart::ReduceOp

Public Functions

ReduceSumSquareOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final
class ReluGradOp : public popart::Op

Public Functions

ReluGradOp(const ReluOp&)
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getReludInIndex()
static inline InIndex getGradReludInIndex()
static inline OutIndex getOutIndex()
class ReluInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ReluInplaceOp(const ReluOp&)
ReluInplaceOp(const Op::Settings &opSettings)
std::unique_ptr<Op> clone() const final
class ReluOp : public popart::ElementWiseUnaryOp

Public Functions

ReluOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class RemoteBaseOp : public popart::ExchangeBaseOp

Subclassed by popart::RemoteLoadOp, popart::RemoteStoreOp

Public Functions

inline RemoteBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, RemoteBufferId rbid_)
virtual std::unique_ptr<Op> clone() const = 0
inline virtual RemoteBufferId getRemoteBufferId() const final
inline virtual bool canShard() const final
inline virtual void setRemoteBufferId(RemoteBufferId remoteBufferId_) final
virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Public Static Functions

static inline InIndex getLocalTensorInIndex()
static inline InIndex getRemoteBufferOffsetInIndex()
class RemoteLoadInplaceOp : public popart::RemoteLoadOp

Remote Load Inplace Op.

See also

See also

RemoteLoadOp for explanation.

Public Functions

RemoteLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteLoadInplaceOp.

See constructor of the parent class for the input parameters.

RemoteLoadInplaceOp(const RemoteLoadOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual view::Regions modifies(InIndex) const final

Return the input region which this op modifies (for inplace ops).

Parameters

InIndex – The input index.

Returns

The regions which this op modifies.

virtual view::Regions aliases(InIndex, OutIndex) const final

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters
  • InIndex – The input index.

  • OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters
  • InIndex – The op input index.

  • OutIndex – The op output index.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

ExchangeDescriptor getExchangeDescriptor(int index) const final
class RemoteLoadOp : public popart::RemoteBaseOp

Remote Load Op.

Loads a tensor from remote (off-chip) buffer. The tensor will be loaded from the memory location corresponding to RemoteBufferId, and will be stored in the memory location corresponding to inTensor.

This class takes between one and two TensorIds as inputs (as indicated in

  1. The TensorId of the inTensor.

    • In the inplace version this will be aliased to the output tensor

    • In the outplace version this Op will clone the inTensor, then write the loaded data to the clone

  2. The (optional) TensorId to a 0-rank tensor called offset .

    • If set to a value >= 0 offset will specify the row in the remote buffer the tensor will be loaded.

    • If set to -1 RemoteSetup will assign a unique value.

See also

graphcoreoperators.hpp).

The relationship between offset, RemoteBufferId and RemoteSetup

is thoroughly described in

The output is the

TensorId of the loaded tensor.

See also

RemoteStoreOp.

See also

See also

RemoteStoreOp.

Subclassed by popart::RemoteLoadInplaceOp

Public Functions

RemoteLoadOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteLoadOp.

Parameters specifically related to this class:

See constructor of the parent class for the rest of input parameters.

Parameters

RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the RemoteSetup will automatically choose the right buffer. The RemoteBufferId can only be used with tensors of identical shape.

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Return which inputs and outputs are replicated tensor sharding pairs.

ExchangeDescriptor getExchangeDescriptor(int index) const override
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters

OperatorIdentifier – The operator identifier of the op to be instantiated.

Returns

An instance of the required op.

virtual void growAliasModel(AliasModel&) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

Pre

All input tensors of this op have mappings in aliasModel before the call to aliasModel.

Post

All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters
  • aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.

  • 2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline OutIndex getLocalTensorOutIndex()
class RemoteStoreOp : public popart::RemoteBaseOp

Remote Store Op.

Stores a tensor to a remote (off-chip) buffer. This Op is typically used when the user wants to store several different identically shaped tensors to the same remote buffer by specifying the offset (see below).

This class takes between one and two TensorIds as inputs (as indicated in

  1. The TensorId of the inTensor to copy to remote memory.

  2. The (optional) TensorId 0-rank tensor called offset .

    • If set to a value >= 0 offset will specify the row in the remote buffer the inTensor will be written to (see below for explanation).

    • If set to -1 RemoteSetup will assign a unique value.

See also

graphcoreoperators.hpp).

If inTensor is of rank x , the remote buffer of a certain RemoteBufferId will be of rank x+1, where the new dimension (the row) will be of size N.

Op instances with matching RemoteBufferId will outline together, meaning that if multiple different tensors are to be stored under the same remote buffer ID, a different offset value has to be supplied for each tensor.

For using the automatic

If not using the automatic

RemoteSetup, all offsets and RemoteBufferIds need to be >= 0. Each remote buffer ID needs then to be registered with Ir::setRemoteBufferInfo manually.

See also

RemoteSetup configuration, the offset tensor should be a unique constant tensor per inTensor per RemoteBufferId. If the constant offset tensor has value -1, RemoteSetup will assign a unique value, otherwise the supplied offset value will be used. RemoteSetup will call Ir::setRemoteBufferInfo to configure the shape (equal to the inTensor shape) and number of rows ( N ) in the remote memory.

This Op does not have any output.

See also

See also

RemoteLoadOp.

Public Functions

RemoteStoreOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteStoreOp.

Parameters specifically related to this class:

See constructor of the parent class for the rest of input parameters.

Parameters

RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the RemoteSetup will automatically choose the right buffer. The RemoteBufferIds can only be used with tensors of identical shape.

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

inline virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline virtual bool hasSideEffect() const override

Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.

Returns

true if the op has side effects, false otherwise. Default=false.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

Return which inputs and outputs are replicated tensor sharding pairs.

ExchangeDescriptor getExchangeDescriptor(int index) const final
class ReplicatedAllGatherOp : public popart::CollectivesBaseOp

Public Functions

ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&)
ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&, TensorInfo outInfo)
ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&)
ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&, const TensorInfo &outInfo)
std::unique_ptr<Op> clone() const final
void setup() final
inline float getSubgraphValue() const final
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
bool isConfigureOutputForReplicatedTensorSharding() const override

Check RTS mode (see collectives.hpp)

Returns

True if this operation is configured for replicated tensor sharding

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
std::vector<std::unique_ptr<Op>> getGradOps() override
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const override
class ReplicatedAllReduceInplaceOp : public popart::ReplicatedAllReduceOp

Public Functions

ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, CommGroup group, const Op::Settings &settings_)
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, const ReplicaGrouping &grouping, const Op::Settings &settings_)
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
ReplicatedAllReduceInplaceOp(const ReplicatedAllReduceOp&)
view::Regions modifies(InIndex) const override
view::Regions aliases(InIndex, OutIndex) const override
std::unique_ptr<Op> clone() const final
void setup() final
class ReplicatedAllReduceOp : public popart::CollectivesBaseOp

Subclassed by popart::ReplicatedAllReduceInplaceOp

Public Functions

ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)
ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)
ReplicatedAllReduceOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
void setup() override
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline CollectiveOperator getCollectiveOp() const
void growAliasModel(AliasModel&) const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
inline bool hasCorrespondingLinkedIndexTensor(Tensor *t) override
inline Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override
inline bool isCollectiveLinkedIndexTensor(InIndex in) const override
inline bool isCollectiveLinkedIndexTensor(Tensor *t) const override
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
class ReplicatedReduceScatterOp : public popart::CollectivesBaseOp

Public Functions

ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)
ReplicatedReduceScatterOp(const OperatorIdentifier&, const Op::Settings&)
std::unique_ptr<Op> clone() const override
void setup() override
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline CollectiveOperator getCollectiveOp() const
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
bool isConfigureOutputForReplicatedTensorSharding() const override

Check RTS mode (see collectives.hpp)

Returns

True if this operation is configured for replicated tensor sharding

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
std::vector<std::unique_ptr<Op>> getGradOps() override
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const override
class ReshapeBaseOp : public popart::Op

Subclassed by popart::ReshapeInplaceOp, popart::ReshapeOp

Public Functions

inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Shape &ots_, const Op::Settings &settings_, bool handleZero_ = true)
inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero_ = true)
std::unique_ptr<Op> clone() const override
void setup() final
void setOutShape(const Shape &value)
const Shape &getOutShape() const
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final
void connectInTensor(InIndex inIndex, TensorId tenId) final
inline bool canShard() const override
void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override
void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ReshapeGradOp : public popart::ReshapeOp

Public Functions

ReshapeGradOp(const ReshapeOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class ReshapeInplaceOp : public popart::ReshapeBaseOp

Public Functions

ReshapeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
ReshapeInplaceOp(const ReshapeOp&)
std::unique_ptr<Op> clone() const final
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline view::Regions aliases(InIndex in, OutIndex) const final
inline bool isInplaceViewChange() const override
class ReshapeOp : public popart::ReshapeBaseOp

Subclassed by popart::ReshapeGradOp

Public Functions

inline ReshapeOp(const OperatorIdentifier &_opid, const Shape &s, const Op::Settings &settings_, bool handleZero = true)
inline ReshapeOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero = true)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
inline bool isOutplaceViewChange() const override
class ResizeGradOp : public popart::ResizeOp

Public Functions

ResizeGradOp(const ResizeOp&)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline const std::vector<float> getFwdScales() const
class ResizeOp : public popart::Op

Subclassed by popart::ResizeGradOp

Public Functions

ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales)
ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales, ResizeNearestMode nearestMode, ResizeCoordinateTransformationMode)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
inline float getSubgraphValue() const final
inline ResizeMode getMode() const
inline const std::vector<float> &getScales() const
inline ResizeNearestMode getNearestMode() const
inline ResizeCoordinateTransformationMode getCoordinateTransformationMode() const

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class RestoreInplaceOp : public popart::RestoreOp

Public Functions

RestoreInplaceOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
std::unique_ptr<Op> clone() const override
view::Regions aliases(InIndex in, OutIndex) const final
view::Regions modifies(InIndex) const final
inline void growAliasModel(AliasModel &m) const override

Public Members

bool requiredForRecompute = false

Public Static Functions

static inline InIndex getActToRestoreInIndex()
class RestoreOp : public popart::Op

Subclassed by popart::RestoreInplaceOp

Public Functions

RestoreOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
std::unique_ptr<Op> clone() const override
void setup() final
TensorId getRestoredTensorId() const
inline float getSubgraphValue() const final
inline int64_t getStashSize() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool isOutlineable() const override

Public Static Functions

static inline InIndex getStashInIndex()
static inline OutIndex getRestoredActOutIndex()
class ReverseBaseOp : public popart::Op

Subclassed by popart::ReverseInplaceOp, popart::ReverseOp

Public Functions

inline ReverseBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
std::unique_ptr<Op> clone() const override
void setup() final
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
inline std::vector<int64_t> getDimensions() const
void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ReverseGradOp : public popart::ReverseOp

Public Functions

ReverseGradOp(const ReverseOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class ReverseInplaceOp : public popart::ReverseBaseOp

Public Functions

inline ReverseInplaceOp(const ReverseOp &op)
std::unique_ptr<Op> clone() const final
inline view::Regions aliases(InIndex in, OutIndex) const final
class ReverseOp : public popart::ReverseBaseOp

Subclassed by popart::ReverseGradOp

Public Functions

inline ReverseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
void appendOutlineAttributes(OpSerialiserBase&) const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
class RoiAlignGradOp : public popart::Op

Public Functions

RoiAlignGradOp(const RoiAlignOp&)
std::unique_ptr<Op> clone() const final
virtual void setup()
virtual const std::vector<popart::GradInOutMapper> &gradInputInfo() const
const std::map<int, int> &gradOutToNonGradIn() const
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline float getSpatialScale() const
inline uint64_t getSamplingRatio() const
inline uint64_t getAlignedHeight() const
inline uint64_t getAlignedWidth() const
class RoiAlignOp : public popart::Op

Region of Interest (RoI) align operation described in the Mask R-CNN paper.

Param spatialScale

Multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling, i.e., spatial scale of the input feature map X relative to the input image.

Param samplingRatio

Number of sampling points in the interpolation grid used to compute the output value of each pooled output bin.

Param alignedHeight

Pooled output Y’s height.

Param alignedWidth

Pooled output X’s height.

Public Functions

RoiAlignOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings, const float spatialScale, const uint64_t samplingRatio, const uint64_t alignedHeight, const uint64_t alignedWidth)
RoiAlignOp(const RoiAlignOp&) = default
RoiAlignOp &operator=(const RoiAlignOp&) = delete
~RoiAlignOp() override = default
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual std::vector<std::unique_ptr<popart::Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters

OpSerialiserBase – The stream to which the attributes should be appended.

inline float getSpatialScale() const
inline uint64_t getSamplingRatio() const
inline uint64_t getAlignedHeight() const
inline uint64_t getAlignedWidth() const
class RoundInplaceOp : public popart::OneWayUnaryInPlaceOp

Public Functions

RoundInplaceOp(const RoundOp&)
std::unique_ptr<Op> clone() const final
class RoundOp : public popart::OneWayUnaryOp

Public Functions

RoundOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class ScaleGradOp : public popart::ScaleOp

Public Functions

ScaleGradOp(const ScaleOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class ScaleInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ScaleInplaceOp(const ScaleOp&)
ScaleInplaceOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
float getScaleFactor() const
void appendOutlineAttributes(OpSerialiserBase&) const override
class ScaleOp : public popart::ElementWiseUnaryOp

Subclassed by popart::ScaleGradOp

Public Functions

ScaleOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline void setScaleFactor(float value)
float getScaleFactor() const
void appendOutlineAttributes(OpSerialiserBase&) const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
bool canBeReplacedByIdentity() const override
class ScaledAddLhsInplaceOp : public popart::ScaledAddOp

Public Functions

ScaledAddLhsInplaceOp(float scale_0_, float scale_1_, const Op::Settings &settings_)
ScaledAddLhsInplaceOp(const ScaledAddOp&)
std::unique_ptr<Op> clone() const final
view::Regions modifies(InIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
class ScaledAddOp : public popart::Op

Subclassed by popart::ScaledAddLhsInplaceOp, popart::ScaledAddRhsInplaceOp

Public Functions

ScaledAddOp(const OperatorIdentifier &_opid, float scale_0_, float scale_1_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void setup() override
inline float getScale0() const
inline float getScale1() const
void appendOutlineAttributes(OpSerialiserBase&) const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
inline bool canShard() const override
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
inline float getSubgraphValue() const override
void growAliasModel(AliasModel&) const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getArg0InIndex()
static inline InIndex getArg1InIndex()
static inline InIndex getScale0InIndex()
static inline InIndex getScale1InIndex()
static inline OutIndex getOutIndex()
class ScaledAddRhsInplaceOp : public popart::ScaledAddOp

Public Functions

ScaledAddRhsInplaceOp(const ScaledAddOp&)
std::unique_ptr<Op> clone() const final
view::Regions modifies(InIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
class ScanOp : public popart::SubgraphOp

Public Functions

ScanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, Graph &callee_, int numScanInputs_, int numImplicitInputs_, std::vector<int64_t> scanInputAxes_, std::vector<int64_t> scanInputDirections_, std::vector<int64_t> scanOutputAxes_, std::vector<int64_t> scanOutputDirections_)
void setup() final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
std::unique_ptr<Op> clone() const override
Graph &getCalledGraph() const override
void setCalledGraph(Graph&) override
InIndex subgraphInToOpInIndex(InIndex index) const override
InIndex opInToSubgraphInIndex(InIndex index) const override
OutIndex subgraphOutToOpOutIndex(OutIndex index) const override
OutIndex opOutToSubgraphOutIndex(OutIndex index) const override
int getTripCountValue() const
inline int getNumScanInputs() const
inline int getNumVariables() const
inline int getNumImplicitInputs() const
inline int getNumScanOutputs() const
int64_t getScanInputAxis(int i) const
inline bool isScanInputReversed(int i) const
int64_t getScanOutputAxis(int i) const
inline bool isScanOutputReversed(int i) const
int64_t getScanInputDirection(int i) const
int64_t getScanOutputDirection(int i) const
class ScatterDataGradOp : public popart::Op

Public Functions

ScatterDataGradOp(const ScatterOp &op, int64_t axis)
std::unique_ptr<Op> clone() const final override
const std::vector<GradInOutMapper> &gradInputInfo() const final override
const std::map<int, int> &gradOutToNonGradIn() const final override
void setup() final override
void appendOutlineAttributes(OpSerialiserBase&) const override
float getSubgraphValue() const final override
int64_t getAxis() const noexcept
nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()
static inline InIndex indicesInIndex()
static inline OutIndex gradOutIndex()
class ScatterOp : public popart::ScatterReduceOp

Public Functions

ScatterOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt)
InIndex srcDataInIndex() const noexcept override
InIndex initialValuesInIndex() const noexcept override
std::unique_ptr<Op> clone() const final override
std::vector<std::unique_ptr<Op>> getGradOps() final override
void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline InIndex dataInIndex()
static inline InIndex indicesInIndex()
static inline InIndex updatesInIndex()
static inline OutIndex outIndex()
class ScatterReduceGradOp : public popart::Op

Public Functions

ScatterReduceGradOp(const ScatterReduceOp &op)
void setup() final override
std::unique_ptr<Op> clone() const final override
const std::vector<GradInOutMapper> &gradInputInfo() const final override
const std::map<int, int> &gradOutToNonGradIn() const final override
void appendOutlineAttributes(OpSerialiserBase&) const override
float getSubgraphValue() const final override
int64_t getAxis() const noexcept
int64_t getGroupSize() const noexcept
ScatterReduction getReduction() const noexcept
bool indexBroadcasted() const noexcept
bool indexBroadcastEnabled() const noexcept
bool hasInitialValues() const noexcept
nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()
static inline InIndex indicesInIndex()
static inline InIndex srcDataInIndex()
static inline InIndex fwdOutInIndex()
static inline InIndex initialValuesInIndex()
static inline OutIndex gradDataOutIndex()
static inline OutIndex gradInitialValuesOutIndex()
class ScatterReduceOp : public popart::Op

Subclassed by popart::ScatterOp

Public Functions

ScatterReduceOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t axis_size_, ScatterReduction reduction_, int64_t group_size_, bool enable_index_broadcast_, const nonstd::optional<float> &available_memory_proportion_, const Op::Settings &settings_)
inline virtual InIndex srcDataInIndex() const noexcept
inline InIndex indicesInIndex() const noexcept
inline virtual InIndex initialValuesInIndex() const noexcept
inline OutIndex outIndex() const noexcept
void setup() final
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void appendOutlineAttributes(OpSerialiserBase&) const override
float getSubgraphValue() const final override
int64_t getAxis() const noexcept
int64_t getGroupSize() const noexcept
ScatterReduction getReduction() const noexcept
const Shape &getBackwardShape() const noexcept
bool indexBroadcasted() const noexcept
bool indexBroadcastEnabled() const noexcept
nonstd::optional<float> getAvailableMemoryProportion() const noexcept
void setAvailableMemoryProportion(const nonstd::optional<float> &v)

Public Static Functions

static std::string reductionToString(ScatterReduction reduction)
static ScatterReduction reductionFromString(const std::string &reductionStr)
class ScatterUpdateGradOp : public popart::Op

Public Functions

ScatterUpdateGradOp(const ScatterOp &op, int64_t axis)
std::unique_ptr<Op> clone() const final override
void setup() final override
const std::vector<GradInOutMapper> &gradInputInfo() const final override
const std::map<int, int> &gradOutToNonGradIn() const final override
void appendOutlineAttributes(OpSerialiserBase&) const override
float getSubgraphValue() const final override
int64_t getAxis() const noexcept
nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()
static inline InIndex indicesInIndex()
static inline OutIndex gradOutIndex()
class SeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SeluGradOp(const SeluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getGamma() const
class SeluInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SeluInplaceOp(const SeluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getGamma() const
class SeluOp : public popart::ElementWiseUnaryOp

Public Functions

SeluOp(const OperatorIdentifier &opid, float _alpha, float _gamma, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
inline float getGamma() const
class SequenceSliceInplaceOp : public popart::SequenceSliceOp

Public Functions

SequenceSliceInplaceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
std::unique_ptr<Op> clone() const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
view::Regions aliases(InIndex, OutIndex) const final
view::Regions modifies(InIndex) const final
class SequenceSliceOp : public popart::Op

Subclassed by popart::SequenceSliceInplaceOp

Public Functions

SequenceSliceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline float getSubgraphValue() const final
void setup() override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
void growAliasModel(AliasModel&) const override
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Members

const bool zeroUnused

Public Static Functions

static inline InIndex getSourceInIndex()
static inline InIndex getDestinationInIndex()
static inline InIndex getNInIndex()
static inline InIndex getSourceOffsetInIndex()
static inline InIndex getDestOffsetInIndex()
static inline OutIndex getOutIndex()
class ShapeOrLikeOp : public popart::Op

Subclassed by popart::RandomBaseOp, popart::ZerosBaseOp

Public Functions

ShapeOrLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
inline float getSubgraphValue() const override
void validateDataType(DataType dataType, OperatorIdentifier opid)
inline const OptionalDataType &getDataType() const
virtual std::vector<DataType> getSupportedDataTypes() const = 0

Public Static Functions

static OptionalDataType getOptionalDataType(const Attributes &attr, OperatorIdentifier opid)
static inline OutIndex getOutIndex()
static const OpDefinition::DataTypes &likeSupportedInputTypes()
class ShapedDropoutOp : public popart::DropoutBaseOp

Subclassed by popart::ShapedDropoutGradOp

Public Functions

ShapedDropoutOp(const OperatorIdentifier &_opid, float ratio_, const Shape &shape_, const Op::Settings &settings_)
inline const std::vector<int64_t> &getShape() const
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() override
void appendOutlineAttributes(OpSerialiserBase&) const override
class ShapedDropoutGradOp : public popart::ShapedDropoutOp

Public Functions

ShapedDropoutGradOp(const ShapedDropoutOp &fwdOp)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const override
const std::map<int, int> &gradOutToNonGradIn() const override

Public Static Functions

static inline InIndex getGradInIndex()
static inline OutIndex getOutIndex()
class ShrinkGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

ShrinkGradOp(const ShrinkOp&)
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float lambd() const
inline float bias() const
class ShrinkInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ShrinkInplaceOp(const ShrinkOp&)
std::unique_ptr<Op> clone() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float lambd() const
inline float bias() const
class ShrinkOp : public popart::ElementWiseUnaryOp

Public Functions

ShrinkOp(const OperatorIdentifier &opid, float lambd, float bias, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float lambd() const
inline float bias() const
class SigmoidGradOp : public popart::Op

Public Functions

SigmoidGradOp(const SigmoidOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdOutInIndex()
static inline OutIndex getOutIndex()
class SigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SigmoidInplaceOp(const SigmoidOp&)
std::unique_ptr<Op> clone() const final
class SigmoidOp : public popart::ElementWiseUnaryOp

Public Functions

SigmoidOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SignInplaceOp : public popart::OneWayUnaryInPlaceOp

Public Functions

SignInplaceOp(const SignOp&)
std::unique_ptr<Op> clone() const final
class SignOp : public popart::OneWayUnaryOp

Public Functions

SignOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)
class SinGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SinGradOp(const SinOp &fwdOp)
std::unique_ptr<Op> clone() const final
class SinOp : public popart::ElementWiseUnaryOp

Public Functions

SinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class SinhGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SinhGradOp(const SinhOp&)
std::unique_ptr<Op> clone() const final
class SinhInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SinhInplaceOp(const SinhOp&)
SinhInplaceOp(const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
class SinhOp : public popart::ElementWiseUnaryOp

Public Functions

SinhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SliceGradOp : public popart::BasePadOutplaceOp

Public Functions

SliceGradOp(const SliceOp&)
void appendOutlineAttributes(OpSerialiserBase&) const override
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline bool canShard() const override
class SliceInplaceOp : public popart::BaseSliceOp

Public Functions

SliceInplaceOp(const SliceOp&)
SliceInplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
view::Regions aliases(InIndex in, OutIndex) const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class SliceOp : public popart::BaseSliceOp

Subclassed by popart::PadGradOp

Public Functions

SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SoftPlusGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SoftPlusGradOp(const SoftPlusOp&)
std::unique_ptr<Op> clone() const final
class SoftPlusInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SoftPlusInplaceOp(const SoftPlusOp&)
std::unique_ptr<Op> clone() const final
class SoftPlusOp : public popart::ElementWiseUnaryOp

Public Functions

SoftPlusOp(const OperatorIdentifier &opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SoftSignGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SoftSignGradOp(const SoftSignOp&)
std::unique_ptr<Op> clone() const final
class SoftSignInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SoftSignInplaceOp(const SoftSignOp&)
std::unique_ptr<Op> clone() const final
class SoftSignOp : public popart::ElementWiseUnaryOp

Public Functions

SoftSignOp(const OperatorIdentifier &opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SoftmaxGradDirectOp : public popart::Op

Public Functions

SoftmaxGradDirectOp(const TensorId lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
void setup() final
bool hasNlllFwdOp() const
Op *nlllFwdOp() const
inline float getSubgraphValue() const final
inline ReductionType getReductionType() const
inline bool hasIgnoreIndex() const
inline nonstd::optional<int> getOptionalIgnoreIndex() const
inline int getIgnoreIndex() const
virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Public Static Functions

static inline InIndex getProbsInIndex()
static inline InIndex getLabelInIndex()
static inline InIndex getGradProbsInIndex()
static inline OutIndex getOutIndex()
class SoftmaxGradOp : public popart::Op

Public Functions

SoftmaxGradOp(const SoftmaxOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getGradProbsInIndex()
static inline InIndex getProbsInIndex()
static inline OutIndex getOutIndex()
class SoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SoftmaxInplaceOp(const SoftmaxOp&)
std::unique_ptr<Op> clone() const final
inline int64_t getAxis() const
void appendOutlineAttributes(OpSerialiserBase&) const override
class SoftmaxOp : public popart::ElementWiseUnaryOp

Public Functions

SoftmaxOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
int64_t getAxis() const
void setAxis(int64_t)
void appendOutlineAttributes(OpSerialiserBase&) const override
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SortOp : public popart::TopKOp

Public Functions

SortOp(const OperatorIdentifier &opid, int64_t axis, int64_t axis_size, bool descending, bool stable, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)
std::unique_ptr<Op> clone() const override
class SplineBasisOp : public popart::Op

Public Functions

SplineBasisOp(const OperatorIdentifier &opid, int degree, const Op::Settings &settings)
void setup() override
std::unique_ptr<Op> clone() const override
float getSubgraphValue() const override
void appendOutlineAttributes(OpSerialiserBase&) const override
unsigned getDegree() const noexcept

Public Static Functions

static inline constexpr InIndex pseudoIndex() noexcept
static inline constexpr InIndex kernelSizeIndex() noexcept
static inline constexpr InIndex isOpenSplineIndex() noexcept
static inline constexpr OutIndex outBasisIndex() noexcept
static inline constexpr OutIndex outWeightIndexIndex() noexcept
class SplineWeightingOp : public popart::Op

Public Functions

SplineWeightingOp(const OperatorIdentifier &opid, const Op::Settings &settings)
void setup() override
std::unique_ptr<Op> clone() const override
float getSubgraphValue() const override
void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline constexpr InIndex inputIndex() noexcept
static inline constexpr InIndex weightIndex() noexcept
static inline constexpr InIndex basisIndex() noexcept
static inline constexpr InIndex weightIndexIndex() noexcept
static inline constexpr OutIndex outputIndex() noexcept
class SplitGradOp : public popart::Op

Public Functions

SplitGradOp(const SplitOp&, const Op::Settings&)
void setup() final
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline float getSubgraphValue() const final
inline int64_t getAxis() const

Public Static Functions

static inline OutIndex getOutIndex()
class SplitOp : public popart::Op

Public Functions

SplitOp(const OperatorIdentifier&, int64_t axis_, const std::vector<int64_t> split_, const Op::Settings&)
void setup() final
std::unique_ptr<Op> clone() const final
inline float getSubgraphValue() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<int64_t> getSplitSizes() const
inline int64_t getAxis() const
inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()
class SqrtGradOp : public popart::Op

Public Functions

SqrtGradOp(const SqrtOp &fwdOp)
std::unique_ptr<Op> clone() const final
void setup() final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdOutInIndex()
static inline OutIndex getOutIndex()
class SqrtOp : public popart::ElementWiseUnaryOp

Public Functions

SqrtOp(const OperatorIdentifier &_opid, const Op::Settings&)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class SquareOp : public popart::ElementWiseUnaryOp

Public Functions

SquareOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class StashOp : public popart::Op

Public Functions

StashOp(const OperatorIdentifier&, int64_t stashSize_, const Op::Settings&)
std::unique_ptr<Op> clone() const override
void setup() final
int64_t getStashSize()
TensorId getStashedTensorId() const
inline float getSubgraphValue() const final
void appendOutlineAttributes(OpSerialiserBase&) const override
inline bool isOutlineable() const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class SubgraphOp : public popart::Op

Subclassed by popart::CallOp, popart::LoopOp, popart::ScanOp

Public Functions

SubgraphOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
virtual std::unique_ptr<Op> clone() const override = 0
void appendOutlineAttributes(OpSerialiserBase &os) const override
view::Regions modifies(InIndex) const override
view::Regions aliases(InIndex, OutIndex) const override
void addAlias(InIndex in, OutIndex out, view::Chains fwdChains, view::Chains bwdChains)
void adjustAliasInIndices(InIndex fromIn, InIndex toIn)
void adjustAliasOutIndices(OutIndex fromOut, OutIndex toOut)
void adjustModifiedIndices(InIndex fromIn, InIndex toIn)
void addModified(InIndex in, view::Regions regions)
void removeModified(InIndex in)
void removeAlias(InIndex in, OutIndex out)
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
virtual InIndex subgraphInToOpInIndex(InIndex index) const = 0
virtual InIndex opInToSubgraphInIndex(InIndex index) const = 0
virtual OutIndex subgraphOutToOpOutIndex(OutIndex index) const = 0
virtual OutIndex opOutToSubgraphOutIndex(OutIndex index) const = 0
virtual Graph &getCalledGraph() const = 0
std::vector<const Graph*> getCalledGraphs() const override
virtual void setCalledGraph(Graph&) = 0
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
bool hasSideEffect() const override
virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override
virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override

Public Static Functions

static bool existsInBodyInputs(std::vector<std::string> &loopBodyInputIds, TensorId &tensorId)
static bool existsInOpInputs(std::vector<std::pair<TensorId, TensorInfo>> &opInputs, TensorId &tensorId)
static std::vector<TensorId> getBodyInputIds(const ONNX_NAMESPACE::GraphProto &bodyProto)
static std::vector<TensorId> getBodyOutputIds(const ONNX_NAMESPACE::GraphProto &bodyProto)
class SubsampleBaseOp : public popart::Op

Subclassed by popart::SubsampleInplaceOp, popart::SubsampleOp

Public Functions

SubsampleBaseOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void setup() override
std::vector<std::unique_ptr<Op>> getGradOps() final
inline std::vector<int64_t> getStrides() const
std::vector<uint32_t> strides_u32() const
bool strideSizeOne() const
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
void growAliasModel(AliasModel&) const override
inline float getSubgraphValue() const final

Public Members

std::vector<int64_t> strides

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class SubsampleGradOp : public popart::Op

Public Functions

SubsampleGradOp(const SubsampleBaseOp &fwdOp)
std::unique_ptr<Op> clone() const final
void setup() override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
inline std::vector<int64_t> getStrides() const
std::vector<uint32_t> strides_u32() const
inline const Shape &getFwdInputShape() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class SubsampleInplaceOp : public popart::SubsampleBaseOp

Public Functions

SubsampleInplaceOp(const SubsampleOp&)
std::unique_ptr<Op> clone() const final
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
view::Regions aliases(InIndex in, OutIndex) const final
class SubsampleOp : public popart::SubsampleBaseOp

Public Functions

SubsampleOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
class SubtractArg0GradOp : public popart::ReduceSumOp

Public Functions

SubtractArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
void setup() final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class SubtractArg1GradOp : public popart::ElementWiseBinaryArg1GradOp

Public Functions

SubtractArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)
std::unique_ptr<Op> clone() const final
class SumArgGradOp : public popart::LinearVariadicGradOp

Public Functions

SumArgGradOp(const SumOp&, InIndex inIndex)
const std::vector<GradInOutMapper> &gradInputInfo() const final
std::unique_ptr<Op> clone() const final
bool canBeReplacedByIdentity() const override
inline bool canShard() const override
class SumOp : public popart::VariadicOp

Public Functions

SumOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
inline bool canShard() const override
class SwishGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

SwishGradOp(const SwishOp&)
std::unique_ptr<Op> clone() const final
class SwishInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

SwishInplaceOp(const SwishOp&)
SwishInplaceOp(const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
class SwishOp : public popart::ElementWiseUnaryOp

Public Functions

SwishOp(const OperatorIdentifier &opid, const Op::Settings &settings)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
class SyncOp : public popart::Op

Public Functions

SyncOp(const Op::Settings&, poplar::SyncType syncType)
std::unique_ptr<Op> clone() const override
const poplar::SyncType &getSyncType() const
inline void setup() final
inline float getSubgraphValue() const final
inline bool hasSideEffect() const override
class TanhGradOp : public popart::Op

Public Functions

TanhGradOp(const TanhOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final
inline bool canShard() const override

Public Static Functions

static inline InIndex getGradInIndex()
static inline InIndex getFwdOutInIndex()
static inline OutIndex getOutIndex()
class TanhOp : public popart::ElementWiseUnaryOp

Public Functions

TanhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
class TensorRemapOp : public popart::Op

Op that creates a new output tensor with tensor layout created by downstream consumers, and then copies the input tensor to the output tensor.

Can improve tile memory liveness if the tensor without remapping is unsuitable for downstream consumers. Should only be used if actual issues occur, since remapping clones the tensor and can introduce more rearrangement and data copies than necessary.

Public Functions

TensorRemapOp(const OperatorIdentifier&, const TensorRemapType&, const Op::Settings&)
TensorRemapOp(const TensorRemapOp&)
virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline TensorRemapType getTensorRemapType() const
virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns

The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns

The subgraph value. Default: 0.

inline virtual bool isOutlineable() const final

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns

true if the op can be outlined, false otherwise. Default: true.

Public Static Functions

static inline InIndex getInIndex()
static inline InIndex getRefInIndex()
static inline OutIndex getOutIndex()
class ThresholdedReluGradOp : public popart::ElementWiseNonLinearUnaryGradOp

Public Functions

ThresholdedReluGradOp(const ThresholdedReluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
class ThresholdedReluInplaceOp : public popart::ElementWiseInplaceUnaryOp

Public Functions

ThresholdedReluInplaceOp(const ThresholdedReluOp&)
std::unique_ptr<Op> clone() const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
class ThresholdedReluOp : public popart::ElementWiseUnaryOp

Public Functions

ThresholdedReluOp(const OperatorIdentifier &opid, float _alpha, const Op::Settings &settings)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
void appendAttributes(OpSerialiserBase&) const override
inline float getAlpha() const
class TiedGatherGradOp : public popart::GatherGradOp

Public Functions

TiedGatherGradOp(const TiedGatherOp *fwdOp, int64_t axis)
std::unique_ptr<Op> clone() const final

Public Members

const TiedGatherOp *fwdOp
class TiedGatherOp : public popart::GatherOp

Public Functions

TiedGatherOp(int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)
std::unique_ptr<Op> clone() const final
std::vector<std::unique_ptr<Op>> getGradOps() final
class TileGradOp : public popart::TileOp

Public Functions

TileGradOp(const TileOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class TileOp : public popart::Op

Subclassed by popart::TileGradOp

Public Functions

TileOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
TileOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &repeats_, const Shape &outShape_, const Op::Settings &settings_)
std::vector<std::unique_ptr<Op>> getGradOps() final
std::unique_ptr<Op> clone() const override
void setup() final
virtual void connectInTensor(InIndex, TensorId) final
const Shape &getOutShape()
const std::vector<int64_t> &getRepeats() const
bool canBeReplacedByIdentity() const override
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class TopKGradOp : public popart::Op

Public Functions

TopKGradOp(const TopKOp&)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
int64_t getAxis() const
const TensorInfo &getGradOutInfo() const
void appendOutlineAttributes(OpSerialiserBase&) const override
inline float getSubgraphValue() const final
inline nonstd::optional<float> getAvailableMemoryProportion() const

Public Static Functions

static inline InIndex gradInIndex()
static inline InIndex indicesInIndex()
static inline OutIndex gradOutIndex()
class TopKOp : public popart::BaseSortOp

Subclassed by popart::SortOp

Public Functions

TopKOp(const OperatorIdentifier &_opid, int64_t k, int64_t axis, bool largest, bool sorted, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)
std::unique_ptr<Op> clone() const override
void setup() final
int64_t getK() const noexcept
bool getLargest() const noexcept
bool getSorted() const noexcept
bool getStable() const noexcept
std::vector<std::unique_ptr<Op>> getGradOps() final
void appendOutlineAttributes(OpSerialiserBase&) const final
inline nonstd::optional<float> getAvailableMemoryProportion() const

Public Static Functions

static inline OutIndex getValuesOutIndex()
static inline OutIndex getIndicesOutIndex()
class TransposeBaseOp : public popart::Op

Subclassed by popart::TransposeInplaceOp, popart::TransposeOp

Public Functions

TransposeBaseOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const final
inline void setPerm(const Shape &value)
inline const Shape &getPerm() const
std::vector<uint64_t> getPerm_u64() const
view::RegMap fwdRegMap(InIndex, OutIndex) const final
view::RegMap bwdRegMap(InIndex, OutIndex) const final
Shape generateReversePermutation() const
inline bool canShard() const override
int getOutBatchAxis(OutIndex) const override
void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class TransposeGradOp : public popart::TransposeOp

Public Functions

TransposeGradOp(const TransposeOp &fwdOp)
std::unique_ptr<Op> clone() const final
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
class TransposeInplaceOp : public popart::TransposeBaseOp

Public Functions

TransposeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
TransposeInplaceOp(const TransposeOp&)
std::unique_ptr<Op> clone() const final
inline view::Regions aliases(InIndex in, OutIndex) const final
inline bool isInplaceViewChange() const override
class TransposeOp : public popart::TransposeBaseOp

Subclassed by popart::TransposeGradOp

Public Functions

TransposeOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() final
void appendOutlineAttributes(OpSerialiserBase&) const override
bool canBeReplacedByIdentity() const override
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
inline bool isOutplaceViewChange() const override
class UnaryZeroGradOp : public popart::ZerosLikeOp

Public Functions

UnaryZeroGradOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
inline const std::vector<GradInOutMapper> &gradInputInfo() const
inline const std::map<int, int> &gradOutToNonGradIn() const

Public Static Functions

static std::vector<std::unique_ptr<Op>> getGradOpVector(const Op::Settings &settings_)
class UpsampleOp : public popart::Op

Public Functions

UpsampleOp(const OperatorIdentifier&, const Op::Settings&, UpsampleMode, const std::vector<float> &scales)
std::unique_ptr<Op> clone() const override
void setup() final
inline float getSubgraphValue() const final
void connectInTensor(InIndex inIndex, TensorId tenId) final
inline UpsampleMode getMode() const
inline const std::vector<float> &getScales() const

Public Static Functions

static inline InIndex getInIndex()
static inline OutIndex getOutIndex()
class VariadicGradOp : public popart::Op

Subclassed by popart::LinearVariadicGradOp, popart::NonLinearVariadicGradOp

Public Functions

VariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
std::unique_ptr<Op> clone() const override
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline InIndex getFwdIndex()
inline const TensorInfo &getFwdInputInfo()
inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()
static inline OutIndex getOutIndex()
class VariadicOp : public popart::Op

Subclassed by popart::MaxOp, popart::MeanOp, popart::MinOp, popart::SumOp

Public Functions

VariadicOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
virtual std::unique_ptr<Op> clone() const override = 0
std::vector<std::unique_ptr<Op>> getGradOps() final
void setup() final
bool canBeReplacedByIdentity() const final
inline float getSubgraphValue() const final

Public Static Functions

static inline OutIndex getOutIndex()
class WhereLhsInplaceOp : public popart::WhereOp

Public Functions

WhereLhsInplaceOp(const WhereOp &op)
std::unique_ptr<Op> clone() const override
view::Regions modifies(InIndex index) const final
view::Regions aliases(InIndex index, OutIndex) const final
class WhereOp : public popart::Op

Subclassed by popart::WhereLhsInplaceOp, popart::WhereRhsInplaceOp

Public Functions

WhereOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
std::vector<std::unique_ptr<Op>> getGradOps() override
void setup() final
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
inline float getSubgraphValue() const final
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier opId) const override
void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline InIndex conditionInIndex()
static inline InIndex xInIndex()
static inline InIndex yInIndex()
static inline OutIndex outIndex()
class WhereRhsInplaceOp : public popart::WhereOp

Public Functions

WhereRhsInplaceOp(const WhereOp &op)
std::unique_ptr<Op> clone() const override
view::Regions modifies(InIndex index) const final
view::Regions aliases(InIndex index, OutIndex) const final
class WhereXGradOp : public popart::Op

Public Functions

WhereXGradOp(const WhereOp &op)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final
std::vector<size_t> getFwdInShape() const

Public Static Functions

static inline InIndex fwdConditionInIndex()
static inline InIndex outGradInIndex()
static inline OutIndex outIndex()
class WhereYGradOp : public popart::Op

Public Functions

WhereYGradOp(const WhereOp &op)
std::unique_ptr<Op> clone() const override
const std::vector<GradInOutMapper> &gradInputInfo() const final
const std::map<int, int> &gradOutToNonGradIn() const final
void setup() final
inline float getSubgraphValue() const final
std::vector<size_t> getFwdInShape() const

Public Static Functions

static inline InIndex fwdConditionInIndex()
static inline InIndex outGradInIndex()
static inline OutIndex outIndex()
class ZerosBaseOp : public popart::ShapeOrLikeOp

Subclassed by popart::ZerosLikeOp, popart::ZerosOp

Public Functions

ZerosBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const override
inline std::vector<DataType> getSupportedDataTypes() const override

Public Static Functions

static std::vector<DataType> supportedDataTypes()
class ZerosLikeOp : public popart::ZerosBaseOp

Subclassed by popart::UnaryZeroGradOp

Public Functions

ZerosLikeOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
void setup() final
std::unique_ptr<Op> clone() const override
std::unique_ptr<ZerosOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()
class ZerosOp : public popart::ZerosBaseOp

Public Functions

ZerosOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, const Op::Settings &settings_)
std::unique_ptr<Op> clone() const final
void setup() final

14.8.4. Available Ops (Opx class)

class AbortOpx : public popart::popx::Opx

Public Functions

AbortOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AbsOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

AbsOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AccumulateBaseOpx : public popart::popx::VarUpdateOpx

Subclassed by popart::popx::AccumulateOpx, popart::popx::RescaleAccumulateOpx, popart::popx::SparseAccumulateOpx

Public Functions

AccumulateBaseOpx(Op*, Devicex*)
virtual void grow(poplar::program::Sequence&) const override = 0
poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const override
std::set<TensorId> mustExistBeforeCreate(InIndex) const override
InputCreatorType getInputCreatorType(InIndex) const final
bool hasCreatorViewChangers(InIndex index) const final
ViewChangers getCreatorViewChangers(InIndex index) const final
class AccumulateOpx : public popart::popx::AccumulateBaseOpx

Public Functions

AccumulateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AccumulatorScaleOpx : public popart::popx::VarUpdateOpx

Public Functions

AccumulatorScaleOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AdaDeltaUpdaterOpx : public popart::popx::Opx

Public Functions

AdaDeltaUpdaterOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
bool hasCreatorViewChangers(InIndex index) const final
ViewChangers getCreatorViewChangers(InIndex index) const final
class AdamUpdaterOpx : public popart::popx::Opx

Public Functions

AdamUpdaterOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AdamVarUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

AdamVarUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AddArg0GradOpx : public popart::popx::ReduceSumOpx

Public Functions

AddArg0GradOpx(Op*, Devicex*)
class AddArg1GradOpx : public popart::popx::ReduceSumOpx

Public Functions

AddArg1GradOpx(Op*, Devicex*)
class AddBiasBiasGradOpx : public popart::popx::ReduceSumOpx

Public Functions

AddBiasBiasGradOpx(Op*, Devicex*)
class AddBiasDataGradOpx : public popart::popx::Opx

Public Functions

AddBiasDataGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AddBiasInplaceOpx : public popart::popx::AddBiasOpx

Public Functions

AddBiasInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AddBiasOpx : public popart::popx::Opx

Subclassed by popart::popx::AddBiasInplaceOpx

Public Functions

AddBiasOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
std::set<TensorId> mustExistBeforeCreate(int index0) const override
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
class AddLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

AddLhsInplaceOpx(Op*, Devicex*)
class AddOpx : public popart::popx::ElementWiseBinaryOutplaceOpx

Public Functions

AddOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const override
class AddRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

AddRhsInplaceOpx(Op*, Devicex*)
class AllReduceOpx : public popart::popx::Opx

Public Functions

AllReduceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const
InputCreatorType getInputCreatorType(int index0) const
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const
view::RegMap unwindRegion(InIndex, OutIndex) const
class AndOpx : public popart::popx::BinaryComparisonOpx

Public Functions

AndOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ArgExtremaOpx : public popart::popx::Opx

Subclassed by popart::popx::ArgMaxOpx, popart::popx::ArgMinOpx

Public Functions

ArgExtremaOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ArgMaxOpx : public popart::popx::ArgExtremaOpx
class ArgMinOpx : public popart::popx::ArgExtremaOpx
class AsinGradOpx : public popart::popx::Opx

Public Functions

AsinGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AsinInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

AsinInplaceOpx(Op*, Devicex*)
class AsinOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

AsinOpx(Op*, Devicex*)
class Atan2LhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

Atan2LhsInplaceOpx(Op*, Devicex*)
class Atan2Opx : public popart::popx::ElementWiseBinaryOutplaceOpx

Public Functions

Atan2Opx(Op*, Devicex*)
class AtanGradOpx : public popart::popx::Opx

Public Functions

AtanGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class AtanInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

AtanInplaceOpx(Op*, Devicex*)
class AtanOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

AtanOpx(Op*, Devicex*)
class BaseConcatOpx : public popart::popx::Opx

Subclassed by popart::popx::ConcatInplaceOpx, popart::popx::ConcatOpx

Public Functions

BaseConcatOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class BaseExpandOpx : public popart::popx::Opx

Subclassed by popart::popx::ExpandInplaceOpx, popart::popx::ExpandOpx

Public Functions

BaseExpandOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class BasePadOpx : public popart::popx::Opx

Subclassed by popart::popx::PadInplaceOpx, popart::popx::PadOpx

Public Functions

BasePadOpx(Op*, Devicex*)
const BasePadOp &getBasePadOp() const
poplar::Tensor padGrow(poplar::Tensor inTensor, poplar::program::Sequence&, bool inPlaceAllowed) const
class BaseSliceOpx : public popart::popx::Opx

Subclassed by popart::popx::SliceInplaceOpx, popart::popx::SliceOpx

Public Functions

BaseSliceOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class BaseSortOpx : public popart::popx::Opx

Subclassed by popart::popx::TopKOpx

Public Functions

BaseSortOpx(Op*, Devicex*)
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final
class BaseWhereOpx : public popart::popx::Opx

Subclassed by popart::popx::WhereLhsInplaceOpx, popart::popx::WhereOpx, popart::popx::WhereRhsInplaceOpx

Public Functions

BaseWhereOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex inIndex) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class BatchNormGradOpx : public popart::popx::NormOpx

Public Functions

BatchNormGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class BatchNormOpx : public popart::popx::NormOpx

Public Functions

BatchNormOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class BinaryComparisonOpx : public popart::popx::Opx

Subclassed by popart::popx::AndOpx, popart::popx::EqualOpx, popart::popx::GreaterOpx, popart::popx::LessOpx, popart::popx::OrOpx

Public Functions

BinaryComparisonOpx(Op*, Devicex*)
class BitwiseBinaryOpx : public popart::popx::ElementWiseBinaryOpx

Public Functions

BitwiseBinaryOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class BitwiseNotOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

BitwiseNotOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class CallGradOpx : public popart::popx::CallOpx

Public Functions

CallGradOpx(Op*, Devicex*)
class CallOpx : public popart::popx::SubgraphOpx

Subclassed by popart::popx::CallGradOpx

Public Functions

CallOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
void grow(std::vector<poplar::program::Sequence>&) const final
InputCreatorType getInputCreatorType(InIndex) const
class CastGradOpx : public popart::popx::CastOpx

Public Functions

CastGradOpx(Op*, Devicex*)
class CastOpx : public popart::popx::Opx

Subclassed by popart::popx::CastGradOpx

Public Functions

CastOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class CeilInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

CeilInplaceOpx(Op*, Devicex*)
class CeilOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

CeilOpx(Op*, Devicex*)
class ClipGradOpx : public popart::popx::Opx

Public Functions

ClipGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ClipInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ClipInplaceOpx(Op*, Devicex*)
class ClipOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

ClipOpx(Op*, Devicex*)
class CollectivesBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::MultiCollectiveBaseOpx, popart::popx::ReplicatedAllGatherOpx, popart::popx::ReplicatedAllReduceOpx, popart::popx::ReplicatedReduceScatterOpx

Public Functions

CollectivesBaseOpx(Op*, Devicex*)
ReplicatedTensorShardingGroup getCollectiveLinkedGroup(ReplicatedTensorShardingIndicesIndex groupIndex) const

Function to determine which collective Ops need to be in the same collective linked group.

Ops in the same collective linked group need to use the same collective balanced reorder to ensure tensor layouts of tensors that interact with each other in the graph, are compatible.

Scenarios leading to collective Ops belonging to the same group:

  1. The CollectivesBaseOp::getCollectiveLinkedIndex() is connected to the same root tensor (i.e. tensor A connects to the getCollectiveLinkedIndex of a ReduceScatter and AllGather, directly or indirectly):

    A -> ReduceScatter -> IdentiyOp -> AllGather

  2. The RTS enabled input/output tensors of RTS enabled collective operations meet in the compute graph:

    B -> ReduceScatter -> C -> AllGather -> F -> ReduceScatter -> G \ VarUpdateOp / D -> ReduceScatter -> E

    C, E and the VarUpdateOp in this graph are replicated tensor sharded (RTS) and therefore, both ReduceScatter Ops and the AllGather Op end up in the same collective linked group. B, D, F, G are not sharded, and therefore, the ReduceScatter between F and G can be in a different collective linked group.

The primary motivation for collective linked groups is “folding” multiple RTS tensors together via e.g. outlining. Folding in this context is when two operations or tensors that were unique now use the same code or memory, which implies that for example tensor layouts need to be identical too. If the graph has 3 RTS enabled variables, for example, and 2 of them use the same VarUpdateOp due to outlining, this implies that we need to ensure all RTS related Ops connected to those 2 variables use identical CBR (collective balanced reorder) rearrangement.

CBR is set in the collective Ops themselves either during Opx::unwindTensorLayout, Opx:createInput or Opx::grow by calling createCollectiveBalancedReorder

The third variable would use a separate VarUpdateOp, and therefore is in a separate collective linked group, and can instantiate it’s own CBR, even if the tensor shapes matches.

getCollectiveLinkedGroup uses Ops that introduce RTS/CBR as a starting point (ReduceScatter & AllGather) and tracks all associated Ops that propagate RTS with a DFS search on the graph.

Parameters

groupIndex – The index of the rtsIndices for which to return the collective group.

Returns

Returns all linked tensors and their connected ops to coordinate tensor mapping of collective inputs and outputs

gcl::CollectiveBalancedReorder *getCollectiveBalancedReorder(ReplicatedTensorShardingIndicesIndex groupIndex) const

Get the existing CBR.

Parameters

groupIndex – The index of the rtsIndices for which to return the collective group.

Returns

Existing CBR for the input/output tensor of the collective Op

gcl::CollectiveBalancedReorder *createCollectiveBalancedReorder(poplar::Tensor tensor, ReplicatedTensorShardingIndicesIndex groupIndex) const

Create a new CBR instance for the reference tensor.

Parameters
  • tensor – non-sharded reference tensor

  • groupIndex – The index of the rtsIndices for which to return the collective group.

Returns

New CBR for the input/output tensor of the collective Op

class ConcatGradOpx : public popart::popx::Opx

Public Functions

ConcatGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ConcatInplaceOpx : public popart::popx::BaseConcatOpx

Public Functions

ConcatInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ConcatOpx : public popart::popx::BaseConcatOpx

Public Functions

ConcatOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ConvFlipWeightsGradOpx : public popart::popx::Opx

Public Functions

ConvFlipWeightsGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ConvOpx : public popart::popx::MultiConvBaseOpx

Public Functions

ConvOpx(Op*, Devicex*)
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
InputCreatorType getInputCreatorType(InIndex idx) const override
std::vector<poplar::Tensor> convolve(poplar::program::Sequence&, const std::vector<poplar::Tensor> &weights) const final
class ConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx

Public Functions

ConvWeightsGradOpx(Op*, Devicex*)
std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const final
class CopyVarUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

CopyVarUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class CosOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

CosOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class CtcBeamSearchDecoderOpx : public popart::popx::Opx

Public Functions

CtcBeamSearchDecoderOpx(Op *op, Devicex *device)
~CtcBeamSearchDecoderOpx()
void grow(poplar::program::Sequence &prog) const final
class CtcGradOpx : public popart::popx::Opx

Public Functions

CtcGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class CtcOpx : public popart::popx::Opx

Public Functions

CtcOpx(Op*, Devicex*)
~CtcOpx()
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override
InputCreatorType getInputCreatorType(InIndex index) const override
std::set<TensorId> mustExistBeforeCreate(InIndex index) const override
class CumSumGradOpx : public popart::popx::Opx

Public Functions

CumSumGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class CumSumOpx : public popart::popx::Opx

Public Functions

CumSumOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class DetachInplaceOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

DetachInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class DetachOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

DetachOpx(popart::Op*, popart::popx::Devicex*)
void grow(poplar::program::Sequence&) const
class DivOpx : public popart::popx::ElementWiseBinaryOpx

Public Functions

DivOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class DropoutOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

DropoutOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
InputCreatorType getInputCreatorType(InIndex) const override
class DynamicAddInplaceOpx : public popart::popx::DynamicAddOpx

Public Functions

inline DynamicAddInplaceOpx(Op *op, Devicex *devicex)
poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override
class DynamicAddOpx : public popart::popx::DynamicUpdateOpx

Subclassed by popart::popx::DynamicAddInplaceOpx

Public Functions

inline DynamicAddOpx(Op *op, Devicex *devicex)
void grow(poplar::program::Sequence&) const final
class DynamicSliceInplaceOpx : public popart::popx::DynamicSliceOpx

Public Functions

DynamicSliceInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class DynamicSliceOpx : public popart::popx::Opx

Subclassed by popart::popx::DynamicSliceInplaceOpx

Public Functions

DynamicSliceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class DynamicUpdateInplaceOpx : public popart::popx::DynamicUpdateOpx

Public Functions

DynamicUpdateInplaceOpx(Op*, Devicex*)
poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override
class DynamicUpdateOpx : public popart::popx::Opx

Subclassed by popart::popx::DynamicAddOpx, popart::popx::DynamicUpdateInplaceOpx

Public Functions

DynamicUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
InputCreatorType getInputCreatorType(InIndex index) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
virtual poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const
class DynamicZeroInplaceOpx : public popart::popx::DynamicZeroOpx

Public Functions

inline DynamicZeroInplaceOpx(Op *op, Devicex *devicex)
poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override
class DynamicZeroOpx : public popart::popx::Opx

Subclassed by popart::popx::DynamicZeroInplaceOpx

Public Functions

inline DynamicZeroOpx(Op *op, Devicex *devicex)
void grow(poplar::program::Sequence&) const override
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
virtual poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const
class ElementWiseBinaryInplaceOpx : public popart::popx::ElementWiseBinaryOpx

Subclassed by popart::popx::AddLhsInplaceOpx, popart::popx::AddRhsInplaceOpx, popart::popx::Atan2LhsInplaceOpx, popart::popx::MulLhsInplaceOpx, popart::popx::MulRhsInplaceOpx, popart::popx::PowLhsInplaceOpx

Public Functions

inline ElementWiseBinaryInplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwbComputex> cx_)
void grow(poplar::program::Sequence&) const final
class ElementWiseBinaryOpx : public popart::popx::Opx

Subclassed by popart::popx::BitwiseBinaryOpx, popart::popx::DivOpx, popart::popx::ElementWiseBinaryInplaceOpx, popart::popx::ElementWiseBinaryOutplaceOpx, popart::popx::FmodOpx, popart::popx::PReluOpx, popart::popx::SubtractOpx

Public Functions

ElementWiseBinaryOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const override
std::set<TensorId> mustExistBeforeCreate(InIndex) const override
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class ElementWiseBinaryOutplaceOpx : public popart::popx::ElementWiseBinaryOpx

Subclassed by popart::popx::AddOpx, popart::popx::Atan2Opx, popart::popx::MulOpx, popart::popx::PowOpx

Public Functions

inline ElementWiseBinaryOutplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwbComputex> cx_)
void grow(poplar::program::Sequence&) const final
class ElementWiseUnaryInplaceOpx : public popart::popx::ElementWiseUnaryOpx

Subclassed by popart::popx::AsinInplaceOpx, popart::popx::AtanInplaceOpx, popart::popx::CeilInplaceOpx, popart::popx::ClipInplaceOpx, popart::popx::EluInplaceOpx, popart::popx::ExpInplaceOpx, popart::popx::Expm1InplaceOpx, popart::popx::FloorInplaceOpx, popart::popx::GeluErfInplaceOpx, popart::popx::GeluInplaceOpx, popart::popx::HardSigmoidInplaceOpx, popart::popx::IncrementModInplaceOpx, popart::popx::LeakyReluInplaceOpx, popart::popx::Log1pInplaceOpx, popart::popx::LogSoftmaxInplaceOpx, popart::popx::NearbyIntInplaceOpx, popart::popx::ReluInplaceOpx, popart::popx::RoundInplaceOpx, popart::popx::ScaleInplaceOpx, popart::popx::SeluInplaceOpx, popart::popx::ShrinkInplaceOpx, popart::popx::SigmoidInplaceOpx, popart::popx::SignInplaceOpx, popart::popx::SinhInplaceOpx, popart::popx::SoftmaxInplaceOpx, popart::popx::SoftPlusInplaceOpx, popart::popx::SoftSignInplaceOpx, popart::popx::SwishInplaceOpx, popart::popx::ThresholdedReluInplaceOpx

Public Functions

inline ElementWiseUnaryInplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwuComputex> cx_)
void grow(poplar::program::Sequence &prog) const final
class ElementWiseUnaryOpx : public popart::popx::Opx

Subclassed by popart::popx::AbsOpx, popart::popx::BitwiseNotOpx, popart::popx::CosOpx, popart::popx::DetachInplaceOpx, popart::popx::DetachOpx, popart::popx::DropoutOpx, popart::popx::ElementWiseUnaryInplaceOpx, popart::popx::ElementWiseUnaryOutplaceOpx, popart::popx::ErfxGradOpx, popart::popx::ErfxOpx, popart::popx::IdentityGradOpx, popart::popx::IdentityOpx, popart::popx::IsInfx, popart::popx::IsNaNx, popart::popx::LogOpx, popart::popx::LogSoftmaxGradOpx, popart::popx::MeanOpx, popart::popx::NegateGradOpx, popart::popx::NegateOpx, popart::popx::NotOpx, popart::popx::ReciprocalOpx, popart::popx::SigmoidGradOpx, popart::popx::SinOpx, popart::popx::SoftmaxGradOpx, popart::popx::SqrtOpx, popart::popx::SquareOpx

Public Functions

ElementWiseUnaryOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class ElementWiseUnaryOutplaceOpx : public popart::popx::ElementWiseUnaryOpx

Subclassed by popart::popx::AsinOpx, popart::popx::AtanOpx, popart::popx::CeilOpx, popart::popx::ClipOpx, popart::popx::EluOpx, popart::popx::Expm1Opx, popart::popx::ExpOpx, popart::popx::FloorOpx, popart::popx::GeluErfOpx, popart::popx::GeluOpx, popart::popx::HardSigmoidOpx, popart::popx::IncrementModOpx, popart::popx::LeakyReluOpx, popart::popx::Log1pOpx, popart::popx::LogSoftmaxOpx, popart::popx::NearbyIntOpx, popart::popx::ReluOpx, popart::popx::RoundOpx, popart::popx::ScaleOpx, popart::popx::SeluOpx, popart::popx::ShrinkOpx, popart::popx::SigmoidOpx, popart::popx::SignOpx, popart::popx::SinhOpx, popart::popx::SoftmaxOpx, popart::popx::SoftPlusOpx, popart::popx::SoftSignOpx, popart::popx::SwishOpx, popart::popx::ThresholdedReluOpx

Public Functions

ElementWiseUnaryOutplaceOpx(Op*, Devicex*, std::unique_ptr<EwuComputex> cx_)
void grow(poplar::program::Sequence&) const final
class EluGradOpx : public popart::popx::Opx

Public Functions

EluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class EluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

EluInplaceOpx(Op*, Devicex*)
class EluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

EluOpx(Op*, Devicex*)
class EqualOpx : public popart::popx::BinaryComparisonOpx

Public Functions

EqualOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ErfxGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

ErfxGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ErfxOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

ErfxOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ExchangeBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::HostBaseOpx, popart::popx::MultiExchangeOpx, popart::popx::RemoteBaseOpx, popart::popx::RemoteCodeLoadOpx

Public Functions

ExchangeBaseOpx(Op*, Devicex*)
inline std::set<TensorId> mustExistBeforeCreate(int) const override
class ExpInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ExpInplaceOpx(Op*, Devicex*)
class ExpOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

ExpOpx(Op*, Devicex*)
class ExpandGradOpx : public popart::popx::Opx

Public Functions

ExpandGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ExpandInplaceOpx : public popart::popx::BaseExpandOpx

Public Functions

ExpandInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ExpandOpx : public popart::popx::BaseExpandOpx

Public Functions

ExpandOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class Expm1InplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

Expm1InplaceOpx(Op*, Devicex*)
class Expm1Opx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

Expm1Opx(Op*, Devicex*)
class FloorInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

FloorInplaceOpx(Op*, Devicex*)
class FloorOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

FloorOpx(Op*, Devicex*)
class FmodOpx : public popart::popx::ElementWiseBinaryOpx

Public Functions

FmodOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GRUGradOpx : public popart::popx::Opx

Public Functions

GRUGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GRUOpx : public popart::popx::Opx

Public Functions

GRUOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const

Public Static Functions

static poplar::Tensor reshapePoplibWeightsForOnnx(poplar::Tensor)
static poplar::Tensor reshapePoplibBiasesForOnnx(poplar::Tensor)
class GatherBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::GatherOpx, popart::popx::TiedGatherOpx

Public Functions

GatherBaseOpx(Op*, Devicex*)
virtual void grow(poplar::program::Sequence&) const override = 0
virtual poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override = 0
virtual InputCreatorType getInputCreatorType(int index0) const override = 0
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class GatherGradOpx : public popart::popx::Opx

Public Functions

GatherGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final

Public Static Functions

static std::tuple<poplar::Tensor, poplar::Tensor, poplar::Tensor> handleNDMultiUpdate(poplar::Tensor target, poplar::Tensor update, poplar::Tensor indices, int64_t axis, int64_t group_size)
class GatherOpx : public popart::popx::GatherBaseOpx

Public Functions

GatherOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(int index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(int index) const final
class GeluGradOpx : public popart::popx::Opx

Public Functions

GeluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

GeluInplaceOpx(Op*, Devicex*)
class GeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

GeluOpx(Op*, Devicex*)
class GeluErfGradOpx : public popart::popx::Opx

Public Functions

GeluErfGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GeluErfInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

GeluErfInplaceOpx(Op*, Devicex*)
class GeluErfOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

GeluErfOpx(Op*, Devicex*)
class GetRandomSeedOpx : public popart::popx::Opx

Public Functions

GetRandomSeedOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GreaterOpx : public popart::popx::BinaryComparisonOpx

Public Functions

GreaterOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GroupNormGradOpx : public popart::popx::NormOpx

Public Functions

GroupNormGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class GroupNormOpx : public popart::popx::NormOpx

Public Functions

GroupNormOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class HardSigmoidGradOpx : public popart::popx::Opx

Public Functions

HardSigmoidGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class HardSigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

HardSigmoidInplaceOpx(Op*, Devicex*)
class HardSigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

HardSigmoidOpx(Op*, Devicex*)
class HistogramOpx : public popart::popx::Opx

Public Functions

HistogramOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class HostBaseOpx : public popart::popx::ExchangeBaseOpx

Subclassed by popart::popx::HostLoadOpx, popart::popx::HostStoreOpx

Public Functions

HostBaseOpx(Op*, Devicex*)
class HostLoadInplaceOpx : public popart::popx::HostLoadOpx

Public Functions

HostLoadInplaceOpx(Op*, Devicex*)
class HostLoadOpx : public popart::popx::HostBaseOpx

Subclassed by popart::popx::HostLoadInplaceOpx

Public Functions

HostLoadOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class HostStoreOpx : public popart::popx::HostBaseOpx

Public Functions

HostStoreOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex) const final
class IdentityGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

IdentityGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IdentityInplaceOpx : public popart::popx::Opx

Public Functions

IdentityInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IdentityLossGradOpx : public popart::popx::Opx

Public Functions

IdentityLossGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
inline bool outputCreatedExternally(OutIndex) const final
class IdentityLossOpx : public popart::popx::Opx

Public Functions

IdentityLossOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class IdentityOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

IdentityOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IfGradOpx : public popart::popx::IfOpx

Public Functions

IfGradOpx(Op*, Devicex*)
class IfOpx : public popart::popx::Opx

Subclassed by popart::popx::IfGradOpx

Public Functions

IfOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IncrementModInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

IncrementModInplaceOpx(Op*, Devicex*)
class IncrementModOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

IncrementModOpx(Op*, Devicex*)
class InitOpx : public popart::popx::Opx

Public Functions

InitOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
inline bool outputCreatedExternally(OutIndex) const final
class InstanceNormGradOpx : public popart::popx::NormOpx

Public Functions

InstanceNormGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class InstanceNormOpx : public popart::popx::NormOpx

Public Functions

InstanceNormOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IoTileCopyOpx : public popart::popx::Opx

Public Functions

IoTileCopyOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
inline bool outputCreatedExternally(OutIndex) const final
class IpuCopyOpx : public popart::popx::Opx

Public Functions

IpuCopyOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
PreparedCopyTensors createPipelinedOutput() const
void growPipelined(poplar::program::Sequence&, PreparedCopyTensors) const
inline InputCreatorType getInputCreatorType(InIndex index) const final
inline bool canUnwind(InIndex in, OutIndex out) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
poplar::Graph &srcGraph(InIndex) const final
poplar::Graph &dstGraph(OutIndex) const final
class L1GradOpx : public popart::popx::Opx

Public Functions

L1GradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class L1Opx : public popart::popx::Opx

Public Functions

L1Opx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class LRNGradOpx : public popart::popx::Opx

Public Functions

LRNGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LRNOpx : public popart::popx::Opx

Public Functions

LRNOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LSTMGradOpx : public popart::popx::Opx

Public Functions

LSTMGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LSTMOpx : public popart::popx::Opx

Public Functions

LSTMOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const
popnn::lstm::LstmParams createLSTMParams() const

Public Static Functions

static poplar::Tensor reshapePoplibWeightsForOnnx(poplar::Tensor, bool transpose)
class LambSquareOpx : public popart::popx::Opx

Public Functions

LambSquareOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LeakyReluGradOpx : public popart::popx::Opx

Public Functions

LeakyReluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LeakyReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

LeakyReluInplaceOpx(Op*, Devicex*)
class LeakyReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

LeakyReluOpx(Op*, Devicex*)
class LessOpx : public popart::popx::BinaryComparisonOpx

Public Functions

LessOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class Log1pInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

Log1pInplaceOpx(Op*, Devicex*)
class Log1pOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

Log1pOpx(Op*, Devicex*)
class LogOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

LogOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class LogSoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

LogSoftmaxGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor cloneNcopyGrouped(poplar::program::Sequence &s, const poplar::Tensor &t) const
class LogSoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

LogSoftmaxInplaceOpx(Op*, Devicex*)
class LogSoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

LogSoftmaxOpx(Op*, Devicex*)
class LoopOpx : public popart::popx::SubgraphOpx

Public Functions

LoopOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const
bool canUnwind(InIndex in, OutIndex out) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class LossScaleUpdateOpx : public popart::popx::Opx

Public Functions

LossScaleUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MatMulOpx : public popart::popx::Opx

Public Functions

MatMulOpx(Op*, Devicex*)
~MatMulOpx() override = default
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final
MatMulOp *getMatMulOp() const
void grow(poplar::program::Sequence&) const final
poplar::Type getOutputType(const poplar::Tensor &output) const
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const

Public Static Functions

static std::vector<std::size_t> onnxShapeToPoplar(const Shape &shape)
static void appendPoplarOptionsForOp(const MatMulBaseOp &op, poplar::OptionFlags &opts)
static void addPartialsType(const MatMulPartialsType &partialsType, poplar::OptionFlags &opts)
static std::pair<poplar::Tensor, poplar::Tensor> groupedMatMulInputsFromOpxInputs(MatMulBaseOp &matmul, poplar::Tensor lhs, poplar::Tensor rhs)
class MaxArgGradOpx : public popart::popx::Opx

Public Functions

MaxArgGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MaxOpx : public popart::popx::Opx

Public Functions

MaxOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class MeanArgGradOpx : public popart::popx::Opx

Public Functions

MeanArgGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MeanOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

MeanOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MinArgGradOpx : public popart::popx::Opx

Public Functions

MinArgGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MinOpx : public popart::popx::Opx

Public Functions

MinOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
class ModifyRandomSeedOpx : public popart::popx::Opx

Public Functions

ModifyRandomSeedOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class MulLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

MulLhsInplaceOpx(Op*, Devicex*)
class MulOpx : public popart::popx::ElementWiseBinaryOutplaceOpx

Public Functions

MulOpx(Op*, Devicex*)
class MulRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

MulRhsInplaceOpx(Op*, Devicex*)
class MultiCollectiveBaseOpx : public popart::popx::CollectivesBaseOpx

A base class for the lowering of different subclasses of MultiCollectiveBaseOp.

Each output tensor can be grown separately.

Subclassed by popart::popx::MultiReplicatedAllGatherOpx, popart::popx::MultiReplicatedAllReduceOpx, popart::popx::MultiReplicatedReduceScatterOpx

Public Functions

MultiCollectiveBaseOpx(Op *op, Devicex *devicex)
std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const override

Defines which “parts” use a particular input tensor There are “output->n()” parts in the collective operation: part “i” uses input “i” and the indices tensor at “i + output->n()” this logic is the same for all collective ops, even in the absence of an indices tensor.

Parameters

inTensor – the tensor for which to return a part id

OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const override

Defines which “part” is responsible for constructing a particular output There are “output->n()” parts: each part “i” produces output “i”.

Parameters

outTensor – the tensor for which to return a corresponding part id

class MultiConvBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::ConvOpx, popart::popx::MultiConvOpx

Public Functions

inline MultiConvBaseOpx(Op *op, Devicex *dv)
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final
InputCreatorType getInputCreatorType(InIndex) const override
void grow(poplar::program::Sequence&) const final
poplar::OptionFlags getConvOptions(int, std::string pass = "") const
std::string getFwdPassFlagString() const
inline virtual std::vector<poplar::Tensor> convolve(poplar::program::Sequence &prog, const std::vector<poplar::Tensor> &weights) const
inline virtual poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const
inline virtual poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const
bool isWeightsInIndex(InIndex) const
bool isDataInIndex(InIndex) const
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const
class MultiConvOpx : public popart::popx::MultiConvBaseOpx

Public Functions

MultiConvOpx(Op*, Devicex*)
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
std::vector<poplar::Tensor> convolve(poplar::program::Sequence&, const std::vector<poplar::Tensor>&) const final
class MultiConvWeightsGradBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::ConvWeightsGradOpx, popart::popx::MultiConvWeightsGradOpx

Public Functions

inline MultiConvWeightsGradBaseOpx(Op *op, Devicex *dv)
void grow(poplar::program::Sequence&) const final
inline virtual std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const
poplar::OptionFlags getConvOptions(int convIndex = 0) const
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const
class MultiConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx

Public Functions

MultiConvWeightsGradOpx(Op*, Devicex*)
std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const final
class MultiExchangeOpx : public popart::popx::ExchangeBaseOpx

Public Functions

MultiExchangeOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
std::vector<std::pair<int, int>> getSegments() const
std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const final
OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const final
void growPart(OpxGrowPartId id) const final
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
bool canUnwind(InIndex, OutIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class MultiReplicatedAllReduceOpx : public popart::popx::MultiCollectiveBaseOpx

Lowers the MultiReplicatedAllReduceOp to Poplar by growing each individual output tensor, and performing a to-destination all-reduce on a concatenation of the input tensors.

Mixing of both in-place and out-place all-reduce operations is supported.

Public Functions

MultiReplicatedAllReduceOpx(popart::Op *op, Devicex *devicex)
InputCreatorType getInputCreatorType(InIndex) const override
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex in, OutIndex out) const override
view::RegMap unwindRegion(InIndex, OutIndex) const override
void growPart(OpxGrowPartId id) const override
void grow(poplar::program::Sequence &prog) const override
class NegateGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

NegateGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NegateOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

NegateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NllGradOpx : public popart::popx::Opx

Public Functions

NllGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NllOpx : public popart::popx::Opx

Public Functions

NllOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final

Public Static Functions

static void flattenAndEncodeOneHot(const Opx &opx, poplar::program::Sequence &prog, const poplar::Tensor &probs, const poplar::Tensor &label, poplar::Tensor &probs2D, poplar::Tensor &label1D, poplar::Tensor &oneHot)
static poplar::Tensor applyMaskInPlaceForIgnoredIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor labels, int ignoreIndex, poplar::program::Sequence &prog)
static void applyScalingInPlaceForMeanReduction(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::program::Sequence &prog)
static void applyScalingInPlaceForMeanReductionWithIgnoreIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::Tensor mask, poplar::program::Sequence &prog)
static void handleLossGradScaling(const Opx &opx, bool hasIgnoreIndex, int64_t ignoreIndex, bool meanReduce, poplar::Tensor &oneHot, poplar::Tensor &gradIn, poplar::Tensor &label1D, poplar::program::Sequence &prog)
static void handleLossOutReducedToScalar(const Opx &opx, bool hasIgnoreIndex, int64_t ignoreIndex, bool meanReduce, poplar::Tensor &reduction, poplar::Tensor &label1D, poplar::program::Sequence &prog, const OutIndex outIdx)
static void handleLossOutNotReducedToScalar(const Opx &opx, poplar::Tensor &reduction, const poplar::Tensor &label, poplar::Tensor &label1D, poplar::program::Sequence &prog)
class NlllWithSoftmaxGradDirectOpx : public popart::popx::Opx

Public Functions

NlllWithSoftmaxGradDirectOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NopOpx : public popart::popx::Opx

Public Functions

NopOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NormOpx : public popart::popx::Opx

Subclassed by popart::popx::BatchNormGradOpx, popart::popx::BatchNormOpx, popart::popx::GroupNormGradOpx, popart::popx::GroupNormOpx, popart::popx::InstanceNormGradOpx, popart::popx::InstanceNormOpx

Public Functions

NormOpx(Op*, Devicex*)
class NotOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

NotOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class NormalizeImageOpx : public popart::popx::Opx

Public Functions

NormalizeImageOpx(popart::Op *op, popart::popx::Devicex *devicex)
poplar::Tensor createInput(popart::InIndex index, const poplar::DebugNameAndId &dnai) const override
popart::popx::InputCreatorType getInputCreatorType(popart::InIndex index) const override
std::set<popart::TensorId> mustExistBeforeCreate(popart::InIndex) const override
poplar::Tensor createNormalizedImageInput(const poplar::DebugNameAndId &dnai) const
void grow(poplar::program::Sequence &prog) const final
class OnehotGradOpx : public popart::popx::Opx

Public Functions

OnehotGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class OnehotOpx : public popart::popx::Opx

Public Functions

OnehotOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class OrOpx : public popart::popx::BinaryComparisonOpx

Public Functions

OrOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class PReluOpx : public popart::popx::ElementWiseBinaryOpx

Public Functions

PReluOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class PadGradOpx : public popart::popx::SliceOpx

Public Functions

PadGradOpx(Op*, Devicex*)
class PadInplaceOpx : public popart::popx::BasePadOpx

Public Functions

PadInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class PadOpx : public popart::popx::BasePadOpx

Public Functions

PadOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
template<typename LSTMOP>
class PopartLSTMOpxBase : public popart::popx::Opx

Subclassed by popart::popx::PopartLSTMGradOpx, popart::popx::PopartLSTMOpx

Public Functions

inline PopartLSTMOpxBase(Op *op, Devicex *devicex)
class PowLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx

Public Functions

PowLhsInplaceOpx(Op*, Devicex*)
class PowOpx : public popart::popx::ElementWiseBinaryOutplaceOpx

Public Functions

PowOpx(Op*, Devicex*)
class PrintTensorOpx : public popart::popx::Opx

Public Functions

PrintTensorOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class RMSPropUpdaterOpx : public popart::popx::Opx

Public Functions

RMSPropUpdaterOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class RNNGradOpx : public popart::popx::Opx

Public Functions

RNNGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const
class RNNOpx : public popart::popx::Opx

Public Functions

RNNOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const
class RandomNormalOpx : public popart::popx::Opx

Public Functions

RandomNormalOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class RandomUniformOpx : public popart::popx::Opx

Public Functions

RandomUniformOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReciprocalOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

ReciprocalOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReduceL1GradOpx : public popart::popx::Opx

Public Functions

ReduceL1GradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceL1Opx : public popart::popx::Opx

Public Functions

ReduceL1Opx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceL2GradOpx : public popart::popx::Opx

Public Functions

ReduceL2GradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceL2Opx : public popart::popx::Opx

Public Functions

ReduceL2Opx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceLogSumExpGradOpx : public popart::popx::Opx

Public Functions

ReduceLogSumExpGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceLogSumExpOpx : public popart::popx::Opx

Public Functions

ReduceLogSumExpOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceLogSumGradOpx : public popart::popx::Opx

Public Functions

ReduceLogSumGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceLogSumOpx : public popart::popx::Opx

Public Functions

ReduceLogSumOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMaxGradOpx : public popart::popx::Opx

Public Functions

ReduceMaxGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMaxOpx : public popart::popx::Opx

Public Functions

ReduceMaxOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMeanGradOpx : public popart::popx::Opx

Public Functions

ReduceMeanGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMeanOpx : public popart::popx::Opx

Public Functions

ReduceMeanOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMedianGradOpx : public popart::popx::Opx

Public Functions

ReduceMedianGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMedianOpx : public popart::popx::Opx

Public Functions

ReduceMedianOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMinGradOpx : public popart::popx::Opx

Public Functions

ReduceMinGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceMinOpx : public popart::popx::Opx

Public Functions

ReduceMinOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceProdGradOpx : public popart::popx::Opx

Public Functions

ReduceProdGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceProdOpx : public popart::popx::Opx

Public Functions

ReduceProdOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceSumGradOpx : public popart::popx::Opx

Public Functions

ReduceSumGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceSumOpx : public popart::popx::Opx

Subclassed by popart::popx::AddArg0GradOpx, popart::popx::AddArg1GradOpx, popart::popx::AddBiasBiasGradOpx, popart::popx::SubtractArg0GradOpx

Public Functions

ReduceSumOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceSumSquareGradOpx : public popart::popx::Opx

Public Functions

ReduceSumSquareGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReduceSumSquareOpx : public popart::popx::Opx

Public Functions

ReduceSumSquareOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ReluGradOpx : public popart::popx::Opx

Public Functions

ReluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ReluInplaceOpx(Op*, Devicex*)
class ReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

ReluOpx(Op*, Devicex*)
class RemoteBaseOpx : public popart::popx::ExchangeBaseOpx

Subclassed by popart::popx::RemoteLoadOpx, popart::popx::RemoteStoreOpx

Public Functions

RemoteBaseOpx(Op*, Devicex*)
class RemoteLoadInplaceOpx : public popart::popx::RemoteLoadOpx

Public Functions

RemoteLoadInplaceOpx(Op*, Devicex*)
class RemoteLoadOpx : public popart::popx::RemoteBaseOpx

Subclassed by popart::popx::RemoteLoadInplaceOpx

Public Functions

RemoteLoadOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class RemoteStoreOpx : public popart::popx::RemoteBaseOpx

Public Functions

RemoteStoreOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReplicatedAllGatherOpx : public popart::popx::CollectivesBaseOpx

Public Functions

ReplicatedAllGatherOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final
bool hasCreatorViewChangers(InIndex index) const final
ViewChangers getCreatorViewChangers(InIndex index) const final
class ReplicatedAllReduceInplaceOpx : public popart::popx::ReplicatedAllReduceOpx

Public Functions

ReplicatedAllReduceInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReplicatedAllReduceOpx : public popart::popx::CollectivesBaseOpx

Subclassed by popart::popx::ReplicatedAllReduceInplaceOpx

Public Functions

ReplicatedAllReduceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class ReplicatedReduceScatterOpx : public popart::popx::CollectivesBaseOpx

Public Functions

ReplicatedReduceScatterOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex) const final
DnfTensorIds mustExistBeforeCreateDNF(InIndex index0) const final
bool hasCreatorViewChangers(InIndex index) const final
ViewChangers getCreatorViewChangers(InIndex index) const final
class RescaleAccumulateOpx : public popart::popx::AccumulateBaseOpx

Public Functions

RescaleAccumulateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReshapeBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::ReshapeInplaceOpx, popart::popx::ReshapeOpx

Public Functions

ReshapeBaseOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex inIndex, OutIndex outIndex) const final
class ReshapeGradOpx : public popart::popx::ReshapeOpx

Public Functions

ReshapeGradOpx(Op*, Devicex*)
class ReshapeInplaceOpx : public popart::popx::ReshapeBaseOpx

Public Functions

ReshapeInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReshapeOpx : public popart::popx::ReshapeBaseOpx

Subclassed by popart::popx::ReshapeGradOpx

Public Functions

ReshapeOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ResizeGradOpx : public popart::popx::Opx

Public Functions

ResizeGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ResizeOpx : public popart::popx::Opx

Public Functions

ResizeOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
template<typename Derived>
class RestoreBaseOpx : public popart::popx::Opx

Base class for restore opxs.

Template Parameters

Opx – is subclass of RestoreBaseOpx. Must have type alias OpType defined as the Op that it corresponds to.

Public Functions

RestoreBaseOpx(Op *op, Devicex *devicex)
virtual void grow(poplar::program::Sequence&) const = 0
class ReverseBaseOpx : public popart::popx::Opx

Subclassed by popart::popx::ReverseInplaceOpx, popart::popx::ReverseOpx

Public Functions

ReverseBaseOpx(Op*, Devicex*)
inline InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex inIndex, OutIndex outIndex) const final
class ReverseGradOpx : public popart::popx::ReverseOpx

Public Functions

ReverseGradOpx(Op*, Devicex*)
class ReverseInplaceOpx : public popart::popx::ReverseBaseOpx

Public Functions

ReverseInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ReverseOpx : public popart::popx::ReverseBaseOpx

Subclassed by popart::popx::ReverseGradOpx

Public Functions

ReverseOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class RoundInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

RoundInplaceOpx(Op*, Devicex*)
class RoundOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

RoundOpx(Op*, Devicex*)
class SGD0VarUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

SGD0VarUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SGD1AcclUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

SGD1AcclUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SGD1VarUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

SGD1VarUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ScaleInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ScaleInplaceOpx(Op*, Devicex*)
class ScaleGradOpx : public popart::popx::ScaleOpx

Public Functions

ScaleGradOpx(Op*, Devicex*)
class ScaleOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Subclassed by popart::popx::ScaleGradOpx

Public Functions

ScaleOpx(Op*, Devicex*)
class ScaledAddLhsInplaceOpx : public popart::popx::ScaledAddOpx

Public Functions

ScaledAddLhsInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ScaledAddOpx : public popart::popx::Opx

Subclassed by popart::popx::ScaledAddLhsInplaceOpx, popart::popx::ScaledAddRhsInplaceOpx

Public Functions

ScaledAddOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ScaledAddRhsInplaceOpx : public popart::popx::ScaledAddOpx

Public Functions

ScaledAddRhsInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ScaledVarUpdateOpx : public popart::popx::VarUpdateOpx

Public Functions

ScaledVarUpdateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ScatterDataGradOpx : public popart::popx::Opx

Public Functions

ScatterDataGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class ScatterOpx : public popart::popx::ScatterReduceOpx

Public Functions

ScatterOpx(Op*, Devicex*)
class ScatterReduceGradOpx : public popart::popx::Opx

Public Functions

ScatterReduceGradOpx(Op*, Devicex*)
~ScatterReduceGradOpx()
void grow(poplar::program::Sequence&) const final override
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final override
InputCreatorType getInputCreatorType(InIndex index) const final override
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final override
class ScatterReduceOpx : public popart::popx::Opx

Subclassed by popart::popx::ScatterOpx

Public Functions

ScatterReduceOpx(Op*, Devicex*)
~ScatterReduceOpx()
void grow(poplar::program::Sequence&) const final override
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final override
InputCreatorType getInputCreatorType(InIndex index) const final override
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final override
class ScatterUpdateGradOpx : public popart::popx::Opx

Public Functions

ScatterUpdateGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class SeluGradOpx : public popart::popx::Opx

Public Functions

SeluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SeluInplaceOpx(Op*, Devicex*)
class SeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SeluOpx(Op*, Devicex*)
class SequenceSliceInplaceOpx : public popart::popx::Opx

Public Functions

SequenceSliceInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SequenceSliceOpx : public popart::popx::Opx

Public Functions

SequenceSliceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ShapedDropoutOpx : public popart::popx::Opx

Public Functions

ShapedDropoutOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const override
class ShrinkGradOpx : public popart::popx::Opx

Public Functions

ShrinkGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ShrinkInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ShrinkInplaceOpx(Op*, Devicex*)
class ShrinkOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

ShrinkOpx(Op*, Devicex*)
class SigmoidGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

SigmoidGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SigmoidInplaceOpx(Op*, Devicex*)
class SigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SigmoidOpx(Op*, Devicex*)
class SignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SignInplaceOpx(Op*, Devicex*)
class SignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SignOpx(Op*, Devicex*)
class SinOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

SinOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SinhGradOpx : public popart::popx::Opx

Public Functions

SinhGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SinhInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SinhInplaceOpx(Op*, Devicex*)
class SinhOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SinhOpx(Op*, Devicex*)
class SliceInplaceOpx : public popart::popx::BaseSliceOpx

Public Functions

SliceInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SliceOpx : public popart::popx::BaseSliceOpx

Subclassed by popart::popx::PadGradOpx

Public Functions

SliceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SoftPlusGradOpx : public popart::popx::Opx

Public Functions

SoftPlusGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SoftPlusInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SoftPlusInplaceOpx(Op*, Devicex*)
class SoftPlusOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SoftPlusOpx(Op*, Devicex*)
class SoftSignGradOpx : public popart::popx::Opx

Public Functions

SoftSignGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SoftSignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SoftSignInplaceOpx(Op*, Devicex*)
class SoftSignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SoftSignOpx(Op*, Devicex*)
class SoftmaxGradDirectOpx : public popart::popx::Opx

Public Functions

SoftmaxGradDirectOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

SoftmaxGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SoftmaxInplaceOpx(Op*, Devicex*)
class SoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SoftmaxOpx(Op*, Devicex*)
class SparseAccumulateOpx : public popart::popx::AccumulateBaseOpx

Public Functions

SparseAccumulateOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final
std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class SplitOpx : public popart::popx::Opx

Public Functions

SplitOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SqrtOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

SqrtOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SquareOpx : public popart::popx::ElementWiseUnaryOpx

Public Functions

SquareOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class StashOpx : public popart::popx::Opx

Public Functions

StashOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SubgraphOpx : public popart::popx::Opx

Subclassed by popart::popx::CallOpx, popart::popx::LoopOpx

Public Functions

SubgraphOpx(Op*, Devicex*)
inline bool outputCreatedExternally(OutIndex) const final
PreparedTensorInfos getInputsToPrepare() const override
PreparedTensorInfos getOutputsToPrepare() const override
class SubsampleGradOpx : public popart::popx::Opx

Public Functions

SubsampleGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SubsampleInplaceOpx : public popart::popx::Opx

Public Functions

SubsampleInplaceOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SubsampleOpx : public popart::popx::Opx

Public Functions

SubsampleOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SubtractArg0GradOpx : public popart::popx::ReduceSumOpx

Public Functions

SubtractArg0GradOpx(Op*, Devicex*)
class SubtractOpx : public popart::popx::ElementWiseBinaryOpx

Public Functions

SubtractOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SumArgGradOpx : public popart::popx::Opx

Public Functions

SumArgGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SumOpx : public popart::popx::Opx

Public Functions

SumOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class SwishGradOpx : public popart::popx::Opx

Public Functions

SwishGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class SwishInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

SwishInplaceOpx(Op*, Devicex*)
class SwishOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

SwishOpx(Op*, Devicex*)
class SyncOpx : public popart::popx::Opx

Public Functions

SyncOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class TanhGradOpx : public popart::popx::Opx

Public Functions

TanhGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class TanhOpx : public popart::popx::Opx

Public Functions

TanhOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class TensorRemapOpx : public popart::popx::Opx

Public Functions

TensorRemapOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
bool outputCreatedExternally(OutIndex) const final
InputCreatorType getInputCreatorType(InIndex index) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class ThresholdedReluGradOpx : public popart::popx::Opx

Public Functions

ThresholdedReluGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ThresholdedReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx

Public Functions

ThresholdedReluInplaceOpx(Op*, Devicex*)
class ThresholdedReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx

Public Functions

ThresholdedReluOpx(Op*, Devicex*)
class TiedGatherOpx : public popart::popx::GatherBaseOpx

Public Functions

TiedGatherOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(int index0) const final
poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final
class TileGradOpx : public popart::popx::Opx

Public Functions

TileGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class TileOpx : public popart::popx::Opx

Public Functions

TileOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class TopKGradOpx : public popart::popx::Opx

Public Functions

TopKGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
poplar::Tensor createInputTensor(InIndex index, const poplar::DebugNameAndId &dnai) const final
InputCreatorType getInputCreatorType(InIndex index) const final
inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final
class TopKOpx : public popart::popx::BaseSortOpx

Public Functions

TopKOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class TransposeGradOpx : public popart::popx::TransposeOpx

Public Functions

TransposeGradOpx(Op*, Devicex*)
class TransposeInplaceOpx : public popart::popx::Opx

Public Functions

TransposeInplaceOpx(Op*, Devicex*)
InputCreatorType getInputCreatorType(InIndex) const final
void grow(poplar::program::Sequence&) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class TransposeOpx : public popart::popx::Opx

Subclassed by popart::popx::TransposeGradOpx

Public Functions

TransposeOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
InputCreatorType getInputCreatorType(InIndex) const final
poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final
view::RegMap unwindRegion(InIndex, OutIndex) const final
class VarUpdateOpx : public popart::popx::Opx

Subclassed by popart::popx::AccumulateBaseOpx, popart::popx::AccumulatorScaleOpx, popart::popx::AdamVarUpdateOpx, popart::popx::CopyVarUpdateOpx, popart::popx::ScaledVarUpdateOpx, popart::popx::SGD0VarUpdateOpx, popart::popx::SGD1AcclUpdateOpx, popart::popx::SGD1VarUpdateOpx

Public Functions

inline VarUpdateOpx(Op *op, Devicex *devicex)
class WhereLhsInplaceOpx : public popart::popx::BaseWhereOpx

Public Functions

WhereLhsInplaceOpx(Op*, Devicex*)
void doGrow(poplar::program::Sequence&, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final
class WhereOpx : public popart::popx::BaseWhereOpx

Public Functions

WhereOpx(Op*, Devicex*)
void doGrow(poplar::program::Sequence &prog, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final
class WhereRhsInplaceOpx : public popart::popx::BaseWhereOpx

Public Functions

WhereRhsInplaceOpx(Op*, Devicex*)
void doGrow(poplar::program::Sequence&, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final
class WhereXGradOpx : public popart::popx::Opx

Public Functions

WhereXGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class WhereYGradOpx : public popart::popx::Opx

Public Functions

WhereYGradOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ZerosOpx : public popart::popx::Opx

Public Functions

ZerosOpx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final

14.9. Patterns

#include <popart/patterns/patterns.hpp>
class Patterns

A class to hold which patterns are enabled and disabled.

Public Functions

Patterns(PatternsLevel level)

Constructor for the Patterns class.

Parameters

level – The pattern set to run.

inline Patterns()

Default constructor for the Patterns class.

The pattern set to run is set to PatternsLevel::Default.

Patterns(std::vector<std::string> patterns)

Constructor for the Patterns class.

Parameters

patterns – A vector of pattern names of patterns to be run.

bool isPatternEnabled(const std::type_index &t)

Check if a pattern (of class PreAliasPattern) is enabled.

Parameters

t – The pattern to check.

Returns

true if pattern is enabled; false otherwise.

bool isPatternEnabled(const std::string &t)

Check if pattern (not of class PreAliasPattern) is enabled.

Parameters

t – The name of the pattern to check.

Returns

true if pattern is enabled; false otherwise.

Patterns &enablePattern(const std::type_index &t, bool v)

Enable a pattern of class PreAliasPattern.

Parameters
  • t – The pattern to enable.

  • v – If true then enable pattern. If false then disable pattern.

Returns

Pattern.

Patterns &enablePattern(const std::string &t, bool v)

Enable a pattern not of class PreAliasPattern.

Parameters
  • t – The pattern to enable.

  • v – If true then enable pattern. If false then disable pattern.

Returns

Pattern.

bool isInitAccumulateEnabled()

Check if InitAccumulatePattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isPreUniReplEnabled()

Check if PreUniRepl is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isPostNReplEnabled()

Check if PostNRepl is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isSoftMaxGradDirectEnabled()

Check if SoftMaxGradDirect is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isNlllWithSoftMaxGradDirectEnabled()

Check if NlllWithSoftMaxGradDirect is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isSplitGatherEnabled()

Check if SplitGatherPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isOpToIdentityEnabled()

Check if OpToIdentityPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isUpsampleToResizeEnabled()

Check if UpsampleToResizePattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isSubtractArg1GradOpEnabled()

Check if SubtractArg1GradOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isMulArgGradOpEnabled()

Check if MulArgGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isReciprocalGradOpEnabled()

Check if ReciprocalGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isAtan2Arg0GradOpEnabled()

Check if Atan2Arg0GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isAtan2Arg1GradOpEnabled()

Check if Atan2Arg1GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isDivArg0GradOpEnabled()

Check if DivArg0GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isDivArg1GradOpEnabled()

Check if DivArg1GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isPowArg0GradOpEnabled()

Check if PowArg0GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isPowArg1GradOpEnabled()

Check if PowArg1GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isSinGradOpEnabled()

Check if SinGradOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isCosGradOpEnabled()

Check if CosGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

inline bool isInPlaceEnabled()

Check if InPlace is enabled.

Returns

true if pattern is enabled; false otherwise.

inline bool isUpdateInplacePrioritiesForIpuEnabled()

Check if UpdateInplacePrioritiesForIpu is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isSqrtGradOpEnabled()

Check if SqrtGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isConvFlipWeightsDoubleFlipEnabled()

Check if ConvFlipWeightsDoubleFlipPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isConvFlipWeightsGradOpEnabled()

Check if ConvFlipWeightsGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isExpandCastEnabled()

Check if ExpandCastPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isExpGradOpEnabled()

Check if ExpGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isExpm1GradOpEnabled()

Check if Expm1GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isLog1pGradOpEnabled()

Check if Log1pGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isLogGradOpEnabled()

Check if LogGradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isNegativeOneScaleEnabled()

Check if NegativeOneScalePattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isMatMulOpEnabled()

Check if MatMulOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isMatMulLhsGradOpEnabled()

Check if MatMulLhsGradOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isMatMulRhsGradOpEnabled()

Check if MatMulRhsGradOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isRandomNormalLikeOpPatternEnabled()

Check if RandomNormalLikeOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isRandomUniformLikeOpPatternEnabled()

Check if RandomUniformLikeOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isZerosLikeOpPatternEnabled()

Check if ZerosLikeOp is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isDecomposeBinaryConstScalarEnabled()

Check if DecomposeBinaryConstScalar is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isFmodArg0GradOpEnabled()

Check if FmodArg0GradOpPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isLambSerialisedWeightEnabled()

Check if LambSerialisedWeightPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isTiedGatherEnabled()

Check if TiedGatherPattern is enabled.

Returns

true if pattern is enabled; false otherwise.

bool isTiedGatherAccumulateEnabled()

Check if TiedGatherAccumulatePattern is enabled.

Returns

true if pattern is enabled; false otherwise.

Patterns &enableInitAccumulate(bool v)

Enable or disable InitAccumulatePattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enablePreUniRepl(bool v)

Enable or disable PreUniRepl.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enablePostNRepl(bool v)

Enable or disable PostNRepl.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableSoftMaxGradDirect(bool v)

Enable or disable SoftMaxGradDirect.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableNlllWithSoftMaxGradDirect(bool v)

Enable or disable NlllWithSoftMaxGradDirect.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableSplitGather(bool v)

Enable or disable SplitGatherPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableOpToIdentity(bool v)

Enable or disable OpToIdentityPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableUpsampleToResize(bool v)

Enable or disable UpsampleToResizePattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableSubtractArg1GradOp(bool v)

Enable or disable SubtractArg1GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableMulArgGradOp(bool v)

Enable or disable MulArgGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableReciprocalGradOp(bool v)

Enable or disable ReciprocalGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableAtan2Arg0GradOp(bool v)

Enable or disable Atan2Arg0GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableAtan2Arg1GradOp(bool v)

Enable or disable Atan2Arg1GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableDivArg0GradOp(bool v)

Enable or disable DivArg0GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableDivArg1GradOp(bool v)

Enable or disable DivArg1GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enablePowArg0GradOp(bool v)

Enable or disable PowArg0GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enablePowArg1GradOp(bool v)

Enable or disable PowArg1GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableSinGradOp(bool v)

Enable or disable SinGradOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableCosGradOp(bool v)

Enable or disable CosGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableInPlace(bool v)

Enable or disable InPlace.

Parameters

v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableUpdateInplacePrioritiesForIpu(bool v)

Enable or disable UpdateInplacePrioritiesForIpu.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableSqrtGradOp(bool v)

Enable or disable SqrtGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableConvFlipWeightsDoubleFlip(bool v)

Enable or disable ConvFlipWeightsDoubleFlipPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableConvFlipWeightsGradOp(bool v)

Enable or disable ConvFlipWeightsGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableExpGradOp(bool v)

Enable or disable ExpGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableExpm1GradOp(bool v)

Enable or disable Expm1GradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableLog1pGradOp(bool v)

Enable or disable Log1pGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableLogGradOp(bool v)

Enable or disable LogGradOpPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableNegativeOneScale(bool v)

Enable or disable NegativeOneScalePattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulOp(bool v)

Enable or disable MatMulOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulLhsGradOp(bool v)

Enable or disable MatMulLhsGradOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulRhsGradOp(bool v)

Enable or disable MatMulRhsGradOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableRandomNormalLikeOpPattern(bool v)

Enable or disable RandomNormalLikeOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableRandomUniformLikeOpPattern(bool v)

Enable or disable RandomUniformLikeOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableZerosLikeOpPattern(bool v)

Enable or disable ZerosLikeOp.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableDecomposeBinaryConstScalar(bool v)

Enable or disable DecomposeBinaryConstScalar.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableLambSerialisedWeight(bool v)

Enable or disable LambSerialisedWeightPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableTiedGather(bool v)

Enable or disable TiedGatherPattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

Patterns &enableTiedGatherAccumulate(bool v)

Enable or disable TiedGatherAccumulatePattern.

Parameters

v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableRuntimeAsserts(bool b)

Enable or disable runtime asserts.

If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.

Parameters

v – If true then enable runtime asserts. If false then disable run time asserts.

std::vector<std::unique_ptr<PreAliasPattern>> getPreAliasList()

Get list of patterns to be run before aliasing.

Returns

A vector of pointers to patterns of class PreAliasPattern.

bool operator==(const Patterns &p) const

Equality operator.

Parameters

p – Pattern to compare to.

Returns

true if patterns are equal; false otherwise.

inline const std::map<std::type_index, bool> &getSettings() const

Get the settings (enabled or disabled) for patterns.

Returns

Map of which patterns are enabled or disabled, indexed by value of std::type_index.

inline bool getInplaceEnabled() const

Check if the pattern InPlace is enabled.

Returns

true if pattern is enabled; false otherwise.

inline bool getUpdateInplacePrioritiesForIpuEnabled() const

Check if the pattern UpdateInplacePrioritiesForIpu is enabled.

Returns

true if pattern is enabled; false otherwise.

inline bool getRuntimeAssertsOn() const

Check if runtime asserts are enabled.

If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.

Returns

true if runtime asserts are enabled; false otherwise.

Public Static Functions

static Patterns create(std::vector<std::string> patterns)

Create a set of pattern to be run.

Parameters

patterns – A vector of pattern names of patterns to be run.

static std::vector<std::string> getAllPreAliasPatternNames()

Get the names of all patterns of class PreAliasPattern, using the same order as getPreAliasList().

Returns

A vector of the names of all patterns of class PreAliasPattern.

static bool isMandatory(Pattern &pattern)

Check if a pattern is mandatory.

Mandatory patterns must be enabled and must be run.

This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is true.

Parameters

pattern – The pattern to check.

Returns

If true then pattern is mandatory. If false then pattern is not mandatory.

static bool isMandatory(std::string &patternName)

Check if a pattern is mandatory.

Mandatory patterns must be enabled and must be run.

This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is true.

Parameters

patternName – The name of the pattern to check.

Returns

If true then pattern is mandatory. If false then pattern is not mandatory.

Friends

friend std::ostream &operator<<(std::ostream &os, const Patterns &patterns)

Write a string representation of patterns to an output stream.

Parameters
  • os – An output stream that the the string representation should be written to.

  • patterns – The patterns for which the string representation is created.

Returns

An output stream containing the string representation of the patterns.

class PreAliasPattern : public popart::Pattern

Subclassed by popart::AllReduceToIdentityPattern, popart::BinaryGradOpPattern, popart::ContiguateIpuCopyIndicesPattern, popart::ConvDataGradPattern, popart::ConvFlipWeightsDoubleFlipPattern, popart::ConvFlipWeightsGradOpPattern, popart::ConvTransposePattern, popart::CosGradOpPattern, popart::CoshOpPattern, popart::DecomposeBinaryConstScalar, popart::ElementWiseGradOpPattern< GRADOP, DOP >, popart::ExpandCastPattern, popart::ExpGradOpPattern, popart::Expm1GradOpPattern, popart::Fuser, popart::InitAccumulatePattern, popart::LambSerialisedWeightPattern, popart::LikeOpsPattern< L >, popart::Log1pGradOpPattern, popart::LogGradOpPattern, popart::LoopScanOutPattern, popart::LSTMPattern, popart::MatMulGradPattern, popart::MatMulPattern, popart::MulArgGradOpPattern, popart::NlllWithSoftmaxGradDirect, popart::OptimizerDecompose, popart::PackedDataBlockPattern, popart::PadSumPattern, popart::PostNRepl, popart::PreUniRepl, popart::ReciprocalGradOpPattern, popart::RemoveUnnecessaryLossGradCast, popart::ScanToLoopPattern, popart::SequenceExpander, popart::SlicePattern, popart::SplitGatherPattern, popart::SplitOpPattern, popart::SqrtGradOpPattern, popart::SumToAddPattern, popart::TiedGatherAccumulatePattern, popart::TiedGatherPattern, popart::TransposeToIdentityOrReshapePattern, popart::UpsampleToResizePattern, popart::ViewSimplifyPattern

Public Functions

PreAliasPattern() = default
virtual ~PreAliasPattern() = default
virtual std::vector<const Tensor*> touches(Op *op) const = 0
Op *makeReplacementOpInIr(const OperatorIdentifier&, Op *oldOp, const std::string name = "") const
virtual bool matches(Op *op) const = 0
virtual bool apply(Op *op) const = 0
bool touchesAnchored(Op*) const

14.9.1. Available patterns

class AllReduceToIdentityPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class BinaryGradOpPattern : public popart::PreAliasPattern

Subclassed by popart::Atan2Arg0GradOpPattern, popart::Atan2Arg1GradOpPattern, popart::DivArg0GradOpPattern, popart::DivArg1GradOpPattern, popart::FmodArg0GradOpPattern, popart::PowArg0GradOpPattern, popart::PowArg1GradOpPattern, popart::SubtractArg1GradOpPattern

Public Functions

std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const final
class ContiguateIpuCopyIndicesPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class ConvDataGradPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class ConvFlipWeightsDoubleFlipPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class ConvFlipWeightsGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class ConvTransposePattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class CosGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class CoshOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class DecomposeBinaryConstScalar : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
template<class GRADOP, class DOP>
class ElementWiseGradOpPattern : public popart::PreAliasPattern

Public Functions

inline bool matches(Op *op) const override
inline std::vector<const Tensor*> touches(Op*) const override
inline bool apply(Op *op) const override
class ExpGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class ExpandCastPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class Expm1GradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class Fuser : public popart::PreAliasPattern

Subclassed by popart::SoftmaxGradDirect

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class InitAccumulatePattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class LSTMPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op *op) const override
inline std::vector<const Tensor*> touches(Op*) const override
bool apply(Op *op) const override
class LambSerialisedWeightPattern : public popart::PreAliasPattern

This Pattern finds Weights that have been serialised and are being updated in the Lamb Optimizer in slices.

Transforming:

Slice(W) U_sliced } | (R1) | (R2) } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } Into:

Slice(W) U_sliced } | | } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } Sum Sum \ / AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } x N

A key property of LambSquare is that the output has not been sqrt yet, so it is valid to just Sum the outputs.

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
template<class L>
class LikeOpsPattern : public popart::PreAliasPattern

Public Functions

inline bool matches(Op *op) const final
inline std::vector<const Tensor*> touches(Op*) const final
inline bool apply(Op *op) const final
class Log1pGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class LogGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class LoopScanOutPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class MatMulGradPattern : public popart::PreAliasPattern

Subclassed by popart::MatMulLhsGradPattern, popart::MatMulRhsGradPattern

Public Functions

inline std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
virtual popart::Tensor *getIn(Op *op) const = 0
virtual popart::Tensor *getGradIn(Op *op) const = 0
virtual popart::Tensor *getGradOut(Op *op) const = 0
virtual InIndex getInIndex() const = 0
virtual InIndex getGradInIndex() const = 0
class MatMulPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op *op) const override
inline std::vector<const Tensor*> touches(Op*) const override
bool apply(Op *op) const override
class MulArgGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class NlllWithSoftmaxGradDirect : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class OptimizerDecompose : public popart::PreAliasPattern

Subclassed by popart::AdamDecompose, popart::AdaptiveDecompose, popart::SGD0Decompose, popart::SGD1Decompose, popart::SGD2Decompose

class PackedDataBlockPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class PadSumPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class PostNRepl : public popart::PreAliasPattern

Public Functions

PostNRepl() = default
~PostNRepl() override = default
bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class PreUniRepl : public popart::PreAliasPattern

Public Functions

PreUniRepl() = default
~PreUniRepl() override = default
bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class ReciprocalGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class RemoveUnnecessaryLossGradCast : public popart::PreAliasPattern

The RemoveUnnecessaryLossGradCast changes

fp32_lossScale -- Cast -- fp16_lossScale -- NllLossGradOp -- fp16_grad
                          fp16_probs -------'

to

fp32_lossScale -- NllLossGradOp -- fp16_grad
fp16_probs -------'

This corner case can occur in a model with fp16 activations when its fp16 loss scale is anchored for summation and upcast to fp32 in order to prevent overflow. In this case if we have a loss scale >max(fp16) the downcasting will result in a clipping of the loss scale.

Notice that even if the loss scale is >max(fp16) the resulting gradients can be within fp16 range. If the resulting gradients are >max(fp16), they will be clipped (unless the user has enabled NaN on overflow).

Public Functions

bool matches(Op *lossOp) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op *lossOp) const final
class ScanToLoopPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class SequenceExpander : public popart::PreAliasPattern

Subclassed by popart::NegativeOneScalePattern, popart::OpToIdentityPattern, popart::SplitGradOpToConcatPattern

Public Functions

std::vector<const Tensor*> touches(Op *op) const final
bool apply(Op *op) const final
class SplitGatherPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class SplitOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class SqrtGradOpPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class SumToAddPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class TiedGatherAccumulatePattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class TiedGatherPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class TransposeToIdentityOrReshapePattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class UpsampleToResizePattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class ViewSimplifyPattern : public popart::PreAliasPattern

Public Functions

bool matches(Op*) const override
std::vector<const Tensor*> touches(Op*) const override
bool apply(Op*) const override
class AdamDecompose : public popart::OptimizerDecompose

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
TensorId rescaleRatio(Graph &graph, AdamComboOp *combo) const
std::pair<Op*, TensorId> rescaleAccl(Graph &graph, AdamComboOp *combo, bool accl1, TensorId acclId, TensorId gradIntoAcclId, TensorId rescaleRatioId) const
class AdaptiveDecompose : public popart::OptimizerDecompose

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class Atan2Arg0GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class Atan2Arg1GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class DivArg0GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class DivArg1GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class FmodArg0GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const final
class MatMulLhsGradPattern : public popart::MatMulGradPattern

Public Functions

bool matches(Op *op) const override
inline popart::Tensor *getIn(Op *op) const override
inline popart::Tensor *getGradIn(Op *op) const override
inline popart::Tensor *getGradOut(Op *op) const override
inline InIndex getInIndex() const override
inline InIndex getGradInIndex() const override
class MatMulRhsGradPattern : public popart::MatMulGradPattern

Public Functions

bool matches(Op *op) const override
inline popart::Tensor *getIn(Op *op) const override
inline popart::Tensor *getGradIn(Op *op) const override
inline popart::Tensor *getGradOut(Op *op) const override
inline InIndex getInIndex() const override
inline InIndex getGradInIndex() const override
class NegativeOneScalePattern : public popart::SequenceExpander

Public Functions

bool matches(Op*) const override
class OpToIdentityPattern : public popart::SequenceExpander

Public Functions

bool matches(Op*) const override
class PowArg0GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class PowArg1GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override
class SGD0Decompose : public popart::OptimizerDecompose

Decomposes an SGD0ComboOp into the Ops and Tensors that implement the SGD0 optimiser step it describes.

If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix reservedAccumPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD0ComboOp.

See also

SGD0ComboOp

See also

SGD.

Recall the SGD0 optimiser step, possibly with gradient accumulation, replication:

(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (4) w = (w * wdsf0) - (slr0 * a) (5) a = 0 [if grad acc]

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by an SGD0VarUpdateOp.

(5) is implemented by an AccumulatorUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD0ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3-5) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD0ComboOp will be disconnected and erased.

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
Op *varUpdateAndEraseCombo(Graph &graph, SGD0ComboOp *combo, const TensorId &weightId, const TensorId &gradIntoUpdateId, const TensorId &updatedWeightId) const
class SGD1Decompose : public popart::OptimizerDecompose

Decomposes an SGD1ComboOp into the Ops and Tensors that implement the SGD1 optimiser step it describes.

Will create the accl tensor (combined accumulator and first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix reservedAcclPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to slr1 * w_0, where w_0 is the initial value of w.

See also

SGD1ComboOp

See also

SGD.

Recall the SGD1 optimiser step, possibly with gradient accumulation and replication:

(_) for each micro batch (1) allReduce(g) [if OptimizerReductionType=GradReduce] (2) v += dpsf1 * g (_) if enable nesterov momentum: (_) a += g (3) v = allReduce(v) [if OptimizerReductionType=AcclReduce] (_) if enable nesterov momentum: (_) a = allReduce(a) [if OptimizerReductionType=AcclReduce] (4) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := g if enable nesterov momentum else v] (5) w = w - slr1 * x (6) v = v * smm1 + swd1 * w

See the SGD docs in optimizer.hpp for derivation of the above.

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by a MulOp and a SGD1NesterovOp.

(5) is implemented by an SGD1VarUpdateOp.

(6) is implemented by an SGD1AcclUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD1ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3), (4), (5), (6) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD1ComboOp will be disconnected and erased.

Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD1ComboOp to the new ops without breaking this. To do this:

  1. At the start of apply, add a topo con from (1) to the combo op.

  2. Transfer topo cons from combo to (2). Since (1)/(2) are the first op to run in the optimiser step (the other ops consume (2)’s output so will always run after), this ensures the pre-existing topo cons on combo are respected.

  3. Insert topo con from (5) to (6), to ensure w update happens before the next step’s v update.

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class SGD2Decompose : public popart::OptimizerDecompose

Decomposes an SGD2ComboOp into the Ops and Tensors that implement the SGD2 optimiser step it describes.

Will create the accl1 tensor (first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix reservedAccl1Prefix(). The tensor will be initialised to 0.The DataType of the tensor is as specified in the SGD2ComboOp.

See also

SGD2ComboOp

See also

SGD.

If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix reservedAccumPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD2ComboOp.

Recall the SGD2 optimiser step, possibly with gradient accumulation, replication:

(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (_) // Note we break the single v update equation into two steps: (4) v += dpsf1 * a (5) v = v * smm1 + swd1 * w (6) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := a if enable nesterov momentum else v] (7) w = w - slr1 * x (8) a = 0 [if grad acc]

See the SGD docs in optimizer.hpp for derivation of the above.

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by an AccumulateOp.

(5) is implemented by an SGD2AcclUpdateOp. Note this is equivalent to an SGD1AcclUpdateOp.

(6) is implemented by a MulOp and a SGD1NesterovOp.

(7) is implemented by an SGD2VarUpdateOp. Note this is equivalent to an SGD1VarUpdateOp.

(8) is implemented by an AccumulatorUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD2ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3-8) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD2ComboOp will be disconnected and erased.

Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD2ComboOp to the new ops without breaking this. To do this:

  1. Transfer topo cons from combo to (1).

  2. Transfer topo cons from combo to (2).

  3. Insert topo con from (7) to (8) to ensure accum not zeroed until after v update (which consumes it).

  4. Transfer topo cons from combo to (8). Only required if not grad acc.

Public Functions

bool matches(Op*) const final
std::vector<const Tensor*> touches(Op*) const final
bool apply(Op*) const final
class SoftmaxGradDirect : public popart::Fuser
class SplitGradOpToConcatPattern : public popart::SequenceExpander

Public Functions

bool matches(Op*) const override
class SubtractArg1GradOpPattern : public popart::BinaryGradOpPattern

Public Functions

bool matches(Op*) const override

14.10. Transforms

#include <popart/transforms/transform.hpp>
class Transform

Subclassed by popart::AccumulateOuterFragmentParallelizer, popart::Autodiff, popart::AutomaticLossScale, popart::AutoVirtualGraph, popart::BatchSerialize, popart::ClipWeightGradientsByNorm, popart::ContiguateCollectivesTransform, popart::DecomposeLoops, popart::DecomposeSum, popart::DynamicOpTransform, popart::EnsureFp32LossScale, popart::ExplicitRecompute, popart::HostIOSetup, popart::InferPipelineStages, popart::InplaceAccumulateGradPartialsIntoOptimizerAccumTensor, popart::InterIpuCopy, popart::IoComputeTileCopy, popart::MainLoops, popart::MergeCollectivesTransform, popart::MergeCopies, popart::MergeDuplicateOps, popart::MergeExchange, popart::MergeLoops, popart::MergeVarUpdates, popart::OverlapIO, popart::Pipeline, popart::PreAutomaticLossScale, popart::Prune, popart::RandomSetup, popart::RemoteSetup, popart::SerializeMatMuls, popart::StochasticRounding, popart::StreamingMemory, popart::SubgraphOutline

Public Functions

inline Transform()
inline virtual ~Transform()
virtual bool apply(Graph &graph) const = 0
virtual std::size_t getId() const = 0
virtual std::string getName() const = 0

Public Static Functions

static void applyTransform(std::size_t transformId, Graph&)
static bool registerTransform(Transform *transform)
static std::size_t getIdFromName(const std::string &transformName)

14.10.1. Available transforms

class AccumulateOuterFragmentParallelizer : public popart::Transform

Public Functions

AccumulateOuterFragmentParallelizer()
virtual ~AccumulateOuterFragmentParallelizer()
virtual bool apply(Graph &graph) const final
virtual std::vector<std::vector<Op*>> getBinConstraints(const Graph &graph) const
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class AutoVirtualGraph : public popart::Transform

Public Functions

inline AutoVirtualGraph()
inline ~AutoVirtualGraph() override
bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final
float costFn(Op *op, bool training, float w_weights, float w_activations) const

Public Static Functions

static std::size_t id()
class Autodiff : public popart::Transform

Class responsible for the automatic differentiation (autodiff) transform.

Public Types

using TensorIds = std::vector<TensorId>

Vector of tensor IDs.

using FwdGraphId = GraphId

ID of the forward graph.

Public Functions

Autodiff()

Default constructor for the Autodiff class.

~Autodiff() override

Destructor for the Autodiff class.

bool apply(Graph &graph) const override

Perform automatic differentiation.

Implemented as applyToIr(graph.getIr()))

Parameters

graph – The autodiff transform is applied to the IR containing the Graph graph.

Returns

An indication of whether the automatic differentiation has been completed (true) or not (false).

virtual bool applyToIr(Ir &ir) const

Perform automatic differentiation.

Parameters

ir – The IR to apply the autodiff transform to.

Returns

An indication of whether the automatic differentiation has been completed (true) or not (false).

virtual FwdGraphToBwdGraphInfo apply(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo, AutodiffStitchStrategy stitchStrategy)

Create a backward graph.

Apply createBwdGraph() and stitch() recursively, top-down, to create a backward graph for the forward graph with ID fwdGraphId.

The forward graph being differentiated can call subgraphs. If the autodiff transform has already been applied to the subgraphs and the result stored in , then the backward graph that has already been created for the subgraphs will be used. Otherwise, this method will recurse on the subgraphs.

When recursing on a subgraph, this method does not know for which tensors gradients are required. If a null gradsRequiredForFwdId is passed, the autodiff transform will produce gradients for all input tensors.

For control over which gradients are produced for the subgraph, first (manually) call the autodiff transform on the subgraph and pass gradsRequiredForFwdId. Store the resultant BwdGraphInfo in the FwdGraphToBwdGraphInfo map passed to the autodiff call for the forward graph.

NOTE: This method may fail if any required gradient cannot be produced.

Parameters
  • ir – The IR to which this transform is applied.

  • fwdGraphId – The ID of the graph to differentiate.

  • gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, the autodiff transform will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, the autodiff transform will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.

  • gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark these all as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.

  • calledGraphsGradInfo – The result of applying the autodiff transform to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.

  • stitchStrategy – The method used to stitch any result of the autodiff transform for graphs that are directly or indirectly called by the graph. This stitch strategy will be universally applied to all relevant inputs.

Returns

An FwdGraphToBwdGraphInfo object that contains BwdGraphInfo for all descended graphs and for which all entries have the following properties:

  • expectedInputs may contain a tuple (t, ExpectedConnectionType::Fwd) iff t is an input or output tensor of the forward graph. Only tensors t in gradsProvidedForFwdId may appear as a tuple (t, ExpectedConnectionType::FwdGrad) in expectedInputs. If gradsProvidedForFwdId is set, the first inputs will match the gradients of gradsProvidedForFwdId, respecting the order.

  • expectedOutputs may only contain tuples of the type (t, ExpectedConnectionType::FwdGrad) where t is an input tensor of the forward graph. If gradsRequiredForFwdId is set, the expectedOutputs list matches the size and order of gradsRequiredForFwdId exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.

virtual BwdGraphInfo createBwdGraph(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)

Create backward graph information for a specific subgraph (non-recursive).

This method returns an “unstitched” result. This means that it is not guaranteed that all non-gradient inputs to a backward graph are available as inputs or outputs of the forward graph. This is a precondition for BwdGraphInfo objects used as values in calledGraphsGradInfo. So, you must call stitch on the result before using the result information in an autodiff call.

NOTE: This method may fail if any required gradient cannot be produced.

Parameters
  • ir – The IR to which this transform is applied.

  • fwdGraphId – The ID of the subgraph to differentiate.

  • gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, autodiff will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, autodiff will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.

  • gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark all these as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.

  • calledGraphsGradInfo – The result of applying autodiff to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.

Returns

A BwdGraphInfo object with the following properties:

  • expectedInputs may contain arbitrary tuples (t, ExpectedConnectionType::Fwd) where t is any tensor in the forward graph (it need not be an input or output). Only tensors t in gradsProvidedForFwdId may appear as a tuple (t, ExpectedConnectionType::FwdGrad) in expectedInputs. If gradsProvidedForFwdId is set, the first inputs will match the gradients of gradsProvidedForFwdId, respecting the order.

  • expectedOutputs may only contain tuples of the type (t, ExpectedConnectionType::FwdGrad) where t is an input tensor of the forward graph. If gradsRequiredForFwdId is set, the expectedOutputs list matches the size and order of gradsRequiredForFwdId exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.

virtual BwdGraphInfo stitch(Ir &ir, const GraphId &fwdGraphId, const BwdGraphInfo &bwdGraphInfo, AutodiffStitchStrategy stitchStrategy, const nonstd::optional<std::vector<InIndex>> &stitchIndices)

Stitch a forward-backward graph pair.

To stitch a forward-backward graph pair means to make it so that the backward graph no longer has any non-gradient inputs of the forward graph tensors that are neither inputs nor outputs of the forward graph.

When applying the autodiff transform to a graph, PopART assumes that all input tensors to the gradient ops are either 1) a forward op input 2) a forward op output or 3) the gradient of a forward op output. For this to be true for gradient ops of subgraph ops (for example: CallOp and IfOp), typically the backward graphs of those called subgraphs must not have inputs that are associated with non-gradient forward tensors that are neither inputs nor outputs of the forward graph. This is because the inputs and outputs of a forward subgraph typically map to the inputs and outputs of the associated forward op. Similarly, the inputs and outputs of a backward subgraph typically map to the inputs and outputs of the associated gradient op.

For stitch strategies that affect the forward graph’s inputs or outputs, stitch() should also amend all call sites of the forward graph as appropriate. Conversely, for the backward graphs, it is assumed there are no call sites as it’s anticipated this method is called before parents of the backward graph exist.

NOTE: This method may modify the forward graph, backward graph, or any graphs that call these graphs, depending on the method. It also may raise a popart::error if it is unable to stitch an index.

Parameters
  • ir – The IR in the context of which this transformation is applied.

  • fwdGraphId – The ID of the subgraph to differentiate.

  • bwdGraphInfo – The data structure describing the backward graph.

  • stitchStrategy – The method by which to stitch any autodiff result for graphs that are directly or indirectly called by the graph.

  • stitchIndices – If provided, backward graph input indices not in this list must be ignored and backward graph input indices in this list must be stitched (or an exception raised). If not set, it is up to the stitcher to decide what indices to stitch.

Throws

popart::error – if unable to stitch an index.

Returns

An updated BwdGraphInfo data structure (with some expectedInputs removed).

inline std::size_t getId() const override

Get the ID of the autodiff transform.

inline std::string getName() const override

Get the name of the autodiff transform.

Public Static Functions

static std::size_t id()

ID of the autodiff transform.

class AutomaticLossScale : public popart::Transform

Public Functions

inline AutomaticLossScale()
inline ~AutomaticLossScale() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static Op *executeOpNTimesEveryMTimes(Op *op, unsigned n, unsigned m, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasMode)

When applied to an op it will be effectively executed n times every m times.

It returns a pointer to an IfOp which either calls an ‘empty’ subgraph, or calls a subgraph containing the op passed as the argument. The ‘empty’ subgraph is meant to be low intensity compute. It is possible to connect inputs and outputs via nop operations and set up default values of outputs in the ‘empty’ subgraph.

Parameters
  • op – Operator whose execution frequency is modified.

  • n – Execute the op n times every m times.

  • m – Execute the op n times every m times.

  • identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.

  • outputIndiciesAndValues – Map of pairs of output indices and values. Note: inplacing and aliasing of inputs are not supported. If the op inplace-modifies or aliases an input, in the transformed graph after this method is called, this will not longer be the case.

class BatchSerialize : public popart::Transform

Public Functions

inline BatchSerialize(int pass_)
inline ~BatchSerialize() override
bool apply(Graph &graph) const final
inline std::size_t getId() const final
inline std::string getName() const final

Public Static Functions

static std::size_t id(int)
class ClipWeightGradientsByNorm : public popart::Transform

Public Functions

inline ClipWeightGradientsByNorm()
inline ~ClipWeightGradientsByNorm() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static std::vector<std::vector<Op*>> findGradientClippingGroups(const Graph &graph)
class ContiguateCollectivesTransform : public popart::Transform

A transform that inserts topological constraints into the graph.

These force collective operations which can potentially be merged to be scheduled contiguously (one right after the other) in the schedule.

Currently supported collective types:

  • ReplicatedAllReduceOp

  • ReplicatedReduceScatterOp

  • ReplicatedAllGatherOp

Public Functions

inline ContiguateCollectivesTransform()
inline ~ContiguateCollectivesTransform() override
bool apply(Graph &graph) const override
std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> includeOps) const
inline std::size_t getId() const override
inline std::string getName() const override
template<typename BaseType>
void processOp(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess) const

Processing baseOp involves finding all other collective ops in the graph with which baseOp can be merged, the inserting constraints between the matching ops and baseOp, that ensure the ops are scheduled contiguously one after another.

Parameters
  • baseOp – is the Op that should be merged with other collectives

  • schedule – is a vector of ops sorted in schedule order

  • opsToProcess – is set of all other collective ops in the graph (which are candidates for merging with base op)

Returns

void, modifies the graph of baseOp

Public Static Functions

static std::size_t id()
template<typename BaseType>
static bool checkCollectiveOp(BaseType *baseOp, BaseType *candidate)

Check whether two ops use the same collective operator.

Parameters
  • baseOp – against which to compare the candidate op

  • candidate – op of the same type as baseOp

Returns

true, if the two ops use the same collective operator or if neither uses a collective operator

template<typename BaseType>
static std::set<BaseType*, POpCmp> lookForMatchingOps(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess)

Loop through the ops in the schedule and find those matching baseOp to avoid merging the same op twice, make sure it is still in opsToProcess.

Parameters
  • baseOp – the op that should be merged with other collectives

  • schedule – the schedule of the (Collective) ops in the graph

  • opsToProcess – the (Collective) ops that can still be considered for merging

Returns

a vector of collective ops that can be merged with the baseOp

class DecomposeGradSum : public popart::DecomposeSum

Public Functions

inline std::size_t getId() const override
inline std::string getName() const override

Public Static Functions

static std::size_t id()
class DecomposeLoops : public popart::Transform

Transform that generically decomposes/unrolls loop iterations to:

  • Unroll LoopOp iterations in general

  • Arrange IO Ops to enable overlap between IO and compute tiles

  • Arrange Ops PipelineStages to enable overlap between PipelineStages

If we want to unroll a loop by a factor of 2, each Op that existed in the loop needs 3 instances, denoted as 0, 1 and 2, one per apparent iteration. If we want to unroll such that iterations can partially overlap (IO and compute overlap), we can’t generally, for all operations, place 0 before the loop, 2 after loop and 1 during the loop (see skewed unrolling below), because this would not lead to overlap between either pipeline stages or IO and compute operations.

Rather, we classify Ops (see DecomposeLoopOpTypeEnum), according to their data, topological dependencies and the tile set they are running on, into one of the categories. The available categories depend on the DecomposeLoopModel implementation. We can then shuffle the operations to before, during and after the loop accordingly. Note that every operation is cloned 2 extra times (for an unroll factor of 2), but the original operation in the loop remains.

However, the “apparent iteration” (iteration that the Op instance corresponds to in the LoopOp before unrolling) has changed.

The number of apparent iterations in total is always the unroll factor (counting all iterations before and after the loop) plus one iteration for the loop itself:

num_apparent_iterations = unroll_factor + 1

In loop iteration n, the Ops (depending on classification) now correspond to iterations i (0), i+1 (1) and i+2 (2) respectively. The Ops unrolled before the loop process iterations 0 (0) and 1 (1) The Ops unrolled after the loop process iterations n-1 (1) and n (2) (where (0) (1) and (2) correspond to the cloned operations)

As an example for apparent iteration: Before unrolling, there is an operation in a loop (denoted as {}): { Op }

If we unroll by a factor of 2, the operation is cloned into the parent graph twice,and there are different possible arrangements, depending on how we skew the unrolling:

a.) { Op } - Op0 - Op1

In this case: Op - unrollIndex -1 - apparent iteration 0 - before loop: no Op0 - unrollIndex 0 - apparent iteration 1 - before loop: no Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no

(use case example: if Op is a HostStoreOp that should do overlapped IO with compute (such as a MatMulOp))

b.) Op0 - { Op } - Op1

In this case: Op - unrollIndex -1 - apparent iteration 1 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no

(use case example: if Op is a MatMulOp that should do overlapped compute with IO (such as HostloadOp and HostStoreOp))

c.) Op0 - Op1 - { Op }

In this case: Op - unrollIndex -1 - apparent iteration 2 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 1 - before loop: yes

(use case example: if Op is a HostLoadOp that should do overlapped IO with compute (such as a MatMulOp))

Use case example:

HostLoadOp0 HostLoadOp1 { HostLoadOp  }
            MatMulOp0   { MatMulOp    } MatMulOp1
                        { HostStoreOp } HostStoreOp0 HostStoreOp1
            ^^^^^^^^^^^   ^^^^^^^^^^^   ^^^^^^^^^^^^
            overlap       overlap       overlap

{ } denotes the LoopOp

Where the data dependencies are: HostLoadOp0 -> MatMulOp0 -> HostStoreOp HostLoadOp1 -> MatMulOp -> HostStoreOp0 HostLoadOp -> MatMulOp1 -> HostStoreOp1

This skew is controlled by the decomposition model (see DecomposeLoopOpTypeEnum for details). If the model is unrolling pipeline stages, for example, each stage will be skewed differently (see DecomposeLoopPipelineModel).

Public Functions

inline DecomposeLoops()
inline ~DecomposeLoops() override
virtual bool apply(Graph &graph) const final

Decomposes all LoopOps in the graph using the standard model of loop decomposition (which is DecomposeLoopOverlapModel())

Parameters

graph – Graph containing the LoopOp to decompose

Returns

true If apply is successful. An error will be thrown if not.

inline virtual std::size_t getId() const final
inline virtual std::string getName() const final
void decomposeLoop(Graph &graph, LoopOp *loopOp, const DecomposeLoopModel &model) const

Decompose a loop with a custom DecomposeLoopModel.

Parameters
  • graphgraph containing the LoopOp to decompose

  • loopOpLoopOp to decompose

  • modelDecomposeLoopModel to apply

Public Static Functions

static std::size_t id()
static bool isComputeOp(Op *op)

Check if an Op should be classified as compute.

The condition is that the operation is on compute tiles.

Parameters

opOp to check

Returns

true if it is a Compute Op

static bool isIOOp(Op *op)

Checks if an Op is an IO operation.

The condition is that the operation is one of HostLoadOp, HostStoreOp, RemoteLoadOp, RemoteStoreOp, MultiExchangeOp.

Parameters

opOp to check

Returns

true if it is an IO Op

static bool isComputeLikeIOOp(std::set<ExchangeStrategy> computeLikeStrategies, Op *op)

Checks if an Op is classified as IO, and executes on IO tiles, but should still be handled like a compute operation (as in, classified, unrolled and scheduled as DecomposeLoopOpTypeEnum::Compute) (instead of an IO operation that should overlap with compute (classified DecomposeLoopOpTypeEnum::IoBeforeCompute or DecomposeLoopOpTypeEnum::IoAfterCompute)).

Operations should be handled like compute instead of IO operations when they are not required to overlap with compute.

Parameters
  • computeLikeStrategies – ExchangeStrategy that should be considered as compute

  • opOp to check

Returns

True if it is a compute Op

class DynamicOpTransform : public popart::Transform

Public Functions

inline DynamicOpTransform()
inline ~DynamicOpTransform() override
bool apply(Graph &graph) const final
inline std::size_t getId() const final
void transferProperties(Op *from, Op *to) const
void inplace(Op *from) const
inline std::string getName() const final

Public Static Functions

static std::size_t id()
class EnsureFp32LossScale : public popart::Transform

Public Functions

inline EnsureFp32LossScale()
inline ~EnsureFp32LossScale() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final
bool isPassThroughOp(Op *op) const

For deciding whether to continue graph traversal from op’s outputs, or to terminate the traversal at this op.

Parameters

op – The op.

Returns

True if the op has a single input, and all its outputs are of the same type as the input.

FromLossScaleTraversalOps traverseFromLossScaleTensor(const Graph &graph) const

Traverse the graph from the loss scale tensor.

  • We ‘pass through’ single-input ops that do not combine the loss scale (or a descendant of it) with an activation tensor.

  • Otherwise we terminate the traversal. We refer to these terminal ops as ‘mixed precision loss grad op’ (or MPLGO) candidates.

Parameters

graph – The graph to be traversed.

Returns

A pair containing the list of pass-through ops and MPLGO candidates.

bool shouldApply(const Graph &graph) const

Run the checks to see if the transform should be applied.

Parameters

graph – The graph that the checks are run on.

Returns

True if the checks pass.

void upCastTensor(Op *op, InIndex index) const

Upcast fp16 tensor at input index index to op to fp32.

This is done by disconnecting the input tensor, inserting a CastOp, and re-connecting the output tensor of the CastOp at index.

Parameters
  • op – The op whose input is to be upcast.

  • index – The input index to op at which the tensor is to be upcast.

void downCastTensor(Tensor *tensor) const

Downcast fp16 tensor to fp16.

This is done by disconnecting it from its consumers, inserting a CastOp, and re-connecting the output tensor of the CastOp to the consumers.

Parameters

tensor – The tensor to be downcast.

Public Static Functions

static std::size_t id()
static bool isMixedPrecisionLossGradOp(Op *op)

To return true, the op’s implementation must be able to handle mixed precision maths.

We have no good way to know this programmatically at the point of running this transform, so we hard code this information here.

Parameters

op – The op we want to check if it has an impelemntation that is known to support mixed precision inputs.

Returns

True if it is has an implementation known to support mixed precision inputs.

static Tensor *getLossScaleInputTensor(Op *op)

Only to be called on an op for which a call to isMixedPrecisionLossGradOp return true.

Parameters

op – An MPLGO candidate whose loss scale tensor (or descendant there-of) you want to find.

Returns

The input tensor.

class ExplicitRecompute : public popart::Transform

Explicit recomputation is a transformation that clones forward-pass operations marked for recomputation and clones them.

Consider a fragment of the training graph before the explicit recomputation transform, where one gradient operation (CheckpointOp1Grad) requires a value from the forward pass (RecomputeOp1) which is considered for recomputation:

CheckpointOp0 | RecomputeOp0 | RecomputeOp1 -. … | \ | CheckpointOp1 CheckpointOp1Grad … … | | Loss ———–&#8212;

(where CheckpointOp* is an op with op->settings.recomputeType == RecomputeType::Checkpoint and RecomputeOp* is an op with op->settings.recomputeType == RecomputeType::Recompute)

By marking these ops as ‘recompute’, the output of RecomputeOp1 does not need to remain live until the recomputation of CheckpointOp1Grad. In other words, the memory used to store this tensor is freed for allocation of other tensors as soon as RecomputeOp1’s output is read during the computation of CheckpointOp1. How does this work in practice?

After the transform, the graph fragment will look like:

CheckpointOp0 -. | \ RecomputeOp0 RecomputeOp0Clone | | RecomputeOp1 RecomputeOp1Clone … | | | CheckpointOp1 –&#8212; CheckpointOp1Grad … … | | Loss —————————–&#8212;

Where every operation marked as Recompute will be cloned and added to the backward pass, while all Checkpoint operation will remain connected as-is.

In pipelining, every copy operation between pipeline stages is (required to be) checkpointed (in order to not cause data dependencies between stages running in parallel), while everything else is recomputed. The user can choose to checkpoint more, but not recompute more (with pipelining).

The alternative, in the case of implicit recomputation, is to not transform the graph at the IR level, and to use these recomputation settings to affect the Ir lowering. In this case, the poplar::program::Sequences that correspond to the lowered RecomputeOps are added once to the main program as scheduled in the forward pass, and then again directly preceding the poplar::program::Sequence of the CheckpointOp1Grad. See the `FindRequiredRecomputes class in irlowering.cpp

Public Functions

inline ExplicitRecompute()
inline ~ExplicitRecompute() override
bool apply(Graph &graph) const final
inline std::size_t getId() const final
inline std::string getName() const final

Public Static Functions

static std::size_t id()
class HostIOSetup : public popart::Transform

Public Functions

inline HostIOSetup(int pass_)
inline ~HostIOSetup() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id(int)
class InferPipelineStages : public popart::Transform

Public Functions

inline InferPipelineStages()
inline ~InferPipelineStages() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class InplaceAccumulateGradPartialsIntoOptimizerAccumTensor : public popart::Transform

Replaces an accumulation tree consumed by an AccumulateOp (which has its own accumulator tensor), with an accumulation tree directly on the AccumulateOp’s accumulator tensor, thereby removing one allocation from the graph (the accumulation tree’s original accumulation tensor).

More precisely:

Init | dW0 pW0 \ / AddLhsInPlace0 | dW1 pW1 \ / AddLhsInPlace1 | A dw2 accum -&#8212;| \ | Accumulate3 | accum’ | B

Becomes:

A | accum pW0 \ / Accumulate | dW1 pW1 \ / Accumulate | accum’ | B

See below comment for more discussion of the conditions required to be able to perform this transform.

The primary use case of this is a decomposed grad sum whose addition tree is fed into an AccumulateOp as part of the optimiser step.

Public Functions

InplaceAccumulateGradPartialsIntoOptimizerAccumTensor()
~InplaceAccumulateGradPartialsIntoOptimizerAccumTensor() final
bool apply(Graph &graph) const final
inline std::size_t getId() const final
inline std::string getName() const final

Public Static Functions

static std::size_t id()
class InterIpuCopy : public popart::Transform

Public Functions

inline InterIpuCopy()
inline ~InterIpuCopy() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class IoComputeTileCopy : public popart::Transform

Public Functions

inline IoComputeTileCopy()
inline ~IoComputeTileCopy() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MainLoops : public popart::Transform

Public Functions

inline MainLoops()
inline ~MainLoops() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static inline std::string getStepGraphName()

Return the name of the step subgraph.

The step subgraph is the body of the LoopOp stepLoop . The stepLoop is run when session.run(...) is called, and will run batchesPerStep number of times (i.e. the trip_count of the loop equals batchesPerStep ). A step thus constitutes a call to session.run(...) . As a call to session.run(...) involves a call to engine.run() (which is expensive, and will involve returning to the host for more data) we would like to have as large a batchesPerStep as possible.

See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator

Returns

The name of the step graph

static inline std::string getAccumulationGraphName()

Return the name of the gradient accumulation subgraph.

The gradient accumulation subgraph is the body of the LoopOp accumLoop . The accumLoop will run accumulationFactor number of times (i.e. the trip_count of the loop equals batchesPerStep ) and will accumulate the gradients for each pass. These accumulated gradients will be used to calculate the weigth update.

See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator

Returns

The name of the accumulation graph

static Graph &getInnerLoopSubgraph(Ir &ir)

Helper function for accessing the subgraph of the inner loop.

The inner loop depends on the values of accumulationFactor and batchesPerStep. The inner loop equals:

  • The mainGraph if accumulationFactor = 1 and batchesPerStep = 1

  • The accumulationGraph if accumulationFactor > 1 and batchesPerStep = 1

  • The stepGraph if accumulationFactor = 1 and batchesPerStep > 1

  • The accumulationGraph if accumulationFactor > 1 and batchesPerStep > 1

Warning

Should only be used after the transform has been applied, this means after call to apply() has been made.

Note

innerLoop and outerLoop are represented by the differnt graphs only when accumulationFactor > 1 and batchesPerStep > 1. In that case the outerLoop repeats the innerLoop

Returns

The inner loop subgraph

static const Graph &getInnerLoopSubgraph(const Ir &ir)
static Graph &getOuterLoopSubgraph(Ir &ir)

Helper function for accessing the subgraph of the outer loop.

The outer loop depends on the values of accumulationFactor and batchesPerStep. The outer loop equals:

  • The mainGraph if accumulationFactor = 1 and batchesPerStep = 1

  • The accumulationGraph if accumulationFactor > 1 and batchesPerStep = 1

  • The stepGraph if accumulationFactor = 1 and batchesPerStep > 1

  • The stepGraph if accumulationFactor > 1 and batchesPerStep > 1

Warning

Should only be used after the transform has been applied, this means after call to apply() has been made.

Note

innerLoop and outerLoop are represented by the differnt graphs only when accumulationFactor > 1 and batchesPerStep > 1. In that case the outerLoop repeats the innerLoop

Returns

The outer loop subgraph

static LoopOp *getInnerLoopOp(Ir &ir)
class MergeAllVarUpdates : public popart::MergeVarUpdates

Public Functions

inline MergeAllVarUpdates()
inline ~MergeAllVarUpdates() override
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MergeAuto : public popart::MergeVarUpdates

Subclassed by popart::MergeLooseThreshold, popart::MergeTightThreshold

Public Functions

int64_t getThresholdMemory(const Graph&) const
class MergeLooseThreshold : public popart::MergeAuto

Public Functions

inline MergeLooseThreshold()
inline ~MergeLooseThreshold() override
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final
int64_t getMemToPlayWithAtPeak(const Graph&) const

Public Static Functions

static std::size_t id()
class MergeTightThreshold : public popart::MergeAuto

Public Functions

inline MergeTightThreshold()
inline ~MergeTightThreshold() override
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MergeCollectivesTransform : public popart::Transform

A transform for merging multiple compatible collective operations into a single larger collective operation.

Ops are only merged if they apear in contiguous order in the schedule.

Currently supported collective types:

  • ReplicatedAllReduceOp

Public Functions

inline MergeCollectivesTransform()
inline ~MergeCollectivesTransform() override
bool apply(Graph &graph) const override
std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> includeOps) const
inline std::size_t getId() const override
inline std::string getName() const override
template<typename BaseType>
bool collectiveOpCheck(BaseType *A, BaseType *B) const

Confirm that two collective ops of the same BaseType use the same collective operation i.e.

ADD, MUL etc. If the BaseType does not require a collective op (gather), return true

Parameters
  • A – the first op

  • B – the second op

Returns

true is A and B use the same collective operation or both use none

template<typename MultiOpType, typename BaseType>
Op *attemptToMergeOnOp(BaseType *baseOp, std::vector<Op*>::iterator &schedulePos, std::vector<Op*> &opSchedule) const

Given a collective operation, attempt to merge it with other compatible collective ops which are tied (in the schedule) to the current op.

Parameters
  • baseOp – a collective op that should be merged

  • opSchedule – the schedule of all (collective) ops in the graph

Returns

pointer the constructed op

template<typename MultiOpType, typename BaseType>
std::unique_ptr<MultiOpType> constructMultiOp(BaseType *baseOp, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet, std::vector<BaseType*> matchingOps) const

Constructs a new MultiOpType which will replace the baseOp and all matching ops.

Parameters
  • baseOp – is the operation to be replaced

  • outInfoFromBaseOps – is the output information for each output tensor collected from the ops with which base op will be merged.

  • inputVirtualGraphIdAndTileSet – the input virtual graph and tile set information collected from the ops that will be merged

  • outputVirtualGraphIdAndTileSet – the output virtual graph and tile set information collected from the ops that will be merged

  • matchingOps – the vector of matching ops

Returns

a unique pointer to the new multi-collective op

Public Static Functions

static std::size_t id()
class MergeCopies : public popart::Transform

Public Functions

inline MergeCopies()
inline ~MergeCopies() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MergeDuplicateOps : public popart::Transform

Public Functions

inline MergeDuplicateOps()
inline ~MergeDuplicateOps() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MergeExchange : public popart::Transform

Public Functions

inline MergeExchange()
inline ~MergeExchange() override
bool apply(Graph &graph) const override
std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> include_ops) const
inline std::size_t getId() const override
inline std::string getName() const override

Public Static Functions

static std::size_t id()
class MergeLoops : public popart::Transform

Public Functions

inline MergeLoops()
inline ~MergeLoops() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class MergeVarUpdates : public popart::Transform

Subclassed by popart::MergeAllVarUpdates, popart::MergeAuto

Public Types

using PartitionId = std::string
using PartitionMap = std::map<PartitionId, std::vector<VarUpdateStartEnd>>

Public Functions

PartitionId getPartitionId(Op *op) const
virtual bool apply(Graph&) const final
PartitionMap getLargestGroupTargetsMap(const Graph&) const
class OverlapIO : public popart::Transform

Public Functions

inline OverlapIO()
inline ~OverlapIO() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static std::map<ExchangeStrategy, std::set<PipelineStage>> overlapIORequired(Ir &ir)

Check what level of ExchangeStrategy is required with overlapped IO.

Each pipeline stage can contain IO operations that belong to any of the strategies defined in the ExchangeStrategy enum. This will then inform how the IO operations of each pipeline stages have to be unrolled.

Parameters

ir – IR to check for overlapped IO settings

Returns

Map of required exchange strategies and pipeline stages in which exchanges occur. The set of stages will be empty if the ExchangeStrategy is not set on the InputSettings or AnchorReturnType of an input or output respectively. HostLoad and HostStore operations inserted by the HostIoSetup transform will inherit the ExchangeStrategy from InputSettings or AnchorReturnType respectively.

class Pipeline : public popart::Transform

Public Functions

inline Pipeline()
inline ~Pipeline() override
virtual bool apply(Graph &graph) const final

Checks if the pipelining settings are valid and applies either implicit or explicit pipelining transforms to the graph.

Parameters

graph – top-level IR graph (main graph) for implicit pipelining, pipeline loop subgraph for explicit pipelining

Returns

true if the transformation has changed the graph

inline virtual std::size_t getId() const final
inline virtual std::string getName() const final
bool addDynamicStashAndRestoreOps(Graph &graph) const

Add all required dynamic update and dynamic slice operations to the graph, which link forward and recompute/backward stages together via stashes Only works for explicit pipelining.

Parameters

graph – Pipeline loop subgraph

Returns

True if successful, will raise error if not

bool contiguateIpuCopies(Graph &graph) const

Add required IpuCopyOps to ensure that within the pipelined execution, no copies between non-contiguous pipeline stages occur.

Parameters

graph – Pipeline loop subgraph

Returns

True if successful, will raise error if not

int getStashSize(const Ir &ir, PipelineStage stashStage, PipelineStage maxRestoreStage) const

Calculate the required stash size.

Parameters
  • ir – The current IR

  • stashStage – The stage in which the stash is updated

  • maxRestoreStage – The last stage in which the stash is restored

Returns

Required number of stash entries

Public Static Functions

static std::size_t id()
static bool checkIsFullRecompute(Graph &graph)
static bool checkIsFullCheckpoint(Graph &graph)
static bool inplaceRestoreRequiredForRecompute(Op *op)

Implicit pipelining and implicit recompute only! Test if the (implicit) recompute logic requires an inplace restored version of a forward ActGrad tensor (from the stash)

Parameters

op – the Op to check if it is convertible to RestoreInplaceOp and is required for (implicit) recompute

Returns

True if the inplace restore is required

static bool inplaceRecomputationConflict(Op *op, InIndex in, OutIndex out)

Implicit pipelining and implicit recompute only! Check if implicit recompute is in conflict with implicit pipelining when restoring a forward ActGrad tensor inplace.

Parameters
  • op – the Op to check

  • in – input index of the Op

  • out – output index of the Op

Returns

true if there is an inplace overwritng conflict

static void setFinalFwdStageRecomputation(Graph &graph)

Implicit pipelining and implicit recompute only! This annotation pass will try to set the Ops between the topologically final Checkpoints and the loss to NOT be recomputed.

This avoid a program where operations are run twice in a row with no benefit to liveness.

Parameters

graph – top-level IR graph (main graph)

static void checkOpsPipelineStage(Graph &graph)

Check and adjust pipeline stage annotations on operations.

Parameters

graph – Graph on which to check pipeline stages

static std::map<PipelineStage, PipelineStage> withStages(const Ir &ir)

Check which stages should be executed with which other stage.

Parameters

ir – IR from which to read the pipeline stages

Returns

Map of pipeline stages to which stage to execute with in sequence.

class PreAutomaticLossScale : public popart::Transform

A transform that annotates tensors in the forward graph, so that their gradients can be tracked in automatic loss scaling.

This transform reads a list of user-provided tensor IDs in the forward graph and inserts AutoLossScaleProxyOps after them (see example below). Later in the lowering process, the Autodiff transform will place the corresponding AutoLossScaleProxyGradOps in the backward graph, marking the tensor locations in the graph, for which to track gradients.

Example graph before applying the transform: A &#8212; MulOp &#8212; C B -’

Example graph after applying the transform with toTrackTensors = [“A”, “C”]: A &#8212; AlsProxyOp &#8212; A* &#8212; MulOp &#8212; C &#8212; AlsProxyOp &#8212; C* B ——————&#8212;’

It is important to apply the AutomaticLossScale transform after PreAutomaticLossScale and Autodiff to remove all AutoLossScaleProxyOps and AutoLossScaleProxyGradOps.

Public Functions

inline PreAutomaticLossScale()
inline ~PreAutomaticLossScale() override
virtual bool apply(Graph &graph) const final

Annotate tensors in the forward graph, so that their gradients can be found and tracked in automatic loss scaling.

See class documentation for details.

Parameters

graph – The graph which to transform.

Throws
  • error – if the user provides an empty list to automaticLossScalingSettings.toTrackTensors.

  • error – if any of the tensor IDs in automaticLossScalingSettings.toTrackTensors don’t exist in the graph.

Returns

true if there was a change to the graph.

Returns

false if there wasn’t a change to the graph.

inline virtual std::size_t getId() const final
virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class Prune : public popart::Transform

Public Functions

inline Prune()
inline ~Prune() override
bool apply(Graph &graph) const override
inline std::size_t getId() const override
inline std::string getName() const override

Public Static Functions

static std::size_t id()
class RandomSetup : public popart::Transform

Public Functions

inline RandomSetup()
inline ~RandomSetup() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static bool hasRandomSeed(const Ir &ir)
static bool requiresRandomSeed(const Ir &ir)
static TensorId getStreamedSeedTensorId()
class RemoteSetup : public popart::Transform

Public Functions

inline RemoteSetup()
inline ~RemoteSetup() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static void getRemoteArgMapping(Graph &graph, RemoteArgOpMap&, RemoteOpArgMap&, RemoteArgBufferMap&)
class SerializeMatMuls : public popart::Transform

Public Functions

inline SerializeMatMuls()
inline ~SerializeMatMuls() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class StochasticRounding : public popart::Transform

Public Functions

inline StochasticRounding()
inline ~StochasticRounding() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
class StreamingMemory : public popart::Transform

Public Functions

inline StreamingMemory(int pass_)
inline ~StreamingMemory() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id(int)
class SubgraphOutline : public popart::Transform

Class for creating functionally equivalent subgraphs from SubgraphableOpClusters, and replacing instances of SubgraphableOpClusters with calls to these subgraphs.

Further down the stack, this allows for code-reuse, which results in a lower memory footprint for the compiled graph.

Public Functions

inline SubgraphOutline()
inline ~SubgraphOutline() override
virtual bool apply(Graph &graph) const final
inline virtual std::size_t getId() const final
inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()
static Graph &createSubgraph(const std::vector<SubgraphableOpCluster> instances, Ir &ir, std::map<Op*, int> &index_map, std::string subgraphId = "call")

Create a subgraph from a set of identitcal op clusters.

Parameters
  • instances – A set of SubgraphableOpClusters that can be replaced by a call to the same subgraph. All SubgraphableOpCluster instances must be functionally equivalent.

  • ir – The IR.

  • index_map – An empty map, passed by reference. Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance. Required as input argument to ‘replaceWithCallOp’.

  • subgraphId – The returned subgraph’s id.

Returns

A Graph that is functionally equivalent to each SubgraphableOpCluster instance.

static Graph &createEmptySubgraph(const SubgraphableOpCluster &instance, Ir &ir, std::string subgraphId, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasModel)

Create an ‘empty’ subgraph from an op cluster.

Parameters
  • instance – A SubgraphableOpCluster that is used as a template for which we build ‘empty’ subgraph where inputs and output tensors can be connected via nops and output tensors can be set to default values.

  • ir – The IR.

  • subgraphId – The returned subgraph’s id.

  • identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.

  • outputIndiciesAndValues – Map of pairs of output indices and values.

Returns

A Graph, low compute subgraph which stands for the op when it is not executed.

static void setSubgraphOpSettingsFromClusterInstance(Op *op, const SubgraphableOpCluster &instance)
static Op *replaceWithCallOp(const SubgraphableOpCluster &instance, Graph &subgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap)

Replace a cluster of ops with a call to a subgraph.

Parameters
  • instance – The SubgraphableOpClusters instance to be replaced.

  • subgraph – The subgraph, a call to which is to replace the instance.

  • index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.

  • aliasesMap – AliasesMap with alias information for instance’s graph.

Returns

The replacement CallOp’s pointer.

static Op *replaceWithEmptyElseBranchIfOp(const SubgraphableOpCluster &instance, Graph &subgraph, Graph &emptySubgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap, Tensor *flag)

Replace an op with if op.

Where the op is moved to the first branch of if op. Its second branch is for low intensity compute which passes input tensors to outputs or provide default output tensors.

Parameters
  • instance – The SubgraphableOpClusters instance which holds op to be replaced.

  • subgraph – if then branch subgraph which contains the op.

  • emptySubgraph – if else low intensity compute branch subgraph.

  • index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.

  • aliasesMap – AliasesMap with alias information for instance’s graph.

  • flag – a Tensor deciding which branch should be used.

Returns

The replacement IfOp’s pointer.

#include <popart/bwdgraphinfo.hpp>
struct BwdGraphInfo

A data structure that captures the result of applying autodiff to a graph.

Public Functions

bool operator==(const BwdGraphInfo &rhs) const

Equality operator.

Public Members

GraphId bwdGraphId

A newly constructed backward graph.

ExpectedConnections expectedInputs

Expected connection details for each of bwdGraph’s inputs.

ExpectedConnections expectedOutputs

Expected connection details for each of bwdGraph’s outputs.

enum class popart::ExpectedConnectionType

The type of tensor expected to connect to a graph input or output.

Values:

enumerator Fwd = 0

A tensor from a forward graph.

enumerator FwdGrad = 1

The gradient of a tensor from a forward graph.

struct ExpectedConnection

Description of tensor expected to connect to graph input or output.

Public Functions

bool operator==(const ExpectedConnection &rhs) const

Equality operator.

Public Members

TensorId fwdId

TensorId in the fwdGraph.

ExpectedConnectionType type

Either fwdId or getGradId(fwdId).

14.11. Utility classes

14.11.1. Graph

#include <popart/graphutils.hpp>
using popart::graphutils::CallStack = std::vector<Op*>

CallStack representation.

using popart::graphutils::TensorAndCallStack = std::pair<Tensor*, CallStack>

14.11.2. Region

#include <popart/region.hpp>

14.11.3. Error handling

#include <popart/error.hpp>
enum class popart::ErrorSource

Values:

enumerator popart = 0
enumerator popart_internal
enumerator poplar
enumerator poplibs
enumerator unknown
class error : public runtime_error

Exception class for popart.

Subclassed by popart::internal_error, popart::memory_allocation_err, popart::runtime_error

Public Functions

template<typename ...Args>
inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args>
inline explicit error(const std::string &s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const char *s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)
const std::string &stackreport() const
inline ErrorUid uid() const
class internal_error : public popart::error

Exception class specific to internal errors This should be used as an assert; for states where the user should not have been able to create.

Public Functions

template<typename ...Args>
inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args>
inline explicit error(const std::string &s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const char *s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)
class memory_allocation_err : public popart::error

Subclassed by popart::popx::devicex_memory_allocation_err

Public Functions

inline memory_allocation_err(const std::string &info)
virtual std::unique_ptr<memory_allocation_err> clone() const = 0
virtual std::string getSummaryReport() const = 0
virtual std::string getProfilePath() const = 0
class runtime_error : public popart::error

Exception class specific to errors that occur when running a model.

For example, this error could be thrown when a user-implemented IStepIO callback doesn’t return any data.

NOTE: This is different from a C++ runtime error.

Public Functions

template<typename ...Args>
inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args>
inline explicit error(const std::string &s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const char *s, const Args&... args)
template<typename ...Args>
inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)
class devicex_memory_allocation_err : public popart::memory_allocation_err

Public Functions

devicex_memory_allocation_err(const devicex_memory_allocation_err &rhs)
devicex_memory_allocation_err(const poplar::graph_memory_allocation_error &e, const poplar::OptionFlags &_reportOptions)
std::unique_ptr<memory_allocation_err> clone() const
std::string getSummaryReport() const
std::string getProfilePath() const

14.11.4. Debug context

#include <popart/debugcontext.hpp>
class DebugContext

Public Functions

DebugContext(SourceLocation loc = SourceLocation::Current())
DebugContext(const char *name, SourceLocation loc = SourceLocation::Current())
DebugContext(std::string name, SourceLocation loc = SourceLocation::Current())
DebugContext(const DebugInfo &debugInfo, std::string name = "", SourceLocation loc = SourceLocation::Current())
DebugContext(const DebugNameAndId &debugNameAndId, std::string name = "", SourceLocation loc = SourceLocation::Current())
DebugContext(DebugContext&&)
DebugContext(const DebugContext&)
~DebugContext()
std::string getPathName() const
class DebugInfo

Subclassed by popart::OnnxOpDebugInfo, popart::OnnxVariableDebugInfo, popart::OpDebugInfo, popart::TensorDebugInfo

Public Types

enum class SerializationFormat

Values:

enumerator JSON

Serialise in JSON format.

enumerator CBOR

Serialise in CBOR format.

Public Functions

DebugInfo(const DebugContext &debugContext, const std::string &layer)
DebugInfo &operator=(const DebugInfo&) = delete
DebugInfo(const DebugInfo&) = delete
virtual ~DebugInfo()
DebugId getId() const
std::string getPathName() const
bool setValue(std::string name, ProfileValue value)

Public Static Functions

static void initializeStreamer(const std::string &fileName, const SerializationFormat &format = SerializationFormat::CBOR)
static void closeStreamer()
class OnnxOpDebugInfo : public popart::DebugInfo

Public Functions

OnnxOpDebugInfo(const DebugContext &debugContext, const Node &node)
OnnxOpDebugInfo &operator=(const OnnxOpDebugInfo&) = delete
OnnxOpDebugInfo(const OnnxOpDebugInfo&) = delete
virtual ~OnnxOpDebugInfo() = default
class OnnxVariableDebugInfo : public popart::DebugInfo

Public Functions

OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::TensorProto &proto)
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto)
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto, const TensorInfo &ti)
OnnxVariableDebugInfo &operator=(const OnnxVariableDebugInfo&) = delete
OnnxVariableDebugInfo(const OnnxVariableDebugInfo&) = delete
virtual ~OnnxVariableDebugInfo() = default
class OpDebugInfo : public popart::DebugInfo

Public Functions

OpDebugInfo(const DebugContext &debugContext, const Op &_op)
virtual ~OpDebugInfo()
OpDebugInfo &operator=(const OpDebugInfo&) = delete
OpDebugInfo(const OpDebugInfo&) = delete
void finalize()
class TensorDebugInfo : public popart::DebugInfo

Public Functions

TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorInfo &info, const TensorType &tt)
TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorType &tt)
TensorDebugInfo &operator=(const TensorDebugInfo&) = delete
TensorDebugInfo(const TensorDebugInfo&) = delete
virtual ~TensorDebugInfo() = default

14.11.5. Attributes

#include <popart/attributes.hpp>
class Attributes

Wrapper around the container of ONNX_NAMESPACE::AtrributeProtos of a Node.

Provides faster and cleaner reads of values from keys (strings) than ONNX_NAMESPACE::AttributesProto.

Public Types

using Ints = std::vector<int64_t>

The types of attributes as defined in the ONNX spec.

using Int = int64_t
using Floats = std::vector<float>
using Float = float
using Strings = std::vector<std::string>
using String = std::string
using Graphs = std::vector<ONNX_NAMESPACE::GraphProto>
using Graph = ONNX_NAMESPACE::GraphProto

Public Functions

Attributes(const NodeAttributes&)
Attributes() = default
const std::vector<std::string> &getNames() const
onnxAttPtr at(const std::string &name) const
void append(std::stringstream &ss, std::string prefix = "") const
template<typename T>
void setIfPresent(T&, const std::string &key) const
template<typename T>
void set(T&, const std::string &key) const
bool hasAttribute(const std::string &key) const
void takeAttribute(const std::string &key, const Attributes &attributes)

Take an attribute identified by key from the given Attributes object.

template<typename UnaryPredicate>
inline Attributes filter(UnaryPredicate p) const

Take the set of attributes that match the given predicate.

template<typename T>
T getAttribute(const std::string &key, const T &defaultValue) const
Attributes::Graphs getAllGraphAttributes() const
template<typename T>
T getAttribute(const std::string &key) const
template<typename T>
void setAttribute(const std::string &key, T&)
template<>
Attributes filter(const char *key) const
template<>
Attributes filter(std::string key) const
template<>
void setIfPresent(std::vector<int64_t>&, const std::string &key) const
template<>
void setIfPresent(int64_t&, const std::string &key) const
template<>
void setIfPresent(bool &v, const std::string &key) const
template<>
void setIfPresent(std::string&, const std::string &key) const
template<>
void setIfPresent(float&, const std::string &key) const
template<>
void set(std::vector<int64_t> &vs, const std::string &key) const
template<>
void set(std::vector<float> &vs, const std::string &key) const
template<>
void set(std::vector<std::string> &vs, const std::string &key) const
template<>
void set(float &v, const std::string &key) const
template<>
void set(int64_t &v, const std::string &key) const
template<>
Attributes::Ints getAttribute(const std::string &key, const Attributes::Ints &defaultValue) const
template<>
Attributes::Int getAttribute(const std::string &key, const Attributes::Int &defaultValue) const
template<>
Attributes::String getAttribute(const std::string &key, const Attributes::String &defaultValue) const
template<>
Attributes::Float getAttribute(const std::string &key, const Attributes::Float &defaultValue) const
template<>
Attributes::Ints getAttribute(const std::string &key) const
template<>
void setAttribute(const std::string &key, Attributes::Ints&)
template<>
void setAttribute(const std::string &key, Attributes::Int&)
template<>
void setAttribute(const std::string &key, Attributes::String&)

14.11.6. Void data

#include <popart/voiddata.hpp>
class ConstVoidData

A class to point to constant data.

Public Functions

ConstVoidData() = default
ConstVoidData(const void *data_, const TensorInfo &info_)
inline bool storesData() const
void store(std::vector<char> &&d, const TensorInfo &i)

Public Members

const void *data = nullptr
TensorInfo info
class MutableVoidData

A class to point to non-constant data.

Public Members

void *data = nullptr
TensorInfo info

14.11.7. Input shape information

#include <popart/inputshapeinfo.hpp>
class InputShapeInfo

Class that contains what is known about the input tensors (as TensorInfo objects) in the IR prior to compilation.

This knowledge can sometimes be compiled into the IR, and for certain backends is even required, for example the IPU requires all Stream Tensor shapes.

Public Functions

InputShapeInfo() = default

Default constructor for the InputShapeInfo class.

void add(TensorId, const TensorInfo&)

Add the identifier and TensorInfo object for a tensor to the InputShapeInfo object.

Parameters
  • TensorId – The identifier of the tensor for which information is being added.

  • TensorInfo – The tensor information to be added.

const TensorInfo &get(TensorId) const

Get the information of a tensor.

Parameters

TensorId – The identifier of the tensor for which to get the tensor information.

bool has(TensorId) const

Check if the InputShapeInfo object contains information for a tensor.

Parameters

TensorId – The identifier of the tensor to check.

Returns

If true, the InputShapeInfo object contains information for the tensor. If false, the InputShapeInfo object does not contain information for the tensor.

std::vector<TensorId> getAllTensorIds() const

Get all unique tensor identifiers of tensors in the InputShapeInfo object.

Returns

Vector of tensor identifiers.

inline const std::map<TensorId, TensorInfo> &getInfos() const

Get all information contained the InputShapeInfo object.

Returns

Map of tensor identifiers and the corresponding tensor information.

14.11.8. Profiling

#include <popart/liveness.hpp>
class LivenessAnalyzer

Public Types

using PendingCopies = std::vector<LivenessNode>

Public Functions

LivenessAnalyzer(const Ir *ir_, const SubgraphCopyingStrategy *subgraphCopyingStrat)
void apply()
int64_t getGlobalSchedulePosition(CallStack callStack) const
inline size_t getOpScheduleSize() const
inline const LivenessNode &getOpScheduleAt(int64_t scheduleIndex) const
inline const std::vector<Op*> &getGraphOpSchedule(GraphId id) const
inline const std::vector<int64_t> &getScheduleIndices(Op *op) const
inline const std::vector<int64_t> &getScheduleIndices(Tensor *t) const
inline const std::vector<int64_t> &getScheduleIndices(TensorId tid) const
inline const std::vector<int64_t> &getCallSiteLinksAt(int64_t scheduleIndex) const
inline const std::vector<int64_t> &getCallSiteLinksInvAt(int64_t scheduleIndex) const
inline const std::vector<Op*> &getGraphCallSites(GraphId id) const
int64_t getContextStartIndex(ExecutionContext context) const
int64_t getContextEndIndex(ExecutionContext context) const
#include <popart/subgraphpartitioner.hpp>
class SubgraphPartitioner

When lowering CallOps, we would previously copy all tensors from the call site (the CallOp’s input tensors) to the subgraph’s input tensors, do the call and then copy the subgraph’s output tensors back to the call site’s output tensors:

Copy(caller_in_1, subgraph_in_1) Copy(caller_in_2, subgraph_in_2) Call(subgraph) Copy(subgraph_out_1, caller_out_1) Copy(subgraph_out_2, caller_out_2)

With this approach both, subgraph_in_1 and subgraph_in_2 are live during the call. This can be suboptimal &#8212; in some cases some subgraph inputs may not be required until later in the subgraph and copying them later would improve the required memory. Analogously, in some cases some subgraph outputs may be ready to copy well before the end of the subgraph and it may be advantageous to do this copy early. This is especially true for subgraphs that deal with multiple inputs/outputs in sequence.

To that end, graphs now support lowering over multiple “subgraph parts” to allow CallOps that have these subgraphs as their called graph to copy inputs later and outputs earlier. Essentially, each graph is ‘split’ over multiple PopART fragments / Poplar sequences to facilitate any parent graph that calls it to do a Copy of inputs or outputs between.

The scheduling of copies for subgraph ops is already modelled by the LivenessAnalyzer. We base our partitioning on this model. This class simply interprets the LivenessAnalyzer’s schedule and determines how to split subgraphs into parts based on the LivenessAnalyzer’s schedule.

Public Types

enum class CallOpPartType

Enum type for CallOpPart types.

Values:

enumerator Undefined = 0
enumerator CopyInput
enumerator CopyOutput
enumerator CopyModified
enumerator CallSubgraphPart
using CallOpSchedule = std::vector<std::tuple<CallOpPart, SubgraphPartIndex>>

Public Functions

SubgraphPartitioner() = default

Default contructor.

virtual ~SubgraphPartitioner() = default

Default destructor.

virtual void apply()

Prepare the results.

Errors if IR or liveness analyser not set.

virtual void setIr(const Ir*)

Set the IR dependency to use.

virtual void setLivenessAnalyzer(const LivenessAnalyzer*)

Set the LivenessAnalyzer dependency to use.

virtual int getNumSubgraphParts(const Graph&) const

Interpret the liveness analysis and work out what how many subgraph parts a graph needs to lower all fragments between input/output copies.

Errors if apply was not run.

virtual SubgraphPartIndex getOpSubgraphPartBegin(Op*) const

Interpret the liveness analysis and work out what subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.

For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the first such part. Errors if apply was not run.

virtual SubgraphPartIndex getOpSubgraphPartEnd(Op*) const

Interpret the liveness analysis and work out what index is one larger than the last subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.

For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the last such part. Errors if apply was not run.

virtual CallOpSchedule getCallOpSchedule(CallOp*) const

Intepret the liveness analysis results and work out how a CallOp is broken down over various subgraph parts.

The result is a vector of pairs of CallOp ‘parts’ and the ‘subgraph parts’ they should be lowered in.

Public Static Functions

static bool isPartitionable(const Graph &graph)

Returns true for a graph if we support it being ‘broken’ into multiple subgraph parts.

The main graph does not support this. Subgraphs that are called by any op that is not a CallOp also do not support this.

class CallOpPart

A class to represent a part of a CallOp.

Public Members

CallOpPartType type
InIndex inIndex
OutIndex outIndex
SubgraphPartIndex subgraphPartIndex
#include <popart/aliaszerocopy.hpp>
class AliasZeroCopy

Public Functions

AliasZeroCopy(const Ir *ir, const LivenessAnalyzer *analyzer)
void apply()
void removePostIRAliases(Tensor*)
std::set<Tensor*, PTensorCmp> getPostIRAliases(Tensor*) const
std::set<Tensor*, PTensorCmp> getTensorsWithPostIRAliases() const
std::set<Tensor*, PTensorCmp> getProposedAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const
std::set<Tensor*, PTensorCmp> getActiveAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const
void activateAlias(Tensor *ta, Tensor *tb)
bool nodeRequired(Op *op, OpStatus status, int index) const
bool opRequired(Op*) const
bool copyInputRequired(Op*, InIndex) const
bool copyLoopCarriedRequired(Op*, InIndex) const
bool copyModifiedRequired(Op*, InIndex) const
bool copyOutputRequired(Op*, OutIndex) const
void printLivenessIntervals(std::set<Tensor*, PTensorCmp> tensors, ProducerInterval producerInterval)
Intervals getLivenessIntervals(Tensor*, ProducerInterval)
Intervals getCandidateLivenessIntervals(Tensor*, ProducerInterval = ProducerInterval::Enforce, bool forceUpdateCache = false)

Public Static Functions

static std::size_t id()
static bool doOverlap(const Intervals &aIntervals, const Intervals &bIntervals)
class Intervals

Public Functions

Intervals()
Intervals(const Intervals &other)
~Intervals()
void insert(int64_t s, int64_t e)
bool empty() const
Intervals operator&(const Intervals &other) const
Intervals &operator=(const Intervals &other)
Intervals &operator+=(const Intervals &other)
bool operator==(const Intervals &other) const
bool operator!=(const Intervals &other) const

Friends

friend std::ostream &operator<<(std::ostream &os, const Intervals&)
enum popart::liveness::ProducerInterval

Values:

enumerator Enforce = 0
enumerator Ignore

14.11.9. Task information

#include <popart/taskid.hpp>
class TaskId

A class describing an IR-to-poplar lowering task.

This is a class that is cheap to construct. We construct and compare TaskIds a lot in irlowering.cpp so it pays to make these cheap operations. Note that previously TaskId was a std::string and creating a TaskId typically involved some string manipulation, meaning heap memory may be involved. Comparing strings for equality or ordering strings is also typically not constant-time.

Public Types

enum class Type

TaskId type.

Values:

enumerator AnchorStreamToHostTask = 0
enumerator AnchorSumTask
enumerator AnchorToHostTask
enumerator FromHostTask
enumerator FromHostUpdateTask
enumerator FromOpTask
enumerator InitBatchCounterTensorsTask
enumerator InitRngSeedsTask
enumerator InitRandomSeedTask
enumerator InitRngStateTensorTask
enumerator InitTensorTask
enumerator PipelinedCopyTask
enumerator RandomSeedToHostTask
enumerator RngStateFromHostTask
enumerator RngStateToHostTask
enumerator SetInitTensorValTask
enumerator StreamFromHostTask
enumerator UpdateBatchCountTask
enumerator WeightStreamToHostTask
enumerator WeightToHostTask
enumerator Undefined

Public Functions

TaskId()
explicit TaskId(Type type)
TaskId(Type, const TensorId &tensorId)
TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier)
TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier, const OpxGrowPartId &opxGrowPartId)
TaskId(Type type, nonstd::optional<TensorId> tensorId, nonstd::optional<OpId> opId, nonstd::optional<OperatorIdentifier> opIdentifier, nonstd::optional<OpxGrowPartId> opxGrowPartId)
bool empty() const
bool operator<(const TaskId &rhs) const
bool operator==(const TaskId &rhs) const
inline const nonstd::optional<TensorId> &getTensorId() const
inline const Type &getType() const

14.11.10. Type definitions

namespace onnx
namespace google
namespace protobuf
namespace popart
namespace view

Typedefs

using Regions = std::vector<Region>
using RegMap = std::function<Regions(const Region&)>
using LowBounds = std::vector<int64_t>
using UppBounds = std::vector<int64_t>

Typedefs

using Shape = std::vector<int64_t>

The dimensions of a tensor, equivalent to numpy.shape.

using Rank = int

Rank of a tensor. That is, the number of indices.

typedef std::string TensorId

Label put on a tensor to distinguish it from the others in the graph.

using DnfTensorIds = std::vector<std::set<TensorId>>
using OpName = std::string

Name of the instance of the operator.

using OpDomain = std::string

Specifies who created the operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpType = std::string

Specifies the type of an operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpVersion = unsigned

Specifies the version of the operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpId = int

Label put on a operator to distinguish it from the others in the graph.

using ReturnPeriod = int
using ReplicaIndex = int

The index of a replica.

using SubgraphIndex = int

The index of a subgraph for an Op.

using SubgraphPartIndex = int

The index of the subgraph part.

using OpxGrowPartId = int

Identifies a part of an Opx grow function.

using InIndex = int

The position at which a tensor is input by an Op.

using OutIndex = int

The position at which a tensor is output by an Op.

using CollectiveBalancedReorderId = int

The identifier of the collective balanced host rearrangement.

using ReplicatedTensorShardingIndices = std::set<std::pair<std::set<InIndex>, std::set<OutIndex>>>

The set of indices that have to be replica sharded together, and the outputs that will be replica sharded as a result.

using ReplicatedTensorShardingIndicesIndex = int

The position in ReplicatedTensorShardingIndices for which to get the ReplicatedTensorShardingGroup.

using ReplicatedTensorShardingGroupId = int

The unique integer id for a ReplicatedTensorShardingGroup.

using PipelineCycle = int64_t
using VGraphId = int64_t
using PipelineStage = int64_t
using ExecutionPhase = int64_t
using BatchSerializedPhase = int64_t
using StashIndex = int64_t
using RemoteBufferId = int64_t
using RemoteBufferIndex = int64_t
using RandomReferenceId = int64_t
using ConvInputs = std::vector<TensorId>
using ConvDilations = std::vector<int64_t>
using ConvGroup = int64_t
using ConvPads = std::vector<int64_t>
using ConvStrides = std::vector<int64_t>
using ConvTruncs = std::vector<int64_t>
using MultiConvInputs = std::vector<ConvInputs>
using MultiConvDilations = std::vector<ConvDilations>
using MultiConvGroups = std::vector<ConvGroup>
using MultiConvPads = std::vector<ConvPads>
using MultiConvStrides = std::vector<ConvStrides>
using TensorInterval = std::pair<size_t, size_t>
using TensorIntervalList = std::vector<TensorInterval>
using onnxAttPtr = const ONNX_NAMESPACE::AttributeProto*
using NodeAttributes = google::protobuf::RepeatedPtrField<ONNX_NAMESPACE::AttributeProto>
using OnnxTensors = std::map<TensorId, ONNX_NAMESPACE::TensorProto>
using Node = ONNX_NAMESPACE::NodeProto
using OnnxTensorPtrs = std::map<TensorId, const ONNX_NAMESPACE::TensorProto*>
using OpsBeforeKey = std::map<Op*, std::vector<Op*>, POpCmp>
using IsReplicaEqual = bool
using ReplEqInputMap = std::map<InIndex, IsReplicaEqual>
using ReplEqOutputMap = std::map<InIndex, IsReplicaEqual>
using ReplEqModifiedInputMap = ReplEqInputMap
using ReplEqFun = std::function<std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap>(const ReplEqInputMap&)>
using ReplEqGraphFuns = std::function<ReplEqFun(const Graph*)>

Enums

enum StochasticRoundingMethod

Used to describe the stochastic rounding which is applied to the output(s) of an Op.

See also docs/notes/ir/attributes/stochasticroundingmethod.md

Values:

enumerator DifferingSeeds = 1

Apply stochastic rounding with a replica-local seed.

That is, stochastic rounding performed by an Op on one replica is nominally different to stochastic rounding performed by the same Op on another replica. Use this setting for Ops where you want to apply stochastic rounding but you cannot meet the condition of StochasticRoundingMethod::IdenticalSeeds. For example, this setting can be useful for gradient accumulation steps.

enumerator IdenticalSeeds = 2

Apply stochastic rounding with a RNG state (the value of poplar::getHwSeeds) that is identical across replicas.

Use this option on, e.g., the weight update step to ensure that the weight tensor on each replica has stochastic rounding applied to it in the same way and there is no weight drift.

REQUIREMENT: The ability to provide an RNG state (the value of poplar::getHwSeeds) that is identical on each replica relies on all Ops that use this setting to behave in a way that does not violate this property for Ops that follow it. More formally, you must only apply this setting to Ops for which you can guarantee that if the RNG state is the same across replicas before the Op is executed then the RNG state is still the same on all replicas after the Op is done executing. A typically sufficient (but not necessary) condition is that all input tensors of the Op have the same value across replicas.

using popart::FwdGraphToBwdGraphInfo = std::map<GraphId, BwdGraphInfo>

Mapping from fwdGraph to info on the bwdGraph.

using popart::popx::PreparedCopyTensors = std::map<InIndex, PreparedCopyTensor>
using popart::popx::PreparedTensorInfos = std::vector<PreparedTensorInfo>

14.11.11. Enums

enum class popart::AccumulationType

Values:

enumerator Add = 0
enumerator DampenedAdd
enumerator DampenedAddSquare
enumerator DecayAdd
enumerator DecayAddSquare
enumerator MovingAverage
enumerator MovingAverageSquare
enumerator Infinity
enumerator Mean
enum class popart::ActivationFunction

Values:

enumerator Sigmoid = 0
enumerator Relu
enumerator Tanh
enumerator Gelu
enumerator GeluErf
enumerator Swish
enumerator Softmax
enumerator SoftmaxStable
enumerator SoftmaxScaled
enumerator N
enumerator Invalid
enum class popart::AutoPad

Values:

enumerator NOTSET = 0
enumerator SAME_UPPER
enumerator SAME_LOWER
enumerator VALID
enum class popart::CollectiveOperator

Values:

enumerator Add = 0
enumerator Mean
enumerator Mul
enumerator Min
enumerator Max
enumerator LogicalAnd
enumerator LogicalOr
enumerator SquareAdd
enumerator Local
enumerator N
enum class popart::DeviceSelectionCriterion

Controls how to select an available IPU.

Values:

enumerator First = 0
enumerator Random

Select the first device available. (Default).

Select a device randomly from those available.

enum class popart::InitType

Values:

enumerator NoInit = 0
enumerator Zero
enum class popart::MatMulPartialsType

Values:

enumerator HALF
enumerator FLOAT
enum class popart::ResizeCoordinateTransformationMode

Values:

enumerator HalfPixel
enumerator PytorchHalfPixel
enumerator AlignCorners
enumerator Asymmetric
enumerator TfCropAndResize
enumerator N
enum class popart::ResizeMode

Values:

enumerator Nearest
enumerator Linear
enumerator Cubic
enumerator N
enum class popart::ResizeNearestMode

Values:

enumerator RoundPreferFloor
enumerator RoundPreferCeil
enumerator Floor
enumerator Ceil
enumerator Pytorch
enumerator N
enum class popart::ScatterReduction

Values:

enumerator Sum = 0
enumerator Max
enumerator Min
enumerator Mul
enumerator None
enum class popart::TensorRemapType

Enum describing how the tensor layout should be remapped during the forward and backward pass (backward pass remapping requires the Op to exist in the IR before autodiff).

Values:

enumerator FwdBwdReverse = 0

Remap the tensor in the forward pass, reverse-apply the remapping in the backward pass.

enumerator FwdBwd

Remap the tensor in the forward pass and backward pass independently.

enumerator Fwd

Only remap the tensor in the forward pass, use identity for the backward pass.

14.11.12. Structs

struct BranchInfo

Public Functions

BranchInfo(const GraphId&, const std::map<int, int> inputIndicesMap, const std::map<int, int> outputIndicesMap)

Public Members

GraphId graphId
std::map<int, int> inputIndicesMap
std::map<int, int> outputIndicesMap
struct ClonedGraphMaps

Struct of maps that map cloned Op and Tensor Ids back to the original, and vice-versa.

Public Members

std::map<OpId, OpId> opIdMap
std::map<TensorId, TensorId> tensorIdMap
struct ConvParameters

Public Members

DataType type
int64_t batchSize
int64_t numInChannelsPerGroup
int64_t numOutChannelsPerGroup
int64_t numGroups
Shape inputShape
Shape kernelShape
struct popart::ConvParameters::Input inputTransformation
struct popart::ConvParameters::Input kernelTransformation
struct popart::ConvParameters::Output outputTransformation
struct Input

Public Members

std::vector<int64_t> lowerTruncation
std::vector<int64_t> upperTruncation
std::vector<int64_t> dilation
std::vector<int64_t> lowerPadding
std::vector<int64_t> upperPadding
std::vector<bool> flip
struct Output

Public Members

std::vector<int64_t> lowerTruncation
std::vector<int64_t> upperTruncation
std::vector<int64_t> stride
std::vector<int64_t> lowerPadding
std::vector<int64_t> upperPadding
struct OpxInAndOutIndex

Public Functions

inline OpxInAndOutIndex(const Opx *opx_, InIndex inIndex_, OutIndex outIndex_)
inline OpxInAndOutIndex(const Opx *opx_)
OpxInAndOutIndex() = default
inline bool operator==(const OpxInAndOutIndex &rhs) const

Public Members

const Opx *opx
InIndex inIndex
OutIndex outIndex
bool isDelegate
struct PTensorCmp

Public Functions

bool operator()(const Tensor *const &a, const Tensor *const &b) const
struct ReplicatedTensorShardingOpInfo

Struct that describes which inputs/outputs of an Op belong to the sharding group.

Regular operations typically belong to only one sharding group, however:

  • Subgraphing operations (CallOp, LoopOp)

  • MultiExchangeOp can belong to multiple sharding groups, depending on the input and ouput indices.

Public Functions

inline ReplicatedTensorShardingOpInfo()
inline ReplicatedTensorShardingOpInfo(OpId id_, std::set<InIndex> inIndices_, std::set<OutIndex> outIndices_)
bool operator<(ReplicatedTensorShardingOpInfo const &rhs) const

Public Members

OpId id

Unique ID of the operator.

std::set<InIndex> inIndices

Input indices belonging to the sharding group.

std::set<OutIndex> outIndices

Output indices belonging to the sharding group.

14.11.13. Other classes

template<typename T, uint32_t V = 0>
class BasicOptional

A temporary solution to removing boost::optional from certain header files This class is an incomplete replacement of boost::optional (and std::optional).

template parameter T: the type which will optionally be stored template parameter V: has no effect, but enables compiler errors when two objects of type T should not be compared

Public Functions

inline BasicOptional() noexcept

Construct an unset BasicOptional<T>

inline BasicOptional(T t)

Create a set BasicOptional<T> from a value.

BasicOptional(const BasicOptional<T, V> &rhs) = default
BasicOptional<T, V> &operator=(const BasicOptional<T, V>&) = default
inline BasicOptional<T, V> &operator=(const T &t)
inline const T &operator*() const &

Get a constant reference to the value.

inline T &operator*() &

Get a reference to the value.

inline explicit operator bool() const

Return true if set.

Can be used as:

BasicOptional<Foo> foo(6); if (foo){ *foo = 7; }

inline void reset() noexcept
class ExchangeDescriptor

Class describing an external exchanges from IPUs.

Public Functions

ExchangeDescriptor(ExchangeDirection direction, TensorId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)

Create an ExchangeDescriptor for a host exchange.

Parameters
  • directionLoad (from host) or Store (to host)

  • id – Host stream tensor ID

  • vgid – Virtual graph for the exchange

  • tileSet – Tile set for the exchange

  • numInputs – Number of tensor inputs expected

  • numOutputs – Number of tensor outputs expected

  • inplace – If the output of the exchange should alias the input during Load

ExchangeDescriptor(ExchangeDirection direction, RemoteBufferId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)

Create an ExchangeDescriptor for a remote exchange.

Parameters
  • directionLoad (from host) or Store (to host)

  • id – Remote buffer id

  • vgid – Virtual graph for the exchange

  • tileSet – Tile set for the exchange

  • numInputs – Number of tensor inputs expected

  • numOutputs – Number of tensor outputs expected

  • inplace – If the output of the exchange should alias the input during Load

ExchangeDescriptor(ExchangeDirection direction, GraphId id, TileSet destination, CodeMemoryType destinationType)

Create an ExchangeDescriptor for an External code copy op.

Parameters
  • directionLoad (from host) or Store (to host)

  • id – GraphId of the graph to load.

  • destination – The destination TileSet to load to .

  • destinationType – The destination memory type to load to.

inline const ExchangeDirection &getDirection() const
inline bool isRemoteExchange() const
inline bool isHostExchange() const
inline bool isCodeCopy() const

Returns true if this exchange descriptor is is associated with a code copy operation.

Returns

true If it is associated with a code copy op.

Returns

false Otherwise.

inline const RemoteBufferId &getRemoteBufferId() const
inline void setRemoteBufferId(RemoteBufferId id)
inline const TensorId &getHostStreamTensorId() const
inline const OptionalGraphId &getGraphToLoadId() const

GraphId of the graph which this op will load code for.

Returns

const OptionalGraphId& Id in question.

inline OptionalCodeMemoryType getDestinationCodeMemoryType() const

Get the Destination Location the code will be sent to, if this is an ExchangeDescriptor for an RemoteCodeLoadOpOp.

Buffer - Stored in non-executable buffer memory. ExecutableMemory - Stored in executable memory.

Returns

OptionalLocationType One of:

const std::string getResourceId() const

Get an identifier representing which resource (landing pad tensor) this exchange will be using.

Returns

Resource identifier

inline OptionalVGraphId getVGraphID() const
inline TileSet getTileSet() const
inline int getNumInputs() const
inline int getNumOutputs() const
inline bool isInplace() const
class GraphId

Public Functions

GraphId() = delete
GraphId(const std::string&)
bool operator<(const GraphId&) const
bool operator==(const GraphId&) const
bool operator!=(const GraphId&) const
const std::string &str() const

Public Static Functions

static const GraphId &root()
class LeakyReluOpBaseAttributes

Subclassed by popart::LeakyReluGradOp, popart::LeakyReluInplaceOp, popart::LeakyReluOp

Public Functions

inline LeakyReluOpBaseAttributes(float _alpha)
inline float getAlpha() const
class MultiConvOptions

Public Functions

MultiConvOptions(const std::map<std::string, std::string> sessionConvOptions, const Attributes &attr)
std::map<std::string, std::string> getConvOptions(int convIndex) const
std::map<std::string, std::string> getGlobalOptions() const

Public Members

std::vector<float> availableMemoryProportions
std::vector<std::string> partialsTypes
nonstd::optional<std::string> planType
nonstd::optional<int> perConvReservedTiles
nonstd::optional<float> cycleBackOff
std::vector<int64_t> enableConvDithering
class OpEquivIdCreator : public popart::OpSerialiserBase

Public Functions

OpEquivIdCreator(const Op*)
void appendAttribute(const std::string&, nonstd::optional<int64_t>) override
void appendAttribute(const std::string&, nonstd::optional<float>) override
void appendAttribute(const std::string&, nonstd::optional<double>) override
void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override
void appendForwardOp(const Op*) override
std::string str()
template<>
void appendAttr(const TensorIndexMap &tmap)
class OpJsonSerialiser : public popart::OpSerialiserBase

Public Functions

OpJsonSerialiser(const Op*, std::stringstream &ss_)
void appendAttribute(const std::string&, nonstd::optional<int64_t>) override
void appendAttribute(const std::string&, nonstd::optional<float>) override
void appendAttribute(const std::string&, nonstd::optional<double>) override
void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override
void appendForwardOp(const Op*) override
class OpSerialiser : public popart::OpSerialiserBase

Public Functions

OpSerialiser(const Op*, std::stringstream &ss_)
void appendAttribute(const std::string&, nonstd::optional<int64_t>) override
void appendAttribute(const std::string&, nonstd::optional<float>) override
void appendAttribute(const std::string&, nonstd::optional<double>) override
void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override
void appendForwardOp(const Op*) override
class OpSerialiserBase

Subclassed by popart::OpEquivIdCreator, popart::OpJsonSerialiser, popart::OpSerialiser

Public Functions

inline virtual ~OpSerialiserBase()
void appendAttribute(const std::string&, float)
void appendAttribute(const std::string&, double)
void appendAttribute(const std::string&, int)
void appendAttribute(const std::string&, int64_t)
void appendAttribute(const std::string&, uint32_t)
void appendAttribute(const std::string&, uint64_t)
void appendAttribute(const std::string&, const std::string&)
void appendAttribute(const std::string&, const std::vector<float>&)
void appendAttribute(const std::string&, const std::vector<double>&)
void appendAttribute(const std::string&, const std::vector<int64_t>&)
void appendAttribute(const std::string&, const Scope&)
void appendAttribute(const std::string&, bool)
virtual void appendAttribute(const std::string&, nonstd::optional<int64_t>) = 0
virtual void appendAttribute(const std::string&, nonstd::optional<float>) = 0
virtual void appendAttribute(const std::string&, nonstd::optional<double>) = 0
virtual void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) = 0
template<typename T, uint32_t V>
inline void appendAttribute(const std::string &key, const BasicOptional<T, V> &value)
template<typename T>
inline void appendAttribute(const std::string &key, const T &value)
virtual void appendForwardOp(const Op*) = 0
class PriTaskDependency

Public Functions

PriTaskDependency(TaskId taskId, DependencyType type)
PriTaskDependency(std::set<TaskId> taskIds, DependencyType type)
inline DependencyType getType() const
inline bool satisfiedBy(TaskId taskId) const
inline const std::set<TaskId> &getTaskIds() const
bool operator==(PriTaskDependency const &rhs) const
class ReplicaEqualAnalysisProxy

Interface for object passed to Op::fwdPropagateIsReplicaEqual.

Public Functions

virtual ReplEqModifiedInputMap getModifiedInputMapFromAliases(const Op *op, const ReplEqOutputMap &replEqOpOutputMap) const = 0

Work out replica-equal values for modified inputs by setting replica-equal values of modified inputs to true if and only if the Op has an output that is an alias of the modified input, containing all elements of the input, and the output is deemed replica-equal.

If this doesn’t hold a modified input is assumed to be not replica-equal.

NOTE: It is possible for an Op to modify an input to a replica-equal value in a way that will not be detected by this implementation, but it’s generally true for currently supported Ops at time of writing.

Parameters
  • op – The Op to get the replica-equal values for modified inputs for.

  • replEqOpOutputMap – The Op’s replica-equal output values.

Returns

A mapping containing replica-equal values for modified outputs.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqualThroughGraph(const Graph *graph, const ReplEqInputMap &replEqGraphInputMap) = 0

A method that can be called to work out how replica-equal values for graph inputs propagate to replica-equal values for graph outputs.

NOTE: Graphs never copy-modify input tensors, although Ops that call graphs might (like CallOp, LoopOp).

Parameters
  • graph – The graph to propagate replica-equal values through.

  • replEqGraphInputMap – The replica-equal values for the graph’s inputs.

Returns

A tuple containing a ReplEqOutputMap that describes replica-equal values for the graph’s outputs and a ReplEqModifiedInputMap that describes the final replica-equal values of the graph’s inputs.

inline virtual ~ReplicaEqualAnalysisProxy()
class ReplicatedTensorShardingTracer

Class that traces the graph and finds all tensors that are: 1.) Replicated tensor sharded 2.) Have the same meta-shape describing the tensor shape before sharding 3.) Use the same collective balanced reorder (CBR) when lowered to Poplar 4.) Share the same elementwise compatible tensor layout by virtue of 2.) and 3.)

Public Functions

ReplicatedTensorShardingTracer(const Ir &ir_)

Instantiate the tracer and trace.

Parameters
  • ir_ – IR to operate on

  • startTensors_ – Tensors to trace from

bool hasGroup(const ReplicatedTensorShardingOpInfo &opInfo) const

Check if the Op associated with the opId has a replicated tensor sharding group.

Parameters

opInfoOpId and input/output indices

Returns

True if there is a group associated with the opId

bool hasGroup(const TensorId &tensorId) const

Check if the tensor associated with the tensorId has a replicated tensor sharding group.

Parameters

tensorIdTensorId

Returns

True if there is a group associated with the tensorId

const ReplicatedTensorShardingGroup &getGroup(const ReplicatedTensorShardingOpInfo &opInfo) const

Get the replicated tensor sharding group associated with the opId.

Parameters

opInfoOpId and input/output indices

Returns

Associated replicated tensor sharding group.

const ReplicatedTensorShardingGroup &getGroup(const TensorId &tensorId) const

Get the replicated tensor sharding group associated with the tensorId.

Parameters

tensorIdTensorId

Returns

Associated replicated tensor sharding group.

void trace(const std::set<Tensor*, PTensorCmp> &startTensors)

Traverse the graph to trace out operators and tensors belonging to the same replicated tensor sharding group.

Parameters

startTensors

class TensorLocationInfo

Public Functions

inline void setRemote(bool remote_)
inline bool isRemote() const
inline void setSharded(bool sharded_)
inline bool isSharded() const
inline void setRemoteBufferInfo(RemoteBufferId rbId, RemoteBufferIndex index)
inline const std::pair<RemoteBufferId, RemoteBufferIndex> getRemoteBufferInfo() const
inline bool operator==(const TensorLocationInfo &rhs) const
class InputCreatorCandidate : public popart::popx::ICreatorCandidate

Public Functions

InputCreatorCandidate(InIndex index_, const Opx *opx_, std::vector<OpxInAndOutIndex> pathFromInput_, int64_t scheduleIndex_)
InputCreatorCandidate() = default
~InputCreatorCandidate() override = default
std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override
DnfTensorIds mustExistBeforeCreate() override
double getMaxCreatorPriority() const override
int64_t getNumElems() const override
inline InIndex getIndex() const
inline const Opx *getOpx() const
inline std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final
inline void setPathFromInput(const std::vector<OpxInAndOutIndex> &value)
std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) override
std::vector<popart::view::Region> unwind(popart::view::Region) override
std::vector<popart::view::Region> unwind() override
std::string str() override
inline int64_t getScheduleIndex() const final
class InputMultiCreatorCandidate : public popart::popx::ICreatorCandidate

Public Functions

InputMultiCreatorCandidate()
~InputMultiCreatorCandidate() override = default
std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override
DnfTensorIds mustExistBeforeCreate() override
double getMaxCreatorPriority() const override
int64_t getNumElems() const override
std::string str() override
bool addCreatorCandidate(ICreatorCandidatePtr)
std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final
std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) override
std::vector<popart::view::Region> unwind(popart::view::Region) override
std::vector<popart::view::Region> unwind() override
int64_t getScheduleIndex() const final
class IsInfx : public popart::popx::ElementWiseUnaryOpx

Public Functions

IsInfx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class IsNaNx : public popart::popx::ElementWiseUnaryOpx

Public Functions

IsNaNx(Op*, Devicex*)
void grow(poplar::program::Sequence&) const final
class ViewChanger

Subclassed by popart::popx::ReplicatedGatherInScatterOutViewChanger, popart::popx::ReplicatedGatherOutScatterInViewChanger

Public Functions

inline virtual ~ViewChanger()
inline virtual poplar::Tensor apply(poplar::Tensor tensor) const
inline virtual bool containsAllDataRegions() const
inline virtual bool operator==(const ViewChanger &rhs) const
inline virtual bool operator!=(const ViewChanger &rhs) const
class ViewChangers

Public Functions

ViewChangers()
ViewChangers(std::vector<std::shared_ptr<ViewChanger>> viewChangers_)
poplar::Tensor apply(poplar::Tensor tensor) const
inline bool empty() const
bool operator==(const ViewChangers &rhs) const
bool operator!=(const ViewChangers &rhs) const
class ReplicatedGatherInScatterOutViewChanger : public popart::popx::ViewChanger

Public Functions

inline ReplicatedGatherInScatterOutViewChanger(int64_t nelms_, ReplicatedTensorShardingGroupId group_)
inline poplar::Tensor apply(poplar::Tensor tensor) const final
inline bool containsAllDataRegions() const final
inline bool operator==(const ViewChanger &rhs) const final
class ReplicatedGatherOutScatterInViewChanger : public popart::popx::ViewChanger

Public Functions

inline ReplicatedGatherOutScatterInViewChanger(const gcl::CollectiveBalancedReorder *cbr_, ReplicatedTensorShardingGroupId group_)
inline poplar::Tensor apply(poplar::Tensor tensor) const final
inline bool operator==(const ViewChanger &rhs) const final
class Reader

A class which facilitates deserialization process.

It allows reading serialized streams allowing restoring PopART state. For more information on what components are deserialized please refer to Writer class.

Public Functions

Reader(const std::vector<std::shared_ptr<std::istream>> &in_vec)

Constructs Reader class object.

Parameters

in – Vector of source streams from which a PopEF file will be read.

Reader(Reader &&reader)

Move constructor.

~Reader()

Default destructor.

size_t readExecutableHash() const
Returns

The executable hash or 0 if the stream contains corrupted data.

bool containsPoplarExecutable() const
Returns

True if the stream contains a Poplar executable.

bool containsExecutable() const
Returns

True if the stream contains a Popart executable.

bool containsPopefMetadata()
Returns

True if the stream contains a PopEF metadata.

poplar::Executable deserializePoplarExecutable() const

Deserializes Poplar executable from an executable blob which is part of a PopEF file.

Returns

Poplar executable.

std::unique_ptr<popart::popx::Executablex> deserializeExecutable(popart::Ir &ir, popart::popx::IrLowering &lowering) const

Load a PopART executable from a PopEF file.

Parameters
  • ir – Object which some of the deserialized data will be written to.

  • lowering – Object which some of the deserialized data will be written to.

Returns

PopART executable.

Public Static Functions

static nonstd::optional<size_t> checkFileForValidPoplarExecutable(const std::string &filePath)

Check that a PopART executable can be loaded from a PopEF file.

Parameters

filePath – The full path to the popef file.

Returns

nonstd::optional<size_t> The hash of the PopART IR if an executable could be loaded.