14. PopART C++ API

This chapter describes the PopART C++ API.

14.1. Sessions

#include <popart/session.hpp>

class Session

Session is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware.

Subclassed by popart::InferenceSession, popart::TrainingSession

Public Functions

virtual ~Session() = 0: Destructor for the Session class.

std::vector<uint32_t> getRNGState(): Get state of the random number generator.

void setRNGState(const std::vector<uint32_t>): Set state of the random number generator.

void setRandomSeed(uint64_t seedValue)

Set the value of the random number generator seed.

This method explicitly seeds all random operations. Additionally, this method derives a new state for the random number generator (RNG) from the seed and sets it on the device. This RNG state is used to resolve stochastic rounding. Note that to deterministically store and restore the combined random state for a session, do the following:

C++:

// Store random state (session s0).
auto seed = s0.getRandomSeed();
auto rngState = s0.getRNGState();

// Restore random state (session s1).
s1.setRandomSeed(seed);   // <-- affects RNG state, order important
s1.setRNGState(rngState);

Python:

# Store random state (session s0).
seed = s0.getRandomSeed()
rngState = s0.getRNGState()

# Restore random state (session s1).
s1.setRandomSeed(seed)   # <-- affects RNG state, order important
s1.setRNGState(rngState)

Parameters: seedValue – The value of the seed.

uint64_t getRandomSeed()

Get the value of the random number generator seed.

Calling setRandomSeed() with this value (at a later stage) reinstates the random state logic that seeds random operations.

Returns: The value used to seed current random operations.

void compileAndExport(const std::string &filename)

Compile the graph and export it to a file.

This method will first create a poplar::Graph and compile the poplar::Executable. Next, it will export the executable and PopART metadata to the file. The exported file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

Parameters: filename – The name of the file where the compiled executable and metadata will be saved.

void compileAndExport(std::ostream &out)

Compile the graph and export it to a stream.

This method will first create a poplar::Graph and compile the poplar::Executable. Next, it will export the executable and PopART metadata to the stream. The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

This method automatically creates folders as needed if filename is located in a folder which does not exist.

Parameters: out – The stream that the compiled executable and metadata will be written to.

void saveExecutableToFile(const std::string &filename)

Save a compiled graph to a file.

The file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

This method automatically creates folders as needed if filename is located in a folder which does not exist.

Parameters: filename – The name of the file where the compiled executable and metadata will be saved.
Pre: prepareDevice() must have been called.

void saveExecutableToStream(std::ostream &out)

Save a compiled graph to a stream.

The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.

Parameters: out – The stream where the compiled executable and metadata will be written to.
Pre: prepareDevice() must have been called.

void saveExecutable(const std::string &path, bool savePopartMetadata = true, bool saveVariables = true)

Save a compiled graph with additional data to a file.

PopART is able to save its state after the model compilation is complete, so that it can be restored at a later time. To make this possible, it is necessary to save such elements as:

a serialised Poplar executable,
its associated metadata,
tensor data blobs if model parameters have not been frozen (refer to the SessionOptions::constantWeights for more information),
a PopART-specific opaque blob to store information only relevant to PopART. This is needed to restore PopART state.

The file will be in the PopEF format. This means that the file can be used to restore the state of the PopART program without recompiling the graph, or run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information. If you want to analyze file structure saved by the function please refer to the PopEF dump tool.

Parameters

path – The name of the file or directory where the compiled executable, metadata and variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/executable.popef”. If the file exists, the function will overwrite the old data with the new ones.
savePopartMetadata – If you do not need the option to restore the PopART state later, you can set the flag to false to reduce disk space taken up by the file.
saveVariables – If you don’t need to save variables (tensors) state, you can set the flag to false if you want to save them later or in a different location. The function will save data consistent with the variables contained within the model.

Pre

prepareDevice() must have been called.

void saveVariables(const std::string &path)

Save all variables to a file.

The function will save data consistent with the variables contained within the model.

The file will be in the PopEF format. If you want to analyze tensors saved by the function refer to the PopEF dump tool.

Parameters: path – The name of the file or directory where the compiled variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/variables.popef”. If the file exists, the function will overwrite the old data with the new ones.
Pre: prepareDevice() must have been called.

void checkInplacingAmbiguity() const

Check for potential inplacing ambiguities.

This method creates an AliasModel object for each graph and runs the Poprithms ambiguity checker on it.

Throws an error if the graph has an inplacing ambiguity and will prompt the user to check the inplacing.

See poprithms::memory::inplace::Graph::AmbiguityStatus on the Poprithms GitHub repo for more on what constitutes an ambiguity.

void loadExecutableFromFile(const std::string &filename)

Load the compiled executable and metadata from a file.

The file must have been created with compileAndExport(const std::string).

Parameters: filename – The name of the file to load the executable and metadata from.

void loadExecutableFromStream(std::shared_ptr<std::istream> in)

Load the compiled executable and from a stream.

The stream must have been created with compileAndExport(std::ostream).

Parameters: in – The shared pointer to the stream to load the executable from.

void prepareDevice(bool loadEngine = true)

Prepare the network for execution.

This will create the poplar::Graph and poplar::Engine.

Parameters: loadEngine – If true, load the engine and connect the streams once the device is ready.

void loadEngineAndConnectStreams()

Load the engine on the device and connect the streams.

This will set up the poplar::Streams.

Note: This call is optional. The engine will implicitly be loaded on the device when required.

void weightsFromHost(): Copy weights from the host to the device.

void buffersFromHost(): Copy buffers from the host to the device.

void weightsToHost(): Copy the weights from the device to the host steam memory.

uint64_t getCycleCount(std::string id = "")

Copy the cycle count tensor from the device to the host.

Parameters: id – The identifier of the cycle count tensor.

void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index = 0)

Connect a Poplar stream with a callback.

This method will be called whenever the stream will be read or was written to by the device. The memory location will only be valid for reading or writing for the duration of the callback.

Parameters

streamHandle – The name of the stream to connect to.
callback – The callback to be called whenever the stream is to be read or was written to by the device.
index – The replica index to connect to, when using replicated graphs. Default=0.

void connectStream(const std::string &streamHandle, void *buffer)

Connect a Poplar stream with a fixed location in memory.

Each time data is copied to the stream, this location will be read and each time data is copied from the stream, this location will be written.

Parameters

streamHandle – The handle of the stream to connect to.
buffer – The pointer to the memory location.

void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index = 0)

Connect a host function to a callback.

The callback takes two arguments, which point to the locations in memory for each of the function’s input and output arguments, respectively. During a host function call, first the device transfers the input data to the host, then the callback is invoked, and finally the output data is copied back to the device. The memory pointed to by the callback arguments must only be accessed during the duration of the callback.

Parameters

functionHandle – The name of the host function.
callback – The function to be called whenever new input data is available.
index – The replica index to connect to, when using replicated graphs. Default=0.

void run(IStepIO &stepIO, std::string debugName = "")

Run one step.

Read input data from address in stepIO.in.

Write the output data to addresses in stepIO.out.

Parameters

stepIO – The input and output data.
debugName – A debug string to identify this run in logs.

void run(std::string programHandle, IStepIO &stepIO, std::string debugName = "")

Run one step of a custom program.

Read input data from address in stepIO.in.

Write the output data to addresses in stepIO.out.

Parameters

programHandle – The handle of the custom program to run.
stepIO – The input and output data.
debugName – A debug string to identify this run in logs.

void updateExternallySavedTensorLocations(const std::string &fromLocation, const std::string &toLocation)

Update the tensor locations of tensors in the session’s ONNX model.

A new file will be created at this point, and written to when the ONNX model is saved with a subsequent call to modelToHost().

Parameters

fromLocation – All externally saved tensors with location fromLocation will have their location updated to toLocation.
toLocation – The updated tensor locations. This must not already exist.

void modelToHost(const std::string &fn)

Write the current model to an ONNX file.

Parameters: fn – The path to file. The path can be absolute or relative. If you plan to run your program in multiple processes simultaneously, you should avoid possible race conditions by writing to different files, for example by using temporary files.

TensorInfo getInfo(TensorId) const

Get the tensor information for a tensor.

Parameters: TensorId – The identifier of the tensor to get the tensor information for.
Returns: The tensor information for the tensor.

bool hasInfo(TensorId) const

Check whether a tensor has information.

Parameters: TensorId – The identifier of the tensor to get the tensor information for.
Returns: true if the tensor with identifier TensorId has tensor information and false if not.

std::set<TensorId> getAllTensorIds() const

Returns the ids of all tensors in the model.

Pre: prepareDevice() must have been called.

std::string getSummaryReport(bool resetProfile = true) const

Retrieve the summary report from from the poplar::Engine.

The options which were passed to the Session constructor will influence the information in the report.

This method may only be called after prepareDevice() has been called.

Parameters: resetProfile – If true, resets the execution profile. Default = true.
Returns: A string containing the report.

std::string getSerializedGraph() const

Retrieve the serialized graph from the poplar::Engine.

A JSON format report is produced.

This method may only be called after prepareDevice() has been called.

Returns: A string containing the serialized graph.

pva::Report getReport() const

Retrieve the graph report from the poplar::Engine.

The options which were passed to the Session constructor will influence the information in the report.

This method may only be called after prepareDevice() has been called.

Returns: The PopVision Analysis report object.

void resetHostWeights(const std::string &model, const bool ignoreWeightsInModelWithoutCorrespondingHostWeight = false)

Reset weights with weights in an ONNX model.

Note that the only differences between the ONNX model and the current model must be the weights. No other differences are allowed.

This method only updates the weights on the host. weightsFromHost() must be called after this method to update the weights on the device.

Parameters

model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
ignoreWeightsInModelWithoutCorrespondingHostWeight – If true, do not throw an error if there are initializers in the ONNX model without corresponding initializer tensor(s) in the session’s IR.

void readWeights(const IWeightsIO &weightsIo)

Read the weights from the host stream memory and write to the host.

This method may only be called after weightsToHost() has been called.

Parameters: weightsIo – The weight data that is read from the host stream memory is written to the addresses in weightsIo.out.

void writeWeights(const IWeightsIO &weightsIo)

Write the weights from the host to the IR tensor memory.

This method may only be called after weightsFromHost() has been called.

Parameters: weightsIo – The weight data is written to the addresses in weightsIo.out.

std::string serializeIr(IrSerializationFormat format)

Serialize the IR graph to a string.

Parameters: format – The format to use for serializing.

inline const Ir &getIr() const: Get the IR associated with the Session.

inline const popx::Devicex &getDevice() const: Get the device associated with the Session.

inline popx::Devicex &getDevice(): Get the device associated with the Session.

inline const popx::IrLowering &getIrLowering() const: Get the IR lowering associated with the Session.

inline const popx::Executablex &getExecutable() const: Get the executable associated with the Session.

void broadcastWeights(int rootRank = 0)

Broadcasts the weight from the PopRun instance with index rootRank to all other instances.

Parameters: rootRank – The index of the PopRun instance from which the weights should be broadcasted.

void updateEngineCache(): Update cacheEntries from engine cache directory and update ir::hashMatched_ with the updated cacheEntries.

void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo): Set the DeviceInfo of the Session.

14.1.1. Training session

#include <popart/session.hpp>

class TrainingSession : public popart::Session 

TrainingSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware with training provided by optimizing a loss tensor using an optimizer and automatic differentiation (backpropagation).

Public Functions

~TrainingSession() override: Destructor for the TrainingSession class.

void updateOptimizerFromHost(const Optimizer *optimizer)

Update the optimizer from the host.

This method updates the optimizer and the associated hyperparameters but not the optimizer state tensors.

NOTE: The optimizer parameter has to be compatible with the optimizer passed to the TrainingSession constructor. For example, you cannot call this function with an SDG1 optimizer if you created the session with an SDG0 optimizer. This is because it is not possible to change the IR after a session has been constructed.

Parameters: optimizer – A pointer to a popart::Optimizer.

void copyFromRemoteBuffer(const std::string &buffer, void *w, int repeat_index, unsigned replication_index = 0)

Copy from a remote butter into a user buffer.

This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.

Parameters

buffer – The name of the remote buffer to copy from.
w – Pointer to a user buffer to copy to.
repeat_index – The index in the remote buffer to copy from.
replication_index – The replicated graph index when using replicated graphs. Default=0.

void copyToRemoteBuffer(void *w, const std::string &buffer, int repeat_index, unsigned replication_index = 0)

Copy from a user buffer to a remote buffer.

This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.

Parameters

w – Pointer to a user buffer to copy from.
buffer – The remote buffer to copy to.
repeat_index – The index in the remote buffer to copy to.
replication_index – The replicated graph index when using replicated graphs. Default=0.

Public Static Functions

static std::unique_ptr<TrainingSession> createFromIr(std::shared_ptr<Ir> ir, std::shared_ptr<DeviceInfo> deviceInfo, const std::string name = DefaultTrainingSessionName)

Create a session for training from an IR.

Parameters

ir – The IR to create the session from.
deviceInfo – The type of device that this session uses.
name – The name of this training session. Default: “training”.

static std::unique_ptr<TrainingSession> createFromOnnxModel(const std::string &model, const DataFlow &dataFlow, const TensorId &loss, const Optimizer &optimizer, std::shared_ptr<DeviceInfo> deviceInfo, const InputShapeInfo &inputShapeInfo = InputShapeInfo(), const SessionOptions &userOptions = SessionOptions(), const Patterns &patterns = Patterns(), const std::string name = DefaultTrainingSessionName)

Create a session for inference from an ONNX model.

Parameters

model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
dataFlow – Configuration for the data feeds and fetches.
loss – The identifier of the final scalar loss tensor for training.
optimizer – The name of an optimizer to use when training.
deviceInfo – The type of device that this session uses.
inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().
userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().
patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().
name – (Optional) The name of this inference session. Default: “training”.

14.1.2. Inference session

#include <popart/session.hpp>

class InferenceSession : public popart::Session 

InferenceSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware, without any automatic differentiation (backpropagation) or optimization.

Public Functions

~InferenceSession() override: Destructor for the InferenceSession class.

void popxlSetEngineIsLoaded(bool isLoaded)

Public Static Functions

static std::unique_ptr<InferenceSession> createFromIr(std::shared_ptr<Ir> ir, std::shared_ptr<DeviceInfo> deviceInfo, const std::string name = DefaultInferenceSessionName)

Create a session for inference from an IR.

Parameters

ir – The IR to create the session from.
deviceInfo – The type of device that this session uses.
name – The name of this inference session. Default: “inference”.

static std::unique_ptr<InferenceSession> createFromOnnxModel(const std::string &model, const DataFlow &dataFlow, std::shared_ptr<DeviceInfo> deviceInfo, const InputShapeInfo &inputShapeInfo = InputShapeInfo(), const SessionOptions &userOptions = SessionOptions(), const Patterns &patterns = Patterns(), const std::string name = DefaultInferenceSessionName)

Create a session for inference from an ONNX model.

Parameters

model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
dataFlow – Configuration for the data feeds and fetches.
deviceInfo – The type of device that this session uses.
inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().
userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().
patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().
name – (Optional) The name of this inference session. Default: “inference”.

14.1.3. Session options

#include <popart/sessionoptions.hpp>

enum class popart::AccumulateOuterFragmentSchedule

Enum type that determines how the operations in the accumulate outer fragment will be scheduled across virtual graphs (only relevant to pipelined modes).

Values:

enumerator Scheduler = 0: Don’t add additional constraints and let the scheduler work it out.

enumerator Serial: Add constraints that ensure ops are executed in virtual graph ID order.

enumerator OverlapCycleOptimized: Try and parallelise ops with different virtual graph IDs as much as possible.

enumerator OverlapMemoryOptimized: Try and parallelise ops with different virtual graph IDs but avoid certain steps that are costly in terms of memory usage.

enum class popart::AutodiffStitchStrategy

Enum type representing a strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph.

Strategies may expose tensors that would otherwise have been internal to the forward graph as outputs of this forward graph.

Values:

enumerator RecomputeMinimal = 0: Recompute any backward graph inputs associated with non-gradient forward graph tensors that are neither inputs nor outputs in the forward graph.

enumerator RecomputeAllNonInputs: Recompute any backward graph inputs associated with non-gradient forward graph tensors that are not inputs in the forward graph.

enumerator AddFwdOutputs: For backward graph inputs associated with non-gradient forward graph tensors that are neither inputs or outputs in the forward graph, add them as outputs to the forward graph.

Note

This strategy is not guaranteed to work for all circumstances. In particular, it is unable to deal with subgraphs of IfOp. Using this setting may therefore result in subsequent exceptions in the Autodiff transform and it is therefore inadvisable to use this as an Autodiff default.

enumerator SafeAddFwdOutputs

Like AutodiffStitchStrategy::AddFwdOutputs except that those backward graph inputs that can’t be stitched with AutodiffStitchStrategy::AddFwdOutputs (that is, by adding outputs to the forward graph) are stitched using the AutodiffStitchStrategy::RecomputeMinimal strategy instead.

This means that this is a safe strategy to use as an Autodiff default.

enumerator N: Number of AutodiffStitchStrategy values.

enum class popart::BatchSerializationBatchSchedule

Enum type that describes how to change the batch serialisation subgraph schedule before outlining.

Note

This setting is experimental and may change.

Values:

enumerator Scheduler = 0: Don’t encourage any particular scheduling for ops within batch subgraphs (leave it to the scheduler) but tell the scheduler to schedule subgraphs in sequence.

enumerator Isomorphic: Encourage all ops within batch subgraphs to be scheduled identically and for each subgraph to be scheduled in sequence (good for outlineability).

enumerator OverlapOnIo: Attempt to put the remote load op for batch N+1 right after the compute phase of batch N.

enumerator OverlapOnCompute: Attempt to put the remote load op for batch N+1 right before the compute phase of batch N.

enumerator N: The number of BatchSerializationBatchSchedule values.

enum class popart::BatchSerializationMethod

Enum type that describes how to apply the batch serialization.

Note

This setting is experimental and may change.

Values:

enumerator UnrollDynamic = 0: Unroll the batch with dynamic slicing.

enumerator UnrollStatic: Unroll the batch with static slicing.

enumerator Loop: Loop over the batch dimension.

enumerator N: The number of BatchSerializationMethod values.

enum class popart::BatchSerializationTransformContext

Enum type that describes when to apply batch serialization.

Note

This setting is experimental and may change.

Values:

enumerator Fwd = 0: Apply batch serialiation before growing the backward pass.

enumerator Bwd: Apply batch serialiation after growing the backward pass.

enumerator N: The number of BatchSerializationTransformContext values.

enum class popart::ExecutionPhaseIOSchedule

Enum type to specify when to load tensors.

Values:

enumerator Preload = 0: Preload tensors in previous phase for use in current phase.

enumerator OnDemand: Load tensors just before they are required.

enumerator N: The number of ExecutionPhaseIOSchedule values.

enum class popart::ExecutionPhaseSchedule

Enum type to specify the order of processing optimizer operations for different weights of the same execution phase.

The steps for phased execution are:

Copy to IO tiles if necessary.
Run collective operations if necessary.
Load optimizer state.
Update optimizer state.
Apply optimizer.
Store updated tensor if necessary.

Values:

enumerator Interleaving = 0

Process above steps for one weight at a time (for example: 123456, 123456, 123456).

The scheduler may interleave these steps.

enumerator Batch: Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange (for example: 333, 111, 222, 444, 555, 666).

enumerator BatchClusteredIO: Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange, and maximise stream copy merges by keeping RemoteLoad/RemoteStore operations clustered (for example: 333, 111, 222, 444, 555, 666).

enumerator N: The number of ExecutionPhaseSchedule values.

enum class popart::GradientTensorTrackingMethod

Enum type to specify the method for selecting gradient tensors whose statistics are to be tracked for the AutomaticLossScale transform.

Values:

enumerator AllNonViewChangingGradientTensors = 0: Track all gradients of non-view-changing gradient tensors.

enumerator ConvAndMatmulGradients: Track all gradients of inputs to MatMul and Convolution ops.

enumerator GradientsOfUserSpecifiedTensors: Track gradients of user-specified tensors.

enumerator N: The number of GradientTensorTrackingMethod values.

enum class popart::Instrumentation

Enum type used to specify an instrumentation type.

Values:

enumerator Outer = 0: Outer loop instrumentation, graph over all IPUs.

enumerator Inner: Inner loop instrumentation, graph per IPU.

enumerator N: The number of Instrumentation values.

enum class popart::IrSerializationFormat

Enum type used to specify a serialization format.

Values:

enumerator JSON: JavaScript Object Notation (JSON).

enum class popart::MeanReductionStrategy

Enum type that specifies when to divide by a mean reduction factor, when doing mean reduction over a sequence of tensors \(t_1, t_2, ..., t_k\).

Values:

enumerator Running = 0

Keep the reduction buffer as the mean of the tensors accumulated so far.

If \(t_1, ..., t_f\) has just been processed, the current accumulator \(s\) is the mean of these values, and the next accumulator update is \(s = \frac{f}{f+1} * s + \frac{1}{f+1} * t_{f+1}\) to keep \(s\) a running mean.

This strategy guarantees \(s \le \max(a_1, ..., a_k)\) throughout the accumulation, therefore it will not overflow, but it is generally slower than MeanReductionStrategy::Post.

enumerator Post

Keep the accumulation factor as the running sum, and divide once by \(k\) at the end of the accumulation.

This strategy will generally be faster than MeanReductionStrategy::Running, but is prone to overflow (especially when using fp16).

enumerator N: The number of MeanReductionStrategy values.

enum class popart::MergeVarUpdateType

Enum type used to specify which VarUpdateOp ops to merge.

Values:

enumerator None = 0: Do not merge VarUpdateOp ops.

enumerator All

Merge all VarUpdateOp ops into as few groups as possible.

This is a good choice when memory is not a constraint.

enumerator AutoLoose: Merge into groups while attempting not to increase maximum variable liveness, and also not slice tensor variables so they will need to be processed by different VarUpdateOp ops.

enumerator AutoTight: Merge into groups, so that VarUpdateOp ops process tensors of exactly SessionOptions::mergeVarUpdateMemThreshold in size.

enumerator N: The number of MergeVarUpdateType values.

enum class popart::RecomputationType

Enum type to specify which ops to recompute in the backward pass when doing auto-recomputation.

Values:

enumerator None = 0: No ops are recomputed (Default).

enumerator Standard: Recompute using algorithm that picks checkpoints to try and minimise max liveness.

enumerator NormOnly: Only Norm ops (+ non-linearities, if following) are recomputed.

enumerator Pipeline: Recompute all forward pipeline stages.

enumerator RecomputeAll: Recompute all ops.

enumerator N: The number of RecomputationTypes values.

enum class popart::SubgraphCopyingStrategy

Enum type that describes how copies for inputs and outputs for subgraphs are lowered.

Currently this only affects subgraphs associated with CallOp ops.

Values:

enumerator OnEnterAndExit = 0

Copy all inputs before the start of the subgraph, copy all outputs after all ops in the subgraph.

With this strategy, subgraphs will always map to a single Poplar function.

enumerator JustInTime

Copy inputs just before they are consumed and copy outputs as soon as they are produced.

With this strategy, subgraphs may be lowered into multiple Poplar functions.

enumerator N: The number of SubgraphCopyingStrategy values.

enum class popart::SyntheticDataMode

Enum type used to specify the data source for input tensors.

Values:

enumerator Off = 0: Use real data.

enumerator Zeros: Input tensors are initialised to all zeros.

enumerator RandomNormal: Input tensors are initialised with a random normal distribution ~N(0,1).

enumerator RandomUniform: Input tensors are initialised with a uniform distribution.

enumerator N: The number of SyntheticDataMode values.

enum class popart::VirtualGraphMode

Enum type used to specify a virtual graph mode.

Values:

enumerator Off = 0: Virtual graphs are not enabled.

enumerator Manual: User must set the popart::Op::virtualGraph attribute on all ops.

enumerator Auto: Use the AutoVirtualGraph transform.

enumerator ExecutionPhases: Virtual graphs are tied to execution phases.

enumerator N: The number of VirtualGraphMode values.

struct AccumulateOuterFragmentSettings

A structure containing accumulate outer fragment settings.

Public Functions

AccumulateOuterFragmentSettings() = default

inline AccumulateOuterFragmentSettings(AccumulateOuterFragmentSchedule schedule_, const std::vector<int> &excludedVirtualGraphs_)

Constructor for AccumulateOuterFragmentSettings.

Parameters

schedule_ – Indicate how to schedule the accumulate outer fragment. This setting is experimental and may change. Default: AccumulateOuterFragmentSchedule::Serial
excludedVirtualGraphs_ – Indicate to explicitly avoid parallelising the virtual graph IDs. This setting is experimental and may change.

Public Members

AccumulateOuterFragmentSchedule schedule = AccumulateOuterFragmentSchedule::Serial : Indicate how to schedule the accumulate outer fragment.

Note

This setting is experimental and may change.

std::vector<int> excludedVirtualGraphs = {}: Indicate to explicitly avoid parallelising the virtual graph IDs.

Note

This setting is experimental and may change.

struct AutodiffSettings

The settings for the Autodiff transform.

Public Functions

AutodiffSettings() = default: Default constructor for the AutodiffSettings struct.

inline AutodiffSettings(AutodiffStitchStrategy stitchStrategy_)

Constructor for the AutodiffSettings struct.

Parameters: stitchStrategy_ – The strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph. Default: AutodiffStitchStrategy::RecomputeAllNonInputs.

Public Members

AutodiffStitchStrategy stitchStrategy = AutodiffStitchStrategy::RecomputeAllNonInputs : The strategy PopART should use to ensure that all graph inputs of a backward graph are available as either inputs or outputs of the forward graph or gradients of outputs of the forward graph.

Note

This is an experimental option and may change.

struct AutomaticLossScalingSettings

A structure containing user configuration for automatic loss scaling settings.

Note

Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.

Public Functions

AutomaticLossScalingSettings() = default: Default constructor for AutomaticLossScalingSettings.

AutomaticLossScalingSettings(bool enabled_, const nonstd::optional<std::vector<TensorId>> &toTrackTensors_, float binEdgeLocation_, float thresholdUpperCountProportion_, int updatePeriod_, GradientTensorTrackingMethod gradientTensorTrackingMethod_)

Constructor for AutomaticLossScalingSettings.

Parameters

enabled_ – Indicate whether to keep track (true) or not (false) of the distribution of gradient tensor elements over the floating point range. Default: false.
toTrackTensors_ – An optional list of model tensor names, for which gradient statistics will be collected. If not set, the gradients of all tensors produced by default operations (matmul, conv) will be used.
binEdgeLocation_ – The location of the bin edge as a proportion of the absolute numerical range of the tracked gradient tensor elements, in the range [0, 1]. 0 represents the smallest representable value, and 1 the maximum. This is the single bin edge of the histogram that is an input to the loss scale updater algorithm. Default: 0.125.
thresholdUpperCountProportion_ – The proportion of the elements in the upper bin above which the loss scale is increased, and below which the loss scale is decreased. Should be in the range [0, 1]. Default: 1e-7.
updatePeriod_ – Indicate how often the loss scale update factor should be updated with respect to optimizer steps. Default: 1
gradientTensorTrackingMethod_ – The method for selecting gradient tensors whose statistics are to be tracked. Default: GradientTensorTrackingMethod::AllNonViewChangingGradientTensors.

std::size_t hash() const

Public Members

bool enabled = false

float binEdgeLocation = 0.125f

float thresholdUpperCountProportion = 1e-7

nonstd::optional<std::vector<TensorId>> toTrackTensors

int updatePeriod = 1

GradientTensorTrackingMethod gradientTensorTrackingMethod = GradientTensorTrackingMethod::AllNonViewChangingGradientTensors 

struct BatchSerializationSettings

A structure containing batch serialization settings.

Public Functions

BatchSerializationSettings() = default: Default constructor for BatchSerializationSettings.

BatchSerializationSettings(int factor_, bool concatOnVirtualGraphChange_, bool concatOnExecutionPhaseChange_, bool concatOnPipelineStageChange_, BatchSerializationTransformContext transformContext_ = BatchSerializationTransformContext::Fwd, BatchSerializationMethod method_ = BatchSerializationMethod::UnrollDynamic, BatchSerializationBatchSchedule batchSchedule_ = BatchSerializationBatchSchedule::Isomorphic)

Constructor for BatchSerializationSettings.

Parameters

factor_ – The number of compute batches to split operations into. Default: 0.
concatOnVirtualGraphChange_ – Indicate to break batch serialization chains (true) when the virtual graph changes (by concatenating the compute batches to the local batch). Default: true.
concatOnExecutionPhaseChange_ – Indicate to break batch serialization chains (true) when the execution phase changes (by concatenating the compute batches to the local batch). Default: true.
concatOnPipelineStageChange_ – Indicate to break batch serialization chains (true) when the pipeline stage changes (by concatenating the compute batches to the local batch). Default: true.
transformContext_ – An experimental value to control when batch serialization is applied. Default: ::Fwd.
method_ – An experimental value to control how batch serialization is applied. Default: BatchSerializationMethod::UnrollDynamic.
batchSchedule_ – An experimental value that changes how operations are scheduled. Default: BatchSerializationBatchSchedule::Isomorphic.

Public Members

int factor = 0: The number of compute batches to split operations into.

bool concatOnVirtualGraphChange = true: Break batch serialization chains when the virtual graph changes (by concatenating the compute batches to the local batch).

bool concatOnExecutionPhaseChange = true: Break batch serialization chains when the execution phase changes (by concatenating the compute batches to the local batch).

bool concatOnPipelineStageChange = true: Break batch serialization chains when the pipeline stage changes (by concatenating the compute batches to the local batch).

BatchSerializationTransformContext transformContext = BatchSerializationTransformContext::Fwd : Experimental value to control when batch serialization is applied.

BatchSerializationMethod method = BatchSerializationMethod::UnrollDynamic : Experimental value to control how batch serialization is applied.

BatchSerializationBatchSchedule batchSchedule = BatchSerializationBatchSchedule::Isomorphic : Experimental value that changes how operations are scheduled.

struct ExecutionPhaseSettings

A structure containing ExecutionPhase settings.

Public Functions

ExecutionPhaseSettings() = default: Default constructor for ExecutionPhaseSettings.

inline ExecutionPhaseSettings(int phases_, bool stages_, ExecutionPhaseIOSchedule weightIOSchedule_, ExecutionPhaseIOSchedule activationIOSchedule_, ExecutionPhaseIOSchedule optimizerStateIOSchedule_, ExecutionPhaseIOSchedule accumulatorIOSchedule_, ExecutionPhaseSchedule schedule_)

Constructor for ExecutionPhaseSettings.

Parameters

phases_ – The number of execution phases for the whole model. Default=0.
stages_ – The number of overlapping stages:
- 1: Parallel streaming memory, default for 1 IPU per replica.
- 2: PingPong between 2 IPUs, default for 2 or more IPUs per replica (Default).
weightIOSchedule_ – The execution phase IO schedule for weight tensors. Default: ExecutionPhaseIOSchedule::Preload.
activationIOSchedule_ – The execution phase IO schedule for activation and gradient tensors. Default: ExecutionPhaseIOSchedule::Preload.
optimizerStateIOSchedule_ – An experimental value to control when batch serialization is applied. Default: ExecutionPhaseIOSchedule::OnDemand.
accumulatorIOSchedule_ – An experimental value to control how batch serialization is applied. Default: ExecutionPhaseIOSchedule::Preload.
schedule_ – An experimental value that changes how operations are scheduled. Default: ExecutionPhaseSchedule::Interleaving.

Public Members

int phases = 0: Number of ExecutionPhases for the whole model.

int stages = 2

Number of overlapping stages.

1: Parallel streaming memory, default for 1 IPU per replica.
2: PingPong between 2 IPUs, default for 2 or more IPUs per replica.

ExecutionPhaseIOSchedule weightIOSchedule = ExecutionPhaseIOSchedule::Preload : The execution phase IO schedule for weight tensors.

ExecutionPhaseIOSchedule activationIOSchedule = ExecutionPhaseIOSchedule::Preload : The execution phase IO schedule for activation and gradient tensors.

ExecutionPhaseIOSchedule optimizerStateIOSchedule = ExecutionPhaseIOSchedule::OnDemand 

ExecutionPhaseIOSchedule accumulatorIOSchedule = ExecutionPhaseIOSchedule::Preload 

ExecutionPhaseSchedule schedule = ExecutionPhaseSchedule::Interleaving 

struct ReplicatedCollectivesSettings

A structure containing settings for replicated collective operations.

Public Functions

ReplicatedCollectivesSettings(bool prepareScheduleForMergingCollectives = false, bool mergeAllReduceCollectives = false, bool mergeReduceScatterCollectives = false, bool mergeAllGatherCollectives = false)

Constructor for the ReplicatedCollectivesSettings struct.

Parameters

prepareScheduleForMergingCollectives – Insert constraints into the schedule such that collectives which can be merged occur one right after the other. true to insert constraints, false otherwise. Default: false.
mergeAllReduceCollectives – Identify allreduce operations which can be scheduled at the same time, and perform them as one larger operation to better utilize the bandwidth between replicas. true to identify operations, false otherwise. Default: false.

std::size_t hash() const

Public Members

bool prepareScheduleForMergingCollectives = false

bool mergeAllReduceCollectives = false

bool mergeReduceScatterCollectives = false: Identifies reduce-scatter operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.

bool mergeAllGatherCollectives = false: Identifies allgather operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.

struct SessionOptions

A structure containing user configuration options for the Session class.

Public Functions

inline bool explicitPipeliningEnabled() const

Enable explicit pipelining.

Determined from values for enablePipelining, useHostCopyOpsfault and enableExplicitMainLoops.

inline bool implicitPipeliningEnabled() const

Enable implicit pipelining.

Determined from values for enablePipelining, useHostCopyOpsfault and enableExplicitMainLoops.

inline void enableExplicitIR(bool enable)

Enable explicit representations in the IR (code paths).

Enabled if true, otherwise not.

bool shouldDelayVarUpdates() const

int64_t getGlobalReplicationFactor() const

Get the global replication factor.

Returns

If enableDistributedReplicatedGraphs is true, then return globalReplicationFactor.
If enableReplicatedGraphs is true, then return replicatedGraphCount.
otherwise return 1.

unsigned getAccumulationFactor() const

Get the gradient accumulation factor.

Throws an error if gradient accumulation is not enabled (enableGradientAccumulation is false) and the factor (accumulationFactor) is set to >1.

Returns: The accumulation factor.

unsigned getBufferingDepth(const TensorId &id, bool rearrangedOnHost)

bool autoRecomputationEnabled() const: Returns true if auto-recomputation is enabled, false otherwise.

inline SessionOptions(): Constructor for SessionOptions.

Public Members

std::string logDir: A directory for log traces to be written into.

std::set<std::string> dotChecks = {}: When to write .dot files during IR construction.

int firstDotOp = 0

The ops written to the .dot file will be a part of the schedule, controlled by firstDotOp and finalDotOp.

In particular, it will be [max(0, firstDotOp), min(N ops in IR, finalDotOp)).

int finalDotOp = 10000: See firstDotOp.

bool dotOpNames = false

Enable inclusion of the op name in the .dot file (the op type is always exported).

Enabled when true. Default: false.

bool exportPoplarComputationGraph = false

Enable export of Poplar computational graph.

Enabled when true. Default: false.

bool exportPoplarVertexGraph = false

Enable export of Poplar vertex graph.

Enabled when true. Default: false.

bool separateCallOpPdfs = true

Enable creation of separate PDFs for each subgraph when generating PDFs of IR graphs.

Enabled when true. Default: true.

bool enableOutlining = true

Enable outlining.

This identifies and extracts repeated parts of computational graph into subgraphs. Enabled when true. Default: true.

bool enableOutliningCopyCostPruning = true

Enable inclusion of the cost of copying of cached sections should be in the outlining cost model.

Enabled when true. Default: true.

float outlineThreshold = 1.0f

Specify the incremental value that a sub-graph requires, relative to its nested sub-graphs (if any), to be eligible for outlining.

A high threshold results in fewer sub-graphs being outlined, a negative value results in all being outlined. The gross value of a sub-graph is the sum of its constituent ops’ Op::getSubgraphValue() values. To disable outlining, it is better to set enableOutlining to false than to set this value to infinity. The default value of 1.0f results in all high value operations such as convolution being cached, but standalone low value operations such as ReLU will not be.

Default: 1.0f.

float outlineSequenceBreakCost = 10000.0f

Specify the penalty applied to outlining potential sub-graphs if the sub-graph to be created breaks up a sequence of operations that are more efficient (for example for overlapping compute and exchange) when outlined together.

Default: 10000.0f.

SubgraphCopyingStrategy subgraphCopyingStrategy = SubgraphCopyingStrategy::OnEnterAndExit 

Specify how copies for inputs and outputs for subgraphs are lowered.

Setting this value to SubgraphCopyingStrategy::JustInTime may save memory at the cost of fragmenting subgraphs into multiple Poplar functions. This may be particularly useful when a number of weight updates are outlined in one subgraph, as it may prevent multiple weight tensors from being live at the same time inside the subgraph.

Default: SubgraphCopyingStrategy::OnEnterAndExit.

RecomputationType autoRecomputation = RecomputationType::None 

Enable recomputation of operations in the graph in the backward pass.

This will reduce model size at the cost of computation cycles.

Default: RecomputationType::None (no recomputation).

MergeVarUpdateType mergeVarUpdate = MergeVarUpdateType::None 

Enable merging of VarUpdates into groups of VarUpdates, by flattening and concatenating variable tensors and updating tensors.

Default: MergeVarUpdateType::None (no merging).

int64_t mergeVarUpdateMemThreshold = 1000000

Specify the memory threshold for VarUpdateOp merging algorithms.

The MergeVarUpdateType::AutoLoose and MergeVarUpdateType::AutoTight VarUpdateOp merging algorithms have a threshold on the total memory of variable tensors to merge for updating. Defined as total memory in bytes.

Default: 1000000.

int64_t looseThresholdAtPeak = 8000

Specify the threshold at peak used in the calculation of the absolute threshold in the MergeVarUpdateType::AutoLoose VarUpdateOp merging algorithm.

 min(mergeVarUpdateMemThreshold, liveAtPeak - liveCurrently +
looseThresholdAtPeak)

where:

liveAtPeak is an estimate of the maximum live memory of the computation; and
liveCurrently is an estimate of the live memory where the threshold is being used to determine whether to schedule or postpone a VarUpdateOp.

Default: 80000.

bool rearrangeAnchorsOnHost = true

Enable rearrangement (in memory) of anchor tensors to be done on the host.

Before anchor tensors are streamed from device to host, they are not necessarily arranged in memory as required when they are to be copied from host stream to host. This can be done on the device or on the host.

Default: true (Rearrangement done on host to save memory, but often at the expense of cycles, especially for larger anchor tensors.).

bool rearrangeStreamsOnHost = false

Enable rearrangement (in memory) of stream tensors to be done on the host.

Before stream tensors are streamed from host to device, they are not necessarily arranged in memory as required when they are to be copied from host stream to device. This can be done on the device or on the host.

Default: false (Rearrangement done on device).

bool enablePrefetchDatastreams = true

Enable prefetching for input data streams.

Poplar will speculatively read data for a stream before it is required in order to allow the ‘preparation’ of the data to occur in parallel with compute. Enabled when true. Default: true.

unsigned defaultBufferingDepth = 1

Specify the default buffering depth value used for streams that are not re-arranged on the host.

For tensors that are rearranged on the host, a buffering depth of 1 will always be used. This default value can be overridden via bufferingDepthMap.

unsigned defaultPrefetchBufferingDepth = initialDefaultPrefetchBufferingDepthValue

Deprecated:: This session option name has been deprecated and will be removed in a future release.

Please use the alias defaultBufferingDepth instead.

std::map<TensorId, unsigned> bufferingDepthMap

This mapping can be used to set stream-specific buffering depths.

The buffering depth could be thought of as being the size of a circular buffer that feeds data to and from Poplar. A buffering depth greater than 1 may improve the performance due to increased parallelisation but comes at the cost of increasing the memory footprint. Streams for tensors that have no entry in this map will default to 1 (if a tensor is rearranged on host) or defaultBufferingDepth (if a tensor is not rearranged on host). Specifying a tensor that gets rearranged on host in this map will throw an error.

std::map<TensorId, unsigned> prefetchBufferingDepthMap

Deprecated:: This session option name has been deprecated and will be removed in a future release.

Please use the alias bufferingDepthMap instead.

bool enableNonStableSoftmax = false

Enable the non-stable softmax Poplar function.

By default, the stable softmax Poplar function is used. The input tensor to softmax, \(x\), is preprocessed by subtracting \(max(x)\) from each element before computing the exponentials, ensuring numerical stability. If the inputs to the softmax operations are small enough to not cause overflow when computing the exponential, then the non-stable version can be enabled instead, to increase the speed.

Default: false (not enabled).

bool enableReplicatedGraphs = false: Enable replication of graphs. Default: false (not enabled).

bool enableGradientAccumulation = false: Enable gradient accumulation. Default: false (not enabled).

ReductionType accumulationAndReplicationReductionType = ReductionType::Sum 

Specify how gradients are reduced when using gradient accumulation and graph replication.

Default: ReductionType::Sum.

MeanReductionStrategy meanAccumulationAndReplicationReductionStrategy = MeanReductionStrategy::Post 

Specify when to divide by a mean reduction factor when accumulationAndReplicationReductionType is set to ReductionType::Mean.

Default: MeanReductionStrategy::Post.

int64_t replicatedGraphCount = 1

Specify the number of model replications.

If enableReplicatedGraphs is true, replicatedGraphCount will set the number of model replications. For example, if the model uses 1 IPU, a replicatedGraphCount of 2 will use 2 IPUs. If the model is pipelined across 4 IPUs, a replicatedGraphCount of 4 will use 16 IPUs in total. Therefore, the number of IPUs requested must be a multiple of replicatedGraphCount. If the training is done across multiple instances of the program then the replicatedGraphCount is the number of replicas for this instance.

int64_t accumulationFactor = 1: Specify the number of micro-batches to accumulate before applying the varUpdate.

VirtualGraphMode virtualGraphMode = VirtualGraphMode::Off 

Specify how to place ops on virtual graphs to achieve model parallelism, either manually using model annotations, or automatically.

Default: VirtualGraphMode::Off.

std::vector<float> virtualGraphSplitRatios

Specify split ratios when VirtualGraphModel::Auto enabled.

These values represent split ratios in each device and each of the values is in range (0, 1).

For example, to uniformly split the whole graph on 4 IPUs, the value should be [0.25, 0.25, 025, 0.25].

bool enablePipelining = false: Enable pipelining of virtual graphs. Default: false (not enabled).

SyntheticDataMode syntheticDataMode = SyntheticDataMode::Off 

Specify whether to use real or synthetic data to initialize input tensors.

Streaming to/from the host is only enabled for SyntheticDataMode::Off which indicates that real data is being used.

Default: SyntheticDataMode::Off.

bool instrumentWithHardwareCycleCounter = false

Add instrumentation to the program to count the number of device cycles (of a single tile, on a single IPU) that the main program takes to execute.

Expect this to have a small detrimental impact on performance.

std::set<Instrumentation> hardwareInstrumentations = {Instrumentation::Outer}

bool disableGradAccumulationTensorStreams = false

Disable saving of weight gradient tensors off the device.

If true, the weight gradient tensors are not saved off the device when devicex.weightsFromHost() is called.

Note

This option is overridden if syntheticDataMode is not SyntheticDataMode::Off.

Note

Weight gradient tensors that are also optimiser tensors will only be disabled if both disableGradAccumulationTensorStreams and disableOptimizerStateTensorStreams are true.

bool disableOptimizerStateTensorStreams = false

Disable streaming of optimizer tensors.

If true, streaming of optimizer tensors is disabled. This setting can be used to conserve memory if you are not interested in checkpointing the optimizer state.

Note

Weight gradient tensors that are also optimiser tensors will only be disabled if both disableGradAccumulationTensorStreams and disableOptimizerStateTensorStreams are true.

bool compileEngine = true

Setting to only build the Poplar graph but not compile not.

If false, the backend will build the Poplar graph but not compile it into an Engine. In this case, no execution can be performed, and nothing can be transferred to the device. API calls which retrieve information from the graph building stage, such as tile mapping introspection, can still be used.

bool constantWeights = true

Specify an optimization for an inference session to have constant weights.

Set this option to false in order to change the weights with a call to Session::resetHostWeights() after the session has been prepared. This option has no effect on a training session.

Default: true.

bool enableEngineCaching = false

Enable Poplar executable caching.

The file is saved to the location defined with cachePath. The file will be in the PopEF format. This means that it can be used to run inference using the Triton Inference Server because Graphcore provides a backend to it. See the Poplar Triton Backend user guide for more information.

Default: false (not enabled).

bool enableVariablesCaching = true

Enable variable caching.

This means that the caching process will save variables as additional PopEF blobs to the file location defined with cachePath. If PopART will require data for variables (during cache reading process), they will be automatically read from the cache file.

Note, turning this off allows a PopART Session to optimise the host memory it consumes during model runtime. Specifically, weightsToHost() can write directly to the IR tensor data buffers. If the option were on, this would not be safe and the session would have to create separate buffers to write the fetched data to.

Default: true (enabled).

std::string cachePath = "session_cache": Folder to save the poplar::Executable to.

bool enableFloatingPointChecks = false

Enable that exceptions are thrown when floating point errors occur.

Default: false (not enabled).

bool enableStochasticRounding = false

Enable stochastic rounding.

PopART will set the Poplar engine option target.deterministicWorkers to true if this option is set and to false if it is not set. Adding a value for “target.deterministicWorkers” to SessionOptions::engineOptions overrides this behaviour.

Default: false (not enabled).

bool _enableRngStateManagement = false

ExecutionPhaseSettings executionPhaseSettings: Configuration settings for execution phases.

AccumulateOuterFragmentSettings accumulateOuterFragmentSettings: Configuration setting for operations in the accumulate outer fragment.

bool explicitRecomputation = false

Enable explicit recomputation.

Default: false (not enabled).

NumIOTiles numIOTiles: Number of IPU tiles dedicated to IO.

bool aliasZeroCopy = false: Enable zero-copy for subgraphs.

BatchSerializationSettings batchSerializationSettings: Configuration setting for batch serialization.

AutodiffSettings autodiffSettings: Configuration settings for the autodiff transform.

bool delayVarUpdates = true: Options to delay variable updates as much as possible.

bool scheduleNonWeightUpdateGradientConsumersEarly = false

bool enableFullyConnectedPass = true

Enable the global fullyConnectedPass option for matmuls.

See also

poplin::matMul(poplar::Graph, poplar::Tensor, poplar::Tensor, poplar::program::Sequence, poplar::Type, poplar::DebugContext, poplar::OptionFlags, matmul::PlanningCache).

bool enableSerializedMatmuls = true: Enable/disable the serializing of matmuls.

std::string partialsTypeMatMuls

Set the partials type globally for matmuls.

Can be overridden individually with Builder.setPartialsType(). Valid values are "float" and "half". By default, this is not set, so no global partials type is imposed.

bool enableStableNorm = false

If true, computes the mean first and subtracts the activations from it before computing the variance.

The implementation with this flag set to true is slower than when set to false. The stable version requires the first order moment to be estimated and applied to the sample set before the second order central moment is calculated.

std::map<std::string, std::string> engineOptions: Poplar engine options.

std::map<std::string, std::string> convolutionOptions: Poplar convolution options.

std::map<std::string, std::string> lstmOptions: Poplar LSTM options.

std::map<std::string, std::string> matmulOptions: Poplar matmul options.

std::map<std::string, std::string> reportOptions: Poplar reporting options.

std::map<std::string, std::string> gclOptions: GCL options.

ExperimentalSettings experimentalSettings: Configuration setting for custom transform applier.

std::vector<std::string> customCodelets

List of codelet files (with file extension) to be added to the Poplar graph.

See the Poplar documentation for poplar::Graph for more information.

std::vector<TensorId> updatableNamedBuffers

List of model named buffers that can be updated with call to copyNamedBuffersToDevice().

This allows to update just a subset of model weights instead of all or them as it happens with copyWeightsToDevice() call.

std::string customCodeletCompileFlags

Compile flags for the custom codelets.

For example -g to generate debug info. See the Poplar documentation for poplar::Engine for more information.

double timeLimitScheduler = 1e9: The maximum allowed time (in seconds) that can be spent searching for a good graph schedule before a solution must be returned.

int64_t swapLimitScheduler = static_cast<int64_t>(1e9): The maximum number of improving steps allowed by the scheduling algorithm before a solution must be returned.

std::string serializedPoprithmsShiftGraphsDir = {}

The directory to serialize Poprithms graphs to.

PopART uses Poprithms for scheduling PopART graphs. The Poprithms graphs created for scheduling can be optionally serialised (written to file). If serializedPoprithmsShiftGraphsDir is empty, then the graphs will not be serialised. The names of serialization files will be poprithms_shift_graph_i.json for the lowest non-existing values of i. The directory must already exist, PopART will not create it.

std::string kahnTieBreaker = "greedy"

Specify which method is used to control how ops are scheduled.

The initial scheduling is done with Kahn’s algorithm. When several ops are free to be scheduled, this controls which method is used.

Options are described in the Poprithms KahnTieBreaker enum.

size_t transitiveClosureOptimizationThreshold = {100000}

Specify the transitive closure optimization threshold.

The transitive closure optimization pass can significantly accelerate the scheduler. It does not, in general, affect the final schedule returned. It is run between initialization with Kahn’s algorithms and the shifting swaps. The transitive closure optimization pass is O(nOps^2) and so should not be used for extremely large graphs. If a graph is above this threshold, the transitive closure optimization pass is not run.

bool decomposeGradSum = false

Enable replacement of single sums of partial gradients with a tree of additions.

This can reduce max liveness at the cost of extra cycles. A typical use case for this would be if a large weight tensor is used as an input to many operations.

Default: false (not enabled).

ReplicatedCollectivesSettings replicatedCollectivesSettings: Control the behavior of different collective operations.

bool enableDistributedReplicatedGraphs = false

Enable training with Poplar replicated graphs across multiple PopART instances.

Default: false (not enabled).

int64_t globalReplicationFactor = 1

The total number of replicas in a multi-instance, replicated-graph training session (this should be left as the default value (1) if distributed replicated graphs are disabled).

This value includes local replication.

int64_t globalReplicaOffset = 0: The first replica index that this PopART instance is running.

bool groupHostSync = false

Specify to group the streams from the host to the device at the beginning of the schedule, and the streams from the device to the host at the end of the schedule.

This trades off memory usage for speed.

When true

, tensors will stay live for longer.

Default:

false (not enabled).

Note

This setting has no effect when useHostCopyOps is enabled (true).

bool strictOpVersions = true

Enable strict op version checks.

Strict op version checks will throw an error if the exact version of an op required for the model opset is not supported. Turning this check off will cause PopART to fall back to the latest implementation of the op that is supported.

Default:

true (enabled).

Warning

Turning off these checks may cause undefined behaviour.

bool opxAliasChecking = false

Enable running Opx checks to verify that IR tensor aliasing information corresponds to the lowered Poplar tensor aliasing.

Default: false (not enabled).

bool opxModifyChecking = false

Enable running Opx checks to verify that IR tensor modification information corresponds to the lowered Poplar tensor modifications.

Default: false (not enabled).

bool useHostCopyOps = false

Enable use of IR graph operations for data and anchor streams.

Default: false (not enabled).

bool enableEfficientOverlapIOTopoCons = false

Enable simplified and equivalent overlapIO constraints.

Suppose we have the N bins in each of three stage(8 for before loop /7 for insdie loop /6 for after loop), and L ops for each bins, vallina implementaiton of overlapio creates topocons of complexity O(N*N*L*L).

To make sure InitOps in each step are scheduled before HostLoadOps, we only need to keep topo constrains in each bin and let the last of op of each bin Bin0 is scheduled before the first op of Bin1 next to Bin0. Then total complexity O(N*N*L*L) is reduced to (N*L).

Default: false (not enabled).

bool enableLoadAndOffloadRNGState = false

Enable load and offload of device RNG state from host.

Default: false (not enabled).

TensorLocationSettings activationTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}: Tensor location settings for activation/gradient tensors.

TensorLocationSettings weightTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}: Tensor location for weight tensors.

TensorLocationSettings optimizerStateTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}: Tensor location for optimizer state tensors.

TensorLocationSettings accumulatorTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}: Tensor location for gradient accumulator tensors.

std::map<TensorId, TensorLocation> tensorLocationSettingsOverride: Override tensor location for specific tensors by setting tensor locations for specific tensor ID values.

AutomaticLossScalingSettings automaticLossScalingSettings: Settings to enable and configure the automatic loss scaling behaviour when training.

Note

Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.

DeveloperSettings developerSettings: Settings for developers to configure testing and benchmarking.

bool enableSupportedDataTypeCasting = true

Enable casting to supported data types.

If enabled (true), casts any tensor of unsupported data types to supported data types when lowering to Poplar. Currently, this implies casting:

INT64 -> INT32
UINT64 -> UINT32 The cast will throw an error for incompatible data types and over/underflows, and will warn about narrowing casts.

Default: true (enabled).

bool enableExplicitMainLoops = false: Enable explicit main loop transformation, and disable implicit training loops.

Note

This will be deprecated and enabled by default.

bool groupNormStridedChannelGrouping = false

Enable fast math mode for group norms.

Group norms have a fast math mode which changes the implementation to run faster on IPU but as a consequence is incompatible with other implementations (so for running trained weights on host). The default (false) is to use the correct, but slightly slower mode.

std::function<void(int, int)> compilationProgressLogger

Callback function used to indicate PopART compilation progress.

The function should not block. All calls to the callback function will be made from the main thread so blocking in the callback will block compilation from progressing.

If this logger is not set then compilation progress will be printed on the info channel.

Param int: The progress value.
Param int: The maximum value for the progress.

int compilationProgressTotal = 100: Total progress ticks until compilation complete.

bool enableMergeExchange = true

Enable merging remote and host IO operations to facilitate IO overlap.

true to enable, otherwise false.

Default=true.

bool ensureFp32LossScaleTensor = false

Ensure that the loss scale tensor is fp32 and that this is combined with fp16 activations as late as possible to produce the first fp16 activation gradients.

This makes it possible to choose a loss scale value greater than max(fp16). This is also recommended when automatic loss scaling is enabled. Only compatible with models that have an fp16 loss scale tensor. true ensures that the loss scale tensor is fp32.

Default: false.

bool enableInplaceAmbiguityChecking = false

Enable creation of an AliasModel object for each graph and run the Poprithms ambiguity checker on it.

This throws an error if the graph has a potential inplacing ambiguity.

See poprithms::memory::inplace::Graph::AmbiguityStatus for more info on what constitutes an ambiguity.

If set to true, AliasModel object is created for each graph and the the Poprithms ambiguity checker is run on it. No ambiguity checking is performed if this option is set to false (default). However inplace fallbacks will occur if necessary.

bool createImplicitPipeliningFwdOnlyProgram = false

Deprecated:: Create a custom program containing the forward pipeline only.

bool throwIfLog2ScaleTensorNotInRange = true

If set to true, throw a Poplar error if any fused ops that consume a log2 scale tensor receive a log2 scale tensor value not in the integer range [-32, 32).

If set to false, no error is thrown. However, note that this may lead to undefined behaviour if the value of the log2 scale is outside the range.

bool enableConstantFoldingOfMultipleConsumers = true

If set to false, disable constant folding on ops if any input have multiple consumers.

Default=true.

bool useLoopCandidateCreator = false

Use loop candidate creator for constant if one exsits.

Default=false.

bool stashAllTensorsInferencePipeline = false

Stash all tensors when inference pipeline.

Default=false.

struct ExperimentalSettings

Public Members

std::map<std::string, std::vector<std::string>> customTransformApplierSettings

Custom transform applier settings.

Enable to insert custom transform sequence at predefined checkpoint. Multiple checkpoint names and transform names can be passed for different model configurations.

The predefined checkpoint names are: FWD0: Initial IR immediately after lowering from ONNX to the IR.

FWD1: After the pre-alias patterns have been applied to FWD0.

BWD0: After growing the backward pass (including the optimiser step). Note this happens before optimiser decomposition, so the optimiser will appear as a single special op rather than the many ops that implement it.

PREALIAS: After pre-alias transforms have been applied to BWD0.

MAINLOOPS: After the MainLoops transform has been applied. This transform adds explicit loop ops to the IR for device iterations (batches per step) and gradient accumulation.

FINAL: The final IR after preparation.

The transform names are defined by PopART and users.

For example to execute ‘Transform A’ and ‘Transform B’ at ‘Fwd0’ checkpoint and exectue ‘Transform C’ at ‘Fwd1’ checkpoint:

{ “Fwd0”: [ “Transform A”, “Transform B” ], “Fwd1”: [ “Transform C” ] }

Note

This setting is experimental for inference and may change.

bool createHostTransferableTensorWithOffset = false

Accumulate the created tensors bytes, rotate the start tile of the next tensor to balance the tile mapping.

Especially when there are a lot of small input tensors, enable it can avoid mapping on tile0 all the time.

Default=false.

class NumIOTiles

A wrapper class for the SessionOptions::numIOTiles option that permits any int value and has an ‘unassigned’ state.

Public Functions

NumIOTiles(): Constructor.

NumIOTiles(int numIOTiles)

Constructor.

Parameters: numIOTiles – The number of IPU tiles dedicated to IO.

bool operator==(const int &rhs) const: Compare with int.

operator int() const: Auto convert to int.

NumIOTiles &operator=(const int &x): Assign value using int.

struct TensorLocationSettings

A structure containing user configuration for cache/offloading settings.

Public Functions

TensorLocationSettings() = default: Constructor.

TensorLocationSettings(TensorLocation location_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)

Constructor.

Parameters

location_ – The tensor location information.
minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.
minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.

TensorLocationSettings(TensorStorage storage_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)

Constructor.

Parameters

storage_ – The tensor storage information.
minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.
minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.

Public Members

TensorLocation location = TensorLocation(): The default tensor location for this tensor type.

int minElementsForOffChip = 2: The minimum number of elements below which offloading won’t be considered.

int minElementsForReplicatedTensorSharding = 8192: A minimum number of elements below which replicated tensor sharding won’t be considered.

#include <popart/variablesettings.hpp>

class VariableSettings

A class to dictate behaviour of variables and reductions of such across multiple graphs.

Public Functions

void verify(): Runs test to see if the VariableSettings are invalid, and throws an error if so.

const CommGroup getSharedVariableDomain() const

Returns: the CommGroup sharedVariableDomain of this VariableSettings.

ReplicaGrouping getReplicaGrouping(unsigned numReplicas) const

Parameters: numReplicas – The number of replicas in the IR this is used in.
Returns: the ReplicaGrouping domain of this VariableSettings.

bool isUsingCommGroup() const

Returns: whether the VariableSettings were initialised using a CommGroup or a stride.

CommGroupType getCommGroupType() const

Returns: the CommGroupType. The value of this is invalid if VariableSettings::isUsingCommGroup returns false.

unsigned getStride() const

Returns: the stride. The value of this is invalid if VariableSettings::isUsingCommGroup returns true.

unsigned getGroupSize() const

Returns: the replica group size.

inline VariableRetrievalMode getRetrievalMode() const

Returns: the VariableRetrievalMode retrievalMode of this VariableSettings.

VariableSettings(): “Default” constructor, defaults CommGroup to [All, 0] and retrievalMode to OnePerGroup.

VariableSettings(CommGroup sharedVariableDomain_): Defaults VariableRetrievalMode to OnePerGroup.

VariableSettings(VariableRetrievalMode retrievalMode_): Defaults CommGroup to [All, 0].

VariableSettings(CommGroup sharedVariableDomain_, VariableRetrievalMode retrievalMode_): Entirely custom VariableSettings.

VariableSettings(unsigned stride, unsigned groupSize)

VariableSettings(unsigned stride, unsigned groupSize, VariableRetrievalMode retrievalMode)

unsigned numReplicasReturningVariable(unsigned replicaCount) const

Calculate the number of replicas that will return this variable.

Parameters: replicaCount – Number of global replicas.
Returns: Number of variables returned.

unsigned getGroupCount(unsigned replicaCount) const

Parameters: replicaCount – The replicationFactor of the graph.
Returns: The number of groups given the replicaFactor and the VariableSettings.

unsigned getStride(unsigned replicaCount) const

Parameters: replicaCount – The replicationFactor of the graph.
Returns: The stride between each member of a group.

unsigned getRealGroupSize(unsigned replicaCount) const

Because CommGroup’s don’t have a defined group-size if the type is All or None, this function will return a group-size that is always accurate, based on replicas.

Parameters: replicaCount – The replication factor
Returns: The actual number of replicas in a group

unsigned getGroupRepresentative(unsigned group) const

Get the default first member of a group.

Parameters: group – The group to return the representative for.
Returns: The representative replica of this group.

Shape shapeOnReplica(Shape full_shape, unsigned replicaCount, const TensorId name) const

The shape Onnx reads holds an extra outer dimension in certain cases, where the outer dimension represents the number of returning replica variables.

This function takes an Onnx full-shape and removes the outer dimension safely (ie. checks if the outer dimension matches an expected outer dimension). A quick-function to avoid duplicate code.

Parameters

full_shape – The shape as presented by Onnx.
replicaCount – The local replication factor, used to calculate the return factor.
name – The TensorId of the function, used to give good error feedback.

Returns

The shape of the data on the replica.

Shape shapeOnHost(Shape replica_shape, unsigned replicaCount) const

Takes the shape of a tensor on a replica and returns it’s full ONNX shape.

This is the inverse operation to shapeOnReplica

Parameters

replica_shape – The shape of the data on a replica.
replicaCount – The local replication factor, used to calculate the return factor.

Returns

The shape as presented by Onnx.

std::vector<std::vector<std::int64_t>> groups(unsigned replicaCount) const

This function returns a set of vectors where each vector contains all the replicaId’s of the replicas with a sharedVariableDomain given the variableSettings and the replicaCount.

Parameters: replicaCount – The local replication factor
Returns: A set of sets, such that set.at(a).set(b) is member nr. b of group a, and set.size() is the number og groups and set.at(A).size() is the size of the group.

bool operator==(const VariableSettings &other) const

Compare two variable-settings.

Parameters: other – VariableSettings to compare these settings to.
Returns: True if all internal elements are the same

bool operator!=(const VariableSettings &other) const

Compare two variable-settings.

Parameters: other – VariableSettings to compare these settings to.
Returns: False if all internal elements are the same

enum class popart::VariableRetrievalMode

Enum type that describes how to retrieve variables from the replicas.

Each replica is in a group defined by the VariableSettings::sharedVariableDomain. Replicas within a group have variables initialized with the same values.

Values:

enumerator OnePerGroup = 0: Returns one variable per group (defined by the VariableSettings::sharedVariableDomain CommGroup), automatically returns the first replica of each group, where first means the one with the lowest replica ID.

enumerator AllReduceReplicas: As OnePerGroup, but performs an AllReduce among the replicas in the same group according to VariableSettings::sharedVariableDomain !!! CURRENTLY UNSUPPORTED.

enumerator AllReplicas: Returns all replica Weights.

#include <popart/commgroup.hpp>

class CommGroup

Class to specify sub-groups of replicas.

Examples of derived sub-groups:

IPU-link domain sub-rack:

type == Consecutive && replicaGroupSize == 64/replica-size/N

where N is a power of two and replicaGroupSize > 1.

Complete IPU-link domain / full rack:

type == Consecutive && replicaGroupSize == 64/replica-size

Using GW-links only:

type == Orthogonal && replicaGroupSize == numberOfIpuLinkDomains

Public Functions

CommGroup()

Default CommGroup constructor.

Sets type to CommGroupType::All and replicaGroupSize to 0.

inline CommGroup(CommGroupType type, unsigned groupSize)

Construct CommGroup.

Parameters

groupType – The replica group type.
groupSize – The replica group size.

explicit CommGroup(const ReplicaGrouping &grouping)

Construct CommGroup from a ReplicaGrouping.

Parameters: grouping – The replica grouping.

ReplicaGrouping toReplicaGrouping(unsigned numReplicas) const

Convert this CommGroup to a ReplicaGrouping.

Parameters: numReplicas – The number of replicas to pass to create the replica grouping with.
Returns: The replica grouping.

bool operator==(const CommGroup &other) const

bool operator!=(const CommGroup &other) const

Public Members

CommGroupType type = CommGroupType::All : Replica group type.

unsigned replicaGroupSize = 0: Replica group size.

enum class popart::CommGroupType

PopART equivalent of GCL CommGroupType.

Each of these enumeration constants has a corresponding GCL CommGroupType value.

Values:

enumerator All = 0: All replicas viewed as one group, replica group size is ignored.

enumerator Consecutive

Groups are consecutive in replicas.

If there are N replicas denoted {0, ... N-1} and the group size is k, then there are N/k groups of size k as {0, 1, ... k-1}, {k, ... 2k-1} ... {N-k-1, ... N-1}.

enumerator Orthogonal

Groups are sliced orthogonal to the replica ordering.

If there are N replicas denoted {0, ... N-1} and the group size is k, then there are m = N/k groups of size k as {0, m, 2m, ...}, {1, m+1, 2m+1, ...} ... {m-1, 2m-1, ... N-1}.

enumerator None: Each replica is in its own group; the replica group size is ignored.

enumerator N: Number of values.

14.2. Data input and output (IStepIO)

#include <popart/istepio.hpp>

class IStepIO

An abstract base class through which input and output data is passed to a Session (see Session::run).

Data is passed via buffers. In the case of buffers returned by IStepIO::in, PopART reads from these buffers. In the case of IStepIO::out, PopART writes to these buffers. The IStepIO::inComplete() and IStepIO::outComplete() functions are called by PopART to signal it is done with an input or output buffer.

An IStepIO implementation should conceptually implement a rolling queue of active buffers for each input and output tensor. Every successful call to IStepIO::in should yield a new data buffer for PopART to read from and add it to the head of the conceptual queue. Conversely, every call to IStepIO::inComplete() should be taken to mean that the buffer at the tail-end of the queue is no longer being used by PopART. This buffer is removed from the conceptual queue.

Note that a IStepIO::in call with the prefetch flag set is only considered successful when it returns data.

Output works analogously to input.

The expected total number of input (or output) buffers that are ‘completed’ for a tensor in one Session::run call is bps \(\times\) SessionOptions::accumulationFactor \(\times\) SessionOptions::replicatedGraphCount, where bps is the number of batches per call to Session::run (this is a value captured by the DataFlow instance passed to the Session instance).

Note, however, that there may be additional ‘incomplete’ calls to IStepIO::in and IStepIO::out.

Furthermore, the number of input (or output) buffers that may be ‘incomplete’ at a given time for a given tensor should not normally be more than SessionOptions::bufferingDepth \(\times\) SessionOptions::replicatedGraphCount, but this bound is not guaranteed.

EXAMPLE: Suppose a session is configured such that the total expected number of input buffers is 6 and these are input buffers for a tensor with ID t with 100 elements. The associated input calls in IStepIO may look like this if SessionOptions::bufferingDepth is 3:

in("t", 100, false) -> Give buffer[0] to PopART.
in("t", 100, true) -> Give buffer[1] to PopART.
in("t", 100, true) -> Give buffer[2] to PopART.
inComplete("t", 100) -> buffer[0] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[3] to PopART.
inComplete("t", 100) -> buffer[1] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[4] to PopART.
inComplete("t", 100) -> buffer[2] is no longer required and can be reused.
in("t", 100, true) -> Give buffer[5] to PopART.
inComplete("t", 100) -> buffer[3] is no longer required and can be reused.
in("t", 100, true) -> No data available, return nullptr.
inComplete("t", 100) -> buffer[4] is no longer required and can be reused.
inComplete("t", 100) -> buffer[5] is no longer required and can be reused.

Subclassed by popart::StepIOCallback, popart::StepIOGeneric< ARRAY_TYPE, ACCESSOR_TYPE, ArrayInfoT >, popart::StepIOGeneric< IArray, StepIONS::IArrayAccessor, IArray & >

Public Functions

virtual ~IStepIO() = default: Destructor for IStepIO.

virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, const bool isBroadcast = false) = 0

Request a new input data buffer.

The memory in this buffer is available for use in PopART until the corresponding inComplete() call.

Note

: Failing to provide a valid data buffer will result in a runtime failure if prefetch is set to false.

Parameters

id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.
prefetch – If set to true the inability to provide data is not considered an error. If false, it is considered an error if no data can be provided.

Returns

The input buffer for this tensor (or nullptr on failure) returned as a ConstVoidData object.

virtual void inComplete(TensorId id, int64_t numElements, const bool isBroadcast = false) = 0

Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.

Parameters

id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.

virtual MutableVoidData out(TensorId id, int64_t numElements) = 0

Request a new output data buffer.

The memory in this buffer is available for use in PopART until the corresponding inComplete() call and will be modified in-place.

Note

Failing to provide a valid data buffer will result in a runtime failure.

Parameters

id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.

Returns

The output buffer for this tensor returned as a MutableVoidData object.

inline virtual void outComplete(TensorId)

Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.

Parameters

id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.

inline void enableRuntimeAsserts(bool b)

Enable or disable runtime asserts.

If runtime asserts are enabled, then a check that the input and output buffers have the correct number of elements is performed. As Session.run() is called multiple times during a user’s session, the check is only performed in the first call to Session.run(), under the assumption that the user is unlikely to change the size of buffers between runs.

Parameters: b – The setting to enable runtime asserts (true) or disable runtime asserts (false).

inline bool runtimeAssertsEnabled() const

Check if runtime asserts are enabled.

Returns: true if runtime asserts are enabled, otherwise false.

virtual void assertNumElements(const popx::Executablex&) const = 0

Check number of elements.

This check is performed when runtimeAssertsEnabled() is true.

Parameters: Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.

#include <popart/stepio.hpp>

class StepIO : public popart::StepIOGeneric<IArray, StepIONS::IArrayAccessor, IArray&>

Class to provide a Session object with input and output data.

Public Functions

inline StepIO(std::map<TensorId, IArray&> inputs, std::map<TensorId, IArray&> outputs)

Constructor for StepIO.

Parameters

inputs – The input data.
outputs – The output data.

class StepIOCallback : public popart::IStepIO 

Class that implements the IStepIO interface using user-provided callback functions.

The IStepIO interface contains a number of pure virtual member functions through which PopART receives buffers to read data from and buffers to write data to. StepIOCallback inherits from IStepIO and implements those member functions by delegating the logic to the callback functions passed in the constructor. This gives the user full control as to how data buffers are provisioned.

See IStepIO for more details on the expected behaviour of the callbacks.

Public Types

using InputCallback = std::function<ConstVoidData(TensorId, bool)>: Callable object that implements IStepIO::in().

using InputCompleteCallback = std::function<void(TensorId)>: Callable object that implements IStepIO::inComplete().

using OutputCallback = std::function<MutableVoidData(TensorId)>: Callable object that implements IStepIO::out().

using OutputCompleteCallback = std::function<void(TensorId)>: Callable object that implements IStepIO::outComplete().

Public Functions

inline StepIOCallback(InputCallback inputCallback, InputCompleteCallback inputCompleteCallback, OutputCallback outputCallback, OutputCompleteCallback outputCompleteCallback)

Construct a StepIOCallback object.

Parameters

inputCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::in() is called. See IStepIO for details on how to implement this method.
inputCompleteCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::inComplete() is called. See IStepIO for details on how to implement this method.
outputCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::out() is called. See IStepIO for details on how to implement this method.
outputCompleteCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::outComplete() is called. See IStepIO for details on how to implement this method.

inline virtual void assertNumElements(const popx::Executablex&) const

Check number of elements.

This check is performed when IStepIO::runtimeAssertsEnabled() is true.

Parameters: Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.

virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, bool) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the inputCallback parameter passed to the constructor.

This function should not be called directly.

virtual void inComplete(TensorId id, int64_t numElements, bool) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the inputCompleteCallback parameter passed to the constructor.

This function should not be called directly.

virtual MutableVoidData out(TensorId id, int64_t numElements) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the outputCallback parameter passed to the constructor.

This function should not be called directly.

virtual void outComplete(TensorId id) final

This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the outputCompleteCallback parameter passed to the constructor.

This function should not be called directly.

class IWeightsIO

A virtual class for accessing pointers to the data required to perform a training step.

Subclassed by popart::WeightsIO

Public Functions

virtual ~IWeightsIO() = default: Destructor for IWeightsIO.

virtual bool contains(TensorId) const = 0

Check if the WeightsIO instance contains the weights for a specific tensor.

Parameters: TensorId – The ID of the tensor to look for weights for.
Returns: true if the WeightsIO instance contains weights for the tensor, false otherwise.

virtual MutableVoidData weight(TensorId) const = 0

Retrieve weights for a specific tensor.

Parameters: TensorId – The ID of the tensor to retrieve weights for.
Returns: The weights.

class WeightsIO : public popart::IWeightsIO 

Class representing weights.

Public Functions

~WeightsIO() override = default: Destructor for WeightsIO.

virtual bool contains(TensorId) const final

Check if the WeightsIO instance contains the weights for a specific tensor.

Parameters: TensorId – The ID of the tensor to look for weights for.
Returns: true if the WeightsIO instance contains weights for the tensor, false otherwise.

virtual MutableVoidData weight(TensorId) const final

Retrieve weights for a specific tensor from the WeightsIO object.

Parameters: TensorId – The ID of the tensor to retrieve weights for.
Returns: The weights.

void insert(TensorId, MutableVoidData)

Insert weights for a specific tensor into the WeightsIO object.

Parameters

TensorId – The ID of the tensor to insert weights for.
MutableVoidData – The weights to insert.

struct IArrayAccessor

Structure to help with accessing the data in IArray objects.

Public Static Functions

static inline void *getDataPointer(IArray &array)

Get pointer to the data.

Parameters: array – The IArray object.
Returns: A pointer to the data contained in the IArray object.

static inline size_t getArraySize(const IArray &array)

Get the number of data elements.

Parameters: array – The IArray object.
Returns: The number of data elements.

static inline DataType getArrayDataType(IArray &array)

Get the data type of the data.

Parameters: array – The IArray object.
Returns: The data type of the data.

static inline size_t getArrayRank(IArray &array)

Get the rank of the data array.

Parameters: array – The IArray object.
Returns: The rank of the data array.

static inline int64_t getArrayDim(IArray &array, size_t index)

Get the size of the data at a specific location.

Parameters

array – The IArray object.
index – The index of the data element in the IArray object.

Returns

The size of the data at the specific location.

#include <popart/stepio_generic.hpp>

template<typename ARRAY_TYPE, typename ACCESSOR_TYPE, typename ArrayInfoT> class StepIOGeneric : public popart::IStepIO 

Subclassed by popart::StepIO

Public Functions

inline void assertNumElements(const popx::Executablex &exe) const final

inline TensorInfo getTensorInfo(ARRAY_TYPE &array) const

template<typename T> inline T get(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, bool advance_, std::string mapName)

template<typename T> inline void advance(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, std::string mapName)

inline ConstVoidData in(TensorId id, int64_t numElements, bool, bool) final

inline void inComplete(TensorId id, int64_t numElements, bool) final

inline MutableVoidData out(TensorId id, int64_t numElements) final

struct ArrayInfo

Public Members

ArrayInfoT array

int64_t offset

#include <popart/iarray.hpp>

class IArray

Subclassed by popart::NDArrayWrapper< T >

Public Functions

inline virtual ~IArray()

virtual void *data() = 0

virtual DataType dataType() const = 0

virtual std::size_t rank() const = 0

virtual int64_t dim(size_t index) const = 0

virtual std::size_t nelms() const = 0

virtual const Shape shape() const = 0

14.3. Tensors

#include <popart/tensor.hpp>

class Tensor : public popart::Vertex

Public Functions

Tensor(TensorId, TensorType, Graph&, const DebugContext& = {})

Tensor(TensorId, VariableSettings, Graph&, const DebugContext& = {})

Tensor(TensorId, TensorType, VariableSettings, Graph&, const DebugContext& = {})

inline std::string str() const final

virtual std::unique_ptr<Tensor> clone(Graph &graph_) const

TensorType tensorType() const

std::string tensor_type() const

void setTensorType(TensorType)

inline ReplicatedStreamMode getReplicatedStreamMode() const

inline void setReplicatedStreamMode(const ReplicatedStreamMode &mode)

void setTensorLocationInfo(TensorLocation&, std::pair<RemoteBufferId, RemoteBufferIndex> &remoteBufferInfo)

std::set<PipelineStage> getPipelineStages() const

Op *getProducerUnsafe() const

Op *getProducer() const

void setProducer(Op*)

void resetProducer(Op*)

bool hasProducer() const

bool isGraphInput() const

InIndex getGraphInputIndex() const

bool isGraphOutput() const

OutIndex getGraphOutputIndex() const

bool isLoopInput() const

bool isImplicitLoopInput() const

bool isExplicitLoopInput() const

bool isLoopTripCounter() const

bool isUnmodifiable() const

bool isCheckpointTensor() const

bool isImplicitRecomputeTensor() const

bool isRestoreInplaceTensor() const

bool idIncludesPrefix(const std::vector<std::string>&) const

bool isOptimizerTensor() const

bool isRemoteArgTensor() const

bool isRandomSeedTensor() const

bool isOptimizerStateTensor() const

bool isAccumulatorTensor() const

bool isHostLoadTensor() const

Is this tensor produced by a HostLoad Op or MultiExchangeOp with HostLoad descriptor?

Returns: true if producer is a HostLoad Op or MultiExchangeOp with HostLoad descriptor false otherwise.

bool isWeightTensor() const

bool isAnchored() const

bool isRootAnchor() const

bool hasTensorData() const

TensorData *tensorData()

const TensorData *tensorData() const

bool anyAlias(std::function<bool(Tensor*)> predicate) const

bool anyAliasFor(std::function<bool(Tensor*)> predicate, const AliasModel &popMem) const

void setTensorDataFromCopyOf(const void *src, std::size_t size)

void setTensorDataFromViewOf(void *src, std::size_t size)

void setTensorDataByEmplaceOf(std::vector<char> &&data)

void setTensorData(const TensorData &td)

void setTensorData(TensorData &&td)

std::vector<Op*> associatedOps() const

inline Graph &getGraph()

inline const Graph &getGraph() const

Ir &getIr()

const Ir &getIr() const

bool hasVirtualGraphId() const

VGraphId getVirtualGraphId() const

VGraphId getVirtualGraphIdUnsafe() const

VGraphIdAndTileSet getVirtualGraphIdAndTileSet(std::set<OpId> &visited) const

VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe() const

VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe(std::set<OpId> &visited) const

int getBatchAxis() const

bool consumersAllPreLoss() const

bool isModified(bool considerLoopInput = true) const

Check if any of the consumers modify this tensor.

Parameters: considerLoopInput – If explicit loop inputs should be considered as being modified. If false, only operations modifying the tensor inplace will be considered.
Returns: True if the tensor is modified, otherwise false.

bool isAliased() const

Check if any of the consumers alias this tensor.

Returns: True if the tensor is aliased to any output, otherwise false.

view::Regions modifiedRegionsByOps(std::vector<Op*> ops, Aliases &aliases) const

view::Regions modifiedRegionsByOps(std::vector<OpId> opIds, Aliases &aliases) const

std::set<Op*, POpCmp> getInplaceModifiers() const

Find operations that modify a tensor.

Returns: All operations that (direct and indirectly) modify this tensor

std::set<Op*, POpCmp> getInplaceModifiersFor(const AliasModel *popMem) const

Find operations that modify a tensor with the given poprithm graph.

Returns: All operations that (direct and indirectly) modify this tensor

std::vector<char> getDataViaGraphTraversal() const

inline const popart::DebugInfo &getDebugInfo() const

inline void setVariableUpdateType(VariableUpdateType type): Members of old subclass VariableTensor class VariableTensor : public Tensor {.

inline VariableUpdateType getVariableUpdateType() const

inline void setCopyFromTensor(TensorId value)

inline TensorId getCopyFromTensor()

inline VariableSettings getVariableSettings() const

Returns: The VariableSettings of this Variable

std::vector<int64_t> returnedShape(unsigned replicationFactor)

Returns the shape necessitated by IO.

Parameters: replicationFactor – The replication factor
Returns: the shape of the tensor, considering replica groups

void verifyMutableVoidInfo(const TensorInfo mutableVoidInfo, unsigned replicationFactor)

Check that the info of a mutableVoidData object matches the expectations set by the TensorInfo and VariableSettings.

Throws an error if there is a mismatch.

Parameters

mutableVoidInfo – The data of the MutableVoidInfo with the same id as this tensor
replicationFactor – The replicationFactor of this instance

void setPreparedVGraphIdAndTileSet(): Set the preparedVGraphIdAndTileSet.

Public Members

TensorId id

Consumers consumers

TensorInfo info

TensorLocationInfo tensorLocationInfo

InputSettings inputSettings

enum class popart::TensorType

Values:

enumerator ActGrad = 0

enumerator Const

enumerator Stream

enumerator Unknown

enumerator Variable

enumerator N

enum class popart::VariableUpdateType

Values:

enumerator None = 0

enumerator Gradient

enumerator Copy

#include <popart/tensorinfo.hpp>

enum class popart::DataType

There is a one-to-one correspondence between popart::DataTypes and ONNX_NAMESPACE::TensorProto_DataTypes, which is equivalent to decltype(ONNX_NAMESPACE::TensorProto().data_type()).

Values:

enumerator UINT8 = 0

enumerator INT8

enumerator FLOAT8_143

enumerator FLOAT8_152

enumerator UINT16

enumerator INT16

enumerator INT32

enumerator INT64

enumerator UINT32

enumerator UINT64

enumerator BOOL

enumerator FLOAT

enumerator FLOAT16

enumerator BFLOAT16

enumerator DOUBLE

enumerator COMPLEX64

enumerator COMPLEX128

enumerator STRING

enumerator UNDEFINED

class DataTypeInfo

Public Functions

DataTypeInfo(DataType type__, int nbytes__, bool isFixedPoint__, std::string name__, std::string lcasename__)

DataType type() const

const int &nbytes() const

const std::string &name() const

const std::string &lcasename() const

bool isFixedPoint() const

class TensorInfo

Public Functions

TensorInfo(DataType, const Shape&)

Create TensorInformation based on data type and shape.

Parameters

data_type – - The data type.
shape – - The actual shape of the tensor.

TensorInfo(DataType data_type, const Shape &shape, const Shape &meta_shape)

Create TensorInformation based on data type, shape and meta shape.

Parameters

data_type – - The data type.
shape – - The actual shape of the tensor.
meta_shape – - The meta shape of the tensor, which can for example be used to store the original tensor shape before replicated tensor sharding was applied.

TensorInfo(std::string data_type, std::string shape)

TensorInfo(std::string data_type, const Shape&)

explicit TensorInfo(const ONNX_NAMESPACE::TensorProto&)

explicit TensorInfo(const ONNX_NAMESPACE::TypeProto&)

void set(const ONNX_NAMESPACE::TensorProto&)

void set(const ONNX_NAMESPACE::TypeProto&)

TensorInfo() = default

void set(DataType)

void set(DataType, const Shape&)

void set(DataType, const Shape&, const Shape&)

const Shape &shape() const

const Shape &metaShape() const

std::vector<size_t> shape_szt() const

inline Rank rank() const

inline int64_t nelms() const

int64_t nbytes() const

inline int64_t dim(int i) const

inline std::vector<int> strides(const std::vector<long> &shape)

Get the strides of the tensor, that is the number of bytes to step in each dimension when traversing an array in memory.

See https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html

Parameters: shape – The on-host ONNX shape of a tensor. This is different from this->shape(), which gives the on-replica shape of a tensor
Returns: std::vector<int> The strides vector.

DataType dataType() const

const std::string &data_type() const

const std::string &data_type_lcase() const

void append(std::ostream&) const

bool isSet() const

bool operator==(const TensorInfo&) const

bool operator!=(const TensorInfo&) const

Shape shapeFromString(const std::string &s) const

ONNX_NAMESPACE::TypeProto getOnnxTypeProto() const

const DataTypeInfo *getDataTypeInfo() const

Public Static Functions

static std::string npOutDataTypeExceptionMessage(const TensorInfo &i0, const TensorInfo &i1, const std::string &debugName)

#include <popart/tensorindex.hpp>

class TensorIndexMap

Public Functions

TensorIndexMap() = default

~TensorIndexMap()

void insert(int, Tensor*)

void reset(int, Tensor*)

void erase(int)

void clear()

bool contains(Tensor*) const

Tensor *tensor(int)

const Tensor *tensor(int) const

TensorId id(int) const

bool hasIndex(int) const

const std::vector<int> &indices(Tensor*) const

const std::map<Tensor*, std::vector<int>, PTensorCmp> &indicesMap() const

const std::map<int, Tensor*> &tensorMap() const

const std::vector<Tensor*> tensors() const

std::map<int, TensorId> tensorIdMap() const

std::map<TensorId, int> idMap() const

int n() const

void append(std::stringstream&, std::string prefix, int max_id_length) const

void setInfoIfIndex(const TensorInfo&, int index)

std::vector<TensorId> getSerialised() const

int maxIdLength() const

std::map<int, Shape> getIndexShapeMap()

int minIndex() const

int maxIndex() const

#include <popart/tensorlocation.hpp>

enum class popart::ReplicatedTensorSharding

Enum type to specify whether to shard tensors over replicas.

Values:

enumerator Off = 0: Don’t shard tensors over replicas.

enumerator On = 1: Do shard tensors over replicas.

enumerator N = 2: Number of values.

class TensorLocation

Class that describes the memory characteristics of one or multiple tensors.

14.4. Optimizers

#include <popart/optimizer.hpp>

class Optimizer

Interface for describing an Optimizer and, internally, how to grow the optimiser step for each weight.

The end-user facing interface constructed by the user to describe what kind of optimiser to use.
Then also used internally by the Ir to grow the optimiser step for each weight.
Stores OptimizerValues for optimizer parameters like learning rate, loss scaling, etc.

See also

OptimiserValue.
Optimizer stores the values for each weight - they can have different values. There is a “default” for all weights, then you can specify specific values for specific weights. This is encapsulated by an OptimizerValueMap, which is a sparse map from weight to value, with unspecified values implying the default.

See also

OptimizerValueMap.
At runtime, the user can dynamically update the Optimizer, e.g. by setting new OptimizerValues. validReplacement determines whether the new Optimizer is interchangable with the one the Ir was built for. For example, trying to replace an SGD Optimizer with an Adam Optimizer would throw.

Subclassed by popart::Adam, popart::Adaptive, popart::SGD

Public Functions

virtual ~Optimizer() = default

Optimizer class has a two-part initialisation. The ctor, used by the end-user, and setFactorsFromOptions called by the Ir to finish initialisation once we have all the relevant information during Ir preparation.
Some key methods used by the Ir to grow optimiser step for each weight are createOp, getInputIds, optimizerInputs.
If the OptimizerValue is const, no Ir tensor for that value is created and the VarUpdateOp created for that weight will not have the optional input for that tensor. The Opx of the VarUpdateOp will emit poplar code that uses the provided value directly.

If the OptimizerValue is not const, an Ir tensor for that value is created and the VarUpdateOp created for that weight will have the optional input for that tensor. The tensor will be a stream tensor, so that it can be updated later from host. The tensor will be streamed an initial value of the OptimizerValue’s value.
It is common for Optimizer
implementations to make use of “compound

scalars”. Take for example the SGD0 weight update equation: w <- w * (1 - lr * (1 - dm) * wd) - g * (lr * (1 - dm) / ls) w is the weights and g is the grads. lr, dm, wd, ls are all the “atomic scalars”. These are the scalars/hyperparameters of the
Optimizer that the user can set using OptimizerValues, as described above.

Multiple atomic scalars appear in expressions together, and will be operated on together before being used by an Op that also consumes a tensor (in this case the weights or grads). For SGD0, they can be grouped as follows:
```
w <- w * {1 -  lr * (1 - dm) * wd} -  g * { lr * (1 - dm) / ls }
         ^^^^^^^^^^^^^^^^^^^^^^^^^        ~~~~~~~~~~~~~~~~~~~~~~
                    |                               |
   weight decay scale factor 0                      |
                                           scaled learning rate 0
```
We call wdsf0 and slr0 the “compound scalars”.

We can statically precompute the OptimizerValues for these compound scalars using the OptimizerValues of the atomic scalars. This makes the Ir simpler, as we now have only:
```
w <- w * wdsf0 - g * slr0
```
The CompoundScalarHelpers are used to precompute the compound scalar values.

If any of the composite atomic scalars are non-const, the compound scalar is non-const.

See also

compoundscalarhelper.hpp

Optimizer(OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings, const DebugContext &debugContext)

Optimizer(const Optimizer&) = default

virtual void validReplacement(const Optimizer &other) const

virtual OptimizerType type() const = 0

virtual std::string type_s() const = 0

virtual std::unique_ptr<Optimizer> clone() const = 0

virtual void resetTensorData(Tensor&) const = 0

virtual void setTensorData(Tensor&) const = 0

virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const = 0

virtual std::vector<TensorId> getInputIds(const Tensor &weight) const = 0

Returns the TensorIds of the input tensors to the VarUpdateOp this optimiser will create for the given weight .

Specifically, The TensorId at index i will be the id of the input tensor at InIndex i of the VarUpdateOp. If the input is an OptimizerValue, if it is const, then “” will be returned, else the relevant reservered prefix for that OptimizerValue will be used, followed by the weight id. The prefixes are defined in tensornames.hpp, for example reservedDefaultWeightDecayScaleFactor0Prefix or reservedSpecificScaledLearningRate1Prefix (note there are different prefixes depending on if the weight has a specific or default value for that OptimizerValue).

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const = 0

inline const OptimizerValue &lossScaling() const

inline float getLossScalingVal() const

float getFinalLossScalingVal() const

virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const = 0

virtual void setFactorsFromOptions(const SessionOptions&)

bool gradientAccumulationEnabled() const

bool meanReductionEnabled() const

bool postMeanAccumulationEnabled() const

bool postMeanReplicationEnabled() const

int64_t getReplicatedGraphCount() const

int64_t getAccumulationFactor() const

bool meanGradientAccumulationEnabled() const

inline const std::vector<ClipNormSettings> &getClipNormSettings() const

virtual bool hasSpecific(const Tensor &w) const = 0

virtual bool hasSpecific() const = 0

virtual size_t hash() const

inline DebugContext getDebugContext() const

Public Static Functions

static TensorId getLossScalingTensorId(DataType)

enum class popart::OptimizerType

Types of optimizers.

Values:

enumerator SGD = 0

enumerator Adam

enumerator Adaptive

enumerator NTYPES

enum class popart::OptimizerReductionType

Reduction mode when doing data-parallel training over replicated graphs.

Depending on the optimizer used and its configuration, this option describes how the reduction of gradients over replicas will occur. For example, directly on the gradient, on the gradient accumulator, or on the momentum. See the documentation of individual optimizers for more information.

Values:

enumerator None = 0: No replicated graph reduction.

enumerator GradReduce: Gradient reduction (every iteration, after a weight’s gradient is produced)

enumerator AcclReduce: Momentum reduction (SGD1, after the gradient accumulation loop, if applicable)

enumerator AccumReduce: Accumulator reduction (Adam/SGD2 + gradient accumulation, after the gradient accumulation loop)

enum class popart::WeightDecayMode

Values:

enumerator Decay: Weight decay (e.g. AdamW)

enumerator L2Regularization: L2 regularization (e.g. PyTorch-like Adam)

#include <popart/optimizervalue.hpp>

class OptimizerValue

A class used to represent values of hyper parameters.

Public Functions

OptimizerValue() = default: Equivalent to OptimizerValue(0, false).

inline OptimizerValue(float v): Equivalent to OptimizerValue(v, true).

inline OptimizerValue(float v, bool c)

Constructor.

Parameters

v – The current value of the hyper parameter.
c – A boolean flag to indicate whether the parameter will remain at this value forever (true) or may change over time (false).

inline OptimizerValue(std::pair<float, bool> x)

inline float val() const

inline bool isConst() const

void validReplacement(const OptimizerValue &rhs) const

bool operator==(const OptimizerValue &rhs) const

#include <popart/optimizervaluemap.hpp>

class OptimizerValueMap

Public Functions

inline OptimizerValueMap(OptimizerValue g)

OptimizerValue get(const TensorId &id) const

void insertSpecific(const TensorId&, OptimizerValue)

inline bool hasSpecific(const TensorId &id) const

inline bool hasSpecific() const

inline OptimizerValue getDefault() const

void validReplacement(const OptimizerValueMap &rhs) const

inline const std::map<TensorId, OptimizerValue> &getSpecifics() const

14.4.1. Stochastic Gradient Descent (SGD)

#include <popart/clipnormsettings.hpp>

class ClipNormSettings

A data structure used to represent a maximum value constraint on one or more weights.

This is passed to the optimizer on construction.

Public Types

enum class Mode

Values:

enumerator ClipSpecifiedWeights

enumerator ClipAllWeights

Public Functions

ClipNormSettings(const std::vector<TensorId> &weightIds_, float maxNorm_)

DEPRECATED This will be removed from a future release.

Constructor.

Parameters

weightIds_ – The weight tensor IDs that this constraint applies to.
maxNorm_ – The maximum permissible value.

const std::vector<TensorId> &getWeightIds() const

float getMaxNorm() const

Mode getMode() const

bool operator==(const ClipNormSettings&) const

bool operator!=(const ClipNormSettings &other) const

Public Members

std::vector<TensorId> weightIds

float maxNorm

Public Static Functions

static ClipNormSettings clipWeights(const std::vector<TensorId> &weightIds_, float maxNorm_)

static ClipNormSettings clipAllWeights(float maxNorm_)

#include <popart/sgd.hpp>

class SGD : public popart::Optimizer 

Stochastic Gradient Descent (SGD) optimizer.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The SGD optimizer has the following state for each weight:

velocity ( \(v\))

The SGD optimizer has the following hyper parameters:

learning rate ( \(\text{lr}\))
momentum ( \(\text{mm}\))
weight decay ( \(\text{wd}\))
dampening ( \(\text{dm}\))
velocity scaling ( \(\text{vs}\))
loss scaling ( \(\text{ls}\))
nesterov
clip norm settings

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see SGD::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first updates the optimizer state as follows:

\[ v' := v * \text{mm} + (1 - \text{dm}) * (g + \text{wd} * w) \text{ \ . } \]

Following the update of the optimizer state the optimizer uses said state to update the weight:

if nesterov is True:

\[ g' := g + \text{wd} * w + \text{mm} * v' \text{ \ . } \]

\[ w' := w - \text{lr} * g' \text{ \ . } \]

else:

\[ w' := w - \text{lr} * v' \text{ \ . } \]

In addition to the above, the velocity scaling hyper parameter is a scaling factor that can provide improved numerical stability by ensuring the values stored in the optimizer state, \(v\), are scaled by this value. When using this parameter PopART will automatically deal with the artificially scaled velocity value during the weight update and other hyper parameters do not need to be adjusted).

In addition, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.

Finally, it is possible to add clip norm settings for this optimizer. These clip norms compute the L2 norm for a group of weights and adds a scalar term to the weight update that effectively divides it by the norm (or a constant value that is provided as part of the clip norm, which ever is greater).

See the SGD notes in optimizer.hpp for a more detailed and comprehensive derivation of the SGD optimizer step in PopART.

Subclassed by popart::ConstSGD

Public Functions

SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, OptimizerValue nesterov, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

Parameters

defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
nesterov – Option to enable Nesterov momentum. Defaults to false.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.

SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

Parameters

defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.

SGD(const std::map<std::string, std::pair<float, bool>> &params, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})

Constructor.

EXAMPLE:

SGD({{"defaultLearningRate", {0.02, false}},
    {"defaultMomentum", {0.6, true}}});

See also

SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.

This will create an SGD Optimizer which has a constant momentum of 0.6 and a changeable learning rate initially of 0.02. All OptimizerValues not present in the map will take values from the getUnset* functions.

Parameters

params – A parameter map where the keys are one or more of "defaultLearningRate", "defaultWeightDecay", "defaultMomentum", "defaultDampening", "defaultVelocityScaling", "lossScaling" or `”nesterov”. The map’s values are pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter because default values will be used where parameters are missing.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.

inline SGD(): Default constructor Creates SGD with default scalars (equivalent to getUnset<scalar>() methods), and other default parameters of main constructor.

SGD(const SGD&) = default: Copy constructor.

~SGD() = default

inline virtual OptimizerType type() const final

inline virtual std::string type_s() const final

inline SGDAccumulatorAndMomentum getSGDAccumulatorAndMomentum() const

virtual std::unique_ptr<Optimizer> clone() const final

virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final

Returns the VarUpdateOp for the given weight .

If no gradient accumulation of momentum, this will be a SGD0VarUpdateOp. Else, if getSGDAccumulatorAndMomentum() == ::Combined, this will be an SGD1ComboOp, else if getSGDAccumulatorAndMomentum() == ::CombinedSGD2ComboOp, an SGD2ComboOp

.

The required compound scalar OptimizerValues for the

VarUpdateOp wil be computed and passed to the Op. See the SGD notes above this class for how they are derived. Recall that if non-const, the VarUpdateOp will take an input Tensor for the compound scalar.

See also

Optimizer::createOp

The OptimizerReductionType of the Op is derived as follows: No replication => None Replication, no grad acc => GradReduce Replication, grad acc, SGD1 => AcclReduce Replication, grad acc, SGD2 => AccumReduce See the SGD notes above this class for why this is.

If SGD2, the DataType of the accum and accl1 tensors passed to the SGD2ComboOp will be as set in the SGD constructor. Recall DataType::UNDEFINED means use the same as the weight.

An SGD1ComboOp will later be decomposed by SGD1Decompose

pattern into a series of Ops and Tensors that implement the SGD1 optimiser step.

An SGD12ComboOp will later be decomposed by

SGD2Decompose pattern into a series of Ops and Tensors that implement the SGD2 optimiser step.

See also

SGD1Decompose

See also

SGD2Decompose

virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

See also

Optimizer::getInputIds

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final: smm1 and wdsf0 have the same data type as the weight . Everything else

virtual void validReplacement(const Optimizer &other) const final

virtual void resetTensorData(Tensor&) const final

virtual void setTensorData(Tensor&) const final

float getStoredValue(const TensorId &optId) const: Tensor “opt” has an id, which it uses to match a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue momentum, OptimizerValue dampening, OptimizerValue velocityScaling, OptimizerValue nesterov)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
momentum – The momentum value to use for this specific weight.
dampening – The dampening value to use for this specific weight.
velocityScaling – The velocity scaling value to use for this specific weight.
nesterov – Option to enable Nesterov momentum. Defaults to false.

void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
params – A parameter map where keys are one of "learningRate", "weightDecay", "momentum", "dampening", or "velocityScaling" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

virtual bool hasSpecific(const Tensor &w) const final

virtual bool hasSpecific() const final

virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const

inline const OptimizerValueMap &learningRates() const

inline const OptimizerValueMap &weightDecays() const

inline const OptimizerValueMap &momentums() const

inline const OptimizerValueMap &dampenings() const

inline const OptimizerValueMap &velocityScalings() const

inline const OptimizerValueMap &nesterov() const

virtual size_t hash() const

Public Static Functions

static inline OptimizerValue getUnsetLearningRate(): Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay(): Default weight decay value.

static inline OptimizerValue getUnsetMomentum(): Default momentum value.

static inline OptimizerValue getUnsetDampening(): Default dampening value.

static inline OptimizerValue getUnsetVelocityScaling(): Default velocity scaling value.

static inline OptimizerValue getUnsetLossScaling(): Default loss scaling value.

static inline OptimizerValue getUnsetNesterov(): Default nesterov.

static SGD fromDefaultMap(const std::map<std::string, OptimizerValue>&, const DebugContext &debugContext = {})

class ConstSGD : public popart::SGD 

Stochastic Gradient Descent (SGD) optimizer with constant learning rate, weight decay, loss scaling and clip norm settings (and default values for momentum, dampening or velocity scaling).

NOTE: See SGD for detailed meaning for these parameters.

NOTE: This class exists for backwards compatibility with the Python API and may be removed at some point in the future.

Public Functions

inline ConstSGD(float learningRate, float weightDecay = 0, float lossScaling = 1, const std::vector<ClipNormSettings> &clipNormSettings = {})

Constructor.

Parameters

learningRate – A constant learning rate.
weightDecay – A constant weight decay value.
lossScaling – A constant loss scaling value.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).

enum class popart::SGDAccumulatorAndMomentum

Strategy for implementing SGD with momentum and/or gradient accumulation.

Values:

enumerator Combined = 0: Implement SGD using a single tensor for the gradient accumulator (accum) and momentum (accl) tensors.

enumerator Separate: Implement SGD using separate tensors for the gradient accumulator (accum) and momentum (accl) tensors.

14.4.2. Adam, AdaMax & Lamb

#include <popart/adam.hpp>

enum class popart::AdamMode

Enum type describing the mode of an Adam optimizer instance.

Values:

enumerator Adam = 0: Adam or AdamW mode, depending on weight decay setting (see Kingma & Ba, 2015 and Loshchilov & Hutter, 2018).

enumerator AdamNoBias: Like Adam but without bias correction.

enumerator AdaMax: Adamax mode.

enumerator Lamb: Lamb mode (see You et al., 2020).

enumerator LambNoBias: Like Lamb but without bias correction.

class Adam : public popart::Optimizer 

AdamW, Lamb and AdaMax optimizer implementation.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The optimizer has the following state for each weight:

first-order momentum ( \(m\))
second-order momentum ( \(v\))
time step ( \(t\))

The optimizer has the following hyper parameters:

learning rate ( \(\text{lr}\))
weight decay ( \(\text{wd}\))
beta1 ( \(\beta_1\))
beta2 ( \(\beta_2\))
epsilon ( \(\epsilon\))
loss scaling ( \(\text{ls}\))
maximum weight norm ( \(\text{mwn}\))

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adam::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

The values of #AdamMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:

\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]

Secondly, the optimizer updates the optimizer state as follows:

\[\begin{split} m' &:= \beta_1 * m + (1 - \beta_1) * g_\text{tmp} \\ v' &:= \left\{\begin{aligned} \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Adam/AdamNoBias) } \\ \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Lamb/LambNoBias) } \\ \text{max}(\beta_2 * v, |g_\text{tmp}|) & \text{ \; (AdaMax) } \\ \end{aligned}\right.\\ t' &:= t + 1 \\ \end{split}\]

Next, it computes the following terms:

\[\begin{split} m_\text{tmp} &:= \left\{\begin{aligned} m' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{m'}{(1 - \beta_1^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ v_\text{tmp} &:= \left\{\begin{aligned} v' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{v'}{(1 - \beta_2^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ u_\text{tmp} &:= \left\{\begin{aligned} \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} + \text{wd} * w &\text{ \; (Decay) } \\ \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]

Finally, the optimizer updates the weight as follows:

\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * u_\text{tmp} &\text{ \; (Adam/AdamNoBias/AdaMax) } \\ w - \biggl(\frac{\text{min}(\lVert{w}\rVert, \text{mwn})}{\lVert{u_\text{tmp}}\rVert}\biggr) * \text{lr} * u_\text{tmp} &\text{ \; (Lamb/LambNoBias) } \\ \end{aligned}\right. \end{split}\]

In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability of the gradient calculations. If scaledOptimizerState is enabled then the the lossScaling will not be removed before updating the optimizer state. This can improve the numerical stability when accl1_type is set to FLOAT16.

NOTE: The maximum weight norm is referred to as \(\phi\) in You et al., 2020.

Public Functions

virtual bool hasSpecific(const Tensor &w) const final

virtual bool hasSpecific() const final

virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const final

Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Constructor.

Parameters

defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultBeta1 – The beta1 value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultBeta2 – The beta2 value value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameters have been inserted.
lossScaling – The loss scaling value to use.
maxWeightNorm – The maxWeightNorm value to use.
adamMode – The AdamMode value to use.
weightDecayMode – The WeightDecayMode value to use.
maxWeightNorm – The maxWeightNorm value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.
debugContext – Optional debug context.

Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Adam(const std::map<std::string, std::pair<float, bool>> &params, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})

Constructor.

EXAMPLE:

Adam({{"defaultLearningRate", {0.02, False}},
      {"defaultBeta1", {0.9, True}},
      {"defaultBeta2":{0.999, True}}},
      AdamMode::Adam,
      WeightDecayMode::Decay,
      DataType::FLOAT,
      DataType::FLOAT,
      DataType::FLOAT);

Parameters

params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultBeta1", "defaultBeta2", "defaultEps", "lossScaling" or "maxWeightNorm", and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.
adamMode – The AdamMode value to use.
weightDecayMode – The WeightDecayMode value to use.
maxWeightNorm – The maxWeightNorm value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.
debugContext – Optional debug context.

Adam(const Adam&) = default

~Adam() = default

inline virtual OptimizerType type() const final

inline virtual std::string type_s() const final

virtual std::unique_ptr<Optimizer> clone() const final

virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final

virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.

In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final: The names and infos of the optimizer tensors.

virtual void validReplacement(const Optimizer &other) const final

virtual void resetTensorData(Tensor&) const final

virtual void setTensorData(Tensor&) const final

float getStoredValue(const TensorId &optId) const: Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue beta1, OptimizerValue beta2, OptimizerValue eps, OptimizerValue mwn)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
beta1 – The beta1 value to use for this specific weight.
beta2 – The beta2 value to use for this specific weight.
eps – The epsilon value to use for this specific weight.
mwn – The max weight norm value to use for this specific weight.

void setStep(int64_t step)

void setStep(const TensorId&, int64_t step)

void setStep(std::map<TensorId, int64_t> steps)

void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultBeta1", "defaultBeta2", "defaultEps", "lossScaling" or "maxWeightNorm" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

inline const OptimizerValueMap &learningRates() const

inline const OptimizerValueMap &weightDecays() const

inline const OptimizerValueMap &beta1s() const

inline const OptimizerValueMap &beta2s() const

inline const OptimizerValueMap &epss() const

inline const OptimizerValueMap &maxWeightNorms() const

inline const WeightDecayMode &getWeightDecayMode() const

inline bool useScaledOptimizerState() const

virtual size_t hash() const final

virtual void setFactorsFromOptions(const SessionOptions&) final

Public Static Functions

static inline OptimizerValue getUnsetLearningRate(): Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay(): Default weight decay value.

static inline OptimizerValue getUnsetBeta1(): Default beta1 value.

static inline OptimizerValue getUnsetBeta2(): Default beta2 value.

static inline OptimizerValue getUnsetEps(): Default epsilon value.

static inline OptimizerValue getUnsetLossScaling(): Default loss scaling value.

static inline OptimizerValue getUnsetMaxWeightNorm(): Default maximum weight norm value.

static Adam fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdamMode adamMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, const DebugContext &debugContext = {})

14.4.3. AdaDelta, RMSProp & AdaGrad

#include <popart/adaptive.hpp>

enum class popart::AdaptiveMode

Enum class representing a type of adaptive optimizer.

Values:

enumerator AdaGrad = 0: AdaGrad optimizer.

enumerator RMSProp: RMSProp optimizer.

enumerator CenteredRMSProp: CenteredRMSProp optimizer.

enumerator AdaDelta: AdaDelta optimizer.

class Adaptive : public popart::Optimizer 

AdaDelta, RMSProp and AdaGrad optimizer implementation.

Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.

The optimizer has the following state for each weight:

first-order momentum ( \(v_1\))
second-order momentum ( \(v_2\)) (only for AdaGrad/RMSProp)
third-order momentum ( \(v_3\))

The optimizer has the following hyper parameters:

learning rate ( \(\text{lr}\))
weight decay ( \(\text{wd}\))
alpha ( \(\alpha\))
momentum ( \(\text{m}\)))
epsilon ( \(\epsilon\))
loss scaling ( \(\text{ls}\))

The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adaptive::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.

The values of #AdaptiveMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).

In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.

When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:

\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]

Secondly, the optimizer updates \(v_1\) the optimizer state as follows:

\[\begin{split} v_1' &:= \left\{\begin{aligned} \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (RMSProp/AdaDelta) } \\ \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (CenteredRMSProp) } \\ v_1 + g_\text{tmp}^2 & \text{ \; (AdaGrad) } \\ \end{aligned}\right.\\ \end{split}\]

Next, \(v_2\) is updated, but only for CenteredRMSProp:

\[\begin{split} v_2' &:= \alpha * v_2 + (1 - \alpha) * g_\text{tmp} \text{ \; (CenteredRMSProp) } \\ \end{split}\]

Next, it computes the update term \(u_\text{tmp}\):

\[\begin{split} u_\text{tmp} &:= \left\{\begin{aligned} \frac{g_\text{tmp}}{\sqrt{v_1'} + \epsilon} & \text{ \; (AdaGrad/RMSProp) } \\ \frac{g_\text{tmp}}{\sqrt{v_1' - v_2'^2} + \epsilon} & \text{ \; (CenteredRMSProp) } \\ \frac{g_\text{tmp} * \sqrt{v_2 + \epsilon}}{\sqrt{v_1' + \epsilon}} & \text{ \; (AdaDelta) } \\ \end{aligned}\right. \end{split}\]

Next, \(v_2\) is updated, but only for AdaDelta:

\[\begin{split} v_2' := \alpha * v_2 + (1 - \alpha) * u_\text{tmp}^2 \text{ \; (AdaDelta) } \\ \end{split}\]

Next the third momentum is updated for all modes:

\[ v_3' := m * v_3 + u_\text{tmp} \]

Finally, the optimizer updates the weight as follows:

\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * (v_3' + \text{wd} * w) &\text{ \; (Decay) } \\ w - \text{lr} * v_3' &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]

In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.

Public Functions

virtual bool hasSpecific(const Tensor &w) const

virtual bool hasSpecific() const

virtual TensorId getInverseLossScalingTensorId(const Tensor &weight) const

Adaptive(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultAlpha, OptimizerValue defaultMomentum, OptimizerValue defaultEps, OptimizerValue lossScaling, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})

Constructor.

Parameters

defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultAlpha – The alpha value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
adaptiveMode – The AdaptiveMode value to use.
weightDecayMode – The WeightDecayMode value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.
debugContext – Optional debug context.

Adaptive(const std::map<std::string, std::pair<float, bool>> &params, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})

Constructor.

EXAMPLE: ```{.cpp} Adaptive({{“defaultLearningRate”, {0.02, False}}, */ // {“defaultAlpha”, {0.99, True}}}, /** AdaptiveMode::RMSProp, WeightDecayMode::Decay, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT); ```

Parameters

params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultAlpha", "defaultMomentum", "defaultEps" or "lossScaling", and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.
adaptiveMode – The AdaptiveMode value to use.
weightDecayMode – The WeightDecayMode value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.
debugContext – Optional debug context.

Adaptive(const Adaptive&) = default

~Adaptive() = default

inline virtual OptimizerType type() const final

inline virtual std::string type_s() const final

virtual std::unique_ptr<Optimizer> clone() const final

virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final

virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final

The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.

In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.

virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final: The names and infos of the optimizer tensors.

virtual void validReplacement(const Optimizer &other) const final

virtual void resetTensorData(Tensor&) const final

virtual void setTensorData(Tensor&) const final

float getStoredValue(const TensorId &optId) const: Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.

void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue alpha, OptimizerValue momentum, OptimizerValue eps)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
alpha – The alpha value to use for this specific weight.
momentum – The momentum value to use for this specific weight.
eps – The epsilon value to use for this specific weight.

void setStep(int64_t step)

void setStep(const TensorId&, int64_t step)

void setStep(std::map<TensorId, int64_t> steps)

void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> &params)

Insert a weight-specific set of hyper parameters.

Parameters

weight – The TensorId of the weight.
params – A parameter map where keys are one of "defaultLearningRate", "defaultWeightDecay", "defaultAlpha", "defaultMomentum", "defaultEps" or "lossScaling" and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.

inline const OptimizerValueMap &learningRates() const

inline const OptimizerValueMap &weightDecays() const

inline const OptimizerValueMap &alphas() const

inline const OptimizerValueMap &momentums() const

inline const OptimizerValueMap &epss() const

virtual size_t hash() const

Public Static Functions

static inline OptimizerValue getUnsetLearningRate(): Default learning rate value.

static inline OptimizerValue getUnsetWeightDecay(): Default weight decay value.

static inline OptimizerValue getUnsetAlpha(): Default alpha value.

static inline OptimizerValue getUnsetMomentum(): Default momentum value.

static inline OptimizerValue getUnsetEps(): Default epsilon value.

static inline OptimizerValue getUnsetLossScaling(): Default loss scaling value.

static Adaptive fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdaptiveMode adaptiveMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, const DebugContext &debugContext = {})

14.5. Builder

#include <popart/builder.hpp>

class Builder

An interface for a Builder, used for creating ONNX graphs.

A builder interface for creating ONNX graphs.

ONNX defines a specification for describing graphs and serialising them as protobuf files. This class provides a builder interface for creating such a graph.

Note, in ONNX, all Ops belong to an “Opset”. The Builder itself does not have methods for creating Ops in the ONNX graph, but instead has accessors to Opsets, like AiGraphcoreOpset1, which contain the methods for creating Ops in the graph.

Public Functions

Builder &createSubgraphBuilder(): Create a builder for a graph which is nested inside this builder’s graph.

~Builder(): Destructor for the Builder class.

TensorId addInputTensor(const TensorInfo &tensorInfo, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters

tensorInfo – The shape and data type of the input tensor.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const std::string &dataType, const Shape &shape, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters

dataType – The data type of the input tensor.
shape – The shape of the input tensor.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const TensorInfo &tensorInfo, const InputSettings &settings, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters

tensorInfo – The shape and data type of the input tensor.
InputSettings – Settings for TileSet and ExchangeStrategy.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInputTensor(const std::string &dataType, const Shape &shape, const InputSettings &settings, const popart::DebugContext &debugContext = {})

Add a new input tensor to the model.

Parameters

dataType – The data type of the input tensor.
shape – The shape of the input tensor.
InputSettings – Settings for TileSet and ExchangeStrategy.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addUntypedInputTensor(const popart::DebugContext &debugContext = {})

Add a new input tensor without a type or shape to the model.

Parameters: debugContext – Optional debug information.
Returns: The tensor id of the input tensor.

void addInputTensorFromParentGraph(const TensorId &tensorId)

Add a new named input tensor (from the parent graph) to the model.

Parameters: tensorId – The identifier string of the input tensor. This identifier must already exist in the name scope of the parent GraphProto and must appear topologically before this sub-graph.

TensorId addInitializedInputTensor(const ConstVoidData &initData, const popart::DebugContext &debugContext = {})

Add a new pre-initialized input tensor to the model.

Parameters

initData – The initial data of the input tensor.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

TensorId addInitializedInputTensor(const ConstVoidData &initData, const VariableSettings &variableSettings, const popart::DebugContext &debugContext = {})

Add a new pre-initialized input tensor to the model.

Parameters

initData – The initial data of the input tensor.
variableSettings – The settings that determine how variables are retrieved from replicas.
debugContext – Optional debug information.

Returns

The tensor id of the input tensor.

void addOutputTensor(const TensorId &arg0)

Add an output tensor from a node in the graph into the list of output tensors.

Parameters: arg0 – The tensor id of the output tensor to be added.

inline AiOnnxOpset6 aiOnnxOpset6(): Return the builder interface for ai.onnx opset 6.

inline AiOnnxOpset7 aiOnnxOpset7(): Return the builder interface for ai.onnx opset 7.

inline AiOnnxOpset8 aiOnnxOpset8(): Return the builder interface for ai.onnx opset 8.

inline AiOnnxOpset9 aiOnnxOpset9(): Return the builder interface for ai.onnx opset 9.

inline AiOnnxOpset10 aiOnnxOpset10(): Return the builder interface for ai.onnx opset 10.

inline AiOnnxOpset11 aiOnnxOpset11(): Return the builder interface for ai.onnx opset 11.

inline AiOnnxMlOpset1 aiOnnxMlOpset1(): Return the builder interface for ai.onnx.ml opset 1.

inline AiGraphcoreOpset1 aiGraphcoreOpset1(): Return the builder interface for ai.graphcore opset 1.

std::vector<TensorId> customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const unsigned numOutputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})

Return the output tensors from a custom op added to the model.

Parameters

opid – The id of the operator.
opsetVersion – The version of the opset.
inputs – The tensor ids of the A vector of input tensor ids.
numOutputs – The number of output tensors.
attributes – The map of attributes and their values to be added.
debugContext – Optional debug information.

Returns

The output tensors.

void customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const std::vector<TensorId> &outputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})

Add a custom op to the model.

Parameters

opid – The id of the operator.
opsetVersion – The version of the opset.
inputs – The tensor ids of the A vector of input tensor ids.
outputs – The tensor ids of the output tensors.
attributes – The map of attributes and their values to be added.
debugContext – Optional debug information.

template<class T> inline TensorId reshape_const(T &t, const std::vector<TensorId> &args, const std::vector<int64_t> &shape, const std::string &name = {})

Add a constant and a reshape a tensor using the provided domain.

Parameters

t – The builder interface.
args – The tensor ids of the tensors to be updated.
shape – The shape information to be used.
name – (Optional) The name of the updated tensor. Default: None.

Returns

The tensor id of the updated tensor.

inline void outputTensorLocation(const TensorId &nodeOutputName, TensorLocation value)

Set a value for the output tensor location attribute.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The location of the tensor.

inline void recomputeOutput(const TensorId &nodeOutputName, RecomputeType value)

Enable recomputation of the output of the node in the backward pass.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – (Optional) The type of the recompute.

inline void recomputeOutputInBackwardPass(const TensorId &nodeOutputName, RecomputeType value = RecomputeType::Recompute)

Enable or disable recomputation of the output of the node in the backward pass.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – (Optional) The type of the recompute. Default: RecomputeType::Recompute.

inline void recomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames, RecomputeType value = RecomputeType::Recompute)

Enable or disable recomputation of the output of the node in the backward pass.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
value – (Optional) The type of the recompute. Default: RecomputeType::Recompute.

inline bool getRecomputeOutputInBackwardPass(const TensorId &nodeOutputName)

Check if a node will have its output recomputed in the backward pass.

Parameters: nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
Returns: true if the output will be recomputed; false otherwise.

inline bool getRecomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames)

Check if a node will have its output recomputed in the backward pass.

Parameters: nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
Returns: true if the output will be recomputed; false otherwise.

std::vector<TensorId> checkpointOutput(const std::vector<TensorId> &nodeOutputNames)

Add checkpoint operations to the model.

This is the same as an identity op but RecomputeType is Checkpoint by default. Use this to checkpoint a subset of an operation’s output tensors.

Parameters: nodeOutputNames – The tensors to checkpoint.
Returns: The checkpointed tensors.

inline void virtualGraph(const TensorId &nodeOutputName, int64_t value = 0)

Set the virtual graph that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters

nodeOutputName – Name of the output tensor of the ONNX node.
value – The index of the virtual graph that computes this node. Default=0.

inline void executionPhase(const TensorId &nodeOutputName, int64_t value = 0)

Set the execution phase that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The index of the virtual graph that computes this node. Default=0.

inline void pipelineStage(const TensorId &nodeOutputName, int64_t value)

Set the value on the pipeline stage attribute.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The value to be set.

inline void pipelineStage(const std::set<TensorId> &nodeOutputNames, int64_t value)

Set the value on the pipeline stage attribute.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
value – The value to be set.

inline void excludePatterns(const TensorId &nodeOutputName, const std::vector<std::string> &patternNames)

Set the patterns to be excluded.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
patternNames – The vector of pattern names to be excluded.

inline void excludePatterns(const std::set<TensorId> &nodeOutputNames, const std::vector<std::string> &patternNames)

Set the patterns to be excluded.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
patternNames – The vector of pattern names to be excluded.

inline void setSerializeMatMul(const std::set<TensorId> &nodeOutputNames, std::string mode, int64_t factor, bool keep_precision)

Set the settings for matmuls that should be serialized.

This option will split a matmul into separate smaller matmuls that will be executed in series. This will also serialize the grad operations during training.

Parameters

nodeOutputNames – The tensor ids of the output matmul tensors of the ONNX node.
mode – The dimension of the matmul to serialize on. Options are: ‘input_channels’, ‘output_channels’, ‘reducing_dim’, ‘none’.
factor – The number of serialised matmuls. This must be a factor of the dimensions to serialise on.

void setPartialsType(const TensorId &nodeOutputName, const std::string partialsType)

Set the partials type for the given node.

This is used in the convolution op.

Parameters

nodeOutputName – Name of the output tensor of the ONNX node.
partialsType – The type for the partials. Options are: FLOAT or HALF.

void setEnableConvDithering(const TensorId &nodeOutputName, int64_t value)

Enable convolution dithering.

Parameters

nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The value to enable convolution. This should be 1 to enable convolution dithering and 0 otherwise.

std::string getPartialsType(const TensorId &nodeOutputName)

Get the partials type for the given node.

Parameters: nodeOutputName – The tensor id of the output tensor of the ONNX node.
Returns: The partials type.

inline void setInplacePreferences(const TensorId &nodeOutputName, const std::map<OpType, float> &prefs)

void setAvailableMemoryProportion(const TensorId &nodeOutputName, const float availableMemoryProportion)

Set the available memory proportion for the given node.

This is used in the convolution op.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion.

Parameters

nodeOutputName – Name of the output tensor of the ONNX node.
availableMemoryProportion – The available memory proportion [0, 1).

void setAvailableMemoryProportion(const std::set<TensorId> &nodeOutputNames, const float availableMemoryProportion)

Set the available memory proportion for the given node.

This is used in the convolution op.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
availableMemoryProportion – The available memory proportion [0, 1).

void setAttribute(const std::string &attribute, popart::any value)

Set the value of an attribute that will be set on all subsequent operations.

Parameters

attribute – The name of the attribute to set.
value – The value to set on the attribute.

popart::any getAttribute(const std::string attribute) const

Get an attribute that has been set for all subsequent operations.

Parameters: attribute – The name of the attribute to get.
Returns: The attribute.

bool hasAttribute(const std::string &attribute) const

Check if an attribute exists.

Parameters: attribute – The name of the attribute to check.
Returns: true if the attribute exists; false otherwise.

void clearAttribute(const std::string &attribute)

Unset an attribute that will be set on all subsequent operations.

Parameters: attribute – The name of the attribute to unset.

bool hasAttribute(const std::string &attribute)

Check if an attribute is set.

Parameters: attribute – The name of the attribute to check.
Returns: true if the attribute is set; false otherwise.

popart::any getAttribute(const std::string &attribute)

Get the attribute value.

Parameters: attribute – The name of the attribute.
Returns: The value of the attribute.

int64_t getPipelineStage() const

Get the pipeline stage attribute.

Returns: The pipeline stage.

int64_t getExecutionPhase() const

Get the execution phase attribute.

Returns: The execution phase.

int64_t getVirtualGraph() const

Get the virtual graph attribute.

Returns: The virtual graph.

inline void virtualGraph(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)

Set the virtual graph that computes the given node.

Applies when creating a graph for a multi-IPU configuration.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
value – The index of the virtual graph that computes this node.

inline void executionPhase(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)

Set the execution phase.

Applies when creating a graph for a multi-IPU configuration.

Parameters

nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
value – The index of the virtual graph that computes this node.

void addNodeAttribute(const std::string &attributeName, const int64_t &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – An int64_t value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<int64_t> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A std::vector<int64_t> value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const float &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A float value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<float> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – The std::vector<float> value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::string &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A std::string value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const char *attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A char value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const std::vector<std::string> &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A std::vector<std::string> value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const bool attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A bool value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

void addNodeAttribute(const std::string &attributeName, const ConstVoidData &attributeValue, const std::set<TensorId> &nodeOutputNames)

Add an attribute to the ONNX node which is uniquely identified by the output tensors.

This function will throw an exception if it cannot find the unique node or if the attribute already exists.

Parameters

attributeName – The name of the attribute to add.
attributeValue – A constant tensor initializer.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

bool nodeHasAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Check whether the ONNX node has an attribute set.

This function will throw an exception if it cannot find the unique node.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

true if the node has an attribute set; false otherwise.

int64_t getInt64NodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a int64_t.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the int64_t type.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<int64_t> getInt64VectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a std::vector<int64_t>.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the std::vector<int64_t> type.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

float getFloatNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a float.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the float type.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<float> getFloatVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a std::vector<float>.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::string getStringNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a string.

This function will throw an exception if it cannot find the unique node or the attribute does not exist or it has not been set to the std::string type.

Parameters

attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

std::vector<std::string> getStringVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a vector of strings.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters

attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

bool getBoolNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Get the value of an attribute for the ONNX node where the value is a boolean.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters

attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

Returns

Value of the attribute.

void removeNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)

Remove an attribute from the ONNX node.

This function will throw an exception if it cannot find the unique node or if the attribute does not exist.

Parameters

attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.

std::vector<std::string> getAllNodeAttributeNames(const std::set<TensorId> &nodeOutputNames)

Get all the attribute names from the ONNX node.

This function will throw an exception if it cannot find the unique node.

Parameters: nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
Returns: The attribute names associated with the ONNX node.

inline int64_t getVirtualGraph(const TensorId &nodeOutputName)

Get the index of the virtual graph that computes this node.

This applies in a multi IPU system.

This function will throw an exception if the virtual graph has not been set in the current scope.

Parameters: nodeOutputName – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.
Returns: The virtual graph associated with the ONNX node.

inline int64_t getVirtualGraph(const std::set<TensorId> &nodeOutputNames)

Get the index of the virtual graph that computes this node based on multiple output tensors.

This applies in a multi IPU system.

This function will throw an exception if the virtual graph has not been set in the current scope.

Parameters: nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
Returns: The virtual graph associated with the ONNX node.

inline int64_t getExecutionPhase(const TensorId &nodeOutputName)

Get the execution phase for a single output tensor.

This only applies to a multi-IPU system.

This function will throw an exception if the execution phase has not been set in the current scope.

Parameters: nodeOutputNames – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.
Returns: The execution phase associated with the ONNX node.

inline int64_t getExecutionPhase(const std::set<TensorId> &nodeOutputNames)

Get the execution phase for a set of output tensors.

This only applies to a multi-IPU system.

This function will throw an exception if the execution phase has not been set in the current scope.

Parameters: nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
Returns: The execution phase associated with the ONNX node.

std::string getModelProto(bool humanReadable = false) const

Retrieve the ONNX serialized ModelProto.

Parameters: humanReadable – If true, return a human readable text representation of the model, otherwise use a binary format.
Returns: A serialized ONNX ModelProto.

void saveModelProto(const std::string &fn)

Save the builder’s ONNX ModelProto into the builder and validate it.

Parameters: fn – The name of a file containing an ONNX model protobuf.

void saveInitializersExternally(const std::vector<TensorId> &ids, const std::string &fn)

Save tensor data externally.

The model data cannot exceed 2GB - the maximum size of a Protobuf message. To avoid this, for large models ONNX tensor data can be saved separately.

Parameters

ids – The names of tensors for which data is to be saved externally.
fn – The name of a file containing the binary tensor data. This can be an absolute or relative path. If a relative path, when the ONNX model is saved, external tensor data will be written to a path relative to the current working directory.

std::vector<TensorId> getInputTensorIds() const

Return a list of ONNX graph input tensor ids.

Returns: A vector of input tensor ids.

std::vector<TensorId> getOutputTensorIds() const

Return a list of ONNX graph output tensor ids.

Returns: A vector of output tensor ids.

std::vector<TensorId> getValueTensorIds() const

Return a list of ONNX graph value tensor ids.

These tensors are stored in the value_info section of the ONNX GraphProto structure.

Returns: A vector of value tensor names.

std::vector<TensorId> getTrainableTensorIds() const

Return a list of ONNX graph initialized tensor ids.

These tensors are stored in the initialized section of the ONNX GraphProto structure..

Returns: A vector of names of initialized tensors.

bool hasValueInfo(const TensorId &id) const

Check if a tensor has value info.

A tensor may not have value info if this either does not exist or if shape inference has failed.

Returns: True if the tensor has value info; false otherwise..

std::vector<int64_t> getTensorShape(const TensorId id)

Return an ONNX graph tensor shape, from either the input, output, or value_info lists in GraphProto.

Parameters: id – The id of the tensor for which dimensions are required.
Returns: A vector of the tensor dimensions.

bool isInitializer(const TensorId id) const

Check if the ONNX tensor is in the initializer list of GraphProto.

Parameters: id – A tensor id.
Returns: True if the tensor is in the initializer list; false otherwise.

std::string getTensorDtypeString(const TensorId id)

Return an ONNX graph tensor type as a lower case string, from either the input, output, or value_info lists in GraphProto.

Parameters: id – The id of the tensor for which the type is required.
Returns: A lower case string of the tensor data type.

DataType getTensorDataType(const TensorId id)

Return a tensor type from either the input, output, or value_info lists in GraphProto.

Parameters: id – The id of tensor id for which the type is required.
Returns: The data type of the tensor.

void pushNameScope(const std::string &name)

Push a name onto the name scope stack.

The names of tensors and nodes added to the ONNX graph will be prefixed with a concatenation of the names in the name scope stack.

Parameters: name – The tensor name to be pushed onto the name scope stack.

void popNameScope(): Remove the last entry in the name scope stack.

std::string getNameScope(const std::string &name = "") const

Get the current name scope stack using the default delimiter.

Parameters: name – (Optional) A string to concatenate to the end of the stack.
Returns: A string of the concatenated name scope stack.

void setGraphName(const std::string &name)

Set a graph name.

Parameters: name – The string to name the graph.

void setParent(Builder *parent)

Set the parent graph of this builder.

Parameters: parent – The builder to set as the parent of this builder.

Builder *getParent() const: Return the parent graph of this builder or null if there is no parent.

inline bool hasParent() const

Check if this builder represents a subgraph.

Returns: If true then the builder represents a subgraph. If false then the builder does not represent a subgraph.

void embedReplicationFactor(int replicationFactor)

Embed the value of replicationFactor into the OnnxModel.

Should be interpreted as 1 if not present in the model.

Parameters: replicationFactor – The replication factor.

Public Static Functions

static std::unique_ptr<Builder> create(): Create a builder for an ONNX model.

static std::unique_ptr<Builder> createFromOnnxModel(const std::string &modelProtoOrFilename)

Create a builder which loads a serialized ONNX ModelProto into the builder and validates it.

Parameters: modelProtoOrFilename – Either an ONNX model protobuf, or the name of a file containing an ONNX model protobuf.

class Ir

Public Types

enum class ExecutionMode

Values:

enumerator Inference

enumerator Training

enum class SerialiseFormat

Values:

enumerator JSON

Public Functions

poprithms::logging::TimePartitionLogger &timePartitionLogger() const

void foo() { auto timer = timePartitionLogger().scopedStopwatch("In foo"); if (cond0()){ return; } bar(); return; }

When the method timePartitionLoggerStr() (see below) is called, there will be a line with “In foo” summarizing the time between between the construction and destruction of timer, above. Something like:

In foo : 0.03 [s] : 30 % In bar : 0.02 [s] : 10 % unaccounted : 0.05 [s] : 50 % total : 0.10 [s] : 100 %.

In the case where there are multiple timers which exist concurrently, only the most recently constructed one will accumulate time. This means that the most nested scope is the one which will accumulate time.

For more information, see the poprithms SwitchingTimePartitionLogger class

Returns: An object used to track and summarize where wall clock time is spent in PopART compilation. This object is used to partition time into different components (scheduling, outlining, poplar Graph construction, etc.). It can be used as follows:

std::string timePartitionLoggerStr() const

Ir()

~Ir()

Ir(Ir&&) = delete

Ir &operator=(Ir&&) = delete

Ir(const Ir&) = delete

Ir &operator=(const Ir&) = delete

inline uint64_t getId() const

void setOnnxModel(const ONNX_NAMESPACE::ModelProto &model)

inline bool hasOnnxModel() const

Check if there’s an ONNX model in the IR.

This is true if the IR has been created from an ONNX model or using the Builder.

Returns: true If there is an onnx model, false otherwise.

void setDataFlow(const DataFlow &df)

void setUserOptions(const SessionOptions &flags)

void setInputShapeInfo(const InputShapeInfo &info)

inline const InputShapeInfo &getInputShapeInfo() const

void setOptimizer(const Optimizer&)

void ensureOptimizerTensorCreated(const TensorId &optId, const TensorInfo &info, const DebugContext &debugContext = {})

inline const Optimizer &getOptimizer() const

void setDeviceInfo(DeviceInfo&)

const DeviceInfo *getDeviceInfo() const

void setPatterns(const Patterns &p)

inline const Patterns &getPatterns() const

std::string getPatternLevelStr(const Patterns &p)

bool isPatternsLevel(const Patterns &p, PatternsLevel level)

void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)

void removeIsolatedGraphs()

void setExecutionMode(const ExecutionMode &mode)

inline bool isTraining() const

inline bool isTesting() const

void logIr() const

void compareWithSavedHash(const HashesMap &cacheEntries)

void prepare(const IrBundle &bundle, const HashesMap &cacheEntries = {}, size_t hashSeed = 0u)

Prepare the IR based on the IrBundle configuration.

If engine caching is enabled then the IR hash which is based on the IrBundle and the forward graph will be compared to a saved file. If the hash matches then the rest of the Ir preparation will be skipped.

Parameters

bundle – The bundle to prepare.
cacheEntries – The engine cache.
hashSeed – The seed to initiate the IR hash with — this hash should incorporate non-IR factors that could affect the compilation such as engine options and session options.

void prepareCache(const HashesMap &cacheEntries, size_t hashSeed)

void finalizeOpDebugInfo()

inline bool isPrepared() const

inline bool hashMatched() const

void updateOptimizer(const Optimizer&)

ONNX_NAMESPACE::ModelProto step(int n)

void addAdditionalModelProtoTensor(const TensorId&)

void addAdditionalModelProtoTensor(Tensor*)

void addAdditionalModelProtoTensors()

inline bool additionalModelProtoTensorsHaveBeenAdded() const

inline const std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors() const

inline std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors()

bool isAnchored(const TensorId&) const

bool isRootAnchor(const TensorId&) const

std::set<TensorId> getAnchors() const

std::set<TensorId> getRootAnchors() const

void remapAnchor(const TensorId &from, const TensorId &to)

void addAnchor(const TensorId &t)

const BiMap<TensorId, TensorId> &getAnchorRemap() const

bool streamingIsDisabledForTensor(const Tensor*) const

bool streamingIsDisabledForTensor(const TensorId&) const

bool storingIsDisabledForTensor(const Tensor*) const

bool storingIsDisabledForTensor(const TensorId&) const

void append(std::stringstream&) const

void serialise(SerialiseFormat format, std::stringstream &ss, bool useScheduler = true) const

std::vector<Tensor*> optimizerTensors() const

std::vector<Tensor*> optimizerStateTensors() const

std::map<TensorId, std::vector<Tensor*>> getHostLoadTensors() const: The original input tensor ID (used to identify streams) and the tensors produced by associated HostLoadOp.

std::map<TensorId, std::vector<Tensor*>> getHostStoreTensors() const: The original anchor tensor ID (used to identify streams) and the tensors consumed by associated HostStoreOp.

std::vector<Tensor*> dataStreamTensors() const

std::vector<Op*> opsOfType(const OperatorIdentifier &opid) const

bool isConsumedByOpOfType(TensorId tid, const OperatorIdentifier &opid)

std::vector<const Graph*> getGraphSchedule() const

std::vector<const Graph*> getGraphSchedule(GraphId root) const

std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule ros) const

bool isSchedulable(const OpsBeforeKey&) const

bool virtualGraphsEnabled() const

SyntheticDataMode syntheticDataMode() const

bool useSyntheticData() const

OpId getOpsCounter() const

OpId getAndIncrOpsCounter()

TensorId getFinalLossId() const

OpId getFinalLossOpId() const

void dotCheckpoint(const Ir &ir, std::string check) const

const ONNX_NAMESPACE::ModelProto &getModel() const

Throws: error – if there is no Onnx model.
Returns: const reference to the Onnx model.

std::vector<TensorId> getModelInputIds() const

Returns: the id of every input tensor of the Onnx model. If there is no Onnx model, returns empty.

void setExternalTensorDataInfo(TensorId, const ONNX_NAMESPACE::TensorProto&)

Set the Onnx TensorProto of the given tensor in the Onnx ModelProto.

Throws: error – if this Ir has no Onnx model.

inline const SessionOptions &getSessionOptions() const

inline SessionOptions &getSessionOptions()

inline void setSessionName(const std::string name)

inline const std::string getSessionName() const

std::set<TensorId> getAllTensorIds() const

std::vector<TensorId> getTensorIds(TensorType) const

Tensor *getTensor(const TensorId&) const

bool containsTensor(const TensorId&) const

std::vector<TensorId> getGraphInputIds() const

std::vector<TensorId> getGraphOutputIds() const

const Graph &getMainGraph() const

Graph &getMainGraph()

std::vector<const Graph*> getAllGraphs() const

Graph &getGraph(const GraphId&) const

bool hasGraph(const GraphId&) const

Graph &createGraph(const GraphId&)

void removeGraph(const GraphId&)

std::map<OpId, std::unique_ptr<Op>> &getMainGraphOps()

const std::map<OpId, std::unique_ptr<Op>> &getMainGraphOps() const

std::vector<Op*> getAllOps() const

Op *getOp(OpId opId) const

Returns the Op if it exists in any graph.

Throws an error if the Op could not be found.

Parameters: opId – The unique ID of the Op to find
Returns: The Op pointer if found

Tensors &getMainGraphTensors()

const Tensors &getMainGraphTensors() const

inline const DataFlow &getDataFlow() const

void applyTransform(std::size_t transformId, Graph &graph)

void validateAnchors() const

ExecutionMode getExecutionMode() const

bool canInfer() const

bool canTrain() const

bool hasConstructedBackwards() const

bool hasDecomposedOptimizers() const

bool containsInitialisers() const

bool tensorExistsInInitialisers(TensorId) const

void constructForwards()

Graph &constructFromOnnxGraph(const ONNX_NAMESPACE::GraphProto &graph, const Scope &scope)

void foldConstants(Graph&)

void constructBackwards()

void registerInputTensors()

void updateVertices()

void unsetAllVirtualGraphIds()

void applyPreAliasPatterns(Graph&)

void applyUpdateInplacePrioritiesForIpu()

void applyInplacePattern(Graph&)

void confirmConstIds() const

void confirmNoReservedIds() const

void setFinalLoss(const TensorId &loss)

int getDefaultOpsetVersion(const std::string &domain) const

unsigned getNumVirtualGraphIds() const

int getOpSetVersionFromModel(const std::string &domain) const

inline bool autoRecomputationEnabled() const

bool hasReplicatedTensorSharding() const

bool hasOverlappedIO() const

inline void setRequiresRandomSeed()

inline bool getRequiresRandomSeed() const

RandomReferenceId getAndIncrementRandomReferenceId()

TensorId getOrSetRandomReferenceTensor(RandomReferenceId, TensorId)

void mergeRandomReferenceIds(std::set<RandomReferenceId>&)

void setRemoteBufferInfo(RemoteBufferId, RemoteBufferInfo)

const RemoteBufferInfo getRemoteBufferInfo(RemoteBufferId) const

const std::map<RemoteBufferId, RemoteBufferInfo> getAllRemoteBufferInfos() const

inline void setExecutionPhasesReady()

inline bool getExecutionPhasesReady() const

PipelineStage getNumPipelineStages() const

PipelineInfo pipelineInfo() const

void setMainGraphPathFromLoss()

void verifyTensorInfos() const: Verifies that all tensors have valid TensorInfos.

void setIsPrepared()

Marks the Ir as “prepared”.

This means the Ir is now ready to be lowered. Failing to do this before lowering the Ir will result in an error. The schedule of all graphs will be fixed by calling this. Modifying the graphs after the IR is prepared will result in an error.

PipelineStage getFinalLossPipelineStage() const

Get pipeline stage containing the final loss (the last forward pipeline stage)

Returns: pipeline stage containing the final loss

PipelineStage getMaxPipelineStage() const

Get the max pipeline stage that will exist after the backward pass has been added to the graph.

Returns: max pipeline stage of the graph

Op &getSubgraphAnchorPlaceholder()

inline const decltype(graphs) &getGraphs() const

TensorId createIntermediateTensorId(const TensorId &base_id)

TensorId createSliceTensorId(TensorId base_id, unsigned s, unsigned e)

TensorId createConcatTensorId(TensorId base_id)

GraphId createUniqueSubgraphId(GraphId base_id)

std::vector<std::vector<Op*>> getAccumulateOuterFragmentBinConstraints(const Graph &graph) const

size_t getHash() const

void computeHash(size_t hashSeed)

size_t getIrBundleHash() const

void setIrBundleHash(size_t)

ClonedGraphMaps cloneGraph(GraphId originalGraphId, GraphId newGraphId)

Clone a graph.

The OpIds and TensorIds will differ between the original and the cloned graph. Hence a map between the old OpId and cloned OpId will be returned. The new graph can be obtained by ir.getGraph(newGraphId);

Warning

Does not support cloning of the main graph.

Parameters

originalGraphId – The id of the graph to clone
newGraphId – The id of the cloned graph

Returns

A struct of maps between the OpIds and TensorIds in the original and new graphs

bool applyPreAliasPattern(const PreAliasPattern*, Graph&)

Public Static Functions

static bool usingEngineCache(const SessionOptions&, const DeviceInfo*)

using popart::HashesMap = std::map<size_t, std::string>

enum class popart::RequireOptimalSchedule

Values:

enumerator Yes = true

enumerator No = false

class Graph

Public Types

enum class CopyInputMarkings

Values:

enumerator Yes = 1

enumerator No = 0

enum class CopyOutputMarkings

Values:

enumerator Yes = 1

enumerator No = 0

Public Functions

Graph(Ir&, const GraphId&)

~Graph()

Graph() = delete

Graph(const Graph&) = delete

const std::map<OpId, std::unique_ptr<Op>> &getOps() const

std::map<OpId, std::unique_ptr<Op>> &getOps()

std::vector<OpId> getOpIds() const

const std::set<int64_t> getAllVirtualGraphIds(bool includeInvalid) const

const std::map<int64_t, int> getVirtualGraphCounts() const

Op *getOp(OpId opId) const

Return a pointer to the Op if it exists.

Throws an error if the Op could not be found.

See also

getOpUnsafe

Parameters: opId – The unique ID of the Op to find
Returns: The Op pointer if found

Op *getOpUnsafe(OpId opId) const

Returns a pointer to the Op if it exists, or nullptr otherwise.

See also

getOp

Parameters: opId – The unique ID of the Op to find
Returns: The Op pointer if found, or nullptr otherwise

const Tensors &getTensors() const

Tensors &getTensors()

Tensor *getTensor(const TensorId&)

void addActGrad(const TensorId&)

void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const DebugContext &debugContext)

Add a variable to this graph with the provided properties.

Parameters

name – The name of the variable.
info – The tensor info to create the variable with, including shape and data type.
src – The data to initialise the tensor with.
debugContext – The debug context to assist with debugging.

void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const VariableSettings &vs, const DebugContext &debugContext)

As per addVarInit, but passing a VariableSettings object to allow for grouped replicas.

See also

addVarInit(const TensorId &, const TensorInfo &, const void *, const DebugContext &)

Parameters

name – The name of the variable.
info – The tensor info to create the variable with, including shape and data type.
src – The data to initialise the tensor with.
vs – The variablesettings to use.
debugContext – The debug context to assist with debugging.

void addConstInit(const TensorId&, const TensorInfo&, const void*, const DebugContext&)

void addStream(const TensorId&, const TensorInfo&, const DebugContext&)

inline const Ir &getIr() const

inline Ir &getIr()

inline const TensorId &getLoss() const

inline void setLoss(const TensorId &loss_)

void constructFromOnnxGraph(const ONNX_NAMESPACE::GraphProto &onnx_graph)

Op *growFromNode(const Node &node)

OpId moveIntoGraph(std::unique_ptr<Op> op)

template<typename OP, typename ...Args> OP *createOp(Args&&... args)

template<typename OP, typename ...Args> OP *createConnectedOp(const std::map<InIndex, TensorId> &in, const std::map<OutIndex, TensorId> &out, Args&&... args)

std::vector<const Graph*> getCalledGraphs() const

template<typename T> void connectInputs(const T &inContainer, OpId opId)

template<typename T> void connectOutputs(const T &outContainer, OpId opId)

void connectInputsFromInputMapWrapper(const InputMapWrapper &in, OpId id)

void connectOutputsFromOutputMapWrapper(const OutputMapWrapper&, OpId opId)

std::map<int, std::unique_ptr<popart::Op>>::iterator eraseOp(OpId id)

void setVarUpdateConstraints()

void setConvFlipWeightConstraints()

std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule requireOptimalSchedule) const

void freezeSchedule(const OpsBeforeKey &gCons)

bool isSchedulable(const OpsBeforeKey&, bool respectExecutionPhases = false) const

bool hasUserRecomputeOps() const

std::vector<OpSet> getLiveSets(const std::vector<Op*> &topoOps) const

inline const std::vector<TensorId> &getInputIds() const

InIndex getInputIndex(TensorId id) const

Get the index of the graph input with a specific id.

If the id is not a valid input id then a error will be raised.

Parameters: id – Tensor name to find the index for.
Returns: The input index for the specified id, if it exists.

void addInput(const InIndex &index, const TensorId &id, const TensorInfo &info, bool overwrite)

Add a graph input at a specific index in the list.

Parameters

index – Force the input to be at the specified index in the graph.
id – Tensor name to create and connect
info – Tensor info
overwrite – Overwrites any existing input at the index if true, otherwise, moves all other inputs by one position

void addInput(const TensorId &id, const TensorInfo &info)

Add a graph input to the end of the list.

Parameters

id – Tensor name to create and connect
info – Tensor info

void markAsInput(const TensorId&)

TensorId addInput(const TensorInfo&)

Tensor *getInputTensor(InIndex idx) const

inline TensorId getInputId(InIndex idx) const

bool hasInputId(const TensorId &id) const

void removeInput(const TensorId&)

void removeInput(const InIndex&)

inline const std::vector<TensorId> &getOutputIds() const

OutIndex getOutputIndex(TensorId id) const

void markAsOutput(const OutIndex &index, const TensorId &id, bool overwrite)

Mark a graph tensor as graph output at a specific index in the list.

Parameters

index – Force the output to be at the specified index in the graph. Overwrites any existing output at the index.
id – Tensor in the graph to mark as output
overwrite – Overwrites any existing output at the index if true, otherwise, moves all other outputs by one position

void markAsOutput(const TensorId &id)

Mark a graph tensor as graph output at the end of the list.

Parameters: id – Tensor in the graph to mark as output

void removeOutput(const TensorId&)

void removeOutput(const OutIndex&)

inline TensorId getOutputId(OutIndex idx) const

bool hasOutputId(const TensorId &id) const

Tensor *getOutputTensor(OutIndex idx) const

Scope getScope() const

void replaceTensor(const TensorId &oldId, const TensorId &newId)

Replace oldId with newId on any consumers.

Both tensors need to exist.

Parameters

oldId – Tensor to disconenct from consumers & graph outputs
newId – Tensor to connect from consimers & graph outputs

std::vector<Op*> getCallSiteOps() const

std::vector<Op*> getCallSiteOps(size_t num) const

std::map<OpId, std::unordered_set<OpId>> getEdgeMap() const

inline const std::string &getGraphId() const

std::string getGraphString() const

void copyFrom(const Graph &other, CopyInputMarkings copyInputMarkings = CopyInputMarkings::Yes, CopyOutputMarkings copyOutputMarkings = CopyOutputMarkings::Yes)

std::pair<bool, std::vector<Op*>> getDirectViewChain(Tensor *from, Tensor *to)

Find a chain of view changing ops in the graph from “from” to “to” (if one exists) and return a vector of ops such that op1(op2(…opN(in))) = out for {op1, op1, …, opN}.

If no such chain exists, returns {false, {}};

Parameters

from – The tensor to start at
to – The tensor to finish at

Returns

std::pair<bool, std::vector<Op *>> The ops along the chain, in order. where the first of the pair is a bool indicating whether the path exists. The second is the vector of ops in order from ‘from’ to ‘to’. Givent the ops are 1-in-1-out, this will also be in schedule order.

void setOnnxToOnnx(std::unique_ptr<onnxpasses::IOnnxToOnnx>)

Set the object which will perform the ONNX -> ONNX transformation, which happens early on in the Graph constructor.

The default object, which is used if this method is not called, is an instance of the onnxpasses::Canonnxalizer class, which performs a set of required transformations, such as decomposing ASinh into more basic Nodes.

void finalizeSchedule()

Finalizes the graph schedule.

Schedule cannot change anymore after this was called. Calling finalize multiple times results in an error.

inline void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)

inline bool canBeRecursivelyAutodiffed() const: If this graph X is called in graph Y, when applying autodiff to Y, is it safe to autodiff X?

inline void setCanBeRecursivelyAutodiffed(bool value)

Public Members

std::unique_ptr<TopoCons> topoCons

const GraphId id

Public Static Attributes

static const int64_t NoVGraph

class AiOnnxMlOpset1 : public popart::DomainOpSet

Class that represents the AI ONNX ML opset.

Public Functions

inline AiOnnxMlOpset1(std::unique_ptr<BuilderImpl> &impl_)

Constructor for the AiOnnxMlOpset1 class.

Parameters: impl_ – A pointer to an implementation of the Builder class.

class AiGraphcoreOpset1 : public popart::DomainOpSet

Class that represents the AI Graphcore opset.

Public Functions

inline AiGraphcoreOpset1(std::unique_ptr<BuilderImpl> &impl_)

Constructor for the AiGraphcoreOpset1 class.

Parameters: impl_ – A pointer to an implementation of the Builder class.

TensorId copyvarupdate(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Copies a tensor to an initalised tensor (variable).

This is used to update an initalised tensor (a variable created using addInitializedInputTensor()) which retains its value between iterations, by setting the value to the value of another tensor (the updater). The purpose is to manually update the tensor in use cases for variables other than trained parameters (weights) or tensors used by other ops.

Parameters

args – A vector of the input tensor ids containing the tensor to be updated, tensor and the tensor containing the values for the update, updater as [tensor, updater].
debugContext – Optional debug information.

Returns

An alias to the updated variable: to ensure correct ordering of the updated variable, you should use this variable for any op which should operate on the updated variable.

std::vector<TensorId> batchnormalization(const std::vector<TensorId> &args, unsigned num_outputs, float epsilon = 1e-05f, float momentum = 0.9f, const popart::DebugContext &debugContext = {})

Add a batch normalization operation to the model.

This version uses N-1 as the population size for calculating running variance (like PyTorch). PyTorch BatchNorm1d

Whereas, the Onnx version uses N. ONNX version

Parameters

args – List of input tensor ids
num_outputs – The number of output tensor ids
epsilon – The ‘epsilon’ attribute
momentum – The ‘momentum’ attribute
name – Optional identifier for the operation

Returns

A list of normalized output tensors

std::vector<TensorId> groupnormalization(const std::vector<TensorId> &args, int64_t num_groups, float epsilon = 1e-05f, const DebugContext &debugContext = {})

Add a group normalization operation to the model.

This is a Poplar extension.

The group will be created from a strided input.

Parameters

args – A vector of input tensor ids for input data x, scale scale, and bias bias as [x, scale, bias].
num_groups – The number of groups to separate the channels into.
epsilon – The epsilon value to use to avoid division by zero.
debugContext – Optional debug information.

Returns

A vector of output tensor ids for output data y, the mean mean and the variance var as [y, mean, var].

std::vector<TensorId> multiconv(const MultiConvInputs &tensors, const MultiConvDilations &dilations = {}, const MultiConvDilations &inDilations = {}, const MultiConvPads &pads = {}, const MultiConvPads &outPads = {}, const MultiConvStrides &strides = {}, const std::vector<float> &availableMemoryProportions = {}, const std::vector<std::string> &partialsTypes = {}, const nonstd::optional<std::string> planType = nonstd::nullopt, const nonstd::optional<int> perConvReservedTiles = nonstd::nullopt, const nonstd::optional<float> cycleBackOff = nonstd::nullopt, const std::vector<int64_t> enableConvDithering = {}, const DebugContext &debugContext = {})

Add a multi-convolution operation to the model.

Using this multi-convolution API ensures that the convolutions are executed in parallel on the device.

Functionally, a multi-convolution is equivalent to a series of single convolutions. Using this multi-convolution API is always equivalent to calling the single-convolution API (conv) once for each argument.

For example, calling:

A0 = conv({X0, W0, B0})
A1 = conv({X1, W1})

is functionally equivalent to calling:

{A0, A1} = multiconv({{X0, W0, B0}, {X1, Q1}).

It is possible that any two convolutions cannot be executed in parallel due to topological constraints. For example, the following:

B = conv({A, W0});
C = B + A
D = conv({C, W1});

cannot be converted to:

{B, D} = multiconv({{A, W0}, {C, W1}}).

Note that it is not possible to create such a cycle by adding a multi-convolution with this API.

Calls to multiconv() are mapped to poplar::poplin::multiconv::convolution().

All input vectors must be either empty, or equal in length to the number of convolutions. Note that groups for each convolution are automatically inferred from the shapes of the data and weight inputs.

See also

Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using availableMemoryProportion.

Parameters

tensors – List of tensor ids for input tensors for data, weights and biases as [data, weight,bias] for each convolution. bias is optional.
dilations – The dilations attributes for each convolution.
inDilations – The input dilations attributes for each convolution.
pads – The pads for each convolution.
outPads – The output padding for each convolution.
strides – The strides for each convolution.
availableMemoryProportions – The available memory proportions per convolution, each [0, 1).
partialsTypes – The partials type per convolution.
planType – Run convolutions in parallel or series.
perConvReservedTiles – The number of tiles to reserve per convolution when planning.
cycleBackOff – Cycle back-off proportion, [0, 1).
enableConvDithering – Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models.
debugContext – Optional debug information.

Returns

A vector of tensor ids of the output tensor from each convolution.

TensorId subsample(const std::vector<TensorId> &args, const std::vector<int64_t> &strides, const DebugContext &debugContext = {})

Add a sub-sample operation to the model.

This is a Poplar extension.

If multiple tensors are provided, the strides will be applied to them all.

Parameters

args – A vector of tensor ids to sub-sample.
strides – The strides to use.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId printtensor(const std::vector<TensorId> &args, int64_t print_gradient = 1, const DebugContext &debugContext = {}, const std::string &title = {}, const int summariseThreshold = 1000, const int edgeItems = 3, const int maxLineWidth = 75, const int digits = 8, const int floatFormat = 0, const char separator = ' ', const char openBracket = '[', const char closeBracket = ']')

Add a print tensor operation to the model.

This is a Poplar extension.

Parameters

args – A vector of tensor ids to print.
print_gradient – Indicates whether the gradient tensor(s) associated with the input tensor(s) are also printed. If 1, the gradient tensor(s) are also printed, otherwise the gradient tensor(s) are not printed.
debugContext – Optional debug information.
title – An optional title to print.
summariseThreshold – (default 1000) If the number of elements of the tensor exceeds this threshold the output will be summarised. Only the edge elements will be displayed with an ellipsis indicating skipped elements. A value of 0 will disable summarisation.
edgeItems – (default 3) number of edge elements to include at the beginning and end when summarisation is enabled
maxLineWidth – (default 75) lines longer than this limit will be split across multiple lines. A value of 0 will disable line splitting.
digits – (default 8) number of digits to display. For integers this limit can be exceeded if any number is large enough. For floating points this does not include the exponent. The number of digits is used in conjunction analysis of the tensor to determine the width of each element to align all elements when printed. A value of 0 disables this analysis and each elements will be printed in an unaligned format.
floatFormat – (default 0=Auto) determines the floating point format to use. 0=auto, 1=fixed, 2=scientific 3=none. Automatic mode determines the appropriate format based on the data. If digits==0 this option is disregarded and the floatFormat is set to none.
separator – (default space) character used to delininate values.
openBracket – (default square bracket) character used to open a tensor.
closeBracket – (default square bracket) character used to close a tensor.

Returns

The tensor id of the result tensor.

TensorId nop(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a no-op operation to the model.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId normalize_image(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})

Normalize image and pad it from 3 channels to 4 channels.

The input channel must be in the last dimension.

Parameters

args – Contains the image input, offsets, scales input tensors as required by Poplibs
scale – the scale to apply
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scale(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})

Add a scale operation to the model.

This is a Poplar extension.

Parameters

args – A vector of input tensor ids.
scale – The scale to apply.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scaledadd(const std::vector<TensorId> &args, float scale0, float scale1, const DebugContext &debugContext = {})

Add a scaled add operation to the model.

The scaled add operation takes the form:

X = scale0 * T0 + scale1 * T1

where scale0 is the scale factor to be applied to tensor \T0 and scale1 is the scale factor to be applied to tensor \T1.

Parameters

args – A vector of input tensor ids: [T0, T1, scale0, scale1].
scale0 – The scale to apply (if no scale0 tensor is supplied).
scale1 – The scale to apply (if no scale1 tensor is supplied).
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

std::vector<TensorId> lstm(const std::vector<TensorId> &args, int64_t outputFullSequence, const DebugContext &debugContext = {})

TensorId gelu(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a GELU operation to the model.

This is a Poplar extension.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId geluerf(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an accurate GELU (ERF instead of TANH) operation to the model.

Parameters

args – A vector of input tensor IDs.
debugContext – Optional debug information.

Returns

The tensor ID of the result tensor.

TensorId detach(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a detach operation to the model.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId depthtospace(const std::vector<TensorId> &args, int64_t blocksize, const std::string &mode = "DCR", const DebugContext &debugContext = {})

Add a depth-to-space operation to the model.

This allows DepthToSpace_11 to be targeted from earlier opsets.

The purpose of a depth-to-space operation, also known as pixel shuffling, is to rearrange data from the depth (channels) dimension into the spatial (width and height) dimensions. It is an efficient means of learning upsampling alongside mixing convolution with bilinear interpolation and using transpose convolution.

See also

ONNX DepthToSpace operator.

Parameters

args – A vector containing a single tensor id of the input tensor of shape [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.
blocksize – The size of the blocks to be moved. If the input is [N, C, H, W] and the blocksize is B, the output will be [N, C/(B*B), H*B, W*B].
mode – Specifies how the data is rearranged:
- ”DCR” (Default): depth-column-row order
- ”CRD”: column-row-depth order
debugContext – Optional debug information.

Returns

A tensor which is a rearrangement of the input tensor.

TensorId round(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a rounding operation to the model.

This allows Round_11 to be targeted from earlier opsets.

See also

ONNX Round operator.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The normalized output tensor ids.

TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, Attributes::Int batch_axis, const DebugContext &debugContext = {})

Add an init operation to the model.

Parameters

shape – The shape of the tensor to initialise.
data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.
init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.
batch_axis – Batch axis specifies the axis that the batches are split along and is a literal integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, const DebugContext &debugContext = {})

Add an init operation to the model.

Parameters

shape – The shape of the tensor to initialise.
data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.
init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicslice(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})

Add a dynamic slice operation to the model.

Creates a new slice tensor, slice, at offset position, offset, in a tensor, tensor. For example:

slice = tensor[offset]

Parameters

args – A vector of input tensor ids: [tensor, offset].
axes – The axes along which to slice.
sizes – The size of the slice along each axis.
noOverlap – Indicates whether the slice regions overlap or not. If 1, slice regions do not overlap, otherwise they do overlap.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicupdate(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})

Add a dynamic update operation to the model.

Creates a copy of a tensor, tensor, and updates the elements of the copied tensor at offset position, offset, with the elements contained in the slice tensor, slice, For example:

out = tensor
out[offset] = slice

Parameters

args – A vector of input tensor ids: [tensor, offset, slice].
axes – The axes along which to update.
sizes – The size of the slice along each axis.
noOverlap – Indicates whether the updates overlap or not. If 1, the updates do not overlap, otherwise they do overlap.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamiczero(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})

Add a dynamic zero operation to the model.

Creates a copy of a tensor, tensor, with a slice tensor at offset position, offset set to zero. For example:

out = tensor
out[offset] = 0.0

Parameters

args – A vector of input tensor ids: [tensor, offset].
axes – The axes along which to zero elements.
sizes – The size of the slice along each axis.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId dynamicadd(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})

Add a dynamic add operation to the model.

Creates a copy of a tensor, tensor, with a slice tensor, slice, added at an offset position, offset. For example:

out = tensor
out[offset] += slice

Parameters

args – A vector of input tensor ids: [tensor, offset, slice].
axes – The axes along which to add the slice.
sizes – The size of the slice along each axis.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId sequenceslice(const std::vector<TensorId> &args, Attributes::Int zeroUnused, const DebugContext &debugContext = {})

Slice a 2D tensor based on offsets.

The outermost dimension is sliced. For the following:

source is the source tensor.
destination is the destination tensor.
N is the number of elements to copy.
sourceOffset is the first element read from the source tensor.
destinationOffset is the first element written to in the destination tensor. Then, for each entry in N, sourceOffset and destinationOffset:
```
destination[destinationOffset:destinationOffset+N][...] =
source[sourceOffset:sourceOffset+N][...]
```

Entries after the first N==0 may be ignored. Unreferenced elements of destination are zeroed if zeroUnused is set. The same output element should not be written by multiple inputs.

source and destination must have rank greater than or equal to 2. The outer dimension is sliced; the product of the inner dimensions must match. sourceOffset, destinationOffset and N must be 1-dimensional and of the same size. For example:

N = [1, 1, 1]
sourceOffset = [0, 2, 4]
destinationOffset = [0, 1, 2]

Parameters

args – A vector of input tensor ids for the following tensors [source, destination, N, sourceOffset, destinationOffset].
zeroUnused – Determines whether to zero unreferenced destination elements. If 1, the unreferenced elements are zeroed, otherwise they are not zeroed.
debugContext – Optional debug information.

std::vector<TensorId> call(const std::vector<TensorId> &args, unsigned num_outputs, const Builder &callee, const DebugContext &debugContext = {})

Add a call operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters

args – A vector of input tensor ids.
callee – The subgraph to call into.
debugContext – Optional debug information.

Returns

A vector of tensors; the subgraph outputs.

TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

DEPRECATED: Add a replicated allreduce operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters

args – A vector of input tensor ids to reduce across.
commGroup – GCL CommGroup parameter.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

Add a replicated allreduce operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters

args – A vector of input tensor ids to reduce across
collectiveOperator – A Graphcore Communication Library (GCL) collective operator.
commGroup – A GCL CommGroup parameter.
debugContext – Optional debug information

Returns

The tensor id of the result tensor.

TensorId replicatedreducescatter(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})

Add a replicated reduce-scatter operation to the model.

This is a Poplar extension, to expose manual code re-use to the builder.

Parameters

args – A vector of input tensor ids to reduce across.
collectiveOperator – A Graphcore Communication Library (GCL) collective operator.
commGroup – A GCL CommGroup parameter.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId l1loss(const std::vector<TensorId> &args, const float lambda, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})

Add an l1 loss operation to the model.

Calculates the mean absolute error between each element in the input with a zero target.

Parameters

args – A vector of input tensor ids.
lambda – The scale factor of the L1 loss.
reduction – The type of reduction to perform on the individual losses.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId nllloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const nonstd::optional<int> ignoreIndex = nonstd::nullopt, bool inputIsLogProbability = false, const DebugContext &debugContext = {})

Add a negative log-likelihood loss operation to the model.

Calculates the negative log likelihood (NLL) loss given a probability tensor over classes, and a target tensor containing class labels.

Parameters

args – A vector of input tensor ids: probability and tensor.
reduction – The type of reduction to perform on the individual losses.
ignoreIndex – Optional class index to ignore in loss calculation.
inputIsLogProbability – If true the input tensor contains log-probabilities, otherwise raw probabilities. Default = false.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId identityloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})

Add an identity loss operation to the model.

Calculates the loss using the identity operator.

Parameters

args – A vector of input tensor ids.
reduction – The type of reduction to perform on the individual losses.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId tensorremap(const std::vector<TensorId> &args, Attributes::Int remap_type, const DebugContext &debugContext = {})

Add a tensor remap operation to the model.

Changes the tensor layout to conform to the downstream consumers, which means the consumers can read the tensor without having to rearrange it.

Parameters

args – The tensor id of the tensor to remap. This is a single tensor that should be copied to a new tensor with a tensor layout conforming to the downstream consumer.
remap_type – The type of remap to perform on the forward/backward pass. Backward pass remapping requires the op to exist in the IR before autodiff. The value is the integer attribute value of the enum TensorRemapType.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})

Add a connectionist temporal classification (CTC) loss operation to the model.

With maximum input length T, batch size N, number of classes C and maximum target length S, this op calculates the CTC loss for a logarithmised probabilities tensor with shape [T, N, C], a class target tensor with shape [N, S], an input lengths tensor [N] and a target lengths tensor [N].

Note that C includes a blank class (default=0). The probabilities tensor is padded as required. Target sequences are also padded and are populated with values less than or equal to C, not including the blank class, up to their respective target lengths. Note that target lengths cannot exceed input lengths.

Parameters

args – A vector of input tensor ids [log_probs,targets, input_lengths, target_lengths].
reduction – The type of reduction to perform on the individual losses.
blank – The integer representing the blank class.
outDataType – The data type of the output tensors. Default = UNDEFINED.
zeroInfinity – If true infinite losses and the associated gradients are zeroed-out. Default = false.
debugContext – Optional debug information

Returns

The tensor id of the result tensor.

std::vector<TensorId> _ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})

std::vector<TensorId> ctcbeamsearchdecoder(const std::vector<TensorId> &args, unsigned blank = 0, unsigned beamWidth = 100, unsigned topPaths = 1, const DebugContext &debugContext = {})

Add a connectionist temporal classification (CTC) beam search decoder operation to the model.

Calculate the most likely topPaths labels and their probabilities given the input logProbs with lengths dataLengths.

Parameters

args – A vector of input tensor ids. These are [logProbs, dataLengths], where logProbs is of shape [maxTime, batchSize, * numClasses], and dataLengths is of shape [batchSize].
blank – The integer representing the blank class.
beamWidth – The number of beams to use when decoding.
topPaths – The number of most likely decoded paths to return, must be less than or equal to beamWidth.
debugContext – Optional debug information.

Returns

The names of the result tensors. These are [labelProbs, labelLengths,decodedLabels], where labelProbsis of shape [batchSize,topPaths],labelLengthsis of shape [batchSize, topPaths], anddecodedLabelsis of shape [batchSize, topPaths,maxTime`].

TensorId shapeddropout(const std::vector<TensorId> &args, const std::vector<int64_t> &shape, float ratio = 0.5f, const DebugContext &debugContext = {})

Add a shaped dropout operation to the model.

Applies a shaped dropout to the input tensor. This operator requires a shape parameter that is used to define the shape of the dropout mask so that strongly correlated features in the input tensor can be preserved. The provided shape must be broadcastable to the input tensor. Note that this operation targets the poprand library function of the same name.

Parameters

args – A vector of input tensor ids.
shape – The shape of dropout mask. This must be broadcastable to the input.
ratio – The probability of dropping an input feature. Default = 0.5.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId atan2(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an atan2 operation to the model.

Returns the element-wise angle theta as a tensor. For \( -\pi < \theta \le \pi \), such that for two input tensors \(x\) and \(y\) and given \( r \ne 0 \), then \( x = r \cos\theta \), and \( y = r \sin\theta \), element-wise.

In the case of \( x > 0 \) , \( \theta = arctan(y/x)\) .

Parameters

args – A vector of input tensor ids: [y, x].
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId expm1(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a expm1 operation to the model.

This calculates the element-wise exponential of the input tensor and subtracts one: \( exp(x) - 1 \).

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId log1p(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a log1p operation to the model.

This calculates the element-wise logarithm of the input tensor plus one: \( log(x + 1) \).

Parameters

args – A vector of input tensor ids.
name – Optional identifier for operation.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId reshape(const TensorId &arg, const Attributes::Ints &shape, const DebugContext &debugContext = {})

Add a reshape operation to the model.

This reshapes an input tensor. This reshape takes the target shape as an attribute instead of a tensor input as for the ONNX reshape op.

Parameters

arg – The tensor id of the input tensor.
shape – The shape of the output tensor. The output tensor must contain the same number of elements as the input tensor.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId fmod(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an fmod operation to the model.

This is equivalent to the C fmod function. The result has the same sign as the dividend.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

Computes the element-wise remainder of division. The remainder has the same sign as the dividend.

TensorId remainder(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a remainder operation to the model.

This is equivalent to Python’s modulo operator %. The result has the same sign as the divisor.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

Computes the element-wise remainder of division. The remainder has the same sign as the divisor.

TensorId reverse(const std::vector<TensorId> &args, const std::vector<int64_t> &dimensions, const DebugContext &debugContext = {})

Add a reverse operator to the model.

This reverses or flips the tensor along the specified dimensions.

Parameters

args – A vector of input tensor ids.
dimensions – The dimensions along which to reverse the tensor. If this is empty then this is equivalent to the identity operator.
debugContext – Optional debug information.

Returns

The tensor id of the reversed tensor.

TensorId slice(const std::vector<TensorId> &args, const std::vector<int64_t> &ends, const std::vector<int64_t> &starts, const std::vector<int64_t> &axes = std::vector<int64_t>(), const popart::DebugContext &debugContext = {})

Add a slice to the model.

This version of slice uses the starts, ends and axes attributes rather than tensor inputs. This reduces the number of ops as constant tensors are treated as ops while attributes are not.

Parameters

args – A vector of input tensor ids.
ends – The ends attribute.
starts – The starts attribute.
axes – The axes attribute.
debugContext – Optional debug information.

Returns

The normalized output tensor id.

TensorId packedDataBlock(const std::vector<TensorId> &args, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, const Builder &callback, const DebugContext &debugContext = {})

Add a packedDataBlock operator to the model.

Unpack packed sequences of data and call the callback function on the unpacked sequences.

Parameters

args – A vector of input tensor ids.
maxSequenceLengths – The maximum length of a sequence in each of the data inputs.
resultSize – The size of the first dimension of the result tensor.
callbackBatchSize – The number of batches to pass to the callback.
callback – The callback function.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

void abort(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add an abort operation to the model.

The operation can be conditional or unconditional.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

TensorId bitwisenot(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise NOT operation to the model.

The operation computes the bitwise NOT of an integer tensor.

Parameters

args – An input tensor of type integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwiseand(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise AND operation to the model.

The operation computes the bitwise AND of two integer tensors.

Parameters

args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwiseor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise OR operation to the model.

The operation computes the bitwise OR of two integer tensors.

Parameters

args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwisexor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise XOR operation to the model.

The operation computes the bitwise XOR of two integer tensors.

Parameters

args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bitwisexnor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a bitwise XNOR operation to the model.

The operation computes the bitwise XNOR of two integer tensors.

Parameters

args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

std::vector<TensorId> reducemedian(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &axes = nonstd::nullopt, int64_t keepdims = 1, const DebugContext &debugContext = {})

Add reducemedian operation to the model.

This method computes the median values along the specified axes. In the case of an even number of elements, the lower of the two medians is selected. By default, the input tensor is reduced over all axes. Additionally, the operation also returns the indices of found median values in the reduction axis. If reduction is performed over multiple axes, the indices are “flattened” over the reduced axes, similar to numpy.ndarray.flat. The index may not be the first occurrence of the median value found in the input tensor.

Parameters

args – A vector with a single input tensor id.
axes – The axes over which the reduction is performed.
keepdims – If 1, the result tensors are of equal size as the input, but with reduction axes of size 1. Otherwise, the reduction axes are squeezed and the result tensors have fewer dimensions compared to the input. Default = 1.
debugContext – Optional debug information.

Returns

The names of the two result tensors, one for median values and one for indices.

TensorId groupedgather(const std::vector<TensorId> &args, Attributes::Int axis = 0, Attributes::Int group_size = 1, const DebugContext &debugContext = {})

TensorId groupedscatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int group_size = 1, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})

Add a grouped scatterreduce operation to the model.

Reduces all the values from the source tensor src at the indices specified along the given axis by index for each group. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy the src input is the data we are splitting and the indices define the groups for the reduction operation.

In pseudocode the operator can be expressed as:

for g in range(group_size):
    for i in range(axis_size):
        output[g][i] = reduce(src[g][index == i])

where the looping over output indices is implicitly handled by poplar.

Parameters

args – A vector of tensor ids as [src, index, initial_values]. initial_values is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor for ScatterReduction::Sum.
axis_size – The size of the reduced axis.
axis – The axis to reduce along. Default = -1.
reduction – The type of reduction to apply. Default = ScatterReduction::Sum.
group_size – The number of groups to reduce. Default = 1.
enable_index_broadcast – If 1
index will be broadcasted to match”

`data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId scatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})

Add a scatterreduce operation to the model.

Reduces all the values from the source tensor src at the indices specified along the given axis by index. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy the src input is the data we are splitting and the indices define the groups for the reduction operation.

In pseudocode the operator can be expressed as:

for i in range(axis_size):
    output[i] = reduce(src[index == i])

where the looping over output indices is implicitly handled by poplar.

Parameters

args – A vector of tensor ids as [src, index, initial_values]. initial_values is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor for ScatterReduction::Sum.
axis_size – The size of the reduced axis.
axis – The axis to reduce along. Default = -1.
reduction – The type of reduction to apply. Default = ScatterReduction::Sum.
enable_index_broadcast – If 1
index will be broadcasted to match”

`data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId swish(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a swish operation to the model.

The operation computes the swish activation function, also known as the SiLU activation.

Parameters

args – A vector with a single input tensor id.
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId incrementmod(const std::vector<TensorId> &args, Attributes::Float increment, Attributes::Float modulus, const DebugContext &debugContext = {})

Add an incrementmod operation to the model.

The operation is of the form y = (x + increment) % modulus.

Parameters

args – A vector with a single input tensor id.
increment – A scalar increment
modulus – A scalar modulus
debugContext – Optional debug information.

Returns

The tensor id of the result tensor.

TensorId bucketize(const std::vector<TensorId> &args, Attributes::Int right = 0, const DebugContext &debugContext = {})

Add a bucketize operation to the model.

The operation returns the indices of the buckets to which each value in the input tensor belongs. The ranges of each bucket are defined by the boundaries tensor. The returned index satisfies the following rules:

right == 1: boundaries[i-1] <= input[m][n]…[l][x] < boundaries[i] right == 0: boundaries[i-1] < input[m][n]…[l][x] <= boundaries[i]

Parameters

args – A vector of tensor IDs containing [input, boundaries]. Where
- input is an N-D tensor or a scalar containing the search values
- boundaries is a 1-D tensor defining ranges of the buckets. This must contain a monotonically increasing sequence.
right – If 0 (default) then the left boundary is closed.

Returns

The tensor ID of the result tensor. The result tensor has the same size and shape as the input tensor.

std::vector<TensorId> sort(const std::vector<TensorId> &args, Attributes::Int axis = -1, Attributes::Int descending = 0, Attributes::Int stable = 0, const popart::DebugContext &debugContext = {})

Add a sort operation to the model.

Parameters

args – A vector with a single input tensor id.
axis – The dimension to sort along.
descending – If ‘1’ then the elements are sorted in descending order by value.
stable – If ‘1’ then the sorting routine becomes stable, preserving the order of equivalent elements.

Returns

A vector of (values, indices) is returned, where the values are the sorted values and indices are the indices of the elements in the original input tensor.

TensorId nearbyint(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a nearby int rounding operation to the model.

Rounds the floating-point argument to an integer value in floating-point format.

Parameters

args – A vector of input tensor ids.
debugContext – Optional debug information.

Returns

The normalized output tensor ids.

std::vector<TensorId> splinebasis(const std::vector<TensorId> &args, Attributes::Int degree = 1, const DebugContext &debugContext = {})

Add a splinebasis operation to the model.

The operation returns two outputs: coefficients for the B-spline basis functions and weight indices for each spline coefficient.

Parameters

args – A vector of tensor IDs containing [pseudo, kernel_size, is_open_spline]. where
- pseudo is a 2-D tensor with pseudo coordinates, of shape [numEdges * numDims].
- kernel_size is a 1-D tensor containing the kernel size at each dimension of the edge pseudo coordinates.
- is_open_slice is a 1-D tensor that for each dimension encodes whether an open or a closed B-spline basis function must be used.
degree – The degree of the B-spline basis function.

Returns

The basis and weightIndex tensors, both of shape [numEdges * numSplines]. basis contains the coefficients for the B-spline basis functions. weightIndex contains weight indices for each spline.

TensorId splineweighting(const std::vector<TensorId> &args, const DebugContext &debugContext = {})

Add a splineweighting operation to the model.

The operation returns features weighted by a continuous B-spline kernel function.

Parameters

args – A vector of tensor IDs containing [input, weight, basis weightIndex]. where

input is a 2-D tensor (size: [numEdges * numInputChannels]) with input features.
weight is a 3-D tensor (size: [numEdges * numInputChannels * numOutputChannels]) containing weights for B-Spline functions.
basis is a 2-D tensor (size: [numEdges * numSplines]) of the coefficients for the B-spline basis functions and is produced by the splinebasis op.
weightIndex is a 2-D tensor (size: [numEdges * numSplines]) of the weight indices produced by the splinebasis op.

Returns

A tensor of shape [numEdges * numOutputChannels] containing features weighted by a continuous B-spline kernel function.

#include <popart/scope.hpp>

class Scope

Public Functions

inline bool empty() const

void pop()

Scope getCommonParent(const Scope&) const

inline size_t depth() const

bool operator==(const Scope&) const

bool operator!=(const Scope&) const

std::string str() const

Scope operator/(const std::string &name) const

inline operator std::string()

bool isSubscope(const Scope&) const

const std::vector<std::string> getScopeNames() const

Public Static Functions

static inline std::string delimiter()

static Scope getCommonParent(const std::vector<Op*>&)

14.6. Data flow

#include <popart/dataflow.hpp>

enum class popart::AnchorReturnTypeId

Class that defines the identifiers for the return type of the anchor tensors.

An anchor tensor is a tensor that the user wants returned after a call to Session::run(). Each call to Session::run() results in batchesPerStep x accumulationFactor x replicationFactor of anchor tensors being computed. The samples associated with each computation is called a micro batch. The dimensions are user-specified with the following parameters:

batchesPerStep is number of batches per step and the value is obtained from the DataFlow object.
accumulationFactor is the gradient accumulation factor and the value is defined by SessionOptions::accumulationFactor.
replicationFactor is the number of replicas and the value is defined by SessionOptions::replicatedGraphCount.

This enum type describes the strategy with which the micro batch values for anchor tensors (or their summaries) are written or to the IStepIO instance passed to Session::run.

NOTE: Anchors are essentially what TensorFlow calls “fetches”.

See also

AnchorReturnType.

Values:

enumerator Final = 0

Only return the tensor value for the last micro batch of the Session::run call for each replica.

The buffer shape required for this anchor in IStepIO is [replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator EveryN

Return the tensor value for every N-th global batch for each replica and for all accumulation steps in that global batch.

Note that the value of N is captured by AnchorReturnType.

The buffer shape required for this anchor in IStepIO is [batchesPerStep / N, accumulationFactor, replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator All

Return the tensor value for all micro batches for each replica.

The buffer shape required for this anchor in IStepIO is [batchesPerStep, accumulationFactor, replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enumerator Sum

Return one tensor value for each replica, doing a sum reduction over the batchesPerStep and accumulationFactor dimensions.

The buffer shape required for this anchor in IStepIO is [replicationFactor, <anchorTensorShape>] (with dimensions of size 1 removed).

enum class popart::ExchangeStrategy

Enum type to specify an exchange strategy.

JustInTime: .- outer loop ———-—. |.- inner loop ——–—.| || load - compute - store || |’————————’| ‘———————–—’

OverlapInnerLoop:

Boxes denote subgraphs / subgraph Ops / loops
Inputs/outputs are loop carried in order

OverlapLoops

Boxes denote subgraphs / subgraph Ops / loops
Numbers on boxes are matching subgraph/loop inputs and outputs

Overlap indicators indicate compute & load/store pairs overlapping in time

         load
           |
        compute   load            load         < overlap
           |        |               |
           1        2               |
       .-- inner loop --.           |
       |   |        |   |           |
       | store  compute |           |          < overlap
       | load       |   |           |          < overlap
       |   |        |   |           |
       '----------------'           |
           2        1      load compute        < overlap
           |        |        |      |
           1        2        3      4

OverlapStep: Not supported yet

Values:

enumerator JustInTime = 0: Copy tensor when required.

enumerator OverlapInnerLoop = 1: Preload values in previous inner loop iteration for the next iteration.

enumerator OverlapLoops = 2: Preload values in the previous loop iteration for the next iteration (implies OverlapInnerLoop)

enumerator OverlapStep = 3: Preload values in the previous host training step for next step (implies OverlapLoops) - not supported yet.

enumerator N = 4: Number of values.

class AnchorReturnType

Class that captures an AnchorReturnTypeId value.

When the value is AnchorReturnTypeId::EVERYN, the associated N value. The constructor takes std::string values and converts them as appropriate.

Public Functions

AnchorReturnType(): Default constructor for the AnchorReturnType class.

AnchorReturnType(std::string artString, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)

Constructor for the AnchorReturnType class.

NOTE: Attempting to construct an AnchorReturnType for AnchorReturnTypeId::EVERYN using this constructor will result in an error. Use AnchorReturnType(std::string,int,TileSet,ExchangeStrategy) which also specifies the return period.

Parameters

artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):
- ”final” = AnchorReturnTypeId::FINAL
- “all” = AnchorReturnTypeId::ALL
- “sum” = AnchorReturnTypeId::SUM
tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.
exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.

AnchorReturnType(std::string artString, int returnPeriod, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)

Constructor for the AnchorReturnType class.

Parameters

artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):
- ”final” = AnchorReturnTypeId::FINAL
- “all” = AnchorReturnTypeId::ALL
- “sum” = AnchorReturnTypeId::SUM
returnPeriod – The value of N in the case of AnchorReturnTypeId::EVERYN.
tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.
exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.

inline const std::string &str() const: Get a string of AnchorReturnTypeId.

inline const TileSet &tileSet() const: Get the type of the tile set.

inline const ExchangeStrategy &exchangeStrategy() const: Get the type of overlap strategy.

class DataFlow

This class specifies parameters for host-device data streams.

The parameters are used to control the amount input data processed in each step, that is each Session::run call. The parameters also determine how data is returned to the user.

See also

AnchorReturnType, AnchorReturnTypeId.

Public Functions

DataFlow()

Default constructor.

This constructor sets batchesPerStep to 0 and does not have any anchor tensors.

DataFlow(int batchesPerStep)

Construct a DataFlow instance without anchor tensors.

Parameters: batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.

DataFlow(int batchesPerStep, const AnchorReturnTypeMap &anchorMap)

Construct a DataFlow instance with anchor tensors.

Parameters

batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.
anchorMap – A mapping from output tensor TensorId to AnchorReturnType indicating the strategy with which to write the anchor tensor values to the IStepIO object provided to Session::run.

DataFlow(int batchesPerStep, const std::vector<TensorId> anchorTensorIds, const AnchorReturnType &anchorReturnType = AnchorReturnType("All"))

Construct a DataFlow instance with anchor tensors.

Parameters

batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.
anchorTensorIds – The tensor ID of anchor tensors.
anchorReturnType – The strategy with which to write anchor tensor values to the IStepIO object provided to Session::run.

DataFlow(const DataFlow &rhs) = default

inline void setBatchesPerStep(const int batchesPerStep): Set the value for batchesPerStep.

class InputSettings

Class that describes the TileSet, ExchangeStrategy, and ReplicatedStreamMode used for an input tensor.

Public Functions

InputSettings(): Constructor for the InputSettings class.

InputSettings(TileSet tileSet, ExchangeStrategy exchangeStrategy)

Constructor for the InputSettings class.

Parameters

tileSet – The type of the tile set.
exchangeStrategy – The overlap strategy (between IO and compute) for anchor tensors.

InputSettings(ReplicatedStreamMode replicatedStreamMode)

Constructor for the InputSettings class.

Parameters: replicatedStreamMode – The mode used for the replicated stream.

inline const TileSet &tileSet() const: Get the type of the tile set.

inline const ExchangeStrategy &exchangeStrategy() const: Get the type of overlap strategy.

inline ReplicatedStreamMode replicatedStreamMode() const: Get the mode of the replicated stream.

inline void setTileSet(TileSet tileSet)

Set the type of the tile set.

Parameters: tileSet – The type of the tile set..

inline void setExchangeStrategy(ExchangeStrategy exchangeStrategy)

Set the overlap strategy (between IO and compute).

Parameters: exchangeStrategy – The overlap strategy.

inline void setReplicatedStreamMode(ReplicatedStreamMode streamMode)

Set the mode used for the replicated stream.

Parameters: replicatedStreamMode – The mode used for the replicated stream.

using popart::AnchorReturnTypeMap = std::map<TensorId, AnchorReturnType>

#include <popart/replicatedstreammode.hpp>

enum class popart::ReplicatedStreamMode

Values:

enumerator Broadcast

enumerator Replicate

14.7. Device manager

#include <popart/devicemanager.hpp>

enum class popart::DeviceType

Defines the type of device to use for graph compilation and execution.

Values:

enumerator IpuModel = 0

Use the Poplar IPU Model for graph compilation and execution.

The IPU Model will simulate the behaviour of the IPU hardware. It will not completely implement every aspect of a real IPU. (Default).

enumerator Cpu: Use CPU for graph compilation and execution.

enumerator Ipu: Use IPU for graph execution.

enumerator OfflineIpu

Compile graph for later execution.

This can be done even if IPUs are not present. Offline graph compilation is also useful for verifying memory constraints.

enumerator Sim: [For Graphcore internal use only] Use a simulator for graph compilation and execution.

enum class popart::DeviceConnectionType

Controls when to connect to the IPU (if at all).

Values:

enumerator Always = 0: Attach to the IPU from the start (Default).

enumerator OnDemand: Wait until the compilation is complete and the executable is ready to be run before attaching to the IPU.

enumerator Never

Never try to attach to an IPU.

This is useful for offline compilation (DeviceType::OfflineIpu. Trying to run an executable will throw an error.

enum class popart::SyncPattern

Controls synchronisation in multi-IPU systems.

Values:

enumerator Full = 0: Require all IPUs to synchronise on every communication between IPUs or between IPUs and host (Default).

enumerator SinglePipeline

Allow IPUs to synchronise with the host independently, without having to synchronise with each other.

This permits any one IPU to perform host IO while other IPUs are processing data.

enumerator ReplicaAndLadder

Allow an IPU group to communicate with the host without requiring synchronisation between groups.

This permits multiple IPU groups to alternate between performing host IO and computation.

class DeviceInfo

Represents a specific device.

Subclassed by popart::popx::DevicexInfo, popart::popx::DevicexOfflineIpuInfo

Public Functions

DeviceInfo(DeviceType _type, DeviceConnectionType _connectionType, const poplar::OptionFlags &_flags)

Constructor for the DeviceInfo class.

Parameters

_type – The type of the device.
_connectionType – The setting for when to connect to the device, if at all.
_flags – A set of Poplar option/value string flags.

virtual ~DeviceInfo(): Destructor for DeviceInfo.

virtual bool attach() = 0

Attach to the device.

Returns: true if successfully attached to the device, false otherwise.

virtual void detach() = 0: Detach from the device.

virtual bool isAttached() const = 0

Check if attached to the device.

Returns: true if attached to the device, false otherwise.

inline DeviceType getType() const

Get the type of the device.

Returns: The type of the device.

inline DeviceConnectionType getConnectionType() const

Get the setting for when to connect to the device.

Returns: The setting for when to connect to the device.

std::string toString() const: Return a description of the device.

virtual int getId() const = 0: Get the device id.

virtual std::vector<int> getChildIds() const = 0

Get the child device IDs.

The value returned by getId() for a multi-IPU device is a ‘parent ID’ and does not relate to the IDs of the devices it comprises. This function, in the case of real devices, uses the Poplar API to work out which single-IPU device IDs it relates to. In the case of replication, a device includes all IPUs involved, so a 2-IPU model with 2x replication would expect to have 4 child IDs returned here.

virtual std::string getVersion() const = 0: Get the version of the software on the IPU.

virtual int getNumIpus() const = 0: Get the number of IPUs in the device.

virtual int getTilesPerIPU() const = 0: Get the number of tiles per IPU.

virtual int getNumWorkerContexts() const = 0: Get the number of worker contexts per tile.

virtual std::string getIpuVersion() const = 0: Get the IPU version.

virtual std::vector<unsigned> getDriverIds() const = 0: Get the version of the drivers on the IPU.

virtual const poplar::Target &getTarget() const = 0: Get the Poplar target.

inline virtual bool canCompileOffline() const

Get whether the device supports offline compilation.

Returns: true if the device supports offline compilation, otherwise false`.

const poplar::OptionFlags &getOptionFlags() const

void setOnDemandAttachTimeout(const unsigned seconds)

Set timeout (in seconds) for trying to attach to a device.

If unable to attach to a device on the first try, the DeviceManager instance will periodically try to attach to the device until successfully attached or this timeout is reached.

Note

This only applies when trying to attach with DeviceConnectionType::OnDemand.

Parameters: seconds – The timeout (in seconds) for trying to attach to the device.

inline const unsigned &getOnDemandAttachTimeout() const

Get timeout (in seconds) for trying to attach to a device.

Returns: The timeout (in seconds) for trying to attach to the device.

bool tryAttachUntilTimeout(): Periodically try to attach to the device until either the attach timeout is reached or successfully attached.

bool isHwCompatible() const

void writeToDeviceAccessLog(const std::string &event, const std::map<std::string, std::string> &auxKeyVals = {})

Log an event for device debugging purposes.

This event will get logged to the file location defined by the environment variable POPART_LOG_DEVICE_ACCESS_IN_TESTS, if it is set.

Parameters

event – A text description of the event to be written to the log.
auxKeyVals – Optional additional parameters to log.

class DevicexInfo : public popart::DeviceInfo 

Subclassed by popart::popx::DevicexCpuInfo, popart::popx::DevicexIpuInfo, popart::popx::DevicexIpuModelInfo, popart::popx::DevicexSimInfo

Public Functions

inline DevicexInfo(popart::DeviceType _type, popart::DeviceConnectionType _connectionType, poplar::Device &_device, const poplar::OptionFlags &_flags)

~DevicexInfo() override

bool attach() override

void detach() override

inline int getNumIpus() const override

inline int getTilesPerIPU() const override

inline int getNumWorkerContexts() const override

inline std::vector<unsigned> getDriverIds() const override

inline const poplar::Device &getDevice() const

inline const poplar::Target &getTarget() const override

inline std::string getIpuVersion() const override

inline bool isAttached() const override

virtual void setMostRecentlyLoaded(Devicex *devicex): Mark devicex as the last one that was loaded.

virtual bool isMostRecentlyLoaded(const Devicex *devicex) const: Check if Devicex was the last one that was loaded.

class DevicexCpuInfo : public popart::popx::DevicexInfo 

Public Functions

inline DevicexCpuInfo(poplar::Device &_device)

inline int getId() const override

inline std::vector<int> getChildIds() const override

inline std::string getVersion() const override

class DevicexIpuInfo : public popart::popx::DevicexInfo 

Public Functions

inline DevicexIpuInfo(popart::DeviceConnectionType _dct, int _id, poplar::Device &_device, const poplar::OptionFlags &_flags)

inline int getId() const override

std::vector<int> getChildIds() const override

std::string getVersion() const override

inline bool canCompileOffline() const override

class DevicexIpuModelInfo : public popart::popx::DevicexInfo 

Public Functions

inline DevicexIpuModelInfo(poplar::Device &_device, const std::string _ipuVersion)

inline int getId() const override

inline std::vector<int> getChildIds() const override

inline std::string getVersion() const override

class DevicexSimInfo : public popart::popx::DevicexInfo 

Public Functions

inline DevicexSimInfo(poplar::Device &_device)

inline int getId() const override

inline std::vector<int> getChildIds() const override

inline std::string getVersion() const override

class DevicexOfflineIpuInfo : public popart::DeviceInfo 

Public Functions

inline DevicexOfflineIpuInfo(poplar::Target &_target, const poplar::OptionFlags &_flags)

inline bool attach() override

inline void detach() override

inline int getId() const override

inline std::vector<int> getChildIds() const override

inline std::string getVersion() const override

inline int getNumIpus() const override

inline int getTilesPerIPU() const override

inline int getNumWorkerContexts() const override

inline std::string getIpuVersion() const override

inline std::vector<unsigned> getDriverIds() const override

inline const poplar::Target &getTarget() const override

inline bool canCompileOffline() const override

inline bool isAttached() const override

class DeviceManager

A class to manage devices.

Public Functions

DeviceManager(const DeviceManager&) = default

~DeviceManager() = default

void registerDeviceProvider(DeviceProvider *provider)

Register a device provider.

Parameters: provider – The device provider to be registered with the device manager.

virtual void enumerate(std::vector<std::shared_ptr<popart::DeviceInfo>> &devices, unsigned requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU)

Get the list of all devices that satisfy the specified criteria.

Parameters

devices – The list of devices.
requiredNumIPUs – The number of IPUs required.
syncPattern – The setting for when to synchronise in a multi-IPU system.
type – The type of the device to use for compilation and execution.
connectionType – The setting for when to connect to the device.
requiredTilesPerIPU – The number of tiles per IPU required.

std::vector<std::shared_ptr<DeviceInfo>> enumerateDevices(SyncPattern pattern = SyncPattern::Full, int numIpus = 1, DeviceType deviceType = DeviceType::Ipu, DeviceConnectionType connectionType = DeviceConnectionType::Always, int tilesPerIPU = 0)

Get the list of all devices with the required criteria.

Parameters

pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
numIpus – The number of IPUs required. (Default: 1).
deviceType – The type of the device required. (Default: DeviceType::Ipu).
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).
tilesPerIPU – The number of tiles per IPU required. (Default: 0).

Returns

The list of devices with the required criteria.

std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern = SyncPattern::Full, uint32_t deviceManagerId = 0, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Get a device with the required criteria.

Parameters

syncPattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
deviceManagerId – The ID of the requested device. (Default: 0)
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> tryAcquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)

Finds an available hardware device, with the specified number of IPUs.

This method will attach to the device if connectionType is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.

Parameters

numIpus – The number of IPUs on the device (Default: 1).
tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).
selectionCriterion – The method for selecting a device from the list of valid selections. (Default: DeviceSelectionCriterion::First).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> acquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)

Finds an available hardware device, with a certain number of IPUs.

This method will attach to the device if connectionType is equal to DeviceConnectionType::Always. Throws an error if there are less than numIpus IPUs available.

Parameters

numIpus – The number of IPUs on the device [=1].
tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device.
selectionCriterion – How to select a device from the list of valid selections.

Returns

A device, which can be used with a session.

std::shared_ptr<DeviceInfo> tryAcquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Allocates the hardware device by ID.

This ID can be found running gc-info -l. This method will try to attach to the device if connectionType is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.

Parameters

id – The ID of the IPU to be used.
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session. If no device is acquired, a nullptr is returned.

std::shared_ptr<DeviceInfo> acquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)

Allocates the hardware device by ID.

This ID can be found running gc-info -l. This method will attach to the device if connectionType is equal to DeviceConnectionType::Always.

Parameters

id – The ID of the IPU to be used.
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).

Returns

A device, which can be used with a session.

std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options)

Create a simulated device on the host for testing purposes.

Parameters

type – The type of device to simulate.
options – The configuration settings for the host device.

Returns

The requested device for testing purposes.

std::shared_ptr<DeviceInfo> createCpuDevice()

Create a simulated CPU device for testing purposes.

Returns: A simulated CPU device.

std::shared_ptr<DeviceInfo> createIpuModelDevice(const std::map<std::string, std::string> &options)

Create a simulated IpuModel device for testing purposes.

The following options are supported:

numIPUs: The number of IPUs to simulate (Default: 1).
ge: The number of tiles per IPU (Default: defaultFewTiles).
compileIPUCode: Indicate whether or not to compile real IPU code for modelling.

Parameters: options – Configuration settings for the IPU Model.
Returns: A device.

std::shared_ptr<DeviceInfo> createSimDevice(const std::map<std::string, std::string> &options)

std::shared_ptr<DeviceInfo> createOfflineIPUDevice(const std::map<std::string, std::string> &options)

Create a simulated OfflineIpu device for testing purposes.

This resembles an IPU and is used for offline compilation.

The following options are supported:

numIPUs: The number of IPUs to compile for
ge: The number of tiles per IPU (Default: defaultManyTiles).
ipuVersion: The ipu architecture (Default: “ipu2”).
syncPattern: The setting for synchronisation in a multi-IPU system.

Parameters: options – Configuration settings for the IPU Model.
Returns: A simulated OfflineIpu device.

std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo)

Create a simulated OfflineIpu device from the description of another device.

Parameters: deviceInfo – The device to create a OfflineIpu version of.
Returns: An OfflineIpu device.

std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus)

Create a simulated OfflineIpu device from the name of a system.

Parameters

system – The device to create a OfflineIpu version of.
numIpus – The number of IPUs. Providing 0 corresponds to all IPUs in system

Returns

An OfflineIpu device.

void setOnDemandAttachTimeout(const unsigned seconds)

If unable to attach to a device on first try, the attach timeout set here is the length of time (in seconds) that the DeviceManager will wait to try and attach.

Note: this only takes effect when trying to attach with a DeviceConnectionType::OnDemand DeviceConnectionType.

Parameters: seconds – The attach timeout in seconds.

Public Static Functions

static DeviceManager &createDeviceManager()

Accessor for the device manager.

Returns: A reference to the DeviceManager instance.

class DeviceProvider

The interface for device providers which are registered with the device manager.

Subclassed by popart::popx::DevicexManager

Public Functions

inline virtual ~DeviceProvider(): Destructor for DeviceProvider.

virtual std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, unsigned deviceManagerId, DeviceConnectionType connectionType) = 0

Get the list of all devices that satisfy the specified criteria.

Throws an error if the connection type is DeviceConnectionType::Never.

Parameters

syncPattern – The setting for synchronisation on multi-IPU systems.
deviceManagerId – The ID of the requested device.
connectionType – The setting for when to connect to the device.

Returns

The list of all devices that satisfy the specified criteria.

virtual void enumerate(std::vector<std::shared_ptr<DeviceInfo>> &devices, uint32_t requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU) = 0

Get the list of all devices that satisfy the specified criteria.

Parameters

devices – The list of devices.
requiredNumIPUs – The number of IPUs required.
syncPattern – The setting for when to synchronise in a multi-IPU system.
type – The type of the device to use for compilation and execution.
connectionType – The setting for when to connect to the device.
requiredTilesPerIPU – The number of tiles per IPU required.

virtual std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) = 0

Create a host device for testing.

Parameters

type – The type of the device to use for compilation and execution.
options – The configuration for the created device. See createCpuDevice(), createIpuModelDevice(), createOfflineIPUDevice() and createSimDevice() for more information about options.
syncPattern – The setting for when to synchronise in a multi-IPU system.

Returns

The device for use in testing.

virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) = 0

virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) = 0

class DevicexManager : public popart::DeviceProvider 

Public Functions

DevicexManager()

std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, uint32_t deviceManagerId, DeviceConnectionType connectionType) override

void enumerate(std::vector<std::shared_ptr<popart::DeviceInfo>> &devices, unsigned requiredNumIPUs, SyncPattern syncPattern, DeviceType type, DeviceConnectionType connectionType, uint32_t requiredTilesPerIPU) override

std::shared_ptr<popart::DeviceInfo> createHostDevice(popart::DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) override

std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) override

std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) override

#include <popart/popx/devicex.hpp>

class Devicex

Public Functions

const Ir &ir() const

const IrLowering &lowering() const

IrLowering &lowering()

Devicex(Executablex &exe, std::shared_ptr<DeviceInfo> deviceInfo)

~Devicex()

void prepare()

void weightsFromHost()

void buffersFromHost()

void remoteBufferWeightsFromHost(const bool isUpdate = false)

void optimizerFromHost()

void setRandomSeedFromHost()

uint64_t getRandomSeedToHost()

void setRngStateFromHost()

std::vector<uint32_t> getRngStateToHost()

void setRngStateValue(const std::vector<uint32_t>)

std::map<std::string, std::vector<uint64_t>> cycleCountTensorToHost()

void run(IStepIO&, std::string debugName = "")

void run(std::string programHandle, IStepIO&, std::string debugName = "")

void weightsToHost()

void remoteBufferWeightsToHost()

void weightsToHost(const std::map<TensorId, MutableVoidData>&)

void popxlWeightsToTensorData()

Copy data from the device, to the host buffers, to the tensor.tensorData() buffers.

Will not run a WeightsToHost program if weights already in sync with ipu. After WeightsToHost, marks the weights as in sync with the ipu.

void popxlMarkHostWeightsOutOfSync(): Mark the d2hWeightBuffers as out of sync with the ipu.

void popxlMarkHostWeightsInSync(): Mark the d2hWeightBuffers as in sync with the ipu.

bool popxlAreHostWeightsInSync(): Are all the weights in sync with the ipu?

void readWeights(const IWeightsIO &dst)

void writeWeights(const IWeightsIO &src)

std::string getSummaryReport(bool resetProfile = true) const

std::string getSerializedGraph() const

pva::Report getReport() const

bool isEngineLoaded() const

void setEngineIsLoaded(bool isLoaded)

void connectRandomSeedStream()

void connectRngStateStream()

void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index)

void connectStream(const std::string &streamHandle, void *host_buffer)

void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index)

void copyFromRemoteBuffer(const PopStreamId buffer, void *w, int repeat_index, unsigned replication_index = 0)

void copyToRemoteBuffer(void *w, const PopStreamId buffer, int repeat_index, unsigned replication_index = 0)

unsigned getReplicationFactor() const

unsigned getAccumulationFactor() const

unsigned getGlobalReplicaOffset() const

unsigned getGlobalReplicationFactor() const

bool isReplicatedGraph() const

inline const DeviceInfo *getDeviceInfo() const

inline DeviceInfo *getDeviceInfo()

inline void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo_)

std::set<TensorId> getLinearlyCreatedInputTensors() const

std::set<TensorId> getEfficientlyCreatedInputTensors() const

inline bool prepareHasBeenCalled() const

void loadEngineAndConnectStreams()

void serializeExecutable(std::ostream &out, bool serializePopartMetadata, bool serializeTensorData)

void serializeExecutable(const std::string &path, bool serializePopartMetadata, bool serializeTensorData)

void serializeTensorData(const std::string &path)

Public Members

poplin::PlanningCache convCache

poplin::matmul::PlanningCache matmulCache

bool prePlanConvolutions = true

bool prePlanMatMuls = true

Friends

friend class serialization::WriterImpl

typedef std::string popart::popx::PopStreamId

class Executablex

Public Functions

Executablex(IrLowering &ir_lowering_)

Executablex(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)

IrLowering &lowering()

const IrLowering &lowering() const

const Ir &ir() const

inline bool isDeserialized() const

bool shouldSerialize()

bool containsTensor(const TensorId &id) const

Tensor *getTensor(const TensorId&)

const Tensor *getTensor(const TensorId&) const

std::set<TensorId> getAllTensorIds()

std::vector<TensorId> getTensorIds(TensorType)

void setRandomSeedValue(uint64_t value)

void resetWeights(const ONNX_NAMESPACE::ModelProto &modelProto, const bool ignoreWeightsInModelWithoutCorrespondingIrWeight = false)

inline const SessionOptions &getSessionOptions() const

inline std::vector<Tensor*> &getWeightTensors()

inline const std::vector<Tensor*> &getWeightTensors() const

inline const std::vector<Tensor*> &getAnchorTensors() const

inline const std::vector<Tensor*> &getOptimizerTensors() const

inline const std::vector<Tensor*> &getDataStreamTensors() const

inline const Tensor *getSeedTensor() const

const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &id) const

const std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> getCollectiveBalancedHostRearrangements() const

const std::map<TensorId, CollectiveBalancedReorderId> getCollectiveBalancedHostRearrangementIds() const

std::string getCachePath(const std::string &cacheDir) const

void updateOptimizerTensors()

Public Static Functions

static std::unique_ptr<Executablex> createFromLoweredIr(IrLowering &ir_lowering_)

static std::unique_ptr<Executablex> createFromStream(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)

#include <popart/popx/irlowering.hpp>

class IrLowering

Public Types

using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>

Public Functions

IrLowering(const Ir&, std::shared_ptr<DeviceInfo> deviceInfo, bool prepareGraphHasBeenCalled = false)

virtual ~IrLowering()

inline const Ir &ir() const

void growOpx(Opx*, SequenceMap::SequenceInterval seqInterval)

void growOpxCall(Opx*, SequenceMap::SequenceInterval seqInterval)

inline void setDevicex(Devicex *d)

std::set<TensorId> getLinearlyCreatedInputTensors() const

inline void setLinearlyCreatedInputTensors(const std::set<TensorId> &s)

inline void addLinearlyCreatedInputTensors(TensorId id)

std::set<TensorId> getEfficientlyCreatedInputTensors() const

inline void setEfficientlyCreatedInputTensors(const std::set<TensorId> &s)

inline void addEfficientlyCreatedInputTensors(TensorId id)

bool tryInitTensorByPostIRAliasing(TensorId dstId, RequireParallelWritable requireParallelWritable, const ViewChangers &viewChangers)

inline const std::vector<std::string> &getCycleCountIds() const

inline void setCycleCountIds(const std::vector<std::string> &ids)

inline const PopTensors &tensors() const

inline PopTensors &tensors()

inline const PopPrograms &progs() const

inline PopPrograms &progs()

void instrumentWithHardwareCycleCounter(poplar::program::Sequence&, int64_t tileId = 0, std::string id = "")

inline poplar::Graph &graph()

inline const poplar::Graph &graph() const

void prepareGraph()

void loadPoplarExecutable(serialization::Reader &reader)

poplar::Executable getExecutable(const ProfileCacher &ProfileCacher)

std::string getPoplarGraphDebugName()

std::string getSerializedGraph() const

poplar::Graph &getVirtualGraph(VGraphId virtualGraphIndex, TileSet tileSet = TileSet::Compute)

PriTaskDependency taskWhichCreates(TensorId) const

TaskId taskWhichPopulates(TensorId) const

PriTask getDependencyFreeInitTensorCreatorTask(const TensorId&)

unsigned getReplicationFactor() const

unsigned getAccumulationFactor() const

unsigned getGlobalReplicaOffset() const

unsigned getGlobalReplicationFactor() const

bool isReplicatedGraph() const

bool doRearrangeOnHost(Tensor *tensor) const

int getNumFragments(const Graph &graph) const

bool containsFragments(const Graph &graph) const

bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const

void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)

std::vector<poplar::Function> &getFragmentFunctions(const Graph &graph)

poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart)

void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Add a vector of pairs {f, buffer} for a given graph id, FunctionBufferMappingType pair.

This is enough for an [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.

Parameters

gid – The graph id to add the functions and buffers for.
fbmt – The FunctionBufferMappingType to add the vector for.

inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Get the Function Buffers for the given GraphId and FunctionBufferMappingType.

Wrapper around popprograms function.

Parameters

gid – The GraphId to lookup.
fbmt – The FunctionBufferMappingType to lookup.

Returns

FunctionBuffers the vector of functions and buffers.

inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Returns true if a functionBuffer vector exists for the given graphId / FunctionBufferMappingType.

Wrapper around popprograms function.

Parameters

gid – The graph id to lookup.
fbmt – The FunctionBufferMappingType to lookup.

Returns

true If pairs exist.

Returns

false Otherwise.

std::vector<ICreatorCandidatePtr> getCreatorEndpoints(const Tensor *tensor, bool excludeEndpointsFromPath = true, bool includeDeadends = false) const

std::vector<ICreatorCandidatePtr> getTensorCreators(const Tensor *tensor, bool dependencyFree) const

poplar::Tensor getConst(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, double val, const poplar::DebugContext &dc = {})

inline const ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle() const

inline ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle()

poplar::Tensor getScalarVariable(poplar::Graph &graph, const poplar::Type &type, const poplar::DebugContext &dc = {})

inline LinearMapper &getLinearMapper()

inline InitTensorOffsetMap &getInitTensorOffsetMap()

inline const liveness::LivenessAnalyzer *getLivenessAnalyzer() const

inline const liveness::SubgraphPartitioner *getSubgraphPartitioner() const

inline liveness::AliasZeroCopy *getAliasZeroCopy() const

inline const DeviceInfo *getDeviceInfo() const

inline void setDeviceInfo(std::shared_ptr<DeviceInfo> deviceInfo_)

std::unique_ptr<Opx> createOpx(Op*)

inline Opx *getOpx(OpId id)

inline const Opx *getOpx(OpId id) const

const std::vector<Op*> &getMainGraphOpSeries() const

std::map<Op*, int, POpCmp> getMainGraphOpSeriesNums() const

std::map<Op*, int, POpCmp> getMainGraphOpCounts() const

std::string getContextOpString(ExecutionContext context, const std::vector<TaskId> &taskOrder) const

inline bool prepareGraphHasBeenCalled() const

inline bool getOuterLoopFragEmpty() const

inline bool usingCachedExecutable() const

poplar::DataStream &insertGradientStoreStream(TensorId, TensorInfo, poplar::Graph&)

poplar::DataStream &insertGradientLoadStream(TensorId, TensorInfo, poplar::Graph&)

poplar::DataStream &insertWeightLoadStream(TensorId, TensorInfo, poplar::Graph&)

inline void addPipelineIndexTensor(const poplar::Tensor &tensor)

inline ExchangeBundle &getExchangeBundle()

Get the exchange bundle containing stream and remote buffer data structures.

Returns: Exchange bundle

inline const ExchangeBundle &getExchangeBundle() const

Get the exchange bundle containing stream and remote buffer data structures.

Returns: Exchange bundle

inline const std::vector<poplar::Tensor> getPipelineIndexTensors()

inline const std::map<TensorId, poplar::DataStream> &getFromHostStreams() const

inline const std::map<TensorId, poplar::DataStream> &getToHostAnchorStreams() const

inline const std::map<TensorId, poplar::DataStream> &getToHostWeightStreams() const

template<class T> inline T *getOpxState(OpId opid)

inline void setProgramHandleIndexMap(const std::map<std::string, unsigned> &programHandleIndexMap_)

inline const std::map<std::string, unsigned> &getProgramHandleIndexMap() const

Public Members

poplar::OptionFlags pooling_options

poplar::OptionFlags lstmOptions

poplar::OptionFlags matmulOptions

poplar::OptionFlags gclOptions

poplar::OptionFlags engineOptions

poplar::OptionFlags reportOptions

std::map<OpId, std::unique_ptr<Opx>> opxs

Public Static Functions

static std::string cycleCountStreamId(std::string id)

static void removeNonDependencyFreeCreators(std::vector<ICreatorCandidatePtr> &candidates)

static PopStreamId h2dId(TensorId)

static PopStreamId d2hId(TensorId, bool isAnchorStream)

static PopStreamId gradientStoreStreamId(TensorId id)

static PopStreamId gradientLoadStreamId(TensorId id)

static PopStreamId weightLoadStreamId(TensorId id)

#include <popart/popx/poptensors.hpp>

class PopTensors

Public Functions

PopTensors(const Ir&)

void insert(TensorId, const poplar::Tensor&)

void insertAliased(TensorId to, TensorId from)

void insertUnsafe(TensorId id, const poplar::Tensor &pt)

const poplar::Tensor &get(TensorId) const

const poplar::Tensor &getView(TensorId) const

bool hasViewChangers(TensorId) const

const ViewChangers &getViewChangers(TensorId)

void setViewChangers(TensorId, const ViewChangers &viewChangers)

bool contains(TensorId) const

const std::map<TensorId, std::shared_ptr<poplar::Tensor>> &getTensors() const

bool canAlias(TensorId, RequireParallelWritable requireParallelWritable) const

#include <popart/popx/popprograms.hpp>

class PopPrograms

Class for managing the complete set of programs that a Devicex can run.

A program in this context is the instance of the poplar::Program class which represents a control program that executes operations on the graph.

The state std::vector<poplar::program::Sequence> seqs contains all these programs, and is populated during IrLowering. The programs are passed to poplar::compileGraph to construct the executable (see IrLowering::getExecutable()).

Public Types

enum ProgramIndex

Values:

enumerator WeightsFromHost = 0

enumerator OptimizerFromHost

enumerator RandomSeedFromHost

enumerator RandomSeedToHost

enumerator RngStateFromHost

enumerator Program

enumerator RngStateToHost

enumerator WeightsToHost

enumerator CycleCountTensorToHost

enumerator CustomProgramsStart

enumerator N

enum class ProgramFragmentIndex

Values:

enumerator StreamWeightsFromHost = 0

enumerator StreamOptimizerFromHost

enumerator RandomSeedFromHost

enumerator RandomSeedToHost

enumerator RngStateFromHost

enumerator Init

enumerator PreForward

enumerator Forward

enumerator Backward

enumerator VarUpdateFromAccumulator

enumerator RngStateToHost

enumerator WeightsToHost

enumerator ToHostFinalCopy

enumerator CycleCountTensorToHost

enumerator N

enum class PipelineFragmentId

Values:

enumerator ToDeviceStream = 0

enumerator Main

enumerator ToHostStream

using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>

Public Functions

PopPrograms(IrLowering *ir_lowering_p_)

const poplar::program::Sequence &streamWeightsFromHostFragment() const

poplar::program::Sequence &streamWeightsFromHostFragment()

const poplar::program::Sequence &streamOptimizerFromHostFragment() const

poplar::program::Sequence &streamOptimizerFromHostFragment()

const poplar::program::Sequence &randomSeedFromHostFragment() const

poplar::program::Sequence &randomSeedFromHostFragment()

const poplar::program::Sequence &randomSeedToHostFragment() const

poplar::program::Sequence &randomSeedToHostFragment()

const poplar::program::Sequence &cycleCountTensorToHostFragment() const

poplar::program::Sequence &rngStateFromHostFragment()

const poplar::program::Sequence &rngStateFromHostFragment() const

poplar::program::Sequence &rngStateToHostFragment()

const poplar::program::Sequence &rngStateToHostFragment() const

poplar::program::Sequence &cycleCountTensorToHostFragment()

const poplar::program::Sequence &toHostFinalCopyFragment() const

poplar::program::Sequence &toHostFinalCopyFragment()

const poplar::program::Sequence &initFragment() const

poplar::program::Sequence &initFragment()

const poplar::program::Sequence &preForwardFragment() const

poplar::program::Sequence &preForwardFragment()

const poplar::program::Sequence &forwardFragment() const

poplar::program::Sequence &forwardFragment()

const poplar::program::Sequence &backwardFragment() const

poplar::program::Sequence &backwardFragment()

const poplar::program::Sequence &accumulateOuterFragment() const

poplar::program::Sequence &accumulateOuterFragment()

const poplar::program::Sequence &weightsToHostFragment() const

poplar::program::Sequence &weightsToHostFragment()

poplar::program::Sequence &forwardOrBackwardFragment(ScheduledPreLoss)

const std::vector<poplar::program::Program> progs() const

poplar::program::Sequence &programFragment(PopPrograms::ProgramFragmentIndex)

int getNumFragments(const Graph &graph) const

std::vector<poplar::program::Sequence> &scopeFragments(const Graph&)

poplar::program::Sequence &scopeFragment(const Graph&, SubgraphPartIndex subgraphPart)

bool containsFragments(const Graph &graph) const

bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const

void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)

std::vector<poplar::Function> &getFragmentFunctions(const Graph &graph, poplar::Graph &poplarGrpah)

poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart, poplar::Graph &poplarGraph)

std::vector<poplar::program::Sequence>::iterator recomputeFragment(OpId)

SequenceMap::SequenceInterval createRecomputeFragment(OpId)

bool hasBeenRecomputed(OpId, ExecutionPhase) const

void recordRecomputed(OpId, ExecutionPhase)

std::string getStrFromPipelineFragmentId(PipelineFragmentId) const

poplar::program::Sequence &pipelineFragment(PipelineStage, PipelineFragmentId, const std::string &desc)

poplar::program::Sequence &pipelineToDeviceStreamFragment(PipelineStage pipelineStage, const std::string &desc)

poplar::program::Sequence &pipelineMainFragment(PipelineStage, const std::string &desc)

poplar::program::Sequence &pipelineToHostStreamFragment(PipelineStage, const std::string &desc)

poplar::program::Sequence &pipelineIpuCopyFragment(const std::string &desc)

poplar::program::Sequence &namedBuffersCopyFragment()

void addPipelineCycle(PipelineInfo pInfo, PipelineCycle pCycle, poplar::program::Sequence &sq, std::ostringstream &ss) const

void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Add a vector of pairs {f, buffer} for a given graph id.

This is enough for a [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.

Parameters

pair – The graph id, FunctionBufferMappingType pair to add the functions and buffers for.
funcVec – The vector of functions and buffers.

inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Get the Function Buffers for the given GraphId and FunctionBufferMappingType.

Parameters

gid – The GraphId to lookup.
fbmt – The FunctionBufferMappingType to lookup.

Returns

FunctionBuffers the vector of functions and buffers.

inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)

Returns true if a functionBuffer vector exists for the given graphId and FunctionBufferMappingType.

Parameters

gid – The graph id to lookup.
fbmt – The FunctionBufferMappingType to lookup.

Returns

true If pairs exist.

Returns

false Otherwise.

unsigned addCustomProgram(const poplar::program::Program &program)

Add a custom program.

Parameters: program – Program to add
Returns: Index of the popart/poplar program

void createPipelineFunctions(): Turn pipeline sequences into callable pipeline functions.

poplar::program::Sequence getFullProgramFromPipelineFragments(bool fwdOnly) const

Return the program based on the pipeline fragments.

See docs/notes/transforms/pipelining.md#assemble-from-fragments for detailed explanation.

Returns: The program based on the pipeline fragments

Public Members

IrLowering *ir_lowering_p

Public Static Attributes

static const std::unordered_map<popef::ProgramFlow::ProgramIndexType, std::string> commonPrograms

#include <popart/popx/inittensor.hpp>

class ICreatorCandidate

Subclassed by popart::popx::InputCreatorCandidate, popart::popx::InputMultiCreatorCandidate

Public Functions

ICreatorCandidate()

virtual ~ICreatorCandidate() = default

virtual std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) = 0

virtual DnfTensorIds mustExistBeforeCreate() = 0

virtual double getMaxCreatorPriority() const = 0

virtual int64_t getNumElems() const = 0

virtual std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() = 0

virtual std::string str() = 0

virtual std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) = 0

virtual std::vector<popart::view::Region> unwind(popart::view::Region) = 0

virtual std::vector<popart::view::Region> unwind() = 0

virtual int64_t getScheduleIndex() const = 0

Public Static Functions

static bool greaterThan(ICreatorCandidatePtr, ICreatorCandidatePtr)

#include <popart/popx/replicatedtensorshardingbundle.hpp>

class ReplicatedTensorShardingBundle

Helper class to bundle all replicated tensor sharding related lowering information together.

Public Functions

ReplicatedTensorShardingBundle(const Ir &ir)

Construct empty replicated tensor sharding bundle Creates the replicatedTensorShardingTracer with the IR object.

Parameters: ir – IR to create the ReplicatedTensorShardingTracer with

bool hasCollectiveBalancedReorder(const TensorId &tensorId) const

Check whether a tensor has an associated CollectiveBalancedReorder.

Parameters: tensorId – TensorId to check
Returns: True if the tensor has an associated CollectiveBalancedReorder

std::shared_ptr<gcl::CollectiveBalancedReorder> getCollectiveBalancedReorder(const TensorId &tensorId) const

Get the associated CollectiveBalancedReorder of a tensor.

Throws an error if the tensor does not have one.

Parameters: tensorId – TensorId to return the CollectiveBalancedReorder for
Returns: Shared pointer to the associated CollectiveBalancedReorder

const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &tensorId) const

Get the host rearrangement method of a tensor.

Can be applied on the host-side tensor data to rearrange the data before upload or after download to/from the IPU

Parameters: tensorId – TensorId to return the CBR host rearrangement for
Returns: CBR host rearrangement method

void setCollectiveBalancedReorder(const TensorId &tensorId, CollectiveBalancedReorderId cbrId)

Associate an existing CollectiveBalancedReorder with a tensor.

Parameters

tensorId – TensorId to associate the CollectiveBalancedReorder with
cbrId – Identifier of an existing, registered CollectiveBalancedReorder obtained by registerCollectiveBalancedReorder

CollectiveBalancedReorderId registerCollectiveBalancedReorder(std::shared_ptr<gcl::CollectiveBalancedReorder> cbr)

Register a new collective balanced reorder method.

Parameters: cbr – GCL CollectiveBalancedReoder to register
Returns: Registered ID for the CollectiveBalancedReoder

inline const std::map<CollectiveBalancedReorderId, std::shared_ptr<gcl::CollectiveBalancedReorder>> &getCollectiveReorders() const

Returns

inline const ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer() const

Returns: Tracer to resolve replicated tensor sharding groups

inline ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer()

Returns: Tracer to resolve replicated tensor sharding groups

inline const std::map<TensorId, CollectiveBalancedReorderId> &getCollectiveReorderIds() const

Get mapping to resolve which CollectiveBalancedReorder has to be applied to a tensor to restore the original data order.

Returns: Mapping of all tensors and their associated CollectiveBalancedReorderId

#include <popart/popx/linearmapper.hpp>

class LinearMapper

Public Functions

void mapTensor(poplar::Graph &graph, poplar::Tensor &tensor)

14.8. Ops

14.8.1. Op definition for PopART IR

#include <popart/op.hpp>

class Op : public popart::Vertex

Parent class for the concrete Op implementations.

The poplar implementation which the op represents can be found in the corresponding popx::Opx class, and will be lowered to poplar.

See also

Custom ops in the PopART User Guide.

Subclassed by popart::AbortOp, popart::AbsGradOp, popart::AdaDeltaUpdaterOp, popart::AdamUpdaterOp, popart::AddBiasOp, popart::AllReduceOp, popart::ArgExtremaOp, popart::AveragePoolGradOp, popart::BaseOnnxRNNGradOp, popart::BaseOnnxRNNOp, popart::BasePadOp, popart::BaseSliceOp, popart::BaseSortOp, popart::BatchNormGradOp, popart::BatchNormOp, popart::BinaryComparisonOp, popart::BoundaryOp, popart::BucketizeOp, popart::CastOp, popart::CastThenPow2ScaleOp, popart::CollectivesBaseOp, popart::ConcatGradOp, popart::ConcatOp, popart::ConvFlipWeightsOp, popart::ConvTransposeOp, popart::CoshOp, popart::CtcBeamSearchDecoderOp, popart::CtcGradOp, popart::CumSumGradOp, popart::CumSumOp, popart::DynamicBaseOp, popart::ElementWiseBinaryBaseOp, popart::ElementWiseBinaryGradOp, popart::ElementWiseNonLinearUnaryGradOp, popart::ElementWiseUnaryBooleanOp, popart::ElementWiseUnaryOp, popart::ExchangeBaseOp, popart::ExpandGradOp, popart::ExpandOp, popart::ExpGradOp, popart::Expm1GradOp, popart::GatherGradOp, popart::GatherOp, popart::GetRandomSeedOp, popart::GlobalAveragePoolGradOp, popart::GlobalAveragePoolOp, popart::GlobalMaxPoolGradOp, popart::GlobalMaxPoolOp, popart::GroupNormGradOp, popart::GroupNormOp, popart::HasReceptiveFieldOp, popart::HistogramOp, popart::IdentityLossGradOp, popart::IfOp, popart::InitOp, popart::InstanceNormGradOp, popart::InstanceNormOp, popart::InternalCodeCopyOp, popart::IoTileCopyOp, popart::IpuCopyOp, popart::L1GradOp, popart::LambSquareOp, popart::LeakyReluGradOp, popart::LogSoftmaxGradOp, popart::LossOp, popart::LossScaleUpdateOp, popart::LRNGradOp, popart::LRNOp, popart::MatMulBaseOp, popart::MaxPoolGradOp, popart::ModifyRandomSeedOp, popart::MultiConvBaseOp, popart::MultiConvDataGradBaseOp, popart::MultiConvWeightsGradBaseOp, popart::NllGradOp, popart::NlllWithSoftmaxGradDirectOp, popart::NormalizeImageOp, popart::OnehotGradOp, popart::OnehotOp, popart::PackedDataBlockOp, popart::ParameterizedOp< TDerivedOp, TOpParams >, popart::PlaceholderOp, popart::PopartLSTMGradOp, popart::PopartLSTMOp, popart::Pow2ScaleThenCastOp, popart::ReduceGradOp, popart::ReduceOp, popart::ReluGradOp, popart::ReshapeBaseOp, popart::ResizeOp, popart::RestoreOp, popart::ReverseBaseOp, popart::RMSPropUpdaterOp, popart::RoiAlignGradOp, popart::RoiAlignOp, popart::ScaledAddOp, popart::ScatterDataGradOp, popart::ScatterReduceGradOp, popart::ScatterReduceOp, popart::ScatterUpdateGradOp, popart::SequenceSliceOp, popart::SGD1NesterovOp, popart::ShapeOrLikeOp, popart::SigmoidGradOp, popart::SoftmaxGradDirectOp, popart::SoftmaxGradOp, popart::SplineBasisOp, popart::SplineWeightingOp, popart::SplitGradOp, popart::SplitOp, popart::SqrtGradOp, popart::StashOp, popart::SubgraphOp, popart::SubsampleBaseOp, popart::SubsampleGradOp, popart::SyncOp, popart::TanhGradOp, popart::TensorRemapOp, popart::TileOp, popart::TopKGradOp, popart::TransposeBaseOp, popart::UpsampleOp, popart::VariadicGradOp, popart::VariadicOp, popart::VarUpdateOp, popart::WhereOp, popart::WhereXGradOp, popart::WhereYGradOp

Public Types

using SubgraphInSig = std::tuple<Op*, fwtools::subgraph::OutIndex, std::string>: The functionality required for sub-graph matching.

Public Functions

inline Settings &getSettings()

Get the settings associated with the op.

Returns: The op settings.

inline const Settings &getSettings() const

Get the settings associated with the op.

Returns: The op settings.

virtual Settings getInSettings(InIndex) const

Return suitable settings for an op inserted before the input to an existing op.

Parameters: InIndex – The input index before which the op is inserted.
Returns: The settings for the op inserted before the input index.

virtual Settings getOutSettings(OutIndex) const

Return suitable settings for an op inserted after the output to an existing op.

Parameters: OutIndex – The output index after which the op is inserted.
Returns: The settings for the op inserted after the output index.

Settings adjustInSettings(InIndex, Op::Settings) const

Adjust the settings to be suitable as input at the input index.

Parameters

InIndex – The input index where the settings are to be applied.
Settings – The settings to be adjusted.

Returns

Adjusted settings suitable for input at the input index.

Settings adjustOutSettings(InIndex, Op::Settings) const

Adjust the settings to be suitable as output at an output index.

Parameters

OutIndex – The output index where the settings are to be applied.
Settings – The settings to be adjusted.

Returns

Adjusted settings suitable for output at the output index.

const OptionalVGraphId getOptionalVGraphId() const

Get the ID of the optional virtual graph.

Returns: The ID of the optional virtual graph.

VGraphId getVirtualGraphId() const

Get the ID of the virtual graph.

Returns: The ID of the virtual graph.

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex) const

Get virtual graph ID and tile set associated with an input index.

Parameters: InIndex – The input index.
Returns: The virtual graph ID and tile set at the input index.

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex) const

Get virtual graph ID and tile set associated with an output index.

Parameters: OutIndex – The output index.
Returns: The virtual graph ID and tile set at the output index.

virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const

Get virtual graph ID and tile set associated with an input index.

Parameters

InIndex – The input index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the input index.

virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const

Get virtual graph ID and tile set associated with an output index.

Parameters

OutIndex – The output index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the output index.

void setVirtualGraphId(const OptionalVGraphId)

Set a virtual graph ID for the op.

Parameters: OptionalVGraphId – The ID of the virtual graph to set on this op.

bool hasVirtualGraphId() const

Check if the op has a virtual graph ID set.

Returns: true if the op has a virtual graph ID set, false otherwise.

void setPipelineStage(OptionalPipelineStage)

Set a pipeline stage for the op.

Parameters: OptionalPipelineStage – The pipeline stage to be set for the op.

bool hasPipelineStage() const

Check if the op has a pipeline stage set.

Returns: true if the op has a pipeline stage set, false otherwise.

PipelineStage getPipelineStage() const

Get the pipeline stage that has been set for the op.

Returns: The pipeline stage that has been set for the op.

OptionalPipelineStage getOptionalPipelineStage() const

Get the optional pipeline stage.

Returns: The optional pipeline stage that has been set for the op.

const OptionalExecutionPhase getOptionalExecutionPhase() const

Get the optional execution phase.

Returns: The optional execution phase that has been set for the op.

virtual ExecutionPhase getExecutionPhase() const

Get the execution phase that has been set for the op.

Returns: The execution phase that has been set for the op.

void setExecutionPhase(const OptionalExecutionPhase)

Set the execution phase for the op.

Parameters: OptionalExecutionPhase – The execution phase to be set for the op.

bool hasExecutionPhase() const

Check if the op has an execution phase set.

Returns: true if the op has a execution phase set, false otherwise.

const OptionalBatchSerializedPhase getOptionalBatchSerializedPhase() const

Get the optional batch serialized phase.

Returns: The optional batch serialized phase that has been set for the op.

virtual BatchSerializedPhase getBatchSerializedPhase() const

Get the batch serialized phase.

Returns: The batch serialized phase that has been set for the op.

void setBatchSerializedPhase(const OptionalBatchSerializedPhase)

Set the batch serialized phase.

Parameters: OptionalBatchSerializedPhase – The batch serialized phase to be set for the op.

bool hasBatchSerializedPhase() const

Check if the op has a batch serialization phase set.

Returns: true if the op has a batch serialization phase set, otherwise false.

const OptionalStochasticRoundingMethod getOptionalStochasticRoundingMethod() const

Get the optional stochastic rounding method.

Returns: The optional stochastic rounding method that has been set for the op.

virtual StochasticRoundingMethod getStochasticRoundingMethod() const

Get the stochastic rounding method.

Returns: The stochastic rounding method that has been set for the op.

void setStochasticRoundingMethod(const OptionalStochasticRoundingMethod)

Set the optional stochastic rounding method.

Parameters: OptionalStochasticRoundingMethod – The optional stochastic rounding method to be set for the op.

bool hasStochasticRoundingMethod() const

Check if the op has a stochastic rounding method set.

Returns: true if the op has a stochastic rounding method set, otherwise false.

bool isExcludedFromPattern(const Pattern*) const

Check if the op is excluded from a pattern.

Returns: true if the op is excluded from a pattern, false otherwise.

inline virtual int getInBatchAxis(InIndex) const

Get the batch axis for the input index.

Returns: The batch axis for the input index.

inline virtual int getOutBatchAxis(OutIndex) const

Get the batch axis for the output index.

Returns: The batch axis for the output index.

void inheritPlacementAttributes(bool inheritSerializations, AliasModel &aliasModel)

Helper function to set an op’s placement attributes by inheriting them from other ops in the graph.

The attributes that are set include:

Execution context.
Pipeline stage.
Execution phase.
Virtual graph ID.
Batch serial phase (optional).

Parameters

inheritSerializations – The indicator to enable or disable the batch serialization phase. true enables the batch serialization phase and false disables it.
aliasModel – An AliasModel object containing alias info for this op’s graph.

Ir &getIr()

Get the IR associated with the op.

Returns: The IR associated with the op.

const Ir &getIr() const

Get the IR associated with the op.

Returns: The IR associated with the op.

inline Graph &getGraph()

Get the graph associated with the op.

Returns: The graph associated with the op.

inline const Graph &getGraph() const

Get the graph associated with the op.

Returns: The graph associated with the op.

inline const Scope &getScope() const

Get the scope associated with the op.

Returns: The scope associated with the op.

inline void setScope(const Scope &scope)

Get the scope associated with the op.

Returns: The scope associated with the op.

inline const std::string &getName() const

Get the name of the op.

Returns: The name of the op.

inline void setName(const std::string &name)

Get the name of the op.

Returns: The name of the op.

inline const OpDebugInfo &getDebugInfo() const

Get the debug info of the op.

Returns: The debug info for the op.

virtual bool isNorm() const

Checks if the op is a norm op.

Returns: true if the op is a norm op, false otherwise.

bool isElementWiseUnary() const

Checks if the op is an element-wise unary op.

Returns: true if the op is an element-wise unary op, false otherwise.

virtual bool canBeReplacedByIdentity() const

Check if the op can be replaced by the identity op.

Returns: true if the op and be replaced by the identity op, false otherwise.

Op(const OperatorIdentifier &_opid, const Op::Settings &settings)

Constructor of the Op class.

Parameters

_opid – The operator identifier specifying domain:type:version, minimum and maximum number of input tensors and number of output tensors.
settings – The general op settings such as graph, name and scope.

Op(const Op&): Copy constructor.

Note

This does NOT copy input and output.

Op &operator=(const Op&) = delete

virtual ~Op(): Destructor.

std::string str() const final: Return the op ID.

std::string debugName() const: Return the op name that is used for debug and profiling.

void createAndConnectOutTensor(OutIndex, TensorId)

Create an ActGrad (output) tensor and connect it to this op’s output.

Parameters

OutIndex – The output index that the output tensor should be connected to.
TensorId – The tensor ID of the tensor to be converted to an output tensor.

void append(std::stringstream &ss) const

Append this op to a stream.

Parameters: ss – The stream to append the op to.

void toJSON(std::stringstream &ss) const

Convert this op to JSON format and append it to a stream.

Parameters: ss – The stream to append the JSON-serialised op to.

int64_t memOfOutputs() const: Return the total memory of used by all output tensors.

inline virtual std::set<InIndex> optionalInputs() const: Return the input indices of all optional inputs to the op.

void defaultConnectInTensor(InIndex, TensorId)

Connect a tensor to an input index.

This method updates the input and updates consumers of the tensor with the tensor ID.

Parameters

InIndex – The input index to connect the tensor to.
TensorId – The tensor ID of the tensor to connect.

virtual void connectInTensor(InIndex index, TensorId tensorId)

Connect existing tensor to input index.

Parameters

index – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.

virtual void connectInTensor(InIndex inIndex, TensorId tensorId, VGraphId vgid)

Connect an existing tensor to an index with the source virtual graph.

Parameters

inIndex – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.
vgid – The virtual graph on which the existing tensor resides.

void connectInTensorDispatch(InIndex inIndex, TensorId tensorId)

Connect an existing tensor at an index with the source virtual graph.

Dispatcher to resolve issues with templated inheritance overloads. This will automatically derive the virtual graph ID of the input when required.

Parameters

inIndex – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.

void connectInTensorLike(const Op *other, InIndex index, TensorId tenId)

Connects the input tensor analogously to another op.

This is useful when cloning graphs or ops, because it avoids having to check if the op requires special considerations when connecting inputs.

IpuCopyOp is currently the only op where this applies, since a source virtual graph has to be specified when connecting it otherwise:

void connectInTensor(InIndex, TensorId, uint64_t sourceIpu);

Parameters

other – An op of the same type as the current op, from which to copy how the tensor at the corresponding index should be connected.
index – The input index to connect.
tenId – The ID of the tensor to connect.

void connectOutTensor(OutIndex, TensorId)

Connect existing tensor to output index.

Parameters

index – The output index at which to connect the tensor.
tensorId – The ID of the existing tensor.

void disconnectInTensor(Tensor *tensor)

Disconnect an input tensor from the op.

Parameters: tensor – The tensor to disconnect.

virtual void disconnectInTensor(InIndex, Tensor *tensor)

Disconnect an input tensor from the op at a specific input index.

Parameters

tensor – The tensor to disconnect.
InIndex – The index of the input tensor in the op.

void disconnectInTensor(InIndex)

Disconnect an input tensor from the input index.

Parameters: InIndex – The input index to disconnect the tensor from.

void disconnectOutTensor(Tensor *tensor)

Disconnect an output tensor from the op.

Parameters: tensor – The tensor to disconnect.

void disconnectAllInputs(): Disconnect all input tensors from the op.

void disconnectAllOutputs(): Disconnect all output tensors from the op.

const std::string &name() const: Return the op name.

virtual void setup()

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

void finalizeDebugInfo()

Finalize DebugInfo.

This method is called once after Ir::prepare() has completed.

virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)

Set information about the gradient graphs for this op’s called subgraphs.

If the op has called subgraphs, then this method will get called prior to getGradOps() to provide the op with the information it needs to call the grad version of the called subgraphs.

Parameters: calledGraphsGradInfo – The mapping between the forward graph and information on the gradient graph.

virtual std::vector<std::unique_ptr<Op>> getGradOps()

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual void growAliasModel(AliasModel &aliasModel) const

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier) const

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

virtual view::Regions modifies(InIndex) const

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

virtual view::Regions uses(InIndex) const

Return the input region which this op uses.

Parameters: InIndex – The input index.
Returns: The regions which this op uses.

virtual view::Regions aliases(InIndex, OutIndex) const

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters

InIndex – The input index.
OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters

aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

bool doesAlias() const

Check if any input tensor aliases any output tensor .

Returns: true if any input tensor aliases any output tensor, otherwise false.

inline bool isOutplace() const

Check if this is an outplace op.

This means that no input tensor aliases any output tensor.

Returns: true if this is an outplace op, otherwise false.

bool doesAlias(InIndex inIndex, OutIndex outIndex) const

Check that the input tensor at an input index aliases the output tensor at an output index.

Returns: true if the input tensor at inIndex aliases the output tensor at outIndex, false otherwise.

bool modifies() const

Check if op modifies a tensor at any index.

Returns: true if the op modifies a tensor at any index, otherwise false.

bool modifiesIndex(InIndex in) const

Check if an op modifies a tensor at a specific index.

Parameters: in – The input index to check.
Returns: true if the op modifies the tensor, false otherwise.

bool overwritesTensor(Tensor *t) const

Check if an op overwrites a tensor.

Parameters: t – The tensor to check.
Returns: true if it overwrites the tensor, false otherwise.

bool modifiesTensor(Tensor *t) const

Check if an op modifies a tensor.

Parameters: t – The tensor to check.
Returns: true if it modifies the tensor, false otherwise.

inline virtual bool isInplaceViewChange() const

Check if this is an inplace op that changes a view.

Examples of inplace ops that change views are:

ReshapeInplaceOp
IdentityInplaceOp
TransposeInplaceOp.

See also

For more information on views, refer to the IPU Programmer’s Guide.

Returns: true if this is a view changing inplace op, false otherwise.

inline virtual bool isOutplaceViewChange() const

Check if this is an outplace op that changes a view.

Examples of outplace ops that change views are:

ReshapeOp
IdentityOp
TransposeOp.

See also

For more information on views, refer to the IPU Programmer’s Guide.

Returns: true if this is a view changing outplace op, otherwise false.

virtual int getNonGradInIndex(int gradOpOutIndex) const

Return the index in the non-grad op which has an output edge-gradient tensor in the matching grad op.

This method throws an error if the op this is called on is not a grad op.

Parameters: gradOpOutIndex – The index at which the grad op has an output of an edge-gradient tensor.
Returns: The index in the non-grad op containing the input tensor corresponding to the edge-gradient tensor in the grad op output.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns: The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

virtual std::unique_ptr<Op> clone() const = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

template<typename T> inline bool isConvertibleTo() const

virtual bool isLossOp() const

Check if this is a LossOp op, for example NllOp or L1Op.

Note

The op SumOp which adds the losses together is not a LossOp.

Returns: true if this is a LossOp op, false otherwise.

virtual bool isIpuCopyOp() const

Check if this is an IpuCopyOp op.

Returns: true if this is an IpuCopyOp op, false otherwise.

virtual bool copiesOptimizerTensors() const

Check if this copies only optimizer tensors from one IPU to another.

Returns: true if this op copies only optimizer tensors from one IPU to another, false otherwise.

virtual bool isOptimizerOp() const: Check if op is part of the optimizer.

bool isGradientClippingOp() const: Check if op is a part of gradient clipping.

virtual bool requiresRandomSeed() const

Check if the op requires a random seed.

This is set to falseby default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns: true if the op requires a random seed, false otherwise.

virtual InIndex getSeedInIndex() const

Check if the op requires a random seed.

This is set to false by default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns: true if the op requires a random seed, false otherwise.

bool hasInput(InIndex index) const

Check if the op has an input at the input index.

Returns: true if the op has an input at the input index, otherwise false.

bool hasOutput(OutIndex index) const

Check if the op has an output at the output index.

Returns: true if the op has an output at the output index, otherwise false.

Tensor *inTensor(InIndex index)

Get the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor at the input index.

const Tensor *inTensor(InIndex index) const

Get the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor at the input index.

Tensor *outTensor(OutIndex index)

Get the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor at the output index.

const Tensor *outTensor(OutIndex index) const

Get the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor at the output index.

TensorId inId(InIndex index)

Get the ID of the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor ID of the tensor at the input index.

const TensorId inId(InIndex index) const

Get the ID of the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor ID of the tensor at the input index.

TensorId outId(OutIndex index)

Get the ID of the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor ID of the tensor at the output index.

const TensorId outId(OutIndex index) const

Get the ID of the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor ID of the tensor at the output index.

TensorInfo &inInfo(InIndex index)

Get the info of the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor info of the tensor at the input index.

const TensorInfo &inInfo(InIndex index) const

Get the info of the input tensor at the input index.

Parameters: index – The input index.
Returns: The tensor info of the tensor at the input index.

TensorInfo &outInfo(OutIndex index)

Get the info of the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor info of the tensor at the output index.

const TensorInfo &outInfo(OutIndex index) const

Get the info of the output tensor at the output index.

Parameters: index – The output index.
Returns: The tensor info of the tensor at the output index.

const Shape &inShape(InIndex index) const

Get the shape info of the input tensor at the input index.

Parameters: index – The input index.
Returns: The shape info of the tensor at the input index.

const Shape &outShape(OutIndex index) const

Get the shape info of the output tensor at the output index.

Parameters: index – The output index.
Returns: The shape info of the tensor at the output index.

size_t inTensorCount() const

Get the number of input tensors of this op.

Returns: The number of input tensors this op has.

size_t outTensorCount() const

Get the number of output tensors of this op.

Returns: The number of output tensors this op has.

Rank inRank(InIndex index) const

Get the rank of the input tensor at the input index.

Parameters: index – The input index.
Returns: The rank of the tensor at the input index.

Rank outRank(OutIndex index) const

Get the rank of the output tensor at the output index.

Parameters: index – The output index.
Returns: The rank of the tensor at the output index.

InIndex inIndex(Tensor*) const

Get the input index of the tensor.

Parameters: Tensor – The input tensor.
Returns: The input index of the tensor in the op.

OutIndex outIndex(Tensor*) const

Get the output index of the tensor.

Parameters: Tensor – The output tensor.
Returns: The output index of the tensor in the op.

virtual void appendAttributes(OpSerialiserBase&) const

Append attributes when serialising the op to a stream.

This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendOutlineAttributes(OpSerialiserBase&) const

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendMore(OpSerialiserBase&) const

Append additional attributes to the stream.

This method should be overridden if the derived class has additional attributes.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

Shape prettyNpOut(const Shape &s0, const Shape &s1) const

Calculate the NumPy broadcast shape for two shapes.

This will throw an error if the broadcast is not aligned. The error will have operator context. Note: If the replicated tensor sharding meta-shape is required, use prettyNpOut with TensorInfo instead.

Parameters

s0 – The first shape.
s1 – The second shape.

Returns

The NumPy-like broadcasted output shape.

TensorInfo prettyNpOut(const TensorInfo &i0, const TensorInfo &i1, bool checkDataType = true) const

Calculate the NumPy broadcast shape for two shapes.

This will throw an error if the broadcast is not aligned. The error will have operator context.

Parameters

i0 – The info for the first tensor containing shape and meta-shape.
i1 – The info for the second tensor containing shape and meta-shape.
checkDataType – Check that the data types are identical. If true, check that the data types are identical and throw an error if they are not. If false, do not check that data types are identical.

Returns

The NumPy-like broadcast output info containing the correct shape and meta-shape. The data type is taken from i0.

virtual std::vector<const Graph*> getCalledGraphs() const

Get all graphs that this op may call during its execution.

Returns: A vector of all graphs that this op may call during its execution.

std::vector<GraphId> getCalledGraphIds() const

Get the IDs of all graphs that this op may call during its execution.

Returns: A vector of IDs of all graphs that this op may call during its execution.

SubgraphIndex getCalledGraphIndex(const GraphId &id) const

Get the index in the op where the graph is called.

Parameters: id – The ID of the called graph.
Returns: The index at which the graph is called.

virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const

Get the input index for the subgraph corresponding to the op input index.

Parameters

subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
inIndex – The input index in the op.

Returns

The input index in the subgraph that corresponds to the input index in the op, or -1 if the op input index is not used by the subgraph.

virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const

Get the input index for the op corresponding to the subgraph input index.

Parameters

subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
inIndex – The input index in the subgraph.

Returns

The input index in the op that corresponds to the input index in the subgraph, or -1 if the subgraph input index is not used by the op.

virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const

Get the output index for the subgraph corresponding to the op output index.

Parameters

subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
outIndex – The output index in the op.

Returns

The output index in the subgraph that corresponds to the output index in the op, or -1 if the op output index is not used by the subgraph.

virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const

Get the output index for the op corresponding to the subgraph output index.

Parameters

subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
outIndex – The output index in the subgraph.

Returns

The output index in the op that corresponds to the output index in the subgraph, or -1 if the subgraph output index is not used by the op.

virtual std::set<OutIndex> opInToOpOutIndex(InIndex in) const

Get the the set of outputs to visit based on the input index (for graph traversal).

Parameters: in – The input index used to determine the set of outputs to visit.
Returns: The set of outputs to visit based on the input index.

virtual std::set<InIndex> opOutToOpInIndex(OutIndex out) const

Get the the set of inputs to visit based on the output index (for graph traversal).

Parameters: out – The output index used to determine the set of inputs to visit.
Returns: The set of inputs to visit based on the output index.

std::string getSubgraphEquivId(const std::map<std::string, popart::any> &externalAttrs = {}) const

Get a string that represents the equivalence class that this op belongs to.

This is used by, for example transforms, to determine if two ops are the same. If and only if two ops return the same equivalence ID then those ops can be considered of the same equivalence class.

Parameters: externalAttrs – Additional attributes by which to distinguish this op. The value types must be one of: float, double, int, int64_t, uint32_t, uint64_t, std::string, std::vector<float>, std::vector<double>, std::vector<int64_t>, popart::Scope, bool, nonstd::optional<int64_t>, nonstd::optional<float>, nonstd::optional<double> or std::map<TensorId, uint64_t>. We use this to add, for example replica-equalness properties to the equivalence ID, which is a property that is calculated on-the-fly as opposed to stored in the op.
Returns: The equivalence ID.

std::map<fwtools::subgraph::InIndex, SubgraphInSig> getSubgraphInputs() const

Get all the producer ops of the tensors consumed at the input index.

Returns: A map of producer ops for the tensors consumed at the input index.

std::map<fwtools::subgraph::OutIndex, OpSet> getSubgraphOutputs() const

Get all the consumer ops of the tensors produced at the output index.

Returns: A map of consumer ops for the tensors produced at the output index.

virtual float getSubgraphValue() const = 0

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

inline float getHighSubgraphValue() const: Return the high subgraph value.

inline float getLowSubgraphValue() const: Return the low subgraph value.

virtual float calcAutoVirtualGraphCost(std::set<int> &inputs_seen): Get approximate cost of activations between forward and backward graphs.

virtual bool isOutlineable() const

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns: true if the op can be outlined, false otherwise. Default: true.

virtual bool hasSideEffect() const

Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.

Returns: true if the op has side effects, false otherwise. Default=false.

virtual bool canRecompute() const

Check if the op can be recomputed.

To recompute an op means to clone it to produce the same output. The function checks the safeness of recompute in the context of explicit recompute. It may still be unsafe for implicit recompute.

Returns: true if the op can be recomputed, false otherwise. Default: hasSideEffect().

bool inputsUnmodifiable() const

Check if any input indices are unmodifiable or alias an unmodifiable tensor.

Returns: true if any connected variable tensor for all input indices has a non-empty alias chain and is unmodifiable, false otherwise.

bool consumesGraphOutput() const

Check if op consumes the outputs of the graph.

Returns: true if op consumes graph outputs, false otherwise.

bool producesGraphOutput() const

Check if op produces the outputs of the graph.

Returns: true if op produces graph outputs, false otherwise.

bool inputUnmodifiable(InIndex in) const

Check if the input index is unmodifiable or aliases an unmodifiable tensor.

Parameters: in – The input index to check.
Returns: true if any connected variable tensor has a non-empty alias chain and is unmodifiable, false otherwise.

bool inputUnmodifiableFor(InIndex in, const AliasModel *popMem) const

Check if the input index is unmodifiable or aliases an unmodifiable tensor with given poprithm graph.

Parameters: in – The input index to check.
Returns: true if any connected variable tensor has a non-empty alias chain and is unmodifiable, false otherwise.

bool hasAliasedModifiers(OutIndex out) const

Check if output is modified by any consumer.

Parameters: out – The output index to check.
Returns: true if any consumer of any aliased tensor downstream modifies a non-empty region, false otherwise.

bool hasAliasedModifiersFor(OutIndex out, const AliasModel *popMem) const

Check if output is modified by any consumer with the given poprithm graph.

Parameters: out – The output index to check.
Returns: true if any consumer of any aliased tensor downstream modifies a non-empty region, false otherwise.

bool isParentOf(const Op*) const

Check if the graph is a parent of the op.

A graph is a parent of an op if and only if the op is a child of the graph.

Parameters: 1 – The op that is being checked.
Returns: true if the graph is a parent graph, false otherwise.

bool isChildOf(const Op*) const

Check if the graph is a child graph.

A graph is a direct child of an op if the graph consumes any of the tensors the op produces.

Parameters: 1 – The op that is being checked.
Returns: true if the graph is a child graph, false otherwise.

virtual bool canShard() const

Check if the operation can be sharded into multiple operations.

Returns: true if the operation can be sharded, false otherwise.

virtual ReductionType getShardReductionType(OutIndex index) const

Get the reduction type to apply after sharding, if the output shape does not change.

Parameters: index – The output index at which to determine the reduction type.
Returns: The reduction type.

inline virtual float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const

Get the scale factor to apply after sharding, if required.

Parameters

shardedOp – The sharded op.
index – The output index at which to determine the scale factor.

Returns

The scale factor. Default:1.0.

std::map<TensorId, std::vector<TensorId>> shard(const std::map<TensorId, std::vector<TensorId>> &inputs)

Shard an operation into multiple operations according to the new, already sharded input tensors.

Parameters: inputs – The sharded input tensors.
Returns: The sharded output tensors.

ShardingPlan shard(const ShardingPlan plan)

Create an output sharding plan from sharding an op.

The sharding plan also contains the individual input/output shards of an operation. When sharding an operation, the new plan is updated with the resulting sharded tensors.

Parameters: plan – The input sharding.
Returns: The plan after sharding the operation containing the resulting sharded tensors.

virtual void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const

Configure a sharded op.

Parameters

shardedOp – The sharded op to be configured.
settings_ – The settings to apply to the sharded op.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const: Return which inputs and outputs are replicated tensor sharding pairs.

virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain)

Configure the op for replicated tensor sharding at specific indices.

Parameters

indices – The indices at which to configure the op for replicated tensor sharding.
shardingDomain – The type and size of the replica group specified by a CommGroup object.

virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping)

Configure the op for replicated tensor sharding at specific indices.

Parameters

indices – The indices at which to configure the op for replicated tensor sharding.
grouping – The stride and size of the replica group specified by a ReplicaGrouping object.

void transferBaseProperties(Op *to)

Transfer the base properties from another op to this op.

Parameters: to – The op to transfer the base properties from.

Op *getPrecedingOp(InIndex inIndex)

Get the producer op of the input tensor at the input index.

Parameters: inIndex – The index at which the input tensor is produced.
Returns: The op which produces the input tensor at the input index.

Op *getFollowingOp(OutIndex outIndex = 0)

Get the op that consumes an output tensor at an output index.

This will throw an error if there is more than one consumer op.

Parameters: outIndex – The index at which the output tensor is consumed.
Returns: The op which consumes the output tensor at the output index.

std::vector<Op*> getFollowingOps(OutIndex outIndex = 0)

Get all ops that consume an output tensor at an output index.

Parameters: outIndex – The index at which the output tensor is consumed.
Returns: A vector of ops which consume the output tensor at the output index.

template<typename T> inline T *getPrecedingOp(InIndex inIndex)

Get the producer op of the input tensor at the input index.

This will throw an error if the producer op cannot be converted to type T.

Parameters: inIndex – The index at which the input tensor is produced.
Returns: The op, converted to type T, which produces the input tensor at the input index.

template<typename T> inline T *getFollowingOp(OutIndex outIndex = 0)

Get the op that consumes an output tensor at an output index.

This will throw an error if there is more than one consumer op, or if the consumer op cannot be converted to type T.

Parameters: outIndex – The index at which the output tensor is consumed.
Returns: The op, converted to type T, which consumes the output tensor at the output index.

template<typename T> inline std::vector<T*> getFollowingOps(OutIndex outIndex = 0)

Get all ops that consume an output tensor at an output index.

This will throw an error if not all of the consumer ops can be converted to type T.

Parameters: outIndex – The index at which the output tensor is consumed.
Returns: A vector of ops, converted to type T, which consume the output tensor at the output index.

bool isPipelineIpuCopyOp() const

Check if the op is of the class IpuCopyOp that copies between pipeline stages.

Returns: true if op is of the class IpuCopyOp and copies between pipeline stages, false otherwise.

Public Members

std::unique_ptr<TensorIndexMap> input

std::unique_ptr<TensorIndexMap> output

OpId id = {-1}

OperatorIdentifier opid

bool pruneable = true

Settings settings

OpDebugInfo debugInfo

struct Settings

Structure to capture the settings for the op.

Public Functions

inline Settings(Graph &graph_, const std::string &name_)

Constructor for the Settings structure.

Parameters

graph_ – The graph the op belongs to.
name_ – The name of the op.

inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_)

Constructor for the Settings structure.

Parameters

graph_ – The graph the op belongs to.
name_ – The name of the op.
scope_ – The scope of the op.

inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_, const uint64_t parentId_)

Constructor for the Settings structure.

Parameters

graph_ – The graph the op belongs to.
name_ – The name of the op.
scope_ – The scope of the op.
parentId_ – The ID of the debug info.

inline Settings(Graph &graph_, const std::string &name_, const uint64_t parentId_)

Constructor for the Settings structure.

Parameters

graph_ – The main graph.
name_ – The name of the op.
parentId_ – The ID of the debug info.

virtual ~Settings() = default: Destructor for the Settings structure.

Settings(const Settings&) = default

inline Settings copy(const std::string &new_name)

Create a copy of the current settings with a new name.

Parameters: new_name – The name of the new settings.
Returns: A copy of the current settings with the new name.

virtual void setFromAttributes(const Attributes &attributes)

Append the optional attributes to the Settings structure depending on whether the attribute has been set in the ONNX model.

Parameters: attributes – The attributes to be added to the Settings structure.

Ir &getIr() const

Get the IR associated with the main graph.

Returns: The IR associated with the main graph.

Public Members

std::reference_wrapper<Graph> graph

std::string name = ""

Scope scope

RecomputeType recomputeType = RecomputeType::Undefined 

OptionalTensorLocation tensorLocation

std::vector<std::tuple<std::string, float>> inplacePriorityVeto

std::unordered_set<std::string> excludePatterns

OptionalVGraphId vgraphId

OptionalPipelineStage pipelineStage

OptionalExecutionPhase executionPhase

OptionalBatchSerializedPhase batchSerializedPhase

OptionalStochasticRoundingMethod stochasticRoundingMethod

TileSet tileSet = {TileSet::Compute}

ExecutionContext executionContext = {ExecutionContext::Normal}

std::map<InIndex, InIndex> inferTensorMappingToFrom

double schedulePriority = {0.0}

std::map<std::string, std::string> extraOutlineAttributes

uint64_t debugInfoId = {0}

bool optimizerOp = {false}

bool gradientClippingOp = {false}

class GradInOutMapper

Class that represents the mapping between the indices of the input tensors to the gradient operation and the indices of these same tensors in the non-gradient operation.

Public Functions

GradInOutMapper(InIndex iGrad_, int iNonGrad_, GradOpInType)

Constructor for the GradInOutMapper class.

Parameters

iGrad_ – The index of the input tensor to the gradient operation.
iNonGrad_ – The index of the gradient operation input tensor as it is indexed in the non-gradient operation.
GradOpInType – The type of the input tensor to the gradient operation.

bool operator==(const GradInOutMapper &rhs) const

Check if the current GradInOutMapper object is equal to another GradInOutMapper object.

Parameters: rhs – A GradInOutMapper object to be compared to the current object.
Returns: true if objects are equal, false otherwise.

Public Members

InIndex iGrad

int iNonGrad

GradOpInType type

enum class popart::ReductionType

Define the reduction operation to use over a sequence of tensors.

The two use-cases for this enum type are:

denoting how to reduce individual losses produced by a LossOp over a minibatch (specified by the LossOp reduction parameter)
denoting how to reduce weight gradients over a number of replicas when gradient accumulation is enabled (specified by the global session option SessionOptions::accumulationAndReplicationReductionType).

Values:

enumerator Sum = 0: Sum the input values and do not scale the output (Default).

enumerator Mean: Take the mean of the input values.

enumerator NoReduction

Do not reduce the input values.

Keep them stacked into a single tensor. So values \(t_1, ..., t_k\) get collected into a tensor \([t_1, ..., t_k]\).

enumerator N: The number of ReductionType values.

#include <popart/operatoridentifier.hpp>

struct OperatorIdentifier

Subclassed by popart::AiGraphcoreOpIdV1

Public Functions

inline OperatorIdentifier(const OpDomain &_domain, const OpType &_type, OpVersion _version, NumInputs inputs = {}, int outputs = 0)

inline bool operator==(const OperatorIdentifier &rhs) const

inline bool operator!=(const OperatorIdentifier &rhs) const

inline bool operator<(const OperatorIdentifier &rhs) const

Public Members

OpDomain domain

OpType type

OpVersion version

NumInputs numInputs

int numOutputs

struct NumInputs

Public Functions

inline NumInputs()

inline NumInputs(int f)

inline NumInputs(int _min, int _max)

Public Members

int min

int max

#include <popart/tensorlocation.hpp>

using popart::VGraphIdAndTileSet = std::pair<VGraphId, TileSet>

#include <popart/basicoptionals.hpp>

using popart::OptionalTensorLocation = BasicOptional<TensorLocation, 9>

using popart::OptionalVGraphId = BasicOptional<VGraphId, 2>

using popart::OptionalPipelineStage = BasicOptional<PipelineStage, 3>

using popart::OptionalExecutionPhase = BasicOptional<ExecutionPhase, 5>

using popart::OptionalBatchSerializedPhase = BasicOptional<BatchSerializedPhase, 7>

using popart::OptionalStochasticRoundingMethod = BasicOptional<StochasticRoundingMethod, 10>

using popart::OptionalDataType = BasicOptional<DataType, 0>

#include <popart/opmanager.hpp>

class OpDefinition

Public Types

using DataTypes = std::vector<DataType>

using Inputs = std::vector<Input>

using Outputs = std::vector<Output>

using Attributes = std::map<std::string, Attribute>

Public Functions

inline OpDefinition()

inline OpDefinition(Inputs i, Outputs o, Attributes a)

Public Members

Inputs inputs

Outputs outputs

Attributes attributes

struct Attribute

Public Functions

inline Attribute(std::string regex)

Public Members

std::string supportedValuesRegex

struct Input

Public Functions

inline Input(std::string n, std::vector<DataType> t, bool _constant = false)

Public Members

std::string name

std::vector<DataType> supportedTensors

bool constant

struct Output

Public Functions

inline Output(std::string n, std::vector<DataType> t)

Public Members

std::string name

std::vector<DataType> supportedTensors

class OpCreatorInfo

Public Functions

inline OpCreatorInfo(const OperatorIdentifier &_opid, const Op::Settings &_settings, const Attributes &_attributes, const std::vector<TensorId> &_inputIds, const std::vector<TensorId> &_outputIds)

inline bool hasInputIds() const

inline bool hasOutputIds() const

const std::vector<TensorId> &getInputIds() const

const std::vector<TensorId> &getOutputIds() const

Tensor *getInputTensor(int index) const

TensorData *getInputTensorData(int index) const

TensorInfo &getInputTensorInfo(int index) const

bool hasInputTensor(int index) const

std::string debugName() const

template<typename T> inline std::vector<T> getInputData(int index, const std::set<DataType> &acceptedTypes) const

template<typename T> inline std::vector<T> getInputData(int index) const

template<typename T> inline T getInputScalarValue(int index) const

template<typename T> inline T getInputScalarValue(int index, T defaultValue) const

Public Members

const OperatorIdentifier &opid

const Op::Settings &settings

const Attributes &attributes

class OpManager

Public Types

using OpFactoryFunc = std::function<std::unique_ptr<Op>(const OpCreatorInfo&)>

using ComplexOpFactoryFunc = std::function<Op*(const OpCreatorInfo&, Graph &graph)>

Public Functions

OpManager() = default

Public Static Functions

static void registerOp(const OpInfo &opInfo)

static Attributes getAttributesFromAnyMap(std::map<std::string, popart::any> attributes)

static std::unique_ptr<Op> createOp(const OpDomain &domain, const OpType &type, const int opsetVersion, Graph &graph, const std::string &name = "", const Scope &scope = {}, const Attributes &_attr = {}, const std::vector<TensorId> &inputIds = {}, const std::vector<TensorId> &outputIds = {})

static std::unique_ptr<Op> createOp(const OperatorIdentifier &opid, Graph &graph, const std::string &name = "", const Attributes &_attr = {})

static std::unique_ptr<Op> createOpWithInputs(const OperatorIdentifier &opid, Graph &graph, const std::string &name, const Attributes &_attr, const std::vector<TensorId> &inIds)

static Op *createOpInGraph(const Node &node, Graph &graph)

static const std::vector<OperatorIdentifier> getSupportedOperations(bool includePrivate)

static const std::vector<OperatorIdentifier> getUnsupportedOperations(int opsetVersion)

static const OpDefinitions getSupportedOperationsDefinition(bool includePrivate)

static OpVersion getOpVersionFromOpSet(const OpDomain &opDomain, const OpType &type, const int opsetVersion)

class OpInfo

Public Functions

inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, OpFactoryFunc _f1)

inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, ComplexOpFactoryFunc _f2)

OpFactoryFunc &getSimpleFactory()

ComplexOpFactoryFunc &getComplexFactory()

bool hasComplexFactory()

Public Members

bool isPublic

const OperatorIdentifier id

OpDefinition details

enum class popart::RecomputeType

Define the type of recomputation.

Values:

enumerator Undefined = 0: Default value if RecomputeType has not been set.

enumerator Checkpoint: Do not recompute. Outputs from the op are kept from the forward pass.

enumerator Recompute: Recompute operation.

enumerator Recomputed

For explicit recomputation, this marks a cloned operation that had RecomputeType::Recompute set.

After cloning, the original op is changed to RecomputeType::Checkpoint, and the cloned op is changed to Recomputed.

enum class popart::ExecutionContext

Define the type of the execution context.

Values:

enumerator Normal = 0: Run the forward and backward passes (Default).

enumerator AccumulateOuterFragment: Used to run the AccumulateOps after the gradient accumulation loop completes.

enumerator WeightsFromHostFragment: Used to transfer weights from host to device.

enumerator WeightsToHostFragment: Used to download weights from the device to the host.

enumerator OptimizerFromHostFragment: Used to stream the optimizer state from the host.

enumerator Subgraph: Program fragment used for subgraph-specific operations.

enum class popart::GradOpInType

Define the relationship between the input tensors of a gradient operation and the corresponding non-gradient operation.

Values:

enumerator In = 0: Indicates that the input tensor to the gradient operation is an input tensor of the non-gradient operation (Default).

enumerator Out: Indicates that the input tensor to the gradient operation is an output tensor of the non-gradient operation.

enumerator GradOut: Indicates that the input tensor to the gradient operation is an output gradient tensor of the non-gradient operation.

#include <popart/op/varupdate.hpp>

class VarUpdateOp : public popart::Op 

Base class used to define PopART ops that update variable tensors.

Subclassed by popart::AccumulatorScaleOp, popart::VarUpdateWithUpdaterOp

Public Functions

VarUpdateOp(const OperatorIdentifier&, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual view::Regions aliases(InIndex in, OutIndex) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters

InIndex – The input index.
OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

virtual std::map<InIndex, TensorId> optimizerInputs() const = 0

inline virtual bool isOptimizerOp() const override: Check if op is part of the optimizer.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override: Return which inputs and outputs are replicated tensor sharding pairs.

virtual void growAliasModel(AliasModel&) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

Public Static Functions

static inline InIndex getVarToUpdateInIndex()

static inline OutIndex getUpdatedVarOutIndex()

class AccumulatorScaleOp : public popart::VarUpdateOp 

Inplace multiplies a tensor by an OptimizerValue factor.

As with other Ops that consume OptimizerValues, will only have an input tensor for the value if the OptimizerValue is not const.

Will directly zero the input tensor if the factor is const and 0.

Subclassed by popart::AccumulatorZeroOp

Public Functions

AccumulatorScaleOp(const OptimizerValue factor_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const override

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline const OptimizerValue &getFactor() const

inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

Public Static Functions

static inline InIndex getFactorInIndex()

class AccumulatorZeroOp : public popart::AccumulatorScaleOp 

An AccumulatorScaleOp with a factor of 0, so zeroes the input tensor.

Public Functions

inline AccumulatorZeroOp(const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

class VarUpdateWithUpdaterOp : public popart::VarUpdateOp 

Subclassed by popart::AccumulateBaseOp, popart::AdamComboOp, popart::AdamVarUpdateOp, popart::AdaptiveComboOp, popart::CopyVarUpdateOp, popart::ScaledVarUpdateOp, popart::SGD0ComboOp, popart::SGD0VarUpdateOpBase, popart::SGD1AcclUpdateOp, popart::SGD1VarUpdateOp, popart::SGDMComboBaseOp

Public Functions

VarUpdateWithUpdaterOp(const OperatorIdentifier &opid, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

Public Static Functions

static inline InIndex getUpdaterInIndex()

class AccumulateBaseOp : public popart::VarUpdateWithUpdaterOp 

Subclassed by popart::AccumulateOp, popart::RescaleAccumulateOp, popart::SparseAccumulateOp

Public Functions

AccumulateBaseOp(const OperatorIdentifier &opid, AccumulationType type_, OptimizerValue factor_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override = 0

std::map<InIndex, TensorId> optimizerInputs() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const override

inline AccumulationType getAccumulationType() const

inline const OptimizerValue &getFactor() const

Public Static Functions

static inline constexpr InIndex getFactorInIndex()

class AccumulateOp : public popart::AccumulateBaseOp 

Public Functions

AccumulateOp(AccumulationType type, OptimizerValue factor, const Op::Settings&)

std::unique_ptr<Op> clone() const override

class RescaleAccumulateOp : public popart::AccumulateBaseOp 

The same as AccumulateOp however it also includes a rescale factor that allows for the accumulator to be rescaled at the same time.

Public Functions

RescaleAccumulateOp(AccumulationType type_, OptimizerValue factor_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const final

Public Static Functions

static inline InIndex getRescaleRatioInIndex()

class SparseAccumulateOp : public popart::AccumulateBaseOp 

Say you have: w -> Gather -> x.

In backward pass you have: dW <- GatherGrad <- x

and when the optimiser step is grown: dW <- GatherGrad <- x \ Accumulate -> accum’ / accum

GatherGrad is essentially a scatter. Then we Accumulate the resultant dW on accum. This involves creating an extra dW tensor, so instead we can do:

          x
          |
          V

accum -> SparseAccumulate -> accum’

Where SparseAccumulate can in one operation, without extra space, accumulate the slices of x into accum as required.

The input tensor at getOriginalVarToUpdateInIndex() is an optional input. This is can be used when two different views of the weight are consumed in the forward pass (by ops that will be autodiffed), and one of those ops is a Gather, thus requiring a SparseAccumulate in the weight update step.

We connect getOriginalVarToUpdateInIndex() to the other view of the weight than the one this SparseAccumulate is for. Then, SparseAccumulateOpx will clone that tensor (and its layout) when creating accum.

You probably do not need this outside of the TiedGatherPattern.

See also

SparseAccumulateOpx::createInputTensor for further motivation of why it does this.

Public Functions

SparseAccumulateOp(AccumulationType type, const OptimizerValue &factor, unsigned axis, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual std::set<InIndex> optionalInputs() const override: Return the input indices of all optional inputs to the op.

unsigned getAxis() const

Public Static Functions

static inline constexpr InIndex getIndicesInIndex()

static inline constexpr InIndex getOriginalVarToUpdateInIndex()

static bool supportsAccumulationType(AccumulationType type)

class AdamComboOp : public popart::VarUpdateWithUpdaterOp 

Public Functions

AdamComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialB1, OptimizerValue initialB2, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue mwn, OptimizerValue initialGs, AdamMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, bool scaledOptimizerState_, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::map<InIndex, TensorId> optimizerInputs() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

std::set<InIndex> optionalInputs() const final

inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr

const OptimizerValue initWd

const OptimizerValue initB1

const OptimizerValue initB2

const OptimizerValue initEps

const OptimizerValue initLs

const OptimizerValue initMwn

const OptimizerValue initGs

const AdamMode mode

const WeightDecayMode decayMode

const bool withGradAccum

const OptimizerReductionType reductionType

DataType accumType

DataType accl1Type

DataType accl2Type

const bool scaledOptimizerState

Public Static Functions

static inline InIndex getLrInIndex()

static inline InIndex getWdInIndex()

static inline InIndex getBeta1InIndex()

static inline InIndex getBeta2InIndex()

static inline InIndex getEpsInIndex()

static inline InIndex getLsInIndex()

static inline InIndex getMwnInIndex()

static inline InIndex getGsInIndex()

class AdamVarUpdateOp : public popart::VarUpdateWithUpdaterOp 

Public Functions

AdamVarUpdateOp(OptimizerValue initLr, OptimizerValue mwn, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::map<InIndex, TensorId> optimizerInputs() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr

const OptimizerValue initMwn

Public Static Functions

static inline InIndex getLambR1SqInIndex()

static inline InIndex getLambR2SqInIndex()

static inline InIndex getLrInIndex()

static inline InIndex getMwnInIndex()

class AdaptiveComboOp : public popart::VarUpdateWithUpdaterOp 

Public Functions

AdaptiveComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialA, OptimizerValue initialM, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue initialGs, AdaptiveMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, bool rmspropTFVariant_, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::map<InIndex, TensorId> optimizerInputs() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

std::set<InIndex> optionalInputs() const final

inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr

const OptimizerValue initWd

const OptimizerValue initA

const OptimizerValue initM

const OptimizerValue initEps

const OptimizerValue initLs

const OptimizerValue initGs

const AdaptiveMode mode

const WeightDecayMode decayMode

const bool withGradAccum

const OptimizerReductionType reductionType

DataType accumType

DataType accl1Type

DataType accl2Type

DataType accl3Type

const bool rmspropTFVariant

Public Static Functions

static inline InIndex getLrInIndex()

static inline InIndex getWdInIndex()

static inline InIndex getAlphaInIndex()

static inline InIndex getMomentumInIndex()

static inline InIndex getEpsInIndex()

static inline InIndex getLsInIndex()

static inline InIndex getGsInIndex()

class CopyVarUpdateOp : public popart::VarUpdateWithUpdaterOp 

Public Functions

CopyVarUpdateOp(const Op::Settings&)

CopyVarUpdateOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const final

inline std::map<InIndex, TensorId> optimizerInputs() const final

inline float getSubgraphValue() const final

view::Regions modifies(InIndex) const override

class SGD0ComboOp : public popart::VarUpdateWithUpdaterOp 

A single Op that encapsulates all the information needed to describe an SGD0 optimiser step.

The “0” in the name signifies that there is no optimizer state (note a gradient accum tensor may still be required)

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD0Decompose pattern.

See also

SGD for the definition of what SGD0 is.

See also

SGD0Decompose for the definition of this decomposition.

Public Functions

SGD0ComboOp(OptimizerValue initialSwd, OptimizerValue initialSlr, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const override: Return the input indices of all optional inputs to the op.

virtual std::map<InIndex, TensorId> optimizerInputs() const override

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

Public Members

OptimizerValue initSlr0

OptimizerValue initWdsf0

const bool withGradAccum

const OptimizerReductionType reductionType

const DataType accumType

Public Static Functions

static inline InIndex getSlr0InIndex()

static inline InIndex getWdsf0InIndex()

class SGD0VarUpdateOpBase : public popart::VarUpdateWithUpdaterOp 

Subclassed by popart::SGD0VarUpdateOp

Public Functions

SGD0VarUpdateOpBase(const OperatorIdentifier &_opid, OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

std::map<InIndex, TensorId> optimizerInputs() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

std::set<InIndex> optionalInputs() const final

Public Members

const OptimizerValue initSlr0

const OptimizerValue initWdsf0

Public Static Functions

static inline InIndex getSlr0InIndex()

static inline InIndex getWdsf0InIndex()

class SGD0VarUpdateOp : public popart::SGD0VarUpdateOpBase 

Public Functions

SGD0VarUpdateOp(OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings&)

std::unique_ptr<Op> clone() const final

float getSubgraphValue() const final

class SGD1AcclUpdateOp : public popart::VarUpdateWithUpdaterOp 

Performs the part of the SGD1 velocity update equation that is pre-computed for the next time step after the weight update of the current time step.

Let: v be the input at getVarToUpdateInIndex() g be the input at getUpdaterInIndex() then this op performs: v <- v * smm1 + swd1 * g

See also

SGD for how this is derived and the definitions of smm1 and swd1.

Subclassed by popart::SGD2PartialAcclUpdateOp

Public Functions

SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const override

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

Public Members

const OptimizerValue initSmm1

const OptimizerValue initSwd1

Public Static Functions

static inline InIndex getSmm1InIndex()

static inline InIndex getSwd1InIndex()

class SGD2PartialAcclUpdateOp : public popart::SGD1AcclUpdateOp 

This Op is by design exactly equivalent to an SGD1AcclUpdateOp.

Any logic based on an SGD1AcclUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2PartialAcclUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1AcclUpdateOp.

For SGD2, the entire v update equation could be done in one op (see equation derivation in optimizer.hpp); however, we reuse the SG1AcclUpdateOp and AccumulateOp to implement the equation in the two steps.

Public Functions

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)

SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, OperatorIdentifier opid, const Op::Settings&)

class SGD1VarUpdateOp : public popart::VarUpdateWithUpdaterOp 

Performs the SGD1 weight update equation.

Let: w be the input at getVarToUpdateInIndex() g be the input at getUpdaterInIndex() then this op performs: w <- w - slr1 * g

See also

SGD for how this is derived and the definition of slr1.

Subclassed by popart::SGD2VarUpdateOp

Public Functions

SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::map<InIndex, TensorId> optimizerInputs() const final

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

Public Members

const OptimizerValue initSlr1

Public Static Functions

static inline InIndex getSlr1InIndex()

class SGD2VarUpdateOp : public popart::SGD1VarUpdateOp 

This Op is by design exactly equivalent to an SGD1VarUpdateOp.

Any logic based on an SGD1VarUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2VarUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1VarUpdate.

Public Functions

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)

class SGDMComboBaseOp : public popart::VarUpdateWithUpdaterOp 

Subclassed by popart::SGD1ComboOp, popart::SGD2ComboOp

Public Functions

SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)

SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf, OptimizerValue initialNdsf, OptimizerReductionType reductionType_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override = 0

std::map<InIndex, TensorId> optimizerInputs() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

std::set<InIndex> optionalInputs() const override

inline float getSubgraphValue() const override

Public Members

const OptimizerValue initSmm1

const OptimizerValue initDpsf1

const OptimizerValue initSwd1

const OptimizerValue initSlr1

OptimizerValue initMm

OptimizerValue initWd

OptimizerValue initNgsf

OptimizerValue initNdsf

const OptimizerReductionType reductionType

bool nesterov

Public Static Functions

static inline InIndex getSmm1InIndex()

static inline InIndex getDpsf1InIndex()

static inline InIndex getSwd1InIndex()

static inline InIndex getSlr1InIndex()

static inline InIndex getMmInIndex()

static inline InIndex getWdInIndex()

static inline InIndex getNgsfInIndex()

static inline InIndex getNdsfInIndex()

class SGD1ComboOp : public popart::SGDMComboBaseOp 

A single Op that encapsulates all the information needed to describe an SGD1 optimiser step.

The “1” in the name signifies that only one extra optimiser tensor (the accl tensor) is required.

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD1Decompose pattern.

See also

SGD for the definition of what SGD1 is.

See also

SGD1Decompose for the definition of this decomposition.

Public Functions

SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)

SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf1, OptimizerValue initialNdsf1, OptimizerReductionType reductionType_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

class SGD2ComboOp : public popart::SGDMComboBaseOp 

A single Op that encapsulates all the information needed to describe an SGD2 optimiser step.

The “2” in the name signifies that two extra optimiser tensors (the accum and accl1 tensors) may be required.

The “Combo” in the name signifies that this

Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD2Decompose pattern.

See also

SGD for the definition of what SGD2 is.

See also

SGD2Decompose for the definition of this decomposition.

Public Functions

SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)

SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf2, OptimizerValue initialNdsf2, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

Public Members

const bool withGradAccum

const DataType accumType

const DataType accl1Type

class ScaledVarUpdateOp : public popart::VarUpdateWithUpdaterOp 

Public Functions

ScaledVarUpdateOp(OptimizerValue initLr, OptimizerValue initWd, bool lrInUpdater, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::map<InIndex, TensorId> optimizerInputs() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

Public Members

const OptimizerValue initLr

const OptimizerValue initWd

const bool lrInUpdater

Public Static Functions

static inline InIndex getLrInIndex()

static inline InIndex getWdInIndex()

#include <popart/alias/aliasmodel.hpp>

class AliasModel

A container for the poprithms::memory::inplace::Graph which corresponds to a PopART Graph.

It contains the poprithms Graph, and mappings between PopART Tensors and Ops, and their poprithms equivalents.

Public Types

using PoprithmsTensorId = poprithms::memory::inplace::TensorId

using PoprithmsOpId = poprithms::memory::inplace::OpId

Public Functions

AliasModel()

~AliasModel() = default

void setGraph(const popart::Graph *graph): Set PopART graph.

void insertTensor(const PoprithmsTensorId &poprithmsTensor, const Tensor &popartTensor)

Register that a poprithms Tensor and a popart Tensor correspond to each other.

In addition to registering the Tensor correspondence, the Ops which produce the respective Tensors are registered to be corresponding.

Parameters

poprithmsTensor – The Tensor in the poprithms Graph.
popartTensor – The Tensor in the PopART Graph.

void insertOp(PoprithmsOpId, OpId)

Register that a poprithms Op and a popart Op correspond.

Note that multiple poprithms Ops can correspond to a single popart Op.

void insertUnaryModifier0(const Op &op)

This method performs the following steps:

(1) inserts an aliasGate which is open at index 0 (2) appends a modify to the output aliasGate created in (1) (3) registers that op.output(0) match the output of (2) (4) registers that the poprithms ops created at (1) and (2) correspond to #op.

Parameters: op – A PopART Op, which might have multiple inputs, and whose output is a modifies alias of its input at index 0.

void insertUnaryModifier(const Op&, InIndex): As per insertUnaryModifier0, but the input index may be different from 0.

void insertBinaryModifier(const Op &op)

This method performs the following steps:

(1) inserts an aliasGate whose inputs are the 2 poprithms Tensors corresponding to the 2 inputs of #op. The alias gate is open at the index which #op aliases through, if any.

(2) appends a modify to the output of the aliasGate created at (1)

(3) registers that the poprithms ops (1) and (2) correspond to #op.

Diagramatically, for the PopART Op:

input0 … input1 \ / op | output0

This method creates the following poprithms subgraph:

input0 … input1 \ / aliasGate | modify | output0

Parameters: op – A PopART Op with 2 inputs.

void insertNG2aryModifier(const Op &op, unsigned int numInputs)

The method is the same as insertBinaryModifier except for allowing a larger number of inputs than 2.

Parameters

op – A PopART Op with 2 or more inputs.
numInputs – The number of inputs

void insertViewChange(PoprithmsTensorId viewChangeOut, const Tensor &t, bool isOutplace)

This method performs the following steps:

(1) adds an aliasGate whose (unique) unput is viewChangeOut,

(2) registers that the output of the aliasGate corresponds to the PopART Tensor #t.

(3) registers that the creator of t (if there is any) corresponds to 2 poprithms ops: the creator of viewChangeOut and the aliasGate created at (1).

Parameters

viewChangeOut – This is a Tensor which is the output of a view changing Op, such as reshape and dimShuffle.
t – This PopART Tensor is the output of the corresponding PopART view changing Op.
isOutplace – This boolean determines if the AliasGate created at (1) should be open or closed. If isOutplace is true, then the AliasGate will be closed.

void update(OpId oldId, OpId newId)

Replace all appearances of #oldId in all maps between PopART and poprithms, with #newId.

This is useful when, for example, an Op is replaced in the PopART Graph during the inplacing transformation.

TensorId getTensorId(const PoprithmsTensorId &id) const

Returns: The TensorId corresponding to a poprithms TensorId.

bool contains(const PoprithmsTensorId&) const

PoprithmsTensorId getPoprithmsTensorId(const TensorId &id) const

Returns: The poprithms TensorId corresponding to a TensorId.

bool contains(const TensorId&) const

OpId getOpId(PoprithmsOpId) const

Returns: The OpId corresponding to a poprithms OpId.

bool contains(PoprithmsOpId) const

PoprithmsOpId getGate(OpId opId) const

Returns: The ID of the AliasGate in the poprithms Graph, which corresponds to the PopART Op #opId. If no such AliasGate exists, an error is thrown.

std::vector<PoprithmsOpId> getAll(OpId) const

Returns: The poprithms OpIds which correspond to a PopART OpId. It is possible for 1 PopART Op to correspond to multiple poprithms Ops.

bool contains(OpId) const

std::vector<Tensor*> allAliases(const Tensor &t) const

Get all aliases for a tensor for this given model.

Returned tensors include the argument #t, if it is non-empty.

bool contains(const Tensor &super, const Tensor &sub) const

Returns: true if all of the ‘allocation’ elements of sub and are also in super.

Public Members

poprithms::memory::inplace::Graph g: The poprithms Graph.

popart::Graph *thisGraph = nullptr: The PopART graph reference.

Public Static Attributes

static constexpr int loadFactor = 0.5: load factor used for hash map containers

#include <popart/op/ipucopy.hpp>

class IpuCopyOp : public popart::Op 

Public Functions

IpuCopyOp(const OperatorIdentifier &_opid, VGraphId _destIpu, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline VGraphId getDestIpu() const

const SourceIpuMap &getSourceIpus() const

const SourceTensorMap &getSourceTensors() const

VGraphId getSourceIpu(const TensorId &tenId) const

VGraphId getSourceIpu() const

VGraphId getMinSourceIpu() const

VGraphId getMaxSourceIpu() const

void setSourceIpus(const SourceIpuMap sourceIpus)

void setSourceTensors(const SourceTensorMap sourceTensors)

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

bool isOutlineable() const override

bool isIpuCopyOp() const final

bool copiesOptimizerTensors() const final

void connectInTensor(InIndex, TensorId, VGraphId sourceIpu) override

std::string getFromToStr() const

void disconnectInTensor(InIndex, Tensor*) override

inline bool canShard() const override

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override

using popart::SourceIpuMap = std::map<TensorId, VGraphId>

using popart::SourceTensorMap = std::map<VGraphId, std::vector<TensorId>>

14.8.2. Op definition for Poplar implementation

#include <popart/popx/opx.hpp>

class Opx

Public Functions

Opx(Op*, Devicex*)

virtual ~Opx()

virtual poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const

virtual poplar::Tensor createInputTensor(popart::InIndex index, const poplar::DebugNameAndId &dnai) const

virtual InputCreatorType getInputCreatorType(InIndex index) const

virtual bool canUnwind(InIndex, OutIndex) const

virtual view::RegMap unwindRegion(InIndex, OutIndex) const

virtual poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const

virtual bool createsEquiv(int index0, const Opx *opx1, int index1) const

virtual bool outputCreatedExternally(OutIndex index) const

virtual std::set<TensorId> mustExistBeforeCreate(int index0) const

virtual DnfTensorIds mustExistBeforeCreateDNF(int index0) const

poplar::Tensor cloneNcopy(poplar::program::Sequence&, TensorId) const

poplar::Tensor cloneNcopy(poplar::program::Sequence&, const poplar::Tensor&, const std::string name = "") const

poplar::Tensor broadcast(const std::vector<int64_t>&, TensorId) const

poplar::Tensor broadcast(const std::vector<int64_t>&, poplar::Tensor) const

const Devicex *getDevicex() const

int64_t getVirtualGraphId() const

poplar::Graph &graph() const

poplar::Graph &topLevelGraph() const

virtual poplar::Graph &srcGraph(InIndex) const

virtual poplar::Graph &dstGraph(OutIndex) const

const poplar::Tensor &get(TensorId) const

const poplar::Tensor &getView(TensorId) const

void insert(TensorId, const poplar::Tensor&) const

Tensor *inTensor(InIndex) const

Tensor *outTensor(OutIndex) const

const poplar::Tensor &getInTensor(InIndex index) const

const poplar::Tensor &getOutTensor(OutIndex index) const

const poplar::Tensor &getInView(InIndex index) const

const poplar::Tensor &getOutView(OutIndex index) const

bool hasInViewChangers(InIndex index) const

const ViewChangers &getInViewChangers(InIndex index) const

void setOutViewChangers(OutIndex index, const ViewChangers &changers) const

const TensorInfo &inInfo(InIndex) const

const Shape &inShape(InIndex) const

const TensorInfo &outInfo(OutIndex) const

const Shape &outShape(OutIndex) const

template<class OP> inline OP &getOp() const

template<class OP> inline void verifyOp(Op *op, const OperatorIdentifier &opid)

template<class OP> inline void verifyOp(Op *op, std::vector<OperatorIdentifier> opids)

template<class OP> inline void verifyOp(Op *op)

bool hasInput(InIndex) const

bool hasOutput(OutIndex) const

void setOutTensor(OutIndex index, const poplar::Tensor &tensor) const

TensorId inId(InIndex index) const

TensorId outId(OutIndex index) const

poplar::Tensor getConst(const poplar::Type &type, const std::vector<size_t> &shape, double val, const std::string &name) const

poplar::Tensor getScalarVariable(const poplar::Type &type, const std::string &name) const

poplar::Tensor getZerosTensor(std::vector<std::size_t>, poplar::Type, std::string) const

poplar::Graph &inGraph(InIndex in) const

Return the virtual graph associated with input at index in.

Parameters: in – the input index
Returns: the corresponding poplar virtual graph

virtual std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const

virtual OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const

virtual bool hasCreatorViewChangers(InIndex index) const

virtual ViewChangers getCreatorViewChangers(InIndex index) const

virtual void growPart(OpxGrowPartId id) const

virtual void grow(poplar::program::Sequence&) const

virtual void grow(std::vector<poplar::program::Sequence>&) const

const popart::DebugInfo &getDebugInfo() const

const poplar::DebugNameAndId getDebugNameAndId(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const

poplar::DebugContext debugContext(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const

virtual PreparedTensorInfos getOutputsToPrepare() const

virtual PreparedTensorInfos getInputsToPrepare() const

poplar::Graph &outGraph(OutIndex out) const

Return the virtual graph associated with output at index out.

Parameters: out – the output index
Returns: the corresponding poplar virtual graph

const std::vector<size_t> inShapeSzt(InIndex) const

poplar::Tensor mapMaybeInPlace(popops::expr::BinaryOpType, poplar::Tensor&, poplar::Tensor&, poplar::program::Sequence&, const poplar::DebugContext&, const poplar::OptionFlags&, const std::string&)

Public Members

double inputCreatorPriority = {0.0}

Op *op_p

class RoiAlignGradOpx : public popart::popx::Opx 

Public Functions

RoiAlignGradOpx(Op*, Devicex*)

~RoiAlignGradOpx() override = default

virtual void grow(poplar::program::Sequence&) const final

class RoiAlignOpx : public popart::popx::Opx 

Public Functions

RoiAlignOpx(Op*, Devicex*)

~RoiAlignOpx() override = default

virtual void grow(poplar::program::Sequence&) const final

14.8.3. Available Ops (Op class)

struct AiGraphcoreOpIdV1 : public popart::OperatorIdentifier 

Public Functions

inline AiGraphcoreOpIdV1(const OpType &_type, NumInputs inputs = {}, int outputs = 0)

class AbortOp : public popart::Op 

Public Functions

AbortOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const final

inline bool hasSideEffect() const override

Public Static Functions

static inline InIndex getInIndex()

class AbsGradOp : public popart::Op 

Public Functions

AbsGradOp(const AbsOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline virtual float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdArgInIndex()

static inline OutIndex getOutIndex()

class AbsOp : public popart::ElementWiseUnaryOp 

Public Functions

AbsOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class AdaDeltaUpdaterOp : public popart::Op 

Public Functions

AdaDeltaUpdaterOp(OptimizerValue eps, const Op::Settings&)

std::unique_ptr<Op> clone() const final

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

inline bool isOptimizerOp() const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

const OptimizerValue initEps

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getAccl1InIndex()

static inline InIndex getAccl2InIndex()

static inline InIndex getEpsInIndex()

static inline OutIndex getUpdaterOutIndex()

class AdamUpdaterOp : public popart::Op 

Public Functions

AdamUpdaterOp(AdamMode mode_, OptimizerValue wd, OptimizerValue b1, OptimizerValue b2, OptimizerValue eps, const Op::Settings&)

std::unique_ptr<Op> clone() const final

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

inline bool isOptimizerOp() const override

view::Regions modifies(InIndex) const final

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

AdamMode mode

const OptimizerValue initWd

const OptimizerValue initB1

const OptimizerValue initB2

const OptimizerValue initEps

Public Static Functions

static inline InIndex getVarInIndex()

static inline InIndex getAccl1InIndex()

static inline InIndex getAccl2InIndex()

static inline InIndex getStepInIndex()

static inline InIndex getWdInIndex()

static inline InIndex getBeta1InIndex()

static inline InIndex getBeta2InIndex()

static inline InIndex getEpsInIndex()

static inline OutIndex getUpdaterOutIndex()

class AddArg0GradOp : public popart::ReduceSumOp 

Public Functions

AddArg0GradOp(const Op&, const std::vector<int64_t> &axes)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

class AddArg1GradOp : public popart::ReduceSumOp 

Public Functions

AddArg1GradOp(const Op&, const std::vector<int64_t> &axes)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

class AddBiasBiasGradOp : public popart::ReduceSumOp 

Public Functions

AddBiasBiasGradOp(const AddBiasOp&, const std::vector<int64_t> &axes)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class AddBiasDataGradOp : public popart::IdentityOp 

Public Functions

AddBiasDataGradOp(const AddBiasOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class AddBiasInplaceOp : public popart::AddBiasOp 

Public Functions

AddBiasInplaceOp(const AddBiasOp&)

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

view::Regions modifies(InIndex) const override

view::Regions aliases(InIndex, OutIndex) const override

class AddBiasOp : public popart::Op 

Subclassed by popart::AddBiasInplaceOp

Public Functions

AddBiasOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

view::RegMap fwdRegMap(InIndex, OutIndex) const override

view::RegMap bwdRegMap(InIndex, OutIndex) const override

void growAliasModel(AliasModel&) const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getDataInIndex()

static inline InIndex getBiasInIndex()

static inline OutIndex getOutIndex()

class AddLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp 

Public Functions

inline AddLhsInplaceOp(const OperatorIdentifier &_, const Op::Settings &_settings)

inline AddLhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

class AddRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp 

Public Functions

inline AddRhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

class AllReduceGradOp : public popart::AllReduceOp 

Public Functions

AllReduceGradOp(CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class AllReduceOp : public popart::Op 

Subclassed by popart::AllReduceGradOp

Public Functions

AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const Op::Settings &settings_)

AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const override

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override

inline CollectiveOperator getReduceOp() const

inline bool getIdenticalInputs() const

inline std::vector<int64_t> getIpus() const

Public Static Functions

static inline InIndex getInStartIndex()

static inline OutIndex getOutStartIndex()

class AndOp : public popart::BinaryComparisonOp 

Public Functions

AndOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class ArgExtremaOp : public popart::Op 

Subclassed by popart::ArgMaxOp, popart::ArgMinOp

Public Functions

ArgExtremaOp(const OperatorIdentifier &_opid, int64_t axis, int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

void setup() final

int64_t getKeepDims() const

int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ArgMaxOp : public popart::ArgExtremaOp 

Public Functions

std::unique_ptr<Op> clone() const final

class ArgMinOp : public popart::ArgExtremaOp 

Public Functions

std::unique_ptr<Op> clone() const final

class AsinGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

AsinGradOp(const AsinOp&)

std::unique_ptr<Op> clone() const final

class AsinInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

AsinInplaceOp(const AsinOp&)

std::unique_ptr<Op> clone() const final

class AsinOp : public popart::ElementWiseUnaryOp 

Public Functions

AsinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class Atan2Arg0GradOp : public popart::ElementWiseBinaryArg0GradOp 

Public Functions

Atan2Arg0GradOp(const Op&, const std::vector<int64_t> &reduction_axes)

std::unique_ptr<Op> clone() const final

class Atan2Arg1GradOp : public popart::ElementWiseBinaryArg1GradOp 

Public Functions

Atan2Arg1GradOp(const Op&, const std::vector<int64_t> &reduction_axes)

std::unique_ptr<Op> clone() const final

class Atan2LhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp 

Public Functions

inline Atan2LhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

class AtanGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

AtanGradOp(const AtanOp&)

std::unique_ptr<Op> clone() const final

class AtanInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

AtanInplaceOp(const AtanOp&)

std::unique_ptr<Op> clone() const final

class AtanOp : public popart::ElementWiseUnaryOp 

Public Functions

AtanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class AutoLossScaleProxyGradOp : public popart::AutoLossScaleProxyOp 

Public Functions

AutoLossScaleProxyGradOp(const AutoLossScaleProxyOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class AutoLossScaleProxyOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::AutoLossScaleProxyGradOp

Public Functions

AutoLossScaleProxyOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class AveragePoolGradOp : public popart::Op 

Public Functions

AveragePoolGradOp(const AveragePoolOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

Public Members

const Shape creatorSpatialK

const Shape creatorStrides

const Shape creatorLowerPads

const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()

static inline InIndex getPooledInIndex()

static inline InIndex getGradPooledInIndex()

static inline OutIndex getOutIndex()

class AveragePoolOp : public popart::HasReceptiveFieldOp 

Public Functions

AveragePoolOp(const OperatorIdentifier &_opid, int64_t _countIncludePad, const std::vector<int64_t> &_kernelShape, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

int64_t getNOutChans() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

bool canBeReplacedByIdentity() const override

Shape getSpatialK() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class BaseOnnxRNNGradOp : public popart::Op 

Subclassed by popart::GRUGradOp, popart::LSTMGradOp, popart::RNNGradOp

Public Functions

BaseOnnxRNNGradOp(const OperatorIdentifier &_opid, const BaseOnnxRNNOp &fwd_op)

virtual std::unique_ptr<Op> clone() const override = 0

void setup() override

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const override

bool hasLastHiddenStateGradInput() const

bool hasFullHiddenStateGradInput() const

inline float getSubgraphValue() const final

Public Members

const bool hasBiasesInput

const bool hasInitialHInput

const unsigned batch_size

const unsigned input_size

const unsigned max_seq_length

const unsigned hidden_size

const unsigned num_directions = 1

Public Static Functions

static inline InIndex getInputInIndex()

static inline InIndex getInputWeightsInIndex()

static inline InIndex getRecurrenceWeightsInIndex()

static inline InIndex getBiasesInIndex()

static inline InIndex getInitialHInIndex()

static inline InIndex getFullHiddenStateInIndex()

static inline InIndex getLastHiddenStateGradInIndex()

static inline InIndex getFullHiddenStateGradInIndex()

static inline InIndex getSequenceLensInIndex()

static inline OutIndex getInputOutIndex()

static inline OutIndex getInputWeightsOutIndex()

static inline OutIndex getRecurrenceWeightsOutIndex()

static inline OutIndex getBiasesOutIndex()

static inline OutIndex getInitialHOutIndex()

class BaseOnnxRNNOp : public popart::Op 

Subclassed by popart::GRUOp, popart::LSTMOp, popart::RNNOp

Public Functions

BaseOnnxRNNOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

int64_t getMaxSeqLength() const

int64_t getBatchSize() const

int64_t getInputSize() const

int64_t getHiddenSize() const

virtual int64_t getNumDirections() const

void checkHiddenSize() const

bool hasBiasesInput() const

bool hasInitialHInput() const

bool hasSeqLenInput() const

std::set<InIndex> optionalInputs() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline virtual std::string getName() const

inline nonstd::optional<int64_t> getHiddenSizeAttribute() const

Public Static Functions

static inline InIndex getInputInIndex()

static inline InIndex getInputWeightsInIndex()

static inline InIndex getRecurrenceWeightsInIndex()

static inline InIndex getBiasesInIndex()

static inline InIndex getSequenceLensInIndex()

static inline InIndex getInitialHInIndex()

static inline OutIndex getFullHiddenStateOutIndex()

static inline OutIndex getLastHiddenStateOutIndex()

class BasePadOp : public popart::Op 

Subclassed by popart::BasePadOutplaceOp, popart::PadInplaceOp

Public Functions

BasePadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

bool padSizeZero() const

inline float getSubgraphValue() const final

view::Region valueRegion() const

std::vector<int64_t> padDimensions() const

inline int64_t getLowerPadding(size_t dim) const

inline int64_t getUpperPadding(size_t dim) const

inline const std::string &getMode() const

inline float getPadValue() const

void appendOutlineAttributes(OpSerialiserBase&) const override

void setup() final

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

inline int64_t getRank() const

std::vector<Slice> getSlices() const

inline std::vector<std::ptrdiff_t> getLowerPadding() const

inline std::vector<std::ptrdiff_t> getUpperPadding() const

inline const std::vector<int64_t> &getPads() const

inline const std::vector<unsigned> &getFlips() const

void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class BasePadOutplaceOp : public popart::BasePadOp 

Subclassed by popart::PadOp, popart::SliceGradOp

Public Functions

BasePadOutplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

inline bool canBeReplacedByIdentity() const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

class BaseSliceOp : public popart::Op 

Subclassed by popart::SliceInplaceOp, popart::SliceOp

Public Functions

BaseSliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void growAliasModel(AliasModel&) const override

void setup() final

virtual void connectInTensor(InIndex, TensorId) final

void appendOutlineAttributes(OpSerialiserBase&) const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

view::Regions uses(InIndex) const final

view::Region createSlicedRegion(const Shape &toBeSliced) const

view::Region getFullInRegion() const

view::Region getFullOutRegion() const

inline const std::vector<int64_t> &getStarts() const

inline const std::vector<int64_t> &getEnds() const

inline const std::vector<int64_t> &getAxes() const

inline const std::vector<int64_t> &getSteps() const

inline void setStarts(const std::vector<int64_t> &x)

inline void setEnds(const std::vector<int64_t> &x)

inline void setAxes(const std::vector<int64_t> &x)

inline void setSteps(const std::vector<int64_t> &x)

std::array<std::vector<int64_t>, 2> getLowerUpper() const

std::vector<Slice> getSlices(std::vector<int64_t> input_shape) const

std::vector<Slice> getSlices() const

std::vector<int64_t> getPads() const

std::vector<unsigned> getFlips() const

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Members

int unwindConcatDim = 0

Public Static Functions

static inline InIndex getInIndex()

static inline InIndex getStartsInIndex()

static inline InIndex getEndsInIndex()

static inline InIndex getAxesInIndex()

static inline InIndex getStepsInIndex()

static inline OutIndex getOutIndex()

class BaseSortOp : public popart::Op 

Subclassed by popart::TopKOp

Public Functions

BaseSortOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline int getInIndex()

class BatchNormGradOp : public popart::Op 

Public Functions

BatchNormGradOp(const BatchNormOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getEpsilon() const

inline int64_t getSpatial() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getXInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getMeanInIndex()

static inline InIndex getVarInIndex()

static inline InIndex getYGradInIndex()

static inline OutIndex getXOutIndex()

static inline OutIndex getScaleOutIndex()

static inline OutIndex getBOutIndex()

class BatchNormOp : public popart::Op 

Public Functions

BatchNormOp(const OperatorIdentifier &_opid, float _epsilon, float _momentum, int64_t _spatial, bool _unbiased_variance, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline float getEpsilon() const

inline float getMomentum() const

inline int64_t getSpatial() const

inline bool useUnbiasedVariance() const

inline bool isTraining() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool isNorm() const override

Public Static Functions

static inline InIndex getXInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getBInIndex()

static inline InIndex getMeanInIndex()

static inline InIndex getVarInIndex()

static inline OutIndex getYOutIndex()

static inline OutIndex getMeanOutIndex()

static inline OutIndex getVarOutIndex()

static inline OutIndex getSavedMeanOutIndex()

static inline OutIndex getSavedVarOutIndex()

class BinaryComparisonOp : public popart::Op 

Subclassed by popart::AndOp, popart::EqualOp, popart::GreaterOp, popart::LessOp, popart::OrOp

Public Functions

BinaryComparisonOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getArg0InIndex()

static inline InIndex getArg1InIndex()

static inline OutIndex getOutIndex()

class BinaryConstScalarOp : public popart::ElementWiseUnaryOp 

A unary Op, which performs a binary operation (Mul, Div, etc) between its single input tensor and a scalar, whose value is stored as an Op attribute.

The input index (0 or 1) of the tensor and scalar are controlled by the scalarInIndex attribute.

Some examples. Let T be the input tensor of this Op.

[value = 2, opType = “Div”, scalarInIndex = 1]: T / 2.0

[value = 4, opType = “Pow”, scalarInIndex = 0]: 2.0 ** T

[value = 0.2, opType = “Add”, scalarInIndex = 0]: 0.2 + T

[value = 100, opType = “Sub”, scalarInIndex = 1]: T - 100.

Public Types

enum class Type

Values:

enumerator Add = 0

enumerator Sub

enumerator Mul

enumerator Div

enumerator Pow

enumerator N

Public Functions

inline BinaryConstScalarOp(const OperatorIdentifier &x, float value, Type t, int64_t index, const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

inline float value() const

inline Type opType() const

inline int64_t scalarInIndex() const

class BitwiseBinaryOp : public popart::ElementWiseBinaryOp 

Public Functions

BitwiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class BitwiseNotOp : public popart::ElementWiseUnaryOp 

Public Functions

BitwiseNotOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class BoundaryOp : public popart::Op 

Public Functions

inline BoundaryOp(const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

inline void setup() final

inline float getSubgraphValue() const final

inline bool isOutlineable() const override

inline bool hasSideEffect() const override

class BucketizeOp : public popart::Op 

Public Functions

BucketizeOp(const OperatorIdentifier &opid, bool right, const Op::Settings &settings)

void setup() override

std::unique_ptr<Op> clone() const override

float getSubgraphValue() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

bool isRight() const noexcept

Public Static Functions

static inline InIndex inIndex()

static inline InIndex boundariesInIndex()

static inline OutIndex outIndex()

class CallGradOp : public popart::CallOp 

Public Functions

CallGradOp(CallOp &fwdOp, Graph &bwdGraph, const std::vector<GradInOutMapper> &gradInInfo_, const std::map<int, int> &gradOutInfo_)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class CallOp : public popart::SubgraphOp 

Subclassed by popart::CallGradOp

Public Functions

CallOp(const OperatorIdentifier&, Graph &callee, const Op::Settings &settings)

CallOp(const OperatorIdentifier&, Graph &callee, const std::vector<int> &modifiedInputsViaAttrs, const Op::Settings &settings)

void setup() final

std::unique_ptr<Op> clone() const final

Graph &getCalledGraph() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<TensorId> getGradOpInputIds(const Graph &gradGraph)

void appendOutlineAttributes(OpSerialiserBase &os) const override

inline float getSubgraphValue() const final

std::vector<const Graph*> getCalledGraphs() const override

void setCalledGraph(Graph&) override

inline InIndex subgraphInToOpInIndex(InIndex index) const override

inline InIndex opInToSubgraphInIndex(InIndex index) const override

inline OutIndex subgraphOutToOpOutIndex(OutIndex index) const override

inline OutIndex opOutToSubgraphOutIndex(OutIndex index) const override

inline std::set<OutIndex> opInToOpOutIndex(InIndex in) const override

inline std::set<InIndex> opOutToOpInIndex(OutIndex out) const override

inline void growAliasModel(AliasModel &m) const override

void connectInTensor(InIndex inIndex, TensorId tenId) override

class CastGradOp : public popart::CastOp 

Public Functions

CastGradOp(const CastOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class CastOp : public popart::Op 

Subclassed by popart::CastGradOp

Public Functions

CastOp(const OperatorIdentifier &_opid, DataType _to, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() override

inline DataType toDataType() const

inline float getSubgraphValue() const final

inline bool canShard() const override

inline ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

bool canBeReplacedByIdentity() const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class CeilInplaceOp : public popart::OneWayUnaryInPlaceOp 

Public Functions

CeilInplaceOp(const CeilOp&)

std::unique_ptr<Op> clone() const final

class CeilOp : public popart::OneWayUnaryOp 

Public Functions

CeilOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class ClipGradOp : public popart::ClipOp 

Public Functions

ClipGradOp(const ClipOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Public Static Functions

static inline InIndex getClippedInIndex()

static inline InIndex getGradClippedInIndex()

class ClipInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ClipInplaceOp(const ClipOp&)

std::unique_ptr<Op> clone() const final

float getClipMin() const

float getClipMax() const

void appendOutlineAttributes(OpSerialiserBase&) const override

class ClipOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::ClipGradOp

Public Functions

ClipOp(const OperatorIdentifier &_opid, float min_, float max_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline void setClipMin(float value)

float getClipMin() const

inline void setClipMax(float value)

float getClipMax() const

void appendOutlineAttributes(OpSerialiserBase&) const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

bool canBeReplacedByIdentity() const override

Public Static Functions

static inline InIndex clip11MinInputIndex()

static inline InIndex clip11MaxInputIndex()

class CollectivesBaseOp : public popart::Op 

Subclassed by popart::MultiCollectiveBaseOp, popart::ReplicatedAllGatherOp, popart::ReplicatedAllReduceOp, popart::ReplicatedReduceScatterOp

Public Functions

CollectivesBaseOp(const OperatorIdentifier &_opid, CommGroup group, const Op::Settings &settings_)

CollectivesBaseOp(const OperatorIdentifier &_opid, const ReplicaGrouping &grouping, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

virtual bool hasCorrespondingLinkedIndexTensor(Tensor *t)

inline bool hasCorrespondingLinkedIndexTensor(InIndex in)

virtual Tensor *getCorrespondingLinkedIndexTensor(Tensor *t)

inline Tensor *getCorrespondingLinkedIndexTensor(InIndex in)

virtual bool isCollectiveLinkedIndexTensor(InIndex in) const

virtual bool isCollectiveLinkedIndexTensor(Tensor *t) const

inline void setGCLCommGroup(CommGroup group)

inline CommGroup getGCLCommGroup() const

void setReplicaGrouping(const ReplicaGrouping &grouping)

const ReplicaGrouping &getReplicaGrouping() const

virtual int64_t getCommSize() const

Number of replicas the collective communicates across.

This will be used to create a CollectiveBalanceReorder in lowering to improve the tile mapping when using RTS.

void appendOutlineAttributes(OpSerialiserBase &os) const override

inline virtual bool isConfigureOutputForReplicatedTensorSharding() const

Check Replicated tensor sharding (RTS) mode Collective operations setup for RTS are allowed to scramble the data element order of the input (AllGather) / output (ReduceScatter) tensor such that the tensor layouts minimize inter-tile exchanges.

As a consequence, the RTS sharded tensor does not follow the original data order and can only be used in elementwise, RTS-enabled operations, such as optimizers, where all inputs consumed are rearranged in the same way.

Returns: True if this operation is configured for replicated tensor sharding

Public Static Functions

static inline InIndex getInIndex()

static inline InIndex getCollectiveLinkedIndex()

static inline OutIndex getOutIndex()

static inline ReplicatedTensorShardingIndicesIndex getDefaultTensorShardingGroupIndex()

class ConcatGradOp : public popart::Op 

Public Functions

ConcatGradOp(const ConcatOp &op, InIndex input)

ConcatGradOp(const ConcatInplaceOp &op, InIndex input)

std::unique_ptr<Op> clone() const override

void setup() override

void appendOutlineAttributes(OpSerialiserBase&) const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

int64_t getAxis() const

int64_t getStart() const

int64_t getEnd() const

inline float getSubgraphValue() const final

inline bool canShard() const override

inline ReductionType getShardReductionType(OutIndex index) const override

void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ConcatInplaceOp : public popart::ConcatOp 

Public Functions

ConcatInplaceOp(int64_t axis_, const Op::Settings &settings)

ConcatInplaceOp(const ConcatOp &concatOp, int64_t axis_)

std::unique_ptr<Op> clone() const override

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline view::Regions aliases(InIndex in, OutIndex) const final

class ConcatOp : public popart::Op 

Subclassed by popart::ConcatInplaceOp

Public Functions

ConcatOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

void setup() final

std::vector<std::unique_ptr<Op>> getGradOps() final

int64_t getAxis() const

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

inline bool canShard() const override

void growAliasModel(AliasModel&) const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getInIndex(InIndex index)

static inline OutIndex getOutIndex()

static Shape getOutputShape(int64_t axis, const std::vector<const Shape*> inputs)

class ConvDataGradOp : public popart::MultiConvDataGradBaseOp 

Public Functions

ConvDataGradOp(const ConvOp&)

std::unique_ptr<Op> clone() const final

inline int numConvs() const override

inline const ConvParameters &getParameters() const

Public Static Functions

static inline InIndex getWeightsInIndex()

static inline InIndex getGradConvolvedInIndex()

static inline OutIndex getOutIndex()

class ConvFlipWeightsGradOp : public popart::ConvFlipWeightsOp 

Public Functions

ConvFlipWeightsGradOp(const ConvFlipWeightsGradOp&) = default

ConvFlipWeightsGradOp(const ConvFlipWeightsOp &convFlipWeightsOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class ConvFlipWeightsOp : public popart::Op 

Subclassed by popart::ConvFlipWeightsGradOp

Public Functions

ConvFlipWeightsOp(const ConvFlipWeightsOp&) = default

ConvFlipWeightsOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

~ConvFlipWeightsOp() override

std::unique_ptr<Op> clone() const override

void setup() final

std::vector<std::unique_ptr<Op>> getGradOps() final

inline const ConvParameters &getParameters() const

inline void setParameters(const ConvParameters &p)

inline bool getGroupReshape() const

inline void setGroupReshape(bool reshape)

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase &os) const final

inline void setConvOptions(const MultiConvOptions &opts)

inline const MultiConvOptions &getMultiConvOptions() const

inline std::map<std::string, std::string> getConvOptions() const

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ConvOp : public popart::MultiConvBaseOp 

Public Functions

ConvOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, const MultiConvOptions &convOpts)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline int numConvs() const final

inline int64_t getGroups() const

inline void setGroup()

inline int64_t getNInChans() const

inline int64_t getNOutChans() const

inline ConvParameters getParameters() const

void restoreAttributesFromParams(const std::vector<ConvParameters>&) override

bool isPow2ScaledConv() const: Returns true if and only if the inputs to the op constitute a valid set of inputs for a fused (float8) convolution.

inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getDataInIndex()

static inline InIndex getWeightsInIndex()

static inline InIndex getLog2ScaleInIndex()

static inline OutIndex getOutIndex()

class ConvTransposeOp : public popart::Op 

Public Functions

ConvTransposeOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, std::vector<int64_t> outputPadding, Shape outputShape, const MultiConvOptions &convOpts)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const final

bool isPow2ScaledConvTranspose() const

inline std::set<InIndex> optionalInputs() const override

Public Members

std::vector<int64_t> strides

std::vector<int64_t> dilations

int64_t group

const AutoPad padType

const MultiConvOptions convOpts

ConvParameters params

Public Static Functions

static inline InIndex getInIndex()

static inline InIndex getWeightsInIndex()

static inline InIndex getLog2ScaleInIndex()

static inline OutIndex getOutIndex()

class ConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp 

Public Functions

ConvWeightsGradOp(const ConvOp&)

std::unique_ptr<Op> clone() const final

ConvWeightsGradOp(const ConvWeightsGradOp&) = default

inline int numConvs() const final

inline const ConvParameters &getParameters() const

Public Static Functions

static inline InIndex getGradConvolvedInIndex()

static inline InIndex getPreConvolvedInIndex()

static inline OutIndex getOutIndex()

class CosGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

CosGradOp(const CosOp &fwdOp)

std::unique_ptr<Op> clone() const final

class CosOp : public popart::ElementWiseUnaryOp 

Public Functions

CosOp(const OperatorIdentifier &_opid, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)

class CoshOp : public popart::Op 

Public Functions

CoshOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class CtcBeamSearchDecoderOp : public popart::Op 

Public Functions

CtcBeamSearchDecoderOp(const popart::OperatorIdentifier &_opid, unsigned _blankClass, unsigned _beamWidth, unsigned _topPaths, const popart::Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

void appendAttributes(popart::OpSerialiserBase &os) const override

void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

std::vector<std::unique_ptr<Op>> getGradOps() final

float getSubgraphValue() const final

bool requiresRandomSeed() const override

inline unsigned getBlankClass() const

inline unsigned getBeamWidth() const

inline unsigned getTopPaths() const

inline unsigned getMaxTime() const

inline unsigned getBatchSize() const

inline unsigned getNumClasses() const

Public Static Functions

static inline InIndex getLogProbsInIndex()

static inline InIndex getDataLengthsInIndex()

static inline OutIndex getLabelProbsOutIndex()

static inline OutIndex getLabelLengthsOutIndex()

static inline OutIndex getDecodedLabelsOutIndex()

class CtcGradOp : public popart::Op 

Public Functions

CtcGradOp(const CtcOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

inline ReductionType getReductionType() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

inline bool canShard() const override

inline bool getEnableReducedClassesInLabel() const

Public Static Functions

static inline InIndex getLogProbsGradientWrtCtcLossInIndex()

static inline InIndex getTargetLengthsInIndex()

static inline InIndex getCtcLossGradientInIndex()

static inline OutIndex getLogProbsGradientOutIndex()

class CtcOp : public popart::LossOp 

Public Functions

CtcOp(const OperatorIdentifier &_opid, const ReductionType reduction, const unsigned blank, const bool zeroInfinity, const Op::Settings &settings_, const bool enableReducedClassesInLabel, const DataType outDataType = DataType::UNDEFINED)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline unsigned getBlank() const

inline bool getZeroInfinity() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

unsigned getBatchSize() const

unsigned getMaxInputLength() const

unsigned getMaxTargetLength() const

unsigned getNumClasses() const

inline bool canShard() const override

inline bool getEnableReducedClassesInLabel() const

Public Static Functions

static inline InIndex getLogProbsInIndex()

static inline InIndex getTargetsInIndex()

static inline InIndex getInputLengthsInIndex()

static inline InIndex getTargetLengthsInIndex()

static inline OutIndex getCtcLossOutIndex()

static inline OutIndex getLogProbsGradientWrtCtcLossOutIndex()

class CumSumGradOp : public popart::Op 

Public Functions

CumSumGradOp(const CumSumOp &op, bool exclusive, bool reverse, int64_t axis)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

bool getExclusive() const

bool getReverse() const

int64_t getAxis() const

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex outGradXInIndex()

static inline InIndex fwdXInIndex()

static inline OutIndex outIndex()

class CumSumOp : public popart::Op 

Public Functions

CumSumOp(const OperatorIdentifier &_opid, bool exclusive_, bool reverse_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() final

bool getExclusive() const

bool getReverse() const

int64_t getAxis() const

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex xInIndex()

static inline InIndex axisInIndex()

static inline OutIndex outIndex()

class DetachInplaceOp : public popart::DetachOp 

Public Functions

DetachInplaceOp(const DetachOp &detachOp)

DetachInplaceOp(const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

inline view::Regions aliases(InIndex in, OutIndex) const final

class DetachOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::DetachInplaceOp

Public Functions

DetachOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

inline std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> clone() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

inline bool isIdentity() const final

inline bool isOutplaceViewChange() const override

class DivArg0GradOp : public popart::ElementWiseBinaryArg0GradOp 

Public Functions

DivArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class DivArg1GradOp : public popart::ElementWiseBinaryArg1GradOp 

Public Functions

DivArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class DropoutBaseOp : public popart::RandomBaseOp 

Subclassed by popart::DropoutOp, popart::ShapedDropoutOp

Public Functions

DropoutBaseOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

bool canBeReplacedByIdentity() const override

inline float getRatio() const

inline void setRatio(float r)

inline InIndex getSeedInIndex() const override

inline bool canShard() const override

void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

static float validateRatioAttribute(const OpCreatorInfo &info)

class DropoutOp : public popart::DropoutBaseOp 

Subclassed by popart::DropoutGradOp

Public Functions

DropoutOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() override

bool canBeReplacedByIdentity() const override

void appendAttributes(OpSerialiserBase &os) const override

inline void setOutputMask(bool v)

inline bool getOutputMask() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline void setReferenceId(RandomReferenceId id)

inline RandomReferenceId getReferenceId() const

TensorId getReferenceTensorId()

Public Static Functions

static inline OutIndex getMaskOutIndex()

class DropoutGradOp : public popart::DropoutOp 

Public Functions

DropoutGradOp(const DropoutOp &fwdOp)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const override

Public Static Functions

static inline InIndex getGradInIndex()

static inline OutIndex getOutIndex()

class DynamicAddInplaceOp : public popart::DynamicTernaryBaseInplaceOp 

Public Functions

DynamicAddInplaceOp(const DynamicAddOp &dynamicAddOp)

DynamicAddInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const final

class DynamicAddOp : public popart::DynamicTernaryBaseOp 

Public Functions

DynamicAddOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

class DynamicBaseOp : public popart::Op 

Dynamic Base Op.

Base class for operators acting on a run-time selectable slice of a tensor.

The word “dynamic” refers to the fact that the index can be specified during runtime, where index

is the second tensor argument of this operator as specified in

A slice along an axis can be defined as by the tuple (

start, stop, step ) start - will be equal the index for the respective axis stop - will be equal index + size for the respective axis step - will equal 1

See also

graphcoreoperators.hpp. The axes specifies along which axes the tensor should be sliced. The size specifies the size of the slices.

Limitations: Assuming we would like to slice A with dimension (4, 3)

Step other than 1 is not supported (i.e. A[::2,:] is not supported)
Negative slicing is not supported (i.e. A[:-1,:] is not supported)
stop greater than the size of the axis is not supported (i.e. A[:5,:] is not supported)

Example: Given a Tensor A with shape (3, 2, 4, 5) If we specify axes = {1, 3} (i.e. we will slice the first and third axis [counting from 0]) the operator will operate on A[:, index[0]:(index[0]+size[0]), :, index[1]:(index[1]+size[1])] If we instead specify axes = {0, 1, 3} the operator will operate on A[index[0]:(index[0]+size[0]), index[1]:(index[1]+size[1]), :, index[2]:(index[2]+size[2])]

Subclassed by popart::DynamicBinaryBaseOp, popart::DynamicSliceBaseOp, popart::DynamicSlicePadGradOp

Public Functions

DynamicBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

inline const std::vector<int64_t> &getAxes() const

inline void setAxes(const std::vector<int64_t> &x)

inline const std::vector<int64_t> &getSizes() const

inline void setSizes(const std::vector<int64_t> &x)

inline bool isNotOverlapping() const

TensorInfo createOutInfo() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

Public Static Functions

static inline InIndex getIndexInIndex()

static inline OutIndex getOutIndex()

class DynamicBinaryBaseInplaceOp : public popart::DynamicBinaryBaseOp 

Subclassed by popart::DynamicZeroInplaceOp

Public Functions

DynamicBinaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::Regions modifies(InIndex) const final

class DynamicBinaryBaseOp : public popart::DynamicBaseOp 

Dynamic Binary Base Op.

Base class for operators acting on a run-time selectable slice of a tensor. The word “binary” refers to the fact that the operator takes two tensors as input.

See also

DynamicBaseOp for details

Subclassed by popart::DynamicBinaryBaseInplaceOp, popart::DynamicTernaryBaseOp, popart::DynamicUpdateToUpdateGradOp, popart::DynamicZeroGradOp, popart::DynamicZeroOp

Public Functions

DynamicBinaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline const TensorInfo &getUpdateTensorInfo() const

virtual void growAliasModel(AliasModel &m) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline InIndex getUpdateInIndex()

static inline InIndex getIndexInIndex()

static inline OutIndex getOutIndex()

class DynamicSliceBaseOp : public popart::DynamicBaseOp 

Subclassed by popart::DynamicSliceOp, popart::DynamicUpdateUpdaterGradOp

Public Functions

DynamicSliceBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

std::unique_ptr<Op> clone() const override

void setup() final

TensorInfo createOutInfo() const

Public Static Functions

static inline InIndex getInIndex()

class DynamicSliceInplaceOp : public popart::DynamicSliceOp 

Dynamic Slice Inplace Op.

This Op takes two or three TensorIds as input (as indicated in

The TensorId of tensor to slice from.
The (optional) TensorId of the index of the starting point of the slice (

See also

DynamicBaseOp for explanation).
The TensorId of the tensor to write the slice into (not used in outplace variant).

See also

graphcoreoperators.hpp)

The output is the TensorId of the sliced tensor, aliased

Public Functions

DynamicSliceInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

DynamicSliceInplaceOp(const DynamicSliceOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual view::Regions modifies(InIndex) const override

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

virtual view::Regions aliases(InIndex, OutIndex) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters

InIndex – The input index.
OutIndex – The output index.

Returns

The regions which the output will alias.

class DynamicSliceOp : public popart::DynamicSliceBaseOp 

Dynamic Slice Op.

This Op takes two or three TensorIds as input (as indicated in

The TensorId of tensor to slice from.
The (optional) TensorId of the index of the starting point of the slice (

See also

DynamicBaseOp for explanation).
The TensorId of the tensor to write the slice into (not used in outplace variant).

See also

graphcoreoperators.hpp)

The output is the TensorId of the sliced tensor.

Subclassed by popart::DynamicSliceInplaceOp

Public Functions

DynamicSliceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual void growAliasModel(AliasModel&) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline InIndex getSliceInIndex()

class DynamicSlicePadGradOp : public popart::DynamicBaseOp 

Public Functions

DynamicSlicePadGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

void setup() final

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const override

inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getInIndex()

class DynamicTernaryBaseInplaceOp : public popart::DynamicTernaryBaseOp 

Subclassed by popart::DynamicAddInplaceOp, popart::DynamicUpdateInplaceOp

Public Functions

DynamicTernaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::Regions modifies(InIndex) const final

class DynamicTernaryBaseOp : public popart::DynamicBinaryBaseOp 

Dynamic Ternary Base Op.

Base class for operators acting on a run-time selectable slice of a tensor. The word “ternary” refers to the fact that the operator takes three tensors as input.

See also

DynamicBaseOp for details

Subclassed by popart::DynamicAddOp, popart::DynamicTernaryBaseInplaceOp, popart::DynamicUpdateOp

Public Functions

DynamicTernaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

Public Static Functions

static inline InIndex getUpdateInIndex()

static inline InIndex getInIndex()

class DynamicUpdateInplaceOp : public popart::DynamicTernaryBaseInplaceOp 

Public Functions

DynamicUpdateInplaceOp(const DynamicUpdateOp &dynamicUpdateOp)

DynamicUpdateInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const final

class DynamicUpdateOp : public popart::DynamicTernaryBaseOp 

Dynamic Update Op.

This class takes three TensorIds as input (as indicated in

The TensorId of the tensor to be updated.
The TensorId of the index of the starting point of the slice (

See also

DynamicBaseOp for explanation).
The TensorId to update with (must match dimension with ( index, axes, sizes )).

See also

graphcoreoperators.hpp)

The output is the TensorId of the updated tensor.

See also

DynamicTernaryBaseOp for details.

Public Functions

DynamicUpdateOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

class DynamicUpdateToUpdateGradOp : public popart::DynamicBinaryBaseOp 

Public Functions

DynamicUpdateToUpdateGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class DynamicUpdateUpdaterGradOp : public popart::DynamicSliceBaseOp 

Public Functions

DynamicUpdateUpdaterGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class DynamicZeroGradOp : public popart::DynamicBinaryBaseOp 

Public Functions

DynamicZeroGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class DynamicZeroInplaceOp : public popart::DynamicBinaryBaseInplaceOp 

Public Functions

DynamicZeroInplaceOp(const DynamicZeroOp &dynamicZeroOp)

DynamicZeroInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const final

class DynamicZeroOp : public popart::DynamicBinaryBaseOp 

Public Functions

DynamicZeroOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

class ElementWiseBinaryArg0GradOp : public popart::ElementWiseBinaryGradOp 

Subclassed by popart::Atan2Arg0GradOp, popart::DivArg0GradOp, popart::FmodArg0GradOp, popart::MulArg0GradOp, popart::PowArg0GradOp

Public Functions

inline ElementWiseBinaryArg0GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

class ElementWiseBinaryArg1GradOp : public popart::ElementWiseBinaryGradOp 

Subclassed by popart::Atan2Arg1GradOp, popart::DivArg1GradOp, popart::MulArg1GradOp, popart::PowArg1GradOp, popart::SubtractArg1GradOp

Public Functions

inline ElementWiseBinaryArg1GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

class ElementWiseBinaryBaseOp : public popart::Op 

Subclassed by popart::ElementWiseBinaryInplaceLhsOp, popart::ElementWiseBinaryInplaceRhsOp, popart::ElementWiseBinaryOp

Public Functions

ElementWiseBinaryBaseOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

void setup() override

inline float getSubgraphValue() const final

inline bool canShard() const override

void growAliasModel(AliasModel&) const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

inline view::RegMap fwdRegMap(InIndex argIndex, OutIndex) const final

inline view::RegMap bwdRegMap(InIndex argIndex, OutIndex) const final

Public Static Functions

static inline InIndex getArg0InIndex()

static inline InIndex getArg1InIndex()

static inline OutIndex getOutIndex()

class ElementWiseBinaryGradOp : public popart::Op 

Subclassed by popart::ElementWiseBinaryArg0GradOp, popart::ElementWiseBinaryArg1GradOp

Public Functions

ElementWiseBinaryGradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)

virtual std::unique_ptr<Op> clone() const override = 0

void setup() final

inline const std::vector<int64_t> &getReductionAxes() const

inline float getSubgraphValue() const final

inline const std::map<int, int> &gradOutToNonGradIn() const final

inline virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdArg0InIndex()

static inline InIndex getFwdArg1InIndex()

static inline InIndex getFwdOutIndex()

static inline OutIndex getOutIndex()

class ElementWiseBinaryInplaceLhsOp : public popart::ElementWiseBinaryBaseOp 

Subclassed by popart::AddLhsInplaceOp, popart::Atan2LhsInplaceOp, popart::MulLhsInplaceOp, popart::PowLhsInplaceOp

Public Functions

inline ElementWiseBinaryInplaceLhsOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

inline view::Regions modifies(InIndex index) const final

inline view::Regions aliases(InIndex index, OutIndex) const final

class ElementWiseBinaryInplaceRhsOp : public popart::ElementWiseBinaryBaseOp 

Subclassed by popart::AddRhsInplaceOp, popart::MulRhsInplaceOp

Public Functions

inline ElementWiseBinaryInplaceRhsOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

inline view::Regions modifies(InIndex index) const final

inline view::Regions aliases(InIndex index, OutIndex) const final

class ElementWiseBinaryOp : public popart::ElementWiseBinaryBaseOp 

Subclassed by popart::ElementWiseNpBroadcastableBinaryWithGradOp< AddArg0GradOp, AddArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Atan2Arg0GradOp, Atan2Arg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< DivArg0GradOp, DivArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< MulArg0GradOp, MulArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< PowArg0GradOp, PowArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< SubtractArg0GradOp, SubtractArg1GradOp >, popart::BitwiseBinaryOp, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Arg0GradOp, Arg1GradOp >, popart::FmodOp, popart::PReluOp

Public Functions

ElementWiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void setInplacePriority(const OperatorIdentifier&, float)

float getInplacePriority(const OperatorIdentifier&) const

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

class ElementWiseInplaceUnaryOp : public popart::ElementWiseUnaryOp 

Public Functions

inline ElementWiseInplaceUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

inline view::Regions modifies(InIndex index) const final

inline view::Regions aliases(InIndex in, OutIndex) const final

class ElementWiseNonLinearUnaryGradOp : public popart::Op 

Subclassed by popart::AsinGradOp, popart::AtanGradOp, popart::CosGradOp, popart::EluGradOp, popart::ErfGradOp, popart::GeluErfGradOp, popart::GeluGradOp, popart::HardSigmoidGradOp, popart::Log1pGradOp, popart::LogGradOp, popart::ReciprocalGradOp, popart::SeluGradOp, popart::ShrinkGradOp, popart::SinGradOp, popart::SinhGradOp, popart::SoftPlusGradOp, popart::SoftSignGradOp, popart::SwishGradOp, popart::ThresholdedReluGradOp

Public Functions

ElementWiseNonLinearUnaryGradOp(const OperatorIdentifier &_opid, const ElementWiseUnaryOp &fwdOp)

std::unique_ptr<Op> clone() const override

void setup() final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdArgInIndex()

static inline OutIndex getOutIndex()

template<class Arg0GradOp, class Arg1GradOp> class ElementWiseNpBroadcastableBinaryWithGradOp : public popart::ElementWiseBinaryOp 

Subclassed by popart::AddOp, popart::Atan2Op, popart::DivOp, popart::MulOp, popart::PowOp, popart::SubtractOp

Public Functions

inline ElementWiseNpBroadcastableBinaryWithGradOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

inline std::unique_ptr<Op> clone() const override

inline virtual std::vector<std::unique_ptr<Op>> getGradOps() final

class ElementWiseUnaryBooleanOp : public popart::Op 

Subclassed by popart::IsInf, popart::IsNaN

Public Functions

ElementWiseUnaryBooleanOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const override

inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ElementWiseUnaryOp : public popart::Op 

Subclassed by popart::AbsOp, popart::AsinOp, popart::AtanOp, popart::AutoLossScaleProxyOp, popart::BinaryConstScalarOp, popart::BitwiseNotOp, popart::ClipOp, popart::CosOp, popart::DetachOp, popart::ElementWiseInplaceUnaryOp, popart::EluOp, popart::ErfOp, popart::Expm1Op, popart::ExpOp, popart::GeluErfOp, popart::GeluOp, popart::HardSigmoidOp, popart::IdentityOp, popart::IncrementModOp, popart::LeakyReluOp, popart::Log1pOp, popart::LogOp, popart::LogSoftmaxOp, popart::NegateOp, popart::NopOp, popart::NotOp, popart::OneWayUnaryOp, popart::PrintTensorOp, popart::ReciprocalOp, popart::ReluOp, popart::ScaleOp, popart::SeluOp, popart::ShrinkOp, popart::SigmoidOp, popart::SinhOp, popart::SinOp, popart::SoftmaxOp, popart::SoftPlusOp, popart::SoftSignOp, popart::SqrtOp, popart::SquareOp, popart::SwishOp, popart::TanhOp, popart::ThresholdedReluOp

Public Functions

ElementWiseUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const override

inline bool canShard() const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

void growAliasModel(AliasModel&) const override

inline virtual bool isIdentity() const

Returns: true, if and only if (iff) this Op is mathematically equivalent to f(x) = x. This is slightly different to canBeReplacedByIdentity; for example Detach and Identity have isIdentity overridden to return true, but still return false for canBeReplacedByIdentity.

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class EluGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

EluGradOp(const EluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const final

inline float alpha() const

class EluInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

EluInplaceOp(const EluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const final

inline float alpha() const

class EluOp : public popart::ElementWiseUnaryOp 

Public Functions

EluOp(const OperatorIdentifier &opid, float alpha, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void appendAttributes(OpSerialiserBase&) const final

inline float alpha() const

class EqualOp : public popart::BinaryComparisonOp 

Public Functions

EqualOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class ErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

ErfGradOp(const ErfOp &fwdOp)

std::unique_ptr<Op> clone() const final

class ErfOp : public popart::ElementWiseUnaryOp 

Public Functions

ErfOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ExchangeBaseOp : public popart::Op 

Subclassed by popart::HostBaseOp, popart::MultiExchangeOp, popart::RemoteBaseOp, popart::RemoteCodeLoadOp

Public Functions

inline ExchangeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const override = 0

inline virtual int getNumExchanges() const

virtual ExchangeDescriptor getExchangeDescriptor(int index) const = 0

Return the exchange descriptor at index A MultiExchangeOp can contain multiple descriptors, while RemoteLoad/Store and HostLoad/Store contain one each.

Parameters: index – Index of the exchange descriptor to return.
Returns: ExchangeDescriptor for the exchange.

inline float getSubgraphValue() const final

inline bool isOutlineable() const final

virtual std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const

Get the descriptor index associated with the input index.

Parameters: index – input index
Returns: pair of descriptor index and input index relative to the descriptor

virtual std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const

Get the descriptor index associated with the output index.

Parameters: index – output index
Returns: pair of descriptor index and output index relative to the descriptor

virtual std::vector<InIndex> descriptorIndexToInIndices(int index) const

Get the input indices associated with the descriptor index.

Parameters: index – exchange descriptor index
Returns: descriptor index

virtual std::vector<OutIndex> descriptorIndexToOutIndices(int index) const

Get the output indices associated with the descriptor index.

Parameters: index – exchange descriptor index
Returns: descriptor index

class ExpGradOp : public popart::Op 

Public Functions

ExpGradOp(const ExpOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdOutInIndex()

static inline OutIndex getOutIndex()

class ExpInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ExpInplaceOp(const ExpOp&)

ExpInplaceOp(const Op::Settings &opSettings)

std::unique_ptr<Op> clone() const final

class ExpOp : public popart::ElementWiseUnaryOp 

Public Functions

ExpOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class ExpandGradOp : public popart::Op 

Public Functions

ExpandGradOp(const ExpandOp &op)

ExpandGradOp(const ExpandInplaceOp &op)

std::unique_ptr<Op> clone() const override

void setup() override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline std::vector<size_t> getXShape()

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getDYIndex()

static inline OutIndex getOutIndex()

class ExpandInplaceOp : public popart::ExpandOp 

Public Functions

ExpandInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)

ExpandInplaceOp(const ExpandOp &expandOp)

std::unique_ptr<Op> clone() const override

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline view::Regions aliases(InIndex in, OutIndex) const final

class ExpandOp : public popart::Op 

Subclassed by popart::ExpandInplaceOp

Public Functions

inline ExpandOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

ExpandOp(const OperatorIdentifier &_opid, const Shape &_outShape, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

void setup() final

std::vector<std::unique_ptr<Op>> getGradOps() final

inline Shape getOutShape() const

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

inline bool canBeReplacedByIdentity() const override

void growAliasModel(AliasModel&) const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

inline float getSubgraphValue() const final

void connectInTensor(InIndex inIndex, TensorId tenId) final

Public Static Functions

static inline InIndex getInTensorIndex()

static inline InIndex getInShapeIndex()

static inline OutIndex getOutIndex()

class Expm1GradOp : public popart::Op 

Public Functions

Expm1GradOp(const Expm1Op &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdOutInIndex()

static inline OutIndex getOutIndex()

class Expm1InplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

Expm1InplaceOp(const Expm1Op&)

std::unique_ptr<Op> clone() const final

class Expm1Op : public popart::ElementWiseUnaryOp 

Public Functions

Expm1Op(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class FloorInplaceOp : public popart::OneWayUnaryInPlaceOp 

Public Functions

FloorInplaceOp(const FloorOp&)

std::unique_ptr<Op> clone() const final

class FloorOp : public popart::OneWayUnaryOp 

Public Functions

FloorOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class FmodArg0GradOp : public popart::ElementWiseBinaryArg0GradOp 

Public Functions

FmodArg0GradOp(const FmodOp &op, const std::vector<int64_t> &reductionAxes)

std::unique_ptr<Op> clone() const final

class FmodOp : public popart::ElementWiseBinaryOp 

Public Functions

FmodOp(const OperatorIdentifier &opId, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class GRUGradOp : public popart::BaseOnnxRNNGradOp 

Gradient operator for GRUOp.

Public Functions

GRUGradOp(const GRUOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const final: Return the input indices of all optional inputs to the op.

Public Members

const unsigned linear_before_reset_attribute

Public Static Functions

static inline InIndex getIntermediatesInIndex()

class GRUOp : public popart::BaseOnnxRNNOp 

This op applies a single-layer GRU with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#GRU

Public Functions

GRUOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const std::string direction, bool linear_before_reset, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

unsigned getNumChannels() const

int64_t getNumDirections() const override

bool hasOutput(OutIndex) const

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

bool isTraining() const

inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns: true if the op can be outlined, false otherwise. Default: true.

inline std::string getDirectionAttribute() const

inline int getLinearBeforeResetAttribute() const

Public Static Functions

static inline OutIndex getInitialHPassThroughIndex()

static inline OutIndex getIntermediatesPassThroughIndex()

static inline OutIndex getInputWeightsPassThroughIndex()

static inline OutIndex getRecurrenceWeightsPassThroughIndex()

static inline OutIndex getBiasesPassThroughIndex()

class GatherGradOp : public popart::Op 

Subclassed by popart::TiedGatherGradOp

Public Functions

GatherGradOp(const GatherOp &op, int64_t axis, int64_t group_size)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

int64_t getAxis() const

int64_t getGroupSize() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline int getInBatchAxis(InIndex i) const override

inline int getOutBatchAxis(OutIndex) const override

inline bool canShard() const override

inline nonstd::optional<float> getAvailableMemoryProportion() const

inline void setAvailableMemoryProportion(const nonstd::optional<float> v)

Public Static Functions

static inline InIndex gradInIndex()

static inline InIndex indicesInIndex()

static inline OutIndex gradOutIndex()

class GatherOp : public popart::Op 

Subclassed by popart::TiedGatherOp

Public Functions

GatherOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t group_size_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() final

int64_t getAxis() const

int64_t getGroupSize() const

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const override

inline bool canShard() const override

inline nonstd::optional<float> getAvailableMemoryProportion() const

inline void setAvailableMemoryProportion(const nonstd::optional<float> v)

inline bool zeroOutOfRangeIndices() const

Public Static Functions

static inline InIndex dataInIndex()

static inline InIndex indicesInIndex()

static inline OutIndex outIndex()

class GeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

GeluGradOp(const GeluOp&)

std::unique_ptr<Op> clone() const final

class GeluInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

GeluInplaceOp(const GeluOp&)

GeluInplaceOp(const Op::Settings &opSettings)

std::unique_ptr<Op> clone() const final

class GeluOp : public popart::ElementWiseUnaryOp 

Public Functions

GeluOp(const OperatorIdentifier &opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class GeluErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

GeluErfGradOp(const GeluErfOp&)

std::unique_ptr<Op> clone() const final

class GeluErfInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

GeluErfInplaceOp(const GeluErfOp&)

GeluErfInplaceOp(const Op::Settings &opSettings)

std::unique_ptr<Op> clone() const final

class GeluErfOp : public popart::ElementWiseUnaryOp 

Public Functions

GeluErfOp(const OperatorIdentifier &opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class GetRandomSeedOp : public popart::Op 

Public Functions

GetRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline InIndex getSeedInIndex() const override

inline int getOutBatchAxis(OutIndex) const override

inline float getSubgraphValue() const final

inline bool isOutlineable() const final

view::Regions aliases(InIndex, OutIndex) const final

view::Regions modifies(InIndex) const final

inline void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline OutIndex getUpdatedSeedOutIndex()

static inline TensorId getStreamedSeedTensorId()

static inline TensorId getUpdatedSeedTensorId()

class GlobalAveragePoolGradOp : public popart::Op 

Public Functions

GlobalAveragePoolGradOp(const GlobalAveragePoolOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK

const Shape creatorStrides

const Shape creatorLowerPads

const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()

static inline InIndex getPooledInIndex()

static inline InIndex getGradPooledInIndex()

static inline OutIndex getOutIndex()

class GlobalAveragePoolOp : public popart::Op 

Public Functions

GlobalAveragePoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline Shape getSpatialK() const

Shape getStrides() const

Shape getLowerPads() const

Shape getUpperPads() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class GlobalMaxPoolGradOp : public popart::Op 

Public Functions

GlobalMaxPoolGradOp(const GlobalMaxPoolOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK

const Shape creatorStrides

const Shape creatorLowerPads

const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()

static inline InIndex getPooledInIndex()

static inline InIndex getGradPooledInIndex()

static inline OutIndex getOutIndex()

class GlobalMaxPoolOp : public popart::Op 

Public Functions

GlobalMaxPoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

void setup() override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline Shape getSpatialK() const

Shape getStrides() const

Shape getLowerPads() const

Shape getUpperPads() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class GreaterOp : public popart::BinaryComparisonOp 

Public Functions

GreaterOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class GroupNormGradOp : public popart::Op 

Public Functions

GroupNormGradOp(const GroupNormOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getEpsilon() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getXInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getMeanInIndex()

static inline InIndex getInvStdDevInIndex()

static inline InIndex getYGradInIndex()

static inline OutIndex getXGradOutIndex()

static inline OutIndex getScaleOutIndex()

static inline OutIndex getBOutIndex()

class GroupNormOp : public popart::Op 

Public Functions

GroupNormOp(const OperatorIdentifier &opid_, int64_t num_groups_, float epsilon_, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getEpsilon() const

inline int64_t getNumGroups() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool isNorm() const override

inline float getSubgraphValue() const final

inline bool canShard() const override

bool canBeReplacedByIdentity() const final

Public Static Functions

static inline InIndex getXInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getBInIndex()

static inline OutIndex getYOutIndex()

static inline OutIndex getMeanOutIndex()

static inline OutIndex getInvStdDevOutIndex()

class HardSigmoidGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

HardSigmoidGradOp(const HardSigmoidOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getBeta() const

class HardSigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

HardSigmoidInplaceOp(const HardSigmoidOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getBeta() const

class HardSigmoidOp : public popart::ElementWiseUnaryOp 

Public Functions

HardSigmoidOp(const OperatorIdentifier &opid, float _alpha, float _beta, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getBeta() const

class HasReceptiveFieldOp : public popart::Op 

Subclassed by popart::AveragePoolOp, popart::MaxPoolOp

Public Functions

HasReceptiveFieldOp(const OperatorIdentifier &_opid, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const override = 0

int getNSpatialDims() const

int64_t getBatchSize() const

int64_t getNInChans() const

virtual Shape getSpatialK() const = 0

Shape getStrides() const

inline Shape getLowerPads() const

inline Shape getUpperPads() const

inline Shape getLowerOutPads() const

inline Shape getUpperOutPads() const

Shape getPads() const

Shape getOutPads() const

Shape getDilations() const

Shape getInDilations() const

std::string getAutoPadStr(const AutoPad &x) const

std::vector<int64_t> getSpatialD() const

std::vector<int64_t> getSpatialO() const

void setup() override

virtual int64_t getNOutChans() const = 0

std::vector<int64_t> lowerPads() const

std::vector<int64_t> upperPads() const

std::vector<int64_t> lowerOutPads() const

std::vector<int64_t> upperOutPads() const

std::vector<size_t> spatialD_szt() const

std::vector<size_t> spatialK_szt() const

std::vector<uint32_t> lowerPads_u32() const

std::vector<uint32_t> upperPads_u32() const

std::vector<int> lowerPads_i32() const

std::vector<int> upperPads_i32() const

std::vector<uint32_t> dilations_u32() const

std::vector<uint32_t> strides_u32() const

void appendOutlineAttributes(OpSerialiserBase&) const override

Public Members

const std::vector<int64_t> basePads

const std::vector<int64_t> baseOutPads

const std::vector<int64_t> baseStrides

const std::vector<int64_t> baseDilations

const std::vector<int64_t> baseInDilations

const AutoPad padType

const bool ceilMode

Public Static Functions

static AutoPad getAutoPad(const std::string &autoPadStr)

static void alterPads(Shape &pads_, Shape OutShape_, Shape spatialD_, Shape spatialK_, std::vector<int64_t> strides_)

static std::vector<int64_t> lowerPads(Shape pads, int nSpatialDims, AutoPad padType)

static std::vector<int64_t> upperPads(Shape pads, int nSpatialDims, AutoPad padType)

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

static Shape getSpatialOutShape(Shape spatialD_, Shape spatialK_, std::vector<int64_t> pads_, std::vector<int64_t> outPads_, std::vector<int64_t> strides_, std::vector<int64_t> dilations_, std::vector<int64_t> inDilations_, AutoPad auto_pad_, bool ceil_mode_ = false)

struct ReceptiveOpAttributes

Public Functions

void setFromAttributes(const Attributes &attributes)

Public Members

std::vector<int64_t> pads

std::vector<int64_t> outPads

std::vector<int64_t> strides

std::vector<int64_t> dilations

std::vector<int64_t> inDilations

std::string auto_pad

int64_t ceil_mode = 0

class HistogramOp : public popart::Op 

Public Functions

HistogramOp(const OperatorIdentifier &_opid, const std::vector<float> &levels_, const bool absoluteOfInput_, const Op::Settings &settings_)

void setup() final

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

std::unique_ptr<Op> clone() const override

inline std::vector<float> getLevels() const

inline bool getAbsoluteOfInput() const

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class HostBaseOp : public popart::ExchangeBaseOp 

Subclassed by popart::HostLoadOp, popart::HostStoreOp

Public Functions

inline HostBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, TensorId sid_)

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool canShard() const final

virtual std::unique_ptr<Op> clone() const override = 0

inline bool hasSideEffect() const override

inline void setHostStreamTensorId(TensorId stream_id_)

inline TensorId getHostStreamTensorId() const

Public Static Functions

static inline InIndex getLocalTensorInIndex()

class HostLoadInplaceOp : public popart::HostLoadOp 

Public Functions

HostLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)

HostLoadInplaceOp(const HostLoadOp&)

std::unique_ptr<Op> clone() const override

void setup() final

view::Regions modifies(InIndex) const override

view::Regions aliases(InIndex, OutIndex) const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

ExchangeDescriptor getExchangeDescriptor(int index) const final

class HostLoadOp : public popart::HostBaseOp 

Host Load Op: an op to represent the transfer of data from the host to the device.

It uses the existing host to device transfers created when building the IR, but defers the actual poplar::Copy until the op itself runs. This allows the copy to be scheduled as part of the normal op scheduling.

There is a stage in the IR which adds the following ops:

Device :: InitOp -> input_prehostload -> HostLoadOp -> input -> etc… / Host :: data -> stream

Subclassed by popart::HostLoadInplaceOp

Public Functions

HostLoadOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters

aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual void growAliasModel(AliasModel &m) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

ExchangeDescriptor getExchangeDescriptor(int index) const override

Public Static Functions

static inline OutIndex getLocalTensorOutIndex()

class HostStoreOp : public popart::HostBaseOp 

Public Functions

HostStoreOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)

std::unique_ptr<Op> clone() const override

void setup() final

ExchangeDescriptor getExchangeDescriptor(int index) const final

class IdentityGradOp : public popart::IdentityOp 

Public Functions

IdentityGradOp(const IdentityOp &fwdOp)

IdentityGradOp(const Settings &settings_)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class IdentityInplaceOp : public popart::IdentityOp 

Public Functions

IdentityInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

IdentityInplaceOp(const IdentityOp &concatOp)

std::unique_ptr<Op> clone() const override

inline view::Regions aliases(InIndex in, OutIndex) const final

inline bool isInplaceViewChange() const override

class IdentityLossGradOp : public popart::Op 

Public Functions

IdentityLossGradOp(const IdentityLossOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

bool canBeReplacedByIdentity() const override

inline ReductionType getReductionType() const

inline float getSubgraphValue() const final

inline bool canShard() const override

float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class IdentityLossOp : public popart::LossOp 

Public Functions

IdentityLossOp(const OperatorIdentifier &_opid, const ReductionType &reduction, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

inline bool canShard() const override

inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class IdentityOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::AddBiasDataGradOp, popart::IdentityGradOp, popart::IdentityInplaceOp, popart::IfConditionGradOp

Public Functions

IdentityOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

inline bool isIdentity() const final

inline bool isOutplaceViewChange() const override

class IfConditionGradOp : public popart::IdentityOp 

Public Functions

IfConditionGradOp(const IfOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class IfGradOp : public popart::IfOp 

Public Functions

IfGradOp(const IfOp&, const std::vector<GradInOutMapper> &gradInInfo, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class IfOp : public popart::Op 

Subclassed by popart::IfGradOp

Public Functions

IfOp(const OperatorIdentifier&, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

Graph &getThenGraph() const

Graph &getElseGraph() const

const std::map<InIndex, InIndex> &getBranchInIndicesMap(const Graph&) const

const std::map<OutIndex, OutIndex> &getBranchOutIndicesMap(const Graph&) const

inline float getSubgraphValue() const final

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

std::vector<const Graph*> getCalledGraphs() const override

virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override

virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override

virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override

virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override

std::set<OutIndex> opInToOpOutIndex(InIndex in) const override

std::set<InIndex> opOutToOpInIndex(OutIndex out) const override

float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override

virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override

Public Static Functions

static inline InIndex getConditionInIndex()

class IncrementModInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Increment Modulo Op.

This Op takes one Tensor as input (as indicated in Attributes:

increment - how much to increment the input tensor by (const scalar)
modulus - the modulo operand (const scalar)

See also

graphcoreoperators.hpp)

The Tensor to increment (modulo) The output is the tensor x = (x + increment) % modulus

Inplace - result is mapped back to the input Tensor.

Public Functions

IncrementModInplaceOp(double increment_, double modulus_, const Op::Settings &settings)

IncrementModInplaceOp(const IncrementModOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline double getIncrement() const

inline double getModulus() const

class IncrementModOp : public popart::ElementWiseUnaryOp 

Increment Modulo Op.

This Op takes one Tensor as input (as indicated in Attributes:

increment - how much to increment the input tensor by (const scalar)
modulus - the modulo operand (const scalar)

See also

graphcoreoperators.hpp)

The Tensor to increment (modulo) The output is the tensor y = (x + increment) % modulus

Public Functions

IncrementModOp(const OperatorIdentifier &opId, double increment_, double modulus_, const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline double getIncrement() const

inline double getModulus() const

class InitOp : public popart::Op 

Public Functions

InitOp(const OperatorIdentifier&, const TensorInfo&, const TensorType&, const InitType&, const Op::Settings&, const int = -1)

std::unique_ptr<Op> clone() const final

void setup() final

inline TensorInfo getTensorInfo() const

inline TensorType getTensorType() const

inline InitType getInitType() const

inline float getSubgraphValue() const final

inline bool isOutlineable() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline int getOutBatchAxis(OutIndex) const override

inline bool canShard() const override

Public Static Functions

static inline InIndex getOutIndex()

class InstanceNormGradOp : public popart::Op 

Public Functions

InstanceNormGradOp(const InstanceNormOp &fwd_op)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInputInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getOutGradInIndex()

static inline InIndex getMeanInIndex()

static inline InIndex getInvStdDevInIndex()

static inline OutIndex getInputOutIndex()

static inline OutIndex getScaleOutIndex()

static inline OutIndex getBOutIndex()

class InstanceNormOp : public popart::Op 

Public Functions

InstanceNormOp(const OperatorIdentifier &_opid, float _epsilon, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getEpsilon() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool isNorm() const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInputInIndex()

static inline InIndex getScaleInIndex()

static inline InIndex getBInIndex()

static inline OutIndex getOutIndex()

static inline OutIndex getMeanOutIndex()

static inline OutIndex getInvStdDevOutIndex()

class IoTileCopyOp : public popart::Op 

Public Functions

IoTileCopyOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline float getSubgraphValue() const final

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class IsInf : public popart::ElementWiseUnaryBooleanOp 

Public Functions

IsInf(const OperatorIdentifier &_opid, const Op::Settings&)

std::unique_ptr<Op> clone() const override

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)

class IsNaN : public popart::ElementWiseUnaryBooleanOp 

Public Functions

IsNaN(const OperatorIdentifier &_opid, const Op::Settings&)

std::unique_ptr<Op> clone() const override

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)

class L1GradOp : public popart::Op 

Public Functions

L1GradOp(const L1Op&)

L1GradOp(const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

inline float getLambda() const

inline ReductionType getReductionType() const

inline bool canShard() const override

float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getFwdActInIndex()

static inline InIndex getGradInIndex()

static inline OutIndex getOutIndex()

class L1Op : public popart::LossOp 

Public Functions

L1Op(const OperatorIdentifier &_opid, const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline float getLambda() const

inline bool canShard() const override

inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class LRNGradOp : public popart::Op 

Public Functions

LRNGradOp(const LRNOp&)

std::unique_ptr<Op> clone() const final

void setup() final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline float getAlpha() const

inline float getBeta() const

inline float getBias() const

inline int64_t getSize() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline InIndex getFwdInInIndex()

static inline OutIndex getOutIndex()

class LRNOp : public popart::Op 

Public Functions

LRNOp(const OperatorIdentifier &_opid, float _alpha, float _beta, float _bias, int64_t _size, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline float getAlpha() const

inline float getBeta() const

inline float getBias() const

inline int64_t getSize() const

void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class LSTMGradOp : public popart::BaseOnnxRNNGradOp 

Gradient operator for LSTM op.

Public Functions

LSTMGradOp(const LSTMOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

bool hasLastCellStateGradInput() const

virtual std::set<InIndex> optionalInputs() const final: Return the input indices of all optional inputs to the op.

Public Members

const bool hasInitialCInput

const std::string fwd_debug_name

const ActivationFunction activation

const ActivationFunction recurrent_activation

Public Static Functions

static inline InIndex getInitialCInIndex()

static inline InIndex getIntermediatesInIndex()

static inline InIndex getLastCellStateGradInIndex()

static inline OutIndex getInitialCOutIndex()

class LSTMOp : public popart::BaseOnnxRNNOp 

This op applies a single-layer LSTM with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#LSTM

Public Functions

LSTMOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

unsigned getNumChannels() const

nonstd::optional<float> getAvailableMemoryProportion() const

bool hasInitialCInput() const

bool hasOutput(OutIndex) const

virtual std::set<InIndex> optionalInputs() const final: Return the input indices of all optional inputs to the op.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

bool isTraining() const

inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns: true if the op can be outlined, false otherwise. Default: true.

virtual int getInBatchAxis(InIndex) const override

Get the batch axis for the input index.

Returns: The batch axis for the input index.

virtual int getOutBatchAxis(OutIndex) const override

Get the batch axis for the output index.

Returns: The batch axis for the output index.

inline ActivationFunction getActivation() const

inline ActivationFunction getRecurrentActivation() const

Public Static Functions

static inline InIndex getInitialCInIndex()

static inline InIndex getPeepholeInIndex()

static inline OutIndex getLastCellStateOutIndex()

static inline OutIndex getInitialHPassThroughIndex()

static inline OutIndex getInitialCPassThroughIndex()

static inline OutIndex getIntermediatesPassThroughIndex()

static inline OutIndex getInputWeightsPassThroughIndex()

static inline OutIndex getRecurrenceWeightsPassThroughIndex()

static inline OutIndex getBiasesPassThroughIndex()

class LambSquareOp : public popart::Op 

Public Functions

LambSquareOp(const Op::Settings&)

std::unique_ptr<Op> clone() const final

void setup() final

inline float getSubgraphValue() const final

inline bool isOptimizerOp() const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final

void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class LeakyReluGradOp : public popart::Op, public popart::LeakyReluOpBaseAttributes 

Public Functions

LeakyReluGradOp(const LeakyReluOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

void appendAttributes(popart::OpSerialiserBase &os) const override

void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getLeakyReluInIndex()

static inline InIndex getGradLeakyReluInIndex()

static inline OutIndex getOutIndex()

class LeakyReluInplaceOp : public popart::ElementWiseInplaceUnaryOp, public popart::LeakyReluOpBaseAttributes 

Public Functions

LeakyReluInplaceOp(const LeakyReluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(popart::OpSerialiserBase &os) const override

void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

class LeakyReluOp : public popart::ElementWiseUnaryOp, public popart::LeakyReluOpBaseAttributes 

Public Functions

LeakyReluOp(const OperatorIdentifier &_opid, float _alpha, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

void appendAttributes(popart::OpSerialiserBase &os) const override

void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class LessOp : public popart::BinaryComparisonOp 

Public Functions

LessOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class LinearVariadicGradOp : public popart::VariadicGradOp 

Subclassed by popart::MeanArgGradOp, popart::SumArgGradOp

Public Functions

LinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)

std::unique_ptr<Op> clone() const override

inline virtual bool hasScale() const

inline virtual float getScale() const

class Log1pGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

Log1pGradOp(const Log1pOp&)

std::unique_ptr<Op> clone() const final

class Log1pInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

Log1pInplaceOp(const Log1pOp&)

std::unique_ptr<Op> clone() const final

class Log1pOp : public popart::ElementWiseUnaryOp 

Public Functions

Log1pOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class LogGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

LogGradOp(const LogOp &fwdOp)

std::unique_ptr<Op> clone() const final

class LogOp : public popart::ElementWiseUnaryOp 

Public Functions

LogOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class LogSoftmaxGradOp : public popart::Op 

Public Functions

LogSoftmaxGradOp(const LogSoftmaxOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradProbsInIndex()

static inline InIndex getActsInIndex()

static inline OutIndex getOutIndex()

class LogSoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

LogSoftmaxInplaceOp(const LogSoftmaxOp&)

std::unique_ptr<Op> clone() const final

inline int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const override

class LogSoftmaxOp : public popart::ElementWiseUnaryOp 

Public Functions

LogSoftmaxOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings_)

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> clone() const final

int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class LoopOp : public popart::SubgraphOp 

Public Functions

LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_)

LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_, int numImplicitScanOutputs_)

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const override

void connectInTensor(InIndex inIndex, TensorId tensorId) final

inline float getSubgraphValue() const final

std::unique_ptr<Op> clone() const override

std::vector<const Graph*> getCalledGraphs() const final

std::vector<TensorId> implicitInputTensors() const

Graph &getCalledGraph() const override

void setCalledGraph(Graph&) override

inline int getTripCountValue() const

inline void setTripCountValue(int value)

int getNumExplicitInputs() const

int getNumImplicitInputs() const

inline int getNumImplicitScanOutputs()

inline void setNumImplicitScanOutputs(int numOutputs)

InIndex subgraphInToOpInIndex(InIndex index) const override

InIndex opInToSubgraphInIndex(InIndex index) const override

OutIndex subgraphOutToOpOutIndex(OutIndex index) const override

OutIndex opOutToSubgraphOutIndex(OutIndex index) const override

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override

void addLoopInput(InIndex index, TensorId tensorId, TensorId subgraphTensorId, bool overwrite)

Add a variadic input to the loop operator.

Parameters

index – The position at which a Tensor is consumed by the Op.
tensorId – The id of the tensor to add as an input.
subgraphTensorId – Tensor which is going to be created in the subgraph.
overwrite – If true the original tensor at index will be replaced.

void addLoopOutput(OutIndex index, TensorId tensorId, TensorId subgraphTensorId, bool overwrite)

void removeLoopInput(InIndex index)

void removeLoopOutput(OutIndex index)

inline void growAliasModel(AliasModel &m) const override

std::set<OutIndex> opInToOpOutIndex(InIndex in) const override

std::set<InIndex> opOutToOpInIndex(OutIndex out) const override

Public Static Functions

static inline InIndex getMaximumTripCountInIndex()

Indexing on the LoopOp.

Returns: The LoopOp input index for the maximum number of loop iterations

static inline InIndex getTerminationConditionInIndex()

Indexing on the LoopOp.

Returns: The LoopOp input index specifying the termination condition status

static inline InIndex getFirstInputInIndex()

Indexing on the LoopOp.

Returns: The first regular, user-defined LoopOp input index

static inline OutIndex getFirstOutputOutIndex()

Indexing on the LoopOp.

Returns: The first regular, user-defined LoopOp output index

static inline InIndex getLoopGraphIterationInIndex()

Indexing on the body graph.

Returns: The loop body graph input index specifying the current loop iteration

static inline InIndex getLoopGraphTerminationConditionInIndex()

Indexing on the body graph.

Returns: The loop body graph input index specifying the current termination condition status

static inline InIndex getLoopGraphFirstInputInIndex()

Indexing on the body graph.

Returns: The first regular, user-defined loop body graph input index

static inline OutIndex getLoopGraphTerminationConditionOutIndex()

Indexing on the body graph.

Returns: The loop body graph output index for the termination condition status after the loop body graph has run

static inline OutIndex getLoopGraphFirstOutputOutIndex()

Indexing on the body graph.

Returns: The first regular, user-defined loop body graph output index

class LossOp : public popart::Op 

Subclassed by popart::CtcOp, popart::IdentityLossOp, popart::L1Op, popart::NllOp

Public Functions

LossOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const ReductionType reduction_)

virtual std::unique_ptr<Op> clone() const override = 0

bool isLossOp() const override

inline ReductionType getReductionType() const

Public Static Functions

static std::string reductionTypeToString(ReductionType reduction)

static ReductionType reductionTypeFromString(std::string reduction)

class LossScaleUpdateOp : public popart::Op 

Public Functions

inline LossScaleUpdateOp(const OperatorIdentifier &_opid, const DataType &updateFactorDType_, const Op::Settings &settings_)

void setup() final

inline float getSubgraphValue() const final

std::unique_ptr<Op> clone() const override

inline DataType getUpdateFactorDType() const

view::Regions aliases(InIndex in, OutIndex) const override

view::Regions modifies(InIndex) const override

void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline InIndex getLossScaleUpdateFactorInIndex()

static inline InIndex getStatisticsTensorInIndex()

static inline OutIndex getUpdatedLossScaleUpdateFactorOutIndex()

class MatMulBaseGradOp : public popart::MatMulBaseOp 

Subclassed by popart::MatMulLhsGradOp, popart::MatMulRhsGradOp

Public Functions

MatMulBaseGradOp(const OperatorIdentifier &_opid, const MatMulOp &fwdOp, Phase phase)

MatMulBaseGradOp(const MatMulBaseGradOp&) = default

~MatMulBaseGradOp() override = default

virtual std::unique_ptr<Op> clone() const override = 0

const MatMulOp *getCloneOfCreator() const

inline float getSubgraphValue() const override

class MatMulBaseOp : public popart::Op 

The matmul op supports inputs of IR datatype FLOAT8_143 and FLOAT8_152.

Inputs of this are a special case because they type require an additional scalar INT32 tensor input known as the log2Scale. This argument may only be used if and only if the two matmul operands are one of the FLOAT8_* types.

If the matmul inputs are valid FLOAT8 and log2Scale inputs then the matmul is considered a ‘pow2 scaled matmul’. A pow2 scaled matmul is an operation of the form result := A @ B * 2^(log2scale) where @ is the matrix multiply op. In this case, the output and partials type must be FLOAT16. Note that the multiplication by 2^(log2scale) is handled by Poplar and is not listed as an Op in the IR.

Subclassed by popart::MatMulBaseGradOp, popart::MatMulOp

Public Types

enum class Phase

Values:

enumerator Fwd

enumerator BwdLHS

enumerator BwdRHS

Public Functions

MatMulBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const Phase phase_, const nonstd::optional<float> availableMemoryProportion_, const SerialiseSettings &serialization_, const OptionalDataType outputType_, const MatMulPartialsType partialsType_, const bool enableFullyConnectedPass_ = true)

MatMulBaseOp(const MatMulBaseOp&) = default

~MatMulBaseOp() override = default

virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual Shape getExpandedLhsShape() const = 0

virtual Shape getExpandedRhsShape() const = 0

bool useFullyConnectedPass() const

inline void setUseFullyConnectedPass(bool b)

inline nonstd::optional<float> getAvailableMemoryProportion() const

inline void setAvailableMemoryProportion(const nonstd::optional<float> v)

inline const SerialiseSettings &getSerialiseSettings() const

inline SerialiseSettings &getSerialiseSettings()

inline OptionalDataType getOutputType() const

inline Phase getPhase() const

inline void setPhase(Phase p)

virtual void appendOutlineAttributes(OpSerialiserBase &os) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

virtual void appendMore(OpSerialiserBase &os) const override

Append additional attributes to the stream.

This method should be overridden if the derived class has additional attributes.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline MatMulPartialsType getPartialsType() const

inline void setPartialsType(const MatMulPartialsType &pt)

inline virtual bool canShard() const override

Check if the operation can be sharded into multiple operations.

Returns: true if the operation can be sharded, false otherwise.

struct SerialiseSettings

Public Types

enum class Mode

Values:

enumerator None

enumerator InputChannels

enumerator ReducingDim

enumerator OutputChannels

Public Members

Mode mode = Mode::None 

uint32_t factor = 0

bool keep_precision = false

class MatMulLhsGradOp : public popart::MatMulBaseGradOp 

Public Functions

MatMulLhsGradOp(const MatMulOp &op_)

MatMulLhsGradOp(const MatMulLhsGradOp&) = default

MatMulLhsGradOp &operator=(const MatMulLhsGradOp&) = delete

~MatMulLhsGradOp() override = default

void setup() final

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline Shape getExpandedLhsShape() const override

inline Shape getExpandedRhsShape() const override

Shape getGradInputShape() const

Shape getRhsInputShape() const

Shape getOutputShape() const

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getRhsInIndex()

static inline OutIndex getOutIndex()

class MatMulOp : public popart::MatMulBaseOp 

Public Functions

MatMulOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const nonstd::optional<float> &availableMemoryProportion, const SerialiseSettings &serialization_, const OptionalDataType &outputType, const MatMulPartialsType &partialsType_ = MatMulPartialsType::FLOAT)

MatMulOp(const MatMulOp&) = default

MatMulOp &operator=(const MatMulOp&) = delete

~MatMulOp() override = default

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

std::unique_ptr<Op> clone() const final

const Tensor *lhsIn() const

const Tensor *rhsIn() const

const Tensor *log2ScaleIn() const

const Tensor *out() const

inline Shape getExpandedLhsShape() const override

inline Shape getExpandedRhsShape() const override

inline Shape getExpandedOutShape() const

inline void setCanCreateInputs(bool value)

inline bool getCanCreateInputs() const

inline float getSubgraphValue() const final

Shape npMatMulOut(Shape lhs, Shape rhs)

bool isPow2ScaledMatMul() const

inline std::set<InIndex> optionalInputs() const override

Public Static Functions

static inline InIndex getLhsInIndex()

static inline InIndex getRhsInIndex()

static inline InIndex getLog2ScaleInIndex()

static inline OutIndex getOutIndex()

class MatMulRhsGradOp : public popart::MatMulBaseGradOp 

Public Functions

MatMulRhsGradOp(const MatMulOp &op_)

MatMulRhsGradOp(const MatMulRhsGradOp&) = default

MatMulRhsGradOp &operator=(const MatMulRhsGradOp&) = delete

~MatMulRhsGradOp() override = default

void setup() final

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline Shape getExpandedLhsShape() const override

inline Shape getExpandedRhsShape() const override

Shape getLhsInputShape() const

Shape getGradInputShape() const

Shape getOutputShape() const

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getLhsInIndex()

static inline OutIndex getOutIndex()

class MaxArgGradOp : public popart::NonLinearVariadicGradOp 

Public Functions

MaxArgGradOp(const MaxOp&, InIndex)

std::unique_ptr<Op> clone() const final

class MaxOp : public popart::VariadicOp 

Public Functions

MaxOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

class MaxPoolGradOp : public popart::Op 

Public Functions

MaxPoolGradOp(const MaxPoolOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Members

const Shape creatorSpatialK

const Shape creatorStrides

const Shape creatorLowerPads

const Shape creatorUpperPads

Public Static Functions

static inline InIndex getPrePooledInIndex()

static inline InIndex getPooledInIndex()

static inline InIndex getGradPooledInIndex()

static inline OutIndex getOutIndex()

class MaxPoolOp : public popart::HasReceptiveFieldOp 

Public Functions

MaxPoolOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &kernelShape_, int64_t storageOrder, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

int64_t getNOutChans() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

bool canBeReplacedByIdentity() const override

Shape getSpatialK() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class MeanArgGradOp : public popart::LinearVariadicGradOp 

Public Functions

MeanArgGradOp(const MeanOp&, InIndex inIndex)

const std::vector<GradInOutMapper> &gradInputInfo() const final

std::unique_ptr<Op> clone() const final

inline bool hasScale() const final

inline float getScale() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

class MeanOp : public popart::VariadicOp 

Public Functions

MeanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

class MinArgGradOp : public popart::NonLinearVariadicGradOp 

Public Functions

MinArgGradOp(const MinOp&, InIndex)

std::unique_ptr<Op> clone() const final

class MinOp : public popart::VariadicOp 

Public Functions

MinOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

class ModifyRandomSeedOp : public popart::Op 

Public Functions

ModifyRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline InIndex getSeedInIndex() const override

inline int getOutBatchAxis(OutIndex) const override

inline float getSubgraphValue() const final

inline bool isOutlineable() const final

Public Static Functions

static inline InIndex getSeedModifierInIndex()

static inline OutIndex getModifiedSeedOutIndex()

static inline TensorId getSeedInTensorId()

static inline TensorId getSeedModifierTensorId(const uint32_t modifier)

static inline TensorId getModifiedSeedTensorId(const uint32_t modifier)

class MulArg0GradOp : public popart::ElementWiseBinaryArg0GradOp 

Public Functions

MulArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class MulArg1GradOp : public popart::ElementWiseBinaryArg1GradOp 

Public Functions

MulArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class MulLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp 

Public Functions

inline MulLhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

void setup() final

class MulRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp 

Public Functions

inline MulRhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

void setup() final

class MultiCollectiveBaseOp : public popart::CollectivesBaseOp 

The base class for a multi-collective which performs all-gather, all-reduce reduce-scatter operations on lists of tensors by first merging them into a larger tensor.

This improves bandwidth utilization and decreases the number of syncs needed.

Subclassed by popart::MultiReplicatedAllGatherOp, popart::MultiReplicatedAllReduceOp, popart::MultiReplicatedReduceScatterOp

Public Functions

MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, CommGroup commGroup, const Op::Settings &settings, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedBaseOp.

Parameters

operatorIdentifier – the identifier for the constructed op
commGroup – all of the inputs will be reduced scattered across the same communications group
settings – the settings of the op are shared across all inputs
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, const ReplicaGrouping &grouping, const Op::Settings &settings, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedBaseOp.

Parameters

operatorIdentifier – the identifier for the constructed op
grouping – all of the inputs will be reduced scattered across the same communications group
settings – the settings of the op are shared across all inputs
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

virtual std::unique_ptr<Op> clone() const override = 0

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in) const

Get virtual graph ID and tile set associated with an input index.

Parameters: InIndex – The input index.
Returns: The virtual graph ID and tile set at the input index.

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out) const

Get virtual graph ID and tile set associated with an output index.

Parameters: OutIndex – The output index.
Returns: The virtual graph ID and tile set at the output index.

virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in, std::set<OpId> &visited) const override

Get virtual graph ID and tile set associated with an input index.

Parameters

InIndex – The input index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the input index.

virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out, std::set<OpId> &visited) const override

Get virtual graph ID and tile set associated with an output index.

Parameters

OutIndex – The output index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.

Returns

The virtual graph ID and tile set at the output index.

bool hasCorrespondingLinkedIndexTensor(Tensor *t) override

Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override

bool isCollectiveLinkedIndexTensor(InIndex in) const override

bool isCollectiveLinkedIndexTensor(Tensor *t) const override

virtual void growAliasModel(AliasModel &m) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

class MultiConvBaseOp : public popart::Op 

Subclassed by popart::ConvOp, popart::MultiConvOp

Public Functions

MultiConvBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, std::vector<int64_t> flatStrides_, std::vector<int64_t> flatPads_, std::vector<int64_t> flatDilations_, const AutoPad &padType_, const MultiConvOptions &convOpts_)

std::unique_ptr<Op> clone() const override

void setup() override

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline virtual int numConvs() const

inline int64_t getNSpatialDims(int convIndex) const

inline Shape getSpatialD(int convIndex) const

inline Shape getSpatialK(int convIndex) const

inline int64_t getGroups(int convIndex) const

inline int64_t getNOutChans(int convIndex) const

inline int64_t getNInChans(int convIndex) const

inline Shape lowerPads(int convIndex) const

inline Shape upperPads(int convIndex) const

inline Shape lowerOutPads(int convIndex) const

inline Shape upperOutPads(int convIndex) const

Shape getOutShape(int convIndex, const ConvPads &pads) const

ConvParameters getParameters(int convIndex) const

virtual void setParamsFromDataGradOp(const Op *dataGradOp)

virtual void restoreAttributesFromParams(const std::vector<ConvParameters>&)

inline const MultiConvOptions &getConvOptions() const

inline void setConvOptions(const MultiConvOptions &opts)

int64_t getCumulativeSpatialDims(int64_t i) const

ConvStrides getStrides(int64_t convIndex) const

ConvPads getPads(int64_t convIndex) const

ConvPads getOutPads(int64_t convIndex) const

ConvDilations getDilations(int64_t convIndex) const

ConvDilations getInDilations(int64_t convIndex) const

Shape lowerKernTruncs(int64_t convIndex) const

Shape upperKernTruncs(int64_t convIndex) const

Shape lowerInTruncs(int64_t convIndex) const

Shape upperInTruncs(int64_t convIndex) const

Shape lowerOutTruncs(int64_t convIndex) const

Shape upperOutTruncs(int64_t convIndex) const

Public Static Functions

static void appendConvParameterAttributes(const ConvParameters&, const std::string&, OpSerialiserBase&)

static inline InIndex getDataInIndex(int convIndex)

static inline InIndex getWeightsInIndex(int convIndex)

static inline OutIndex getOutIndex(int convIndex)

static inline int getConvIndexFromInIndex(InIndex index)

class MultiConvDataGradBaseOp : public popart::Op 

Subclassed by popart::ConvDataGradOp, popart::MultiConvDataGradOp

Public Functions

MultiConvDataGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)

std::unique_ptr<Op> clone() const override

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline const std::vector<GradInOutMapper> &gradInputInfo() const final

inline const std::map<int, int> &gradOutToNonGradIn() const final

inline const ConvParameters &getParameters(int convIndex) const

inline virtual int numConvs() const

inline const MultiConvOptions &getConvOptions() const

inline void setConvOptions(const MultiConvOptions &opts)

inline TensorInfo getDataInfo(int convIndex) const

Public Static Functions

static inline InIndex getWeightsInIndex(int convIndex)

static inline InIndex getGradConvolvedInIndex(int convIndex)

static inline OutIndex getOutIndex(int convIndex)

class MultiConvDataGradOp : public popart::MultiConvDataGradBaseOp 

Public Functions

MultiConvDataGradOp(const MultiConvOp&)

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

class MultiConvOp : public popart::MultiConvBaseOp 

Public Functions

MultiConvOp(const OperatorIdentifier &_opid, const Settings &settings_, const std::vector<int64_t> &flatStrides_, const std::vector<int64_t> &flatPads_, const std::vector<int64_t> &flatDilations_, const MultiConvOptions &mcOpts_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void appendOutlineAttributes(OpSerialiserBase&) const final

class MultiConvWeightsGradBaseOp : public popart::Op 

Subclassed by popart::ConvWeightsGradOp, popart::MultiConvWeightsGradOp

Public Functions

MultiConvWeightsGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)

std::unique_ptr<Op> clone() const override

void setup() final

inline const std::vector<GradInOutMapper> &gradInputInfo() const final

inline const std::map<int, int> &gradOutToNonGradIn() const final

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline virtual int numConvs() const

inline const ConvParameters &getParameters(int convIndex) const

inline const MultiConvOptions &getConvOptions() const

Public Static Functions

static inline InIndex getGradConvolvedInIndex(int convIndex)

static inline InIndex getPreConvolvedInIndex(int convIndex)

static inline OutIndex getOutIndex(int convIndex)

class MultiConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp 

Public Functions

MultiConvWeightsGradOp(const MultiConvOp&)

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

class MultiExchangeOp : public popart::ExchangeBaseOp 

Public Functions

MultiExchangeOp(const OperatorIdentifier&, const Op::Settings&, const std::vector<ExchangeDescriptor>)

std::unique_ptr<Op> clone() const final

void setup() final

view::Regions modifies(InIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

void appendOutlineAttributes(OpSerialiserBase&) const override

int numLoads() const

int numStores() const

inline bool isRemote(int index)

inline void setRemoteBufferId(int index, RemoteBufferId remotebuffer_id)

inline RemoteBufferId getRemoteBufferId(int index) const

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final

inline void growAliasModel(AliasModel &m) const override

inline bool canShard() const final

bool hasSideEffect() const final

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

inline int getNumExchanges() const final

ExchangeDescriptor getExchangeDescriptor(int index) const final

std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const override

Map input index to a tuple of integers (a,b) that corresponds to the input associated with index.

That is, the bth input of getExchangeDescriptor(a) corresponds to the input at index.

Parameters: index – the input index to look up.
Returns: a pair of integers comprising the index of the descriptor and the index of the input associated with the input within the descriptor.

std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const override

Map output index to a tuple of integers (a,b) that corresponds to the output associated with index.

That is, the bth output of getExchangeDescriptor(a) corresponds to the output at index.

Parameters: index – the output index to look up.
Returns: a pair of integers comprising the index of the descriptor and the index of the output associated with the output within the descriptor.

std::vector<InIndex> descriptorIndexToInIndices(int index) const override

std::vector<OutIndex> descriptorIndexToOutIndices(int index) const override

class MultiReplicatedAllReduceOp : public popart::MultiCollectiveBaseOp 

A multi-collective class for performing an all-reduce operation on a list of tensors.

The tensors will be merged into a single large tensor and reduced as one, leading to better bandwidth utilization and fewer syncs between replicas than doing the all-reduce on a per-tensor basis. The class supports mixing in-place and out-place all-reduce operations, but requires that all tensors use the same collective group i.e. reduction is over the same replicas. This op is usually constructed in the MergeCollectivesTransform

Public Functions

MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, CommGroup commGroup, const Settings &settings, std::vector<bool> modifiesIndexInplace, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedAllReduceOp.

Parameters

collectiveOperator – the collective operator is the same for all input tensors
commGroup – all of the inputs will be reduced across the same communications group
settings – the settings of the op are shared across all inputs
modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, const ReplicaGrouping &grouping, const Settings &settings, const std::vector<bool> &modifiesIndexInplace, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)

Constructor for the MultiReplicatedAllReduceOp.

Parameters

collectiveOperator – the collective operator is the same for all input tensors
grouping – all of the inputs will be reduced across the same communications group
settings – the settings of the op are shared across all inputs
modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

inline CollectiveOperator getCollectiveOp() const

Returns the type of the collective used in the all reduce e.g.

addition the same collective operator is used across all the inputs to be reduced

bool hasCorrespondingLinkedIndexTensor(Tensor *t) override

Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override

bool isCollectiveLinkedIndexTensor(InIndex in) const override

bool isCollectiveLinkedIndexTensor(Tensor *t) const override

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override: Return which inputs and outputs are replicated tensor sharding pairs.

virtual view::Regions modifies(InIndex index) const override

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

virtual view::Regions aliases(InIndex in, OutIndex out) const override

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters

InIndex – The input index.
OutIndex – The output index.

Returns

The regions which the output will alias.

virtual void growAliasModel(AliasModel &m) const override

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

Determine whether output tensors are guaranteed to have an equal value across all replicas.

This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).

The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.

Parameters

aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.

Returns

A tuple comprising of:

a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.

class NearbyIntOp : public popart::OneWayUnaryOp 

Public Functions

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class NegateGradOp : public popart::NegateOp 

Public Functions

NegateGradOp(const NegateOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class NegateOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::NegateGradOp

Public Functions

NegateOp(const OperatorIdentifier &_opid, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class NllGradOp : public popart::Op 

Public Functions

NllGradOp(const NllOp&)

NllGradOp(const TensorId &lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const bool inputIsLogProbability, const Op::Settings &settings)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

inline ReductionType getReductionType() const

inline bool hasIgnoreIndex() const

inline nonstd::optional<int> getOptionalIgnoreIndex() const

int getIgnoreIndex() const

inline bool inputIsLogProbability() const

inline TensorId getLossTensorId() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

inline bool canShard() const override

float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()

static inline InIndex getLabelInIndex()

static inline InIndex getGradInIndex()

static inline OutIndex getOutIndex()

class NllOp : public popart::LossOp 

Public Functions

NllOp(const OperatorIdentifier &_opid, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, bool inputIsLogProbability, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline bool hasIgnoreIndex() const

inline nonstd::optional<int> getOptionalIgnoreIndex() const

int getIgnoreIndex() const

inline bool inputIsLogProbability() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

inline bool canShard() const override

inline ReductionType getShardReductionType(OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()

static inline InIndex getLabelInIndex()

static inline OutIndex getOutIndex()

class NlllWithSoftmaxGradDirectOp : public popart::Op 

Public Functions

NlllWithSoftmaxGradDirectOp(const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

void setup() final

Op *nlllFwdOp() const

inline float getSubgraphValue() const final

inline ReductionType getReductionType() const

inline bool hasIgnoreIndex() const

inline int getIgnoreIndex() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

inline bool canShard() const override

inline ReductionType getShardReductionType(OutIndex index) const override

float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const override

Public Static Functions

static inline InIndex getProbsInIndex()

static inline InIndex getLabelInIndex()

static inline InIndex getGradProbsInIndex()

static inline OutIndex getLossOutIndex()

static inline OutIndex getGradOutIndex()

class NonLinearVariadicGradOp : public popart::VariadicGradOp 

Subclassed by popart::MaxArgGradOp, popart::MinArgGradOp

Public Functions

NonLinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInIndex()

static inline InIndex getFwdOutInIndex()

class NopOp : public popart::ElementWiseUnaryOp 

Public Functions

NopOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline bool isOutplaceViewChange() const override

class NotOp : public popart::ElementWiseUnaryOp 

Public Functions

NotOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class NormalizeImageOp : public popart::Op 

Public Functions

NormalizeImageOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings_, float _scale)

std::unique_ptr<Op> clone() const override

void setup() override

inline float getSubgraphValue() const final

inline bool canShard() const override

const popart::Tensor *imgIn() const

const popart::Tensor *offsetsIn() const

const popart::Tensor *scalesIn() const

inline float getScale() const

bool canBeReplacedByIdentity() const override

bool verifyInputShapes(popart::Shape imgShape)

popart::Shape paddedShape(popart::Shape imgShape, popart::Shape offsetsShape, popart::Shape scalesShape)

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)

static inline popart::InIndex getImageInIndex()

static inline popart::InIndex getOffsetsInIndex()

static inline popart::InIndex getScalesInIndex()

static inline popart::OutIndex getOutIndex()

static inline std::string opName()

class OneWayUnaryInPlaceOp : public popart::ElementWiseInplaceUnaryOp 

Subclassed by popart::CeilInplaceOp, popart::FloorInplaceOp, popart::NearbyIntInplaceOp, popart::RoundInplaceOp, popart::SignInplaceOp

Public Functions

OneWayUnaryInPlaceOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

class OneWayUnaryOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::CeilOp, popart::FloorOp, popart::NearbyIntOp, popart::RoundOp, popart::SignOp

Public Functions

OneWayUnaryOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class OnehotGradOp : public popart::Op 

Public Functions

OnehotGradOp(const OnehotOp &fwdOp_)

std::unique_ptr<Op> clone() const final

void setup() override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline const Shape &getOutputShape() const

inline int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getIndicesInIndex()

static inline OutIndex getOutIndex()

class OnehotOp : public popart::Op 

Public Functions

OnehotOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() override

void appendOutlineAttributes(OpSerialiserBase&) const override

void connectInTensor(InIndex inIndex, TensorId tenId) override

inline int64_t getAxis() const

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getIndicesInIndex()

static inline InIndex getValuesInIndex()

static inline OutIndex getOutIndex()

class OrOp : public popart::BinaryComparisonOp 

Public Functions

OrOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class PReluOp : public popart::ElementWiseBinaryOp 

Public Functions

PReluOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

class PackedDataBlockOp : public popart::Op 

Public Functions

PackedDataBlockOp(const OperatorIdentifier&, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, Graph &callback, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

Graph &getCalledGraph() const

void setCalledGraph(Graph&)

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override

int64_t numCallbackInputs() const

int64_t numDataInputs() const

int64_t getCallbackIterations() const

std::vector<PackedSequences> getPackedInputs()

PackedSequences getPackedOutput()

inline InIndex dataIndex(InIndex i)

inline InIndex offsetsIndex(InIndex i)

inline InIndex lengthsIndex(InIndex i)

inline int64_t getCallbackBatchSize()

inline std::vector<int64_t> getMaxSequenceLengths()

inline int64_t getMaxSequenceLength(int64_t dataIndex)

std::vector<TensorInfo> callbackSequenceInInfos()

class PadGradOp : public popart::SliceOp 

Public Functions

PadGradOp(const PadOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class PadInplaceOp : public popart::BasePadOp 

Public Functions

PadInplaceOp(const BasePadOutplaceOp&)

std::unique_ptr<Op> clone() const final

inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

view::Regions aliases(InIndex, OutIndex) const override

view::Regions uses(InIndex index) const override

class PadOp : public popart::BasePadOutplaceOp 

Public Functions

PadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void connectInTensor(InIndex inIndex, TensorId tenId) final

template<typename TDerivedOp, typename TOpParams> class ParameterizedOp : public popart::Op 

Generic base class for simple ops with parameterized attributes.

The aim of this class is to regroup all the common logic in the implementation of custom ops. In particular, it forces gathering all parameters/attributes into a proper data structure, helping generalizing the rest of the code.

Template Parameters

TDerivedOP – CRTP template type.
TOpParams – Structure containing the op parameters.

Public Types

using ParamsType = TOpParams 

Public Functions

inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParamsType &_params, const popart::Op::Settings &_settings)

Construct a custom op.

Parameters

_opid – Operator id (default one if not provided).
_params – Operation parameters.
_settings – Settings.

inline ParameterizedOp(const ParamsType &_params, const popart::Op::Settings &_settings)

template<typename T> inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParameterizedOp<T, TOpParams> &_op)

Construct a custom op from another op with same parameters.

Typically, this constructor build a grad op from a fwd op.

Template Parameters

T – Op input type.

Parameters

_opid – Operator identifier (default one if not provided).
_op – Operation to extract setting and parameters from.

template<typename T> inline ParameterizedOp(const ParameterizedOp<T, TOpParams> &_op)

inline virtual std::unique_ptr<Op> clone() const override

Clone the operator.

NOTE: using CRTP trick for generic implementation!

Returns: std::unique_ptr<Op> A unique pointer to the op.

inline virtual void appendAttributes(popart::OpSerialiserBase &os) const override

Append attributes when serialising the op to a stream.

This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual void appendOutlineAttributes(popart::OpSerialiserBase &os) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline virtual float getSubgraphValue() const override

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

inline virtual bool requiresRandomSeed() const override

Check if the op requires a random seed.

This is set to falseby default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.

Returns: true if the op requires a random seed, false otherwise.

inline const TOpParams &params() const

Returns: Custom op parameters.

Public Static Functions

static inline std::unique_ptr<TDerivedOp> createOpFromCreatorInfo(const popart::OpCreatorInfo &info)

Build the op from a PopART OpCreatorInfo data structure.

Parameters: info – The OpCreatorInfo to use.
Returns: Unique ptr of the op created.

static inline TDerivedOp *createOpInGraph(popart::Graph &graph, const std::map<popart::InIndex, popart::TensorId> &in, const std::map<popart::OutIndex, popart::TensorId> &out, const popart::OperatorIdentifier &opid, const TOpParams &params, const popart::Op::Settings &settings)

Create the custom op connected in a graph.

Parameters

graph – Graph where to create and connect the op.
in – Map of input tensor ids (i.e. name).
out – Map of input tensor ids (i.e. name).
opid – PopART operator identifier (default one if not provided).
params – Custom op parameters.
settings – Custom op settings.

Returns

Pointer to the custom op created (owned by the graph?)

static inline TDerivedOp *createOpInGraph(popart::Graph &graph, const std::map<popart::InIndex, popart::TensorId> &in, const std::map<popart::OutIndex, popart::TensorId> &out, const TOpParams &params, const popart::Op::Settings &settings)

class PlaceholderOp : public popart::Op 

Public Functions

PlaceholderOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

inline float getSubgraphValue() const final

class PopartLSTMGradOp : public popart::Op 

Gradient operator for PopartLSTMOp.

Public Functions

PopartLSTMGradOp(const PopartLSTMOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns: The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

virtual std::set<InIndex> optionalInputs() const final: Return the input indices of all optional inputs to the op.

int64_t getInputSize() const

int64_t getMaxSeqLength() const

int64_t getBatchSize() const

int64_t getHiddenSize() const

inline ActivationFunction getActivation() const

inline ActivationFunction getRecurrentActivation() const

Public Members

const bool outputFullSequence

Public Static Functions

static inline InIndex getInitialStateInIndex()

static inline InIndex getIntermediatesInIndex()

static inline InIndex getWeightsInIndex()

static inline InIndex getBiasesInIndex()

static inline InIndex getSequenceLensInIndex()

static inline InIndex getInputInIndex()

static inline InIndex getFwdOutputInIndex()

static inline InIndex getFwdOutputGradInIndex()

static inline InIndex getFwdCellStateGradInIndex()

static inline OutIndex getInputOutIndex()

static inline OutIndex getWeightsOutIndex()

static inline OutIndex getBiasesOutIndex()

static inline OutIndex getInitialStateOutIndex()

class PopartLSTMOp : public popart::Op 

Public Functions

PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)

PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

bool hasBiasesInput() const

std::set<InIndex> optionalInputs() const final

bool hasSeqLenInput() const

int64_t getMaxSeqLength() const

int64_t getBatchSize() const

int64_t getInputSize() const

int64_t getHiddenSize() const

int64_t getNumIntermediates() const

nonstd::optional<float> getAvailableMemoryProportion() const

int getInBatchAxis(InIndex) const override

int getOutBatchAxis(OutIndex) const override

inline ActivationFunction getActivation() const

inline ActivationFunction getRecurrentActivation() const

Public Members

const bool outputFullSequence

Public Static Functions

static inline InIndex getInputInIndex()

static inline InIndex getWeightsInIndex()

static inline InIndex getBiasesInIndex()

static inline InIndex getInitialStateInIndex()

static inline InIndex getSequenceLensInIndex()

static inline OutIndex getOutputOutIndex()

static inline OutIndex getCellStateOutIndex()

static inline OutIndex getIntermediatesOutIndex()

class PowArg0GradOp : public popart::ElementWiseBinaryArg0GradOp 

Public Functions

PowArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class PowArg1GradOp : public popart::ElementWiseBinaryArg1GradOp 

Public Functions

PowArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class PowLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp 

Public Functions

inline PowLhsInplaceOp(const Op::Settings &_settings)

std::unique_ptr<Op> clone() const final

class PrintTensorOp : public popart::ElementWiseUnaryOp 

Public Functions

PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const Op::Settings&)

PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const PrintTensorFmt &fmt, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void appendOutlineAttributes(OpSerialiserBase &os) const final

inline bool canBeReplacedByIdentity() const final

inline bool hasSideEffect() const override

inline bool shouldPrint() const

inline const std::string &getTitle() const

inline void setTitle(std::string title_)

inline const PrintTensorFmt &getFmt() const

class RMSPropUpdaterOp : public popart::Op 

Public Functions

RMSPropUpdaterOp(OptimizerValue eps, bool TFVariant, const Op::Settings&)

std::unique_ptr<Op> clone() const final

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSubgraphValue() const final

inline bool isOptimizerOp() const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

Public Members

const OptimizerValue initEps

const bool TFVariant

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getAccl1InIndex()

static inline InIndex getAccl2InIndex()

static inline InIndex getEpsInIndex()

static inline OutIndex getUpdaterOutIndex()

class RNNGradOp : public popart::BaseOnnxRNNGradOp 

Gradient operator for RNNOp.

Public Functions

RNNGradOp(const RNNOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::set<InIndex> optionalInputs() const final: Return the input indices of all optional inputs to the op.

Public Members

const ActivationFunction activation_attribute

class RNNOp : public popart::BaseOnnxRNNOp 

This op applies a single-layer Elman RNN with a non-linearity to a batch of input sequences.

The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#RNN

For each batch element, the following output is computed:

\[ h_t = f(W x_t + b_x + R h_{t-1} + b_h) \]

where:

\(f\) is a supported nonlinearity function
\(W\) is the input weight
\(x_t\) is the t’th element of the input sequence
\(R\) is the recurrence weight matrix
\(h_{t-1}\) is the previous output sequence element. \(h_0\) can be provided by the user
\(b_x\) and \(b_h\) are the input and recurrence biases respectively

The op outputs the full sequence \(h_1, h_2, ...\), as well as the last element of the sequence.

If the biases or \(h_0\) are not set, they are considered to be 0 and not trained (are treated as constant 0s in the model).

Public Functions

RNNOp(const OperatorIdentifier &_opid, ActivationFunction activation, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual void appendOutlineAttributes(OpSerialiserBase&) const override

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

virtual int getInBatchAxis(InIndex) const override

Get the batch axis for the input index.

Returns: The batch axis for the input index.

virtual int getOutBatchAxis(OutIndex) const override

Get the batch axis for the output index.

Returns: The batch axis for the output index.

inline virtual bool isOutlineable() const override

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns: true if the op can be outlined, false otherwise. Default: true.

inline virtual std::string getName() const final

Public Members

const ActivationFunction activation_attribute

class RandomBaseOp : public popart::ShapeOrLikeOp 

Subclassed by popart::DropoutBaseOp, popart::RandomNormalBaseOp, popart::RandomUniformBaseOp

Public Functions

RandomBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

inline bool requiresRandomSeed() const final

inline std::vector<DataType> getSupportedDataTypes() const override

Public Static Functions

static std::vector<DataType> supportedDataTypes()

static void errorIfSeedIsSet(const Attributes &attr, OperatorIdentifier opid)

class RandomNormalBaseOp : public popart::RandomBaseOp 

Subclassed by popart::RandomNormalLikeOp, popart::RandomNormalOp

Public Functions

RandomNormalBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getMean() const

inline float getScale() const

class RandomNormalLikeOp : public popart::RandomNormalBaseOp 

Public Functions

RandomNormalLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

std::vector<std::unique_ptr<Op>> getGradOps() final

inline InIndex getSeedInIndex() const final

std::unique_ptr<RandomNormalOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()

class RandomNormalOp : public popart::RandomNormalBaseOp 

Public Functions

RandomNormalOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline InIndex getSeedInIndex() const final

class RandomUniformBaseOp : public popart::RandomBaseOp 

Subclassed by popart::RandomUniformLikeOp, popart::RandomUniformOp

Public Functions

RandomUniformBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getHigh() const

inline float getLow() const

class RandomUniformLikeOp : public popart::RandomUniformBaseOp 

Public Functions

RandomUniformLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

std::vector<std::unique_ptr<Op>> getGradOps() final

inline InIndex getSeedInIndex() const final

std::unique_ptr<RandomUniformOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()

class RandomUniformOp : public popart::RandomUniformBaseOp 

Public Functions

RandomUniformOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

inline InIndex getSeedInIndex() const final

class ReciprocalGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

ReciprocalGradOp(const ReciprocalOp&)

std::unique_ptr<Op> clone() const final

class ReciprocalOp : public popart::ElementWiseUnaryOp 

Public Functions

ReciprocalOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceGradOp : public popart::Op 

Subclassed by popart::ReduceL1GradOp, popart::ReduceL2GradOp, popart::ReduceLogSumExpGradOp, popart::ReduceLogSumGradOp, popart::ReduceMaxGradOp, popart::ReduceMeanGradOp, popart::ReduceMedianGradOp, popart::ReduceMinGradOp, popart::ReduceProdGradOp, popart::ReduceSumGradOp, popart::ReduceSumSquareGradOp

Public Functions

ReduceGradOp(const AiGraphcoreOpIdV1 &opid, const ReduceOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const override

void setup() override

const std::vector<int64_t> &getAxes() const

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const final

const Shape &backwardShape() const

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ReduceL1GradOp : public popart::ReduceGradOp 

Public Functions

ReduceL1GradOp(const ReduceL1Op &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

class ReduceL1Op : public popart::ReduceOp 

Public Functions

ReduceL1Op(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceL2GradOp : public popart::ReduceGradOp 

Public Functions

ReduceL2GradOp(const ReduceL2Op &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

static inline InIndex getFwdOutInIndex()

class ReduceL2Op : public popart::ReduceOp 

Public Functions

ReduceL2Op(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceLogSumExpGradOp : public popart::ReduceGradOp 

Public Functions

ReduceLogSumExpGradOp(const ReduceLogSumExpOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

static inline InIndex getFwdOutInIndex()

class ReduceLogSumExpOp : public popart::ReduceOp 

Public Functions

ReduceLogSumExpOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceLogSumGradOp : public popart::ReduceGradOp 

Public Functions

ReduceLogSumGradOp(const ReduceLogSumOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdOutInIndex()

class ReduceLogSumOp : public popart::ReduceOp 

Public Functions

ReduceLogSumOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceMaxGradOp : public popart::ReduceGradOp 

Public Functions

ReduceMaxGradOp(const ReduceMaxOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

static inline InIndex getFwdOutInIndex()

class ReduceMaxOp : public popart::ReduceOp 

Public Functions

ReduceMaxOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceMeanGradOp : public popart::ReduceGradOp 

Public Functions

ReduceMeanGradOp(const ReduceMeanOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

class ReduceMeanOp : public popart::ReduceOp 

Public Functions

ReduceMeanOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceMedianGradOp : public popart::ReduceGradOp 

Public Functions

ReduceMedianGradOp(const ReduceMedianOp &fwd_op, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const override

Public Static Functions

static inline InIndex getIndicesInIndex()

class ReduceMedianOp : public popart::ReduceOp 

Public Functions

ReduceMedianOp(const OperatorIdentifier &opid, const nonstd::optional<std::vector<int64_t>> &axes, int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

void setup() override

std::vector<std::unique_ptr<Op>> getGradOps() final

bool canBeReplacedByIdentity() const override

Public Static Functions

static inline OutIndex getIndicesOutIndex()

class ReduceMinGradOp : public popart::ReduceGradOp 

Public Functions

ReduceMinGradOp(const ReduceMinOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

static inline InIndex getFwdOutInIndex()

class ReduceMinOp : public popart::ReduceOp 

Public Functions

ReduceMinOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceOp : public popart::Op 

Subclassed by popart::ReduceL1Op, popart::ReduceL2Op, popart::ReduceLogSumExpOp, popart::ReduceLogSumOp, popart::ReduceMaxOp, popart::ReduceMeanOp, popart::ReduceMedianOp, popart::ReduceMinOp, popart::ReduceProdOp, popart::ReduceSumOp, popart::ReduceSumSquareOp

Public Functions

ReduceOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

void setup() override

const std::vector<int64_t> &getAxes() const

bool getKeepDims() const

void setAxes(std::vector<int64_t> value)

void setKeepDims(int64_t value)

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

const Shape &backwardShape() const

inline bool canShard() const override

inline int getOutBatchAxis(OutIndex) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ReduceProdGradOp : public popart::ReduceGradOp 

Public Functions

ReduceProdGradOp(const ReduceProdOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

class ReduceProdOp : public popart::ReduceOp 

Public Functions

ReduceProdOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceSumGradOp : public popart::ReduceGradOp 

Public Functions

ReduceSumGradOp(const ReduceSumOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

class ReduceSumOp : public popart::ReduceOp 

Subclassed by popart::AddArg0GradOp, popart::AddArg1GradOp, popart::AddBiasBiasGradOp, popart::SubtractArg0GradOp

Public Functions

ReduceSumOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class ReduceSumSquareGradOp : public popart::ReduceGradOp 

Public Functions

ReduceSumSquareGradOp(const ReduceSumSquareOp &fwdOp, const Shape &backward_shape)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

Public Static Functions

static inline InIndex getFwdInInIndex()

class ReduceSumSquareOp : public popart::ReduceOp 

Public Functions

ReduceSumSquareOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final

void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final

void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final

class ReluGradOp : public popart::Op 

Public Functions

ReluGradOp(const ReluOp&)

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getReludInIndex()

static inline InIndex getGradReludInIndex()

static inline OutIndex getOutIndex()

class ReluInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ReluInplaceOp(const ReluOp&)

ReluInplaceOp(const Op::Settings &opSettings)

std::unique_ptr<Op> clone() const final

class ReluOp : public popart::ElementWiseUnaryOp 

Public Functions

ReluOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class RemoteBaseOp : public popart::ExchangeBaseOp 

Subclassed by popart::RemoteLoadOp, popart::RemoteStoreOp

Public Functions

inline RemoteBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, RemoteBufferId rbid_)

virtual std::unique_ptr<Op> clone() const = 0

inline virtual RemoteBufferId getRemoteBufferId() const final

inline virtual bool canShard() const final

inline virtual void setRemoteBufferId(RemoteBufferId remoteBufferId_) final

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Public Static Functions

static inline InIndex getLocalTensorInIndex()

static inline InIndex getRemoteBufferOffsetInIndex()

class RemoteLoadInplaceOp : public popart::RemoteLoadOp 

Remote Load Inplace Op.

See also

RemoteLoadOp for explanation.

Public Functions

RemoteLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteLoadInplaceOp.

See constructor of the parent class for the input parameters.

RemoteLoadInplaceOp(const RemoteLoadOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual view::Regions modifies(InIndex) const final

Return the input region which this op modifies (for inplace ops).

Parameters: InIndex – The input index.
Returns: The regions which this op modifies.

virtual view::Regions aliases(InIndex, OutIndex) const final

Return the input region which the op output will alias (for inplace and view-changing ops).

See also

For more information on views, refer to the IPU Programmer’s Guide.

Parameters

InIndex – The input index.
OutIndex – The output index.

Returns

The regions which the output will alias.

virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final

Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final

Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.

Parameters

InIndex – The op input index.
OutIndex – The op output index.

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

ExchangeDescriptor getExchangeDescriptor(int index) const final

class RemoteLoadOp : public popart::RemoteBaseOp 

Remote Load Op.

Loads a tensor from remote (off-chip) buffer. The tensor will be loaded from the memory location corresponding to RemoteBufferId, and will be stored in the memory location corresponding to inTensor.

This class takes between one and two TensorIds as inputs (as indicated in

The TensorId of the inTensor.
- In the inplace version this will be aliased to the output tensor
- In the outplace version this Op will clone the inTensor, then write the loaded data to the clone
The (optional) TensorId to a 0-rank tensor called offset .
- If set to a value >= 0 offset will specify the row in the remote buffer the tensor will be loaded.
- If set to -1 RemoteSetup will assign a unique value.

See also

graphcoreoperators.hpp).

The relationship between offset, RemoteBufferId and RemoteSetup

is thoroughly described in

The output is the

TensorId of the loaded tensor.

See also

RemoteStoreOp.

See also

RemoteStoreOp.

Subclassed by popart::RemoteLoadInplaceOp

Public Functions

RemoteLoadOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteLoadOp.

Parameters specifically related to this class:

See constructor of the parent class for the rest of input parameters.

Parameters: RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the RemoteSetup will automatically choose the right buffer. The RemoteBufferId can only be used with tensors of identical shape.

virtual std::unique_ptr<Op> clone() const override

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final: Return which inputs and outputs are replicated tensor sharding pairs.

ExchangeDescriptor getExchangeDescriptor(int index) const override

virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override

Return the variants of this op (if any) which can modify / alias the inputs at the given indices.

This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().

Parameters: OperatorIdentifier – The operator identifier of the op to be instantiated.
Returns: An instance of the required op.

virtual void growAliasModel(AliasModel&) const final

For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.

The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.

To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.

See also

AliasModel.

Parameters: aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
Pre: All input tensors of this op have mappings in aliasModel before the call to aliasModel.
Post: All output tensors of this op have mappings in aliasModel after to the call to aliasModel.

virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final

Translate a PopART inplacing proposal.

This replaces an outplace op with an inplace op of type inplaceId, into an AliasModel equivalent.

This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.

Parameters

aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.

Returns

A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.

Public Static Functions

static inline OutIndex getLocalTensorOutIndex()

class RemoteStoreOp : public popart::RemoteBaseOp 

Remote Store Op.

Stores a tensor to a remote (off-chip) buffer. This Op is typically used when the user wants to store several different identically shaped tensors to the same remote buffer by specifying the offset (see below).

This class takes between one and two TensorIds as inputs (as indicated in

The TensorId of the inTensor to copy to remote memory.
The (optional) TensorId 0-rank tensor called offset .
- If set to a value >= 0 offset will specify the row in the remote buffer the inTensor will be written to (see below for explanation).
- If set to -1 RemoteSetup will assign a unique value.

See also

graphcoreoperators.hpp).

If inTensor is of rank x , the remote buffer of a certain RemoteBufferId will be of rank x+1, where the new dimension (the row) will be of size N.

Op instances with matching RemoteBufferId will outline together, meaning that if multiple different tensors are to be stored under the same remote buffer ID, a different offset value has to be supplied for each tensor.

For using the automatic

If not using the automatic

RemoteSetup, all offsets and RemoteBufferIds need to be >= 0. Each remote buffer ID needs then to be registered with Ir::setRemoteBufferInfo manually.

See also

RemoteSetup configuration, the offset tensor should be a unique constant tensor per inTensor per RemoteBufferId. If the constant offset tensor has value -1, RemoteSetup will assign a unique value, otherwise the supplied offset value will be used. RemoteSetup will call Ir::setRemoteBufferInfo to configure the shape (equal to the inTensor shape) and number of rows ( N ) in the remote memory.

This Op does not have any output.

See also

RemoteLoadOp.

Public Functions

RemoteStoreOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)

Construct the RemoteStoreOp.

Parameters specifically related to this class:

See constructor of the parent class for the rest of input parameters.

Parameters: RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the RemoteSetup will automatically choose the right buffer. The RemoteBufferIds can only be used with tensors of identical shape.

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

inline virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline virtual bool hasSideEffect() const override

Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.

Returns: true if the op has side effects, false otherwise. Default=false.

virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override: Return which inputs and outputs are replicated tensor sharding pairs.

ExchangeDescriptor getExchangeDescriptor(int index) const final

class ReplicatedAllGatherOp : public popart::CollectivesBaseOp 

Public Functions

ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&)

ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&, TensorInfo outInfo)

ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&)

ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&, const TensorInfo &outInfo)

std::unique_ptr<Op> clone() const final

void setup() final

inline float getSubgraphValue() const final

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

bool isConfigureOutputForReplicatedTensorSharding() const override

Check RTS mode (see collectives.hpp)

Returns: True if this operation is configured for replicated tensor sharding

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

std::vector<std::unique_ptr<Op>> getGradOps() override

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const override

class ReplicatedAllReduceInplaceOp : public popart::ReplicatedAllReduceOp 

Public Functions

ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, CommGroup group, const Op::Settings &settings_)

ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, const ReplicaGrouping &grouping, const Op::Settings &settings_)

ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

ReplicatedAllReduceInplaceOp(const ReplicatedAllReduceOp&)

view::Regions modifies(InIndex) const override

view::Regions aliases(InIndex, OutIndex) const override

std::unique_ptr<Op> clone() const final

void setup() final

class ReplicatedAllReduceOp : public popart::CollectivesBaseOp 

Subclassed by popart::ReplicatedAllReduceInplaceOp

Public Functions

ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)

ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)

ReplicatedAllReduceOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override

void setup() override

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline CollectiveOperator getCollectiveOp() const

void growAliasModel(AliasModel&) const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

inline bool hasCorrespondingLinkedIndexTensor(Tensor *t) override

inline Tensor *getCorrespondingLinkedIndexTensor(Tensor *t) override

inline bool isCollectiveLinkedIndexTensor(InIndex in) const override

inline bool isCollectiveLinkedIndexTensor(Tensor *t) const override

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

class ReplicatedReduceScatterOp : public popart::CollectivesBaseOp 

Public Functions

ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)

ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)

ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)

ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)

ReplicatedReduceScatterOp(const OperatorIdentifier&, const Op::Settings&)

std::unique_ptr<Op> clone() const override

void setup() override

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline CollectiveOperator getCollectiveOp() const

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

bool isConfigureOutputForReplicatedTensorSharding() const override

Check RTS mode (see collectives.hpp)

Returns: True if this operation is configured for replicated tensor sharding

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

std::vector<std::unique_ptr<Op>> getGradOps() override

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const override

class ReshapeBaseOp : public popart::Op 

Subclassed by popart::ReshapeInplaceOp, popart::ReshapeOp

Public Functions

inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Shape &ots_, const Op::Settings &settings_, bool handleZero_ = true)

inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero_ = true)

std::unique_ptr<Op> clone() const override

void setup() final

void setOutShape(const Shape &value)

const Shape &getOutShape() const

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

void connectInTensor(InIndex inIndex, TensorId tenId) final

inline bool canShard() const override

void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const override

void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ReshapeGradOp : public popart::ReshapeOp 

Public Functions

ReshapeGradOp(const ReshapeOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class ReshapeInplaceOp : public popart::ReshapeBaseOp 

Public Functions

ReshapeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)

ReshapeInplaceOp(const ReshapeOp&)

std::unique_ptr<Op> clone() const final

inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline view::Regions aliases(InIndex in, OutIndex) const final

inline bool isInplaceViewChange() const override

class ReshapeOp : public popart::ReshapeBaseOp 

Subclassed by popart::ReshapeGradOp

Public Functions

inline ReshapeOp(const OperatorIdentifier &_opid, const Shape &s, const Op::Settings &settings_, bool handleZero = true)

inline ReshapeOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero = true)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

inline bool isOutplaceViewChange() const override

class ResizeGradOp : public popart::ResizeOp 

Public Functions

ResizeGradOp(const ResizeOp&)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline const std::vector<float> getFwdScales() const

class ResizeOp : public popart::Op 

Subclassed by popart::ResizeGradOp

Public Functions

ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales)

ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales, ResizeNearestMode nearestMode, ResizeCoordinateTransformationMode)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

inline float getSubgraphValue() const final

inline ResizeMode getMode() const

inline const std::vector<float> &getScales() const

inline ResizeNearestMode getNearestMode() const

inline ResizeCoordinateTransformationMode getCoordinateTransformationMode() const

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class RestoreInplaceOp : public popart::RestoreOp 

Public Functions

RestoreInplaceOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)

std::unique_ptr<Op> clone() const override

view::Regions aliases(InIndex in, OutIndex) const final

view::Regions modifies(InIndex) const final

inline void growAliasModel(AliasModel &m) const override

Public Members

bool requiredForRecompute = false

Public Static Functions

static inline InIndex getActToRestoreInIndex()

class RestoreOp : public popart::Op 

Subclassed by popart::RestoreInplaceOp

Public Functions

RestoreOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)

std::unique_ptr<Op> clone() const override

void setup() final

TensorId getRestoredTensorId() const

inline float getSubgraphValue() const final

inline int64_t getStashSize() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool isOutlineable() const override

Public Static Functions

static inline InIndex getStashInIndex()

static inline OutIndex getRestoredActOutIndex()

class ReverseBaseOp : public popart::Op 

Subclassed by popart::ReverseInplaceOp, popart::ReverseOp

Public Functions

inline ReverseBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)

std::unique_ptr<Op> clone() const override

void setup() final

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

inline std::vector<int64_t> getDimensions() const

void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ReverseGradOp : public popart::ReverseOp 

Public Functions

ReverseGradOp(const ReverseOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class ReverseInplaceOp : public popart::ReverseBaseOp 

Public Functions

inline ReverseInplaceOp(const ReverseOp &op)

std::unique_ptr<Op> clone() const final

inline view::Regions aliases(InIndex in, OutIndex) const final

class ReverseOp : public popart::ReverseBaseOp 

Subclassed by popart::ReverseGradOp

Public Functions

inline ReverseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

void appendOutlineAttributes(OpSerialiserBase&) const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

class RoiAlignGradOp : public popart::Op 

Public Functions

RoiAlignGradOp(const RoiAlignOp&)

std::unique_ptr<Op> clone() const final

virtual void setup()

virtual const std::vector<popart::GradInOutMapper> &gradInputInfo() const

const std::map<int, int> &gradOutToNonGradIn() const

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline float getSpatialScale() const

inline uint64_t getSamplingRatio() const

inline uint64_t getAlignedHeight() const

inline uint64_t getAlignedWidth() const

class RoiAlignOp : public popart::Op 

Region of Interest (RoI) align operation described in the Mask R-CNN paper.

Param spatialScale: Multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling, i.e., spatial scale of the input feature map X relative to the input image.
Param samplingRatio: Number of sampling points in the interpolation grid used to compute the output value of each pooled output bin.
Param alignedHeight: Pooled output Y’s height.
Param alignedWidth: Pooled output X’s height.

Public Functions

RoiAlignOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings, const float spatialScale, const uint64_t samplingRatio, const uint64_t alignedHeight, const uint64_t alignedWidth)

RoiAlignOp(const RoiAlignOp&) = default

RoiAlignOp &operator=(const RoiAlignOp&) = delete

~RoiAlignOp() override = default

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() override

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

virtual std::vector<std::unique_ptr<popart::Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Append the op attributes that are relevant for outlining ops.

Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.

Parameters: OpSerialiserBase – The stream to which the attributes should be appended.

inline float getSpatialScale() const

inline uint64_t getSamplingRatio() const

inline uint64_t getAlignedHeight() const

inline uint64_t getAlignedWidth() const

class RoundInplaceOp : public popart::OneWayUnaryInPlaceOp 

Public Functions

RoundInplaceOp(const RoundOp&)

std::unique_ptr<Op> clone() const final

class RoundOp : public popart::OneWayUnaryOp 

Public Functions

RoundOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class ScaleGradOp : public popart::ScaleOp 

Public Functions

ScaleGradOp(const ScaleOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class ScaleInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ScaleInplaceOp(const ScaleOp&)

ScaleInplaceOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

float getScaleFactor() const

void appendOutlineAttributes(OpSerialiserBase&) const override

class ScaleOp : public popart::ElementWiseUnaryOp 

Subclassed by popart::ScaleGradOp

Public Functions

ScaleOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline void setScaleFactor(float value)

float getScaleFactor() const

void appendOutlineAttributes(OpSerialiserBase&) const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

bool canBeReplacedByIdentity() const override

class ScaledAddLhsInplaceOp : public popart::ScaledAddOp 

Public Functions

ScaledAddLhsInplaceOp(float scale_0_, float scale_1_, const Op::Settings &settings_)

ScaledAddLhsInplaceOp(const ScaledAddOp&)

std::unique_ptr<Op> clone() const final

view::Regions modifies(InIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

class ScaledAddOp : public popart::Op 

Subclassed by popart::ScaledAddLhsInplaceOp, popart::ScaledAddRhsInplaceOp

Public Functions

ScaledAddOp(const OperatorIdentifier &_opid, float scale_0_, float scale_1_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void setup() override

inline float getScale0() const

inline float getScale1() const

void appendOutlineAttributes(OpSerialiserBase&) const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

inline bool canShard() const override

ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override

inline float getSubgraphValue() const override

void growAliasModel(AliasModel&) const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Static Functions

static inline InIndex getArg0InIndex()

static inline InIndex getArg1InIndex()

static inline InIndex getScale0InIndex()

static inline InIndex getScale1InIndex()

static inline OutIndex getOutIndex()

class ScaledAddRhsInplaceOp : public popart::ScaledAddOp 

Public Functions

ScaledAddRhsInplaceOp(const ScaledAddOp&)

std::unique_ptr<Op> clone() const final

view::Regions modifies(InIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

class ScanOp : public popart::SubgraphOp 

Public Functions

ScanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, Graph &callee_, int numScanInputs_, int numImplicitInputs_, std::vector<int64_t> scanInputAxes_, std::vector<int64_t> scanInputDirections_, std::vector<int64_t> scanOutputAxes_, std::vector<int64_t> scanOutputDirections_)

void setup() final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

std::unique_ptr<Op> clone() const override

Graph &getCalledGraph() const override

void setCalledGraph(Graph&) override

InIndex subgraphInToOpInIndex(InIndex index) const override

InIndex opInToSubgraphInIndex(InIndex index) const override

OutIndex subgraphOutToOpOutIndex(OutIndex index) const override

OutIndex opOutToSubgraphOutIndex(OutIndex index) const override

int getTripCountValue() const

inline int getNumScanInputs() const

inline int getNumVariables() const

inline int getNumImplicitInputs() const

inline int getNumScanOutputs() const

int64_t getScanInputAxis(int i) const

inline bool isScanInputReversed(int i) const

int64_t getScanOutputAxis(int i) const

inline bool isScanOutputReversed(int i) const

int64_t getScanInputDirection(int i) const

int64_t getScanOutputDirection(int i) const

class ScatterDataGradOp : public popart::Op 

Public Functions

ScatterDataGradOp(const ScatterOp &op, int64_t axis)

std::unique_ptr<Op> clone() const final override

const std::vector<GradInOutMapper> &gradInputInfo() const final override

const std::map<int, int> &gradOutToNonGradIn() const final override

void setup() final override

void appendOutlineAttributes(OpSerialiserBase&) const override

float getSubgraphValue() const final override

int64_t getAxis() const noexcept

nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()

static inline InIndex indicesInIndex()

static inline OutIndex gradOutIndex()

class ScatterOp : public popart::ScatterReduceOp 

Public Functions

ScatterOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt)

InIndex srcDataInIndex() const noexcept override

InIndex initialValuesInIndex() const noexcept override

std::unique_ptr<Op> clone() const final override

std::vector<std::unique_ptr<Op>> getGradOps() final override

void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline InIndex dataInIndex()

static inline InIndex indicesInIndex()

static inline InIndex updatesInIndex()

static inline OutIndex outIndex()

class ScatterReduceGradOp : public popart::Op 

Public Functions

ScatterReduceGradOp(const ScatterReduceOp &op)

void setup() final override

std::unique_ptr<Op> clone() const final override

const std::vector<GradInOutMapper> &gradInputInfo() const final override

const std::map<int, int> &gradOutToNonGradIn() const final override

void appendOutlineAttributes(OpSerialiserBase&) const override

float getSubgraphValue() const final override

int64_t getAxis() const noexcept

int64_t getGroupSize() const noexcept

ScatterReduction getReduction() const noexcept

bool indexBroadcasted() const noexcept

bool indexBroadcastEnabled() const noexcept

bool hasInitialValues() const noexcept

nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()

static inline InIndex indicesInIndex()

static inline InIndex srcDataInIndex()

static inline InIndex fwdOutInIndex()

static inline InIndex initialValuesInIndex()

static inline OutIndex gradDataOutIndex()

static inline OutIndex gradInitialValuesOutIndex()

class ScatterReduceOp : public popart::Op 

Subclassed by popart::ScatterOp

Public Functions

ScatterReduceOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t axis_size_, ScatterReduction reduction_, int64_t group_size_, bool enable_index_broadcast_, const nonstd::optional<float> &available_memory_proportion_, const Op::Settings &settings_)

inline virtual InIndex srcDataInIndex() const noexcept

inline InIndex indicesInIndex() const noexcept

inline virtual InIndex initialValuesInIndex() const noexcept

inline OutIndex outIndex() const noexcept

void setup() final

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void appendOutlineAttributes(OpSerialiserBase&) const override

float getSubgraphValue() const final override

int64_t getAxis() const noexcept

int64_t getGroupSize() const noexcept

ScatterReduction getReduction() const noexcept

const Shape &getBackwardShape() const noexcept

bool indexBroadcasted() const noexcept

bool indexBroadcastEnabled() const noexcept

nonstd::optional<float> getAvailableMemoryProportion() const noexcept

void setAvailableMemoryProportion(const nonstd::optional<float> &v)

Public Static Functions

static std::string reductionToString(ScatterReduction reduction)

static ScatterReduction reductionFromString(const std::string &reductionStr)

class ScatterUpdateGradOp : public popart::Op 

Public Functions

ScatterUpdateGradOp(const ScatterOp &op, int64_t axis)

std::unique_ptr<Op> clone() const final override

void setup() final override

const std::vector<GradInOutMapper> &gradInputInfo() const final override

const std::map<int, int> &gradOutToNonGradIn() const final override

void appendOutlineAttributes(OpSerialiserBase&) const override

float getSubgraphValue() const final override

int64_t getAxis() const noexcept

nonstd::optional<float> getAvailableMemoryProportion() const noexcept

Public Static Functions

static inline InIndex gradInIndex()

static inline InIndex indicesInIndex()

static inline OutIndex gradOutIndex()

class SeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SeluGradOp(const SeluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getGamma() const

class SeluInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SeluInplaceOp(const SeluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getGamma() const

class SeluOp : public popart::ElementWiseUnaryOp 

Public Functions

SeluOp(const OperatorIdentifier &opid, float _alpha, float _gamma, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

inline float getGamma() const

class SequenceSliceInplaceOp : public popart::SequenceSliceOp 

Public Functions

SequenceSliceInplaceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)

std::unique_ptr<Op> clone() const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

view::Regions aliases(InIndex, OutIndex) const final

view::Regions modifies(InIndex) const final

class SequenceSliceOp : public popart::Op 

Subclassed by popart::SequenceSliceInplaceOp

Public Functions

SequenceSliceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline float getSubgraphValue() const final

void setup() override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

void growAliasModel(AliasModel&) const override

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

Public Members

const bool zeroUnused

Public Static Functions

static inline InIndex getSourceInIndex()

static inline InIndex getDestinationInIndex()

static inline InIndex getNInIndex()

static inline InIndex getSourceOffsetInIndex()

static inline InIndex getDestOffsetInIndex()

static inline OutIndex getOutIndex()

class ShapeOrLikeOp : public popart::Op 

Subclassed by popart::RandomBaseOp, popart::ZerosBaseOp

Public Functions

ShapeOrLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

inline float getSubgraphValue() const override

void validateDataType(DataType dataType, OperatorIdentifier opid)

inline const OptionalDataType &getDataType() const

virtual std::vector<DataType> getSupportedDataTypes() const = 0

Public Static Functions

static OptionalDataType getOptionalDataType(const Attributes &attr, OperatorIdentifier opid)

static inline OutIndex getOutIndex()

static const OpDefinition::DataTypes &likeSupportedInputTypes()

class ShapedDropoutOp : public popart::DropoutBaseOp 

Subclassed by popart::ShapedDropoutGradOp

Public Functions

ShapedDropoutOp(const OperatorIdentifier &_opid, float ratio_, const Shape &shape_, const Op::Settings &settings_)

inline const std::vector<int64_t> &getShape() const

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() override

void appendOutlineAttributes(OpSerialiserBase&) const override

class ShapedDropoutGradOp : public popart::ShapedDropoutOp 

Public Functions

ShapedDropoutGradOp(const ShapedDropoutOp &fwdOp)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const override

const std::map<int, int> &gradOutToNonGradIn() const override

Public Static Functions

static inline InIndex getGradInIndex()

static inline OutIndex getOutIndex()

class ShrinkGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

ShrinkGradOp(const ShrinkOp&)

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float lambd() const

inline float bias() const

class ShrinkInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ShrinkInplaceOp(const ShrinkOp&)

std::unique_ptr<Op> clone() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float lambd() const

inline float bias() const

class ShrinkOp : public popart::ElementWiseUnaryOp 

Public Functions

ShrinkOp(const OperatorIdentifier &opid, float lambd, float bias, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float lambd() const

inline float bias() const

class SigmoidGradOp : public popart::Op 

Public Functions

SigmoidGradOp(const SigmoidOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdOutInIndex()

static inline OutIndex getOutIndex()

class SigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SigmoidInplaceOp(const SigmoidOp&)

std::unique_ptr<Op> clone() const final

class SigmoidOp : public popart::ElementWiseUnaryOp 

Public Functions

SigmoidOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SignInplaceOp : public popart::OneWayUnaryInPlaceOp 

Public Functions

SignInplaceOp(const SignOp&)

std::unique_ptr<Op> clone() const final

class SignOp : public popart::OneWayUnaryOp 

Public Functions

SignOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

Public Static Functions

static OperatorIdentifier getOpId(const Ir &ir)

class SinGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SinGradOp(const SinOp &fwdOp)

std::unique_ptr<Op> clone() const final

class SinOp : public popart::ElementWiseUnaryOp 

Public Functions

SinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class SinhGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SinhGradOp(const SinhOp&)

std::unique_ptr<Op> clone() const final

class SinhInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SinhInplaceOp(const SinhOp&)

SinhInplaceOp(const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

class SinhOp : public popart::ElementWiseUnaryOp 

Public Functions

SinhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SliceGradOp : public popart::BasePadOutplaceOp 

Public Functions

SliceGradOp(const SliceOp&)

void appendOutlineAttributes(OpSerialiserBase&) const override

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline bool canShard() const override

class SliceInplaceOp : public popart::BaseSliceOp 

Public Functions

SliceInplaceOp(const SliceOp&)

SliceInplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

view::Regions aliases(InIndex in, OutIndex) const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class SliceOp : public popart::BaseSliceOp 

Subclassed by popart::PadGradOp

Public Functions

SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)

SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SoftPlusGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SoftPlusGradOp(const SoftPlusOp&)

std::unique_ptr<Op> clone() const final

class SoftPlusInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SoftPlusInplaceOp(const SoftPlusOp&)

std::unique_ptr<Op> clone() const final

class SoftPlusOp : public popart::ElementWiseUnaryOp 

Public Functions

SoftPlusOp(const OperatorIdentifier &opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SoftSignGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SoftSignGradOp(const SoftSignOp&)

std::unique_ptr<Op> clone() const final

class SoftSignInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SoftSignInplaceOp(const SoftSignOp&)

std::unique_ptr<Op> clone() const final

class SoftSignOp : public popart::ElementWiseUnaryOp 

Public Functions

SoftSignOp(const OperatorIdentifier &opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SoftmaxGradDirectOp : public popart::Op 

Public Functions

SoftmaxGradDirectOp(const TensorId lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

void setup() final

bool hasNlllFwdOp() const

Op *nlllFwdOp() const

inline float getSubgraphValue() const final

inline ReductionType getReductionType() const

inline bool hasIgnoreIndex() const

inline nonstd::optional<int> getOptionalIgnoreIndex() const

inline int getIgnoreIndex() const

virtual void appendOutlineAttributes(OpSerialiserBase&) const final

Public Static Functions

static inline InIndex getProbsInIndex()

static inline InIndex getLabelInIndex()

static inline InIndex getGradProbsInIndex()

static inline OutIndex getOutIndex()

class SoftmaxGradOp : public popart::Op 

Public Functions

SoftmaxGradOp(const SoftmaxOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getGradProbsInIndex()

static inline InIndex getProbsInIndex()

static inline OutIndex getOutIndex()

class SoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SoftmaxInplaceOp(const SoftmaxOp&)

std::unique_ptr<Op> clone() const final

inline int64_t getAxis() const

void appendOutlineAttributes(OpSerialiserBase&) const override

class SoftmaxOp : public popart::ElementWiseUnaryOp 

Public Functions

SoftmaxOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

int64_t getAxis() const

void setAxis(int64_t)

void appendOutlineAttributes(OpSerialiserBase&) const override

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SortOp : public popart::TopKOp 

Public Functions

SortOp(const OperatorIdentifier &opid, int64_t axis, int64_t axis_size, bool descending, bool stable, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)

std::unique_ptr<Op> clone() const override

class SplineBasisOp : public popart::Op 

Public Functions

SplineBasisOp(const OperatorIdentifier &opid, int degree, const Op::Settings &settings)

void setup() override

std::unique_ptr<Op> clone() const override

float getSubgraphValue() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

unsigned getDegree() const noexcept

Public Static Functions

static inline constexpr InIndex pseudoIndex() noexcept

static inline constexpr InIndex kernelSizeIndex() noexcept

static inline constexpr InIndex isOpenSplineIndex() noexcept

static inline constexpr OutIndex outBasisIndex() noexcept

static inline constexpr OutIndex outWeightIndexIndex() noexcept

class SplineWeightingOp : public popart::Op 

Public Functions

SplineWeightingOp(const OperatorIdentifier &opid, const Op::Settings &settings)

void setup() override

std::unique_ptr<Op> clone() const override

float getSubgraphValue() const override

void appendOutlineAttributes(OpSerialiserBase&) const override

Public Static Functions

static inline constexpr InIndex inputIndex() noexcept

static inline constexpr InIndex weightIndex() noexcept

static inline constexpr InIndex basisIndex() noexcept

static inline constexpr InIndex weightIndexIndex() noexcept

static inline constexpr OutIndex outputIndex() noexcept

class SplitGradOp : public popart::Op 

Public Functions

SplitGradOp(const SplitOp&, const Op::Settings&)

void setup() final

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline float getSubgraphValue() const final

inline int64_t getAxis() const

Public Static Functions

static inline OutIndex getOutIndex()

class SplitOp : public popart::Op 

Public Functions

SplitOp(const OperatorIdentifier&, int64_t axis_, const std::vector<int64_t> split_, const Op::Settings&)

void setup() final

std::unique_ptr<Op> clone() const final

inline float getSubgraphValue() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<int64_t> getSplitSizes() const

inline int64_t getAxis() const

inline bool canShard() const override

Public Static Functions

static inline InIndex getInIndex()

class SqrtGradOp : public popart::Op 

Public Functions

SqrtGradOp(const SqrtOp &fwdOp)

std::unique_ptr<Op> clone() const final

void setup() final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdOutInIndex()

static inline OutIndex getOutIndex()

class SqrtOp : public popart::ElementWiseUnaryOp 

Public Functions

SqrtOp(const OperatorIdentifier &_opid, const Op::Settings&)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class SquareOp : public popart::ElementWiseUnaryOp 

Public Functions

SquareOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class StashOp : public popart::Op 

Public Functions

StashOp(const OperatorIdentifier&, int64_t stashSize_, const Op::Settings&)

std::unique_ptr<Op> clone() const override

void setup() final

int64_t getStashSize()

TensorId getStashedTensorId() const

inline float getSubgraphValue() const final

void appendOutlineAttributes(OpSerialiserBase&) const override

inline bool isOutlineable() const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class SubgraphOp : public popart::Op 

Subclassed by popart::CallOp, popart::LoopOp, popart::ScanOp

Public Functions

SubgraphOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

virtual std::unique_ptr<Op> clone() const override = 0

void appendOutlineAttributes(OpSerialiserBase &os) const override

view::Regions modifies(InIndex) const override

view::Regions aliases(InIndex, OutIndex) const override

void addAlias(InIndex in, OutIndex out, view::Chains fwdChains, view::Chains bwdChains)

void adjustAliasInIndices(InIndex fromIn, InIndex toIn)

void adjustAliasOutIndices(OutIndex fromOut, OutIndex toOut)

void adjustModifiedIndices(InIndex fromIn, InIndex toIn)

void addModified(InIndex in, view::Regions regions)

void removeModified(InIndex in)

void removeAlias(InIndex in, OutIndex out)

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override

virtual InIndex subgraphInToOpInIndex(InIndex index) const = 0

virtual InIndex opInToSubgraphInIndex(InIndex index) const = 0

virtual OutIndex subgraphOutToOpOutIndex(OutIndex index) const = 0

virtual OutIndex opOutToSubgraphOutIndex(OutIndex index) const = 0

virtual Graph &getCalledGraph() const = 0

std::vector<const Graph*> getCalledGraphs() const override

virtual void setCalledGraph(Graph&) = 0

VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override

VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override

bool hasSideEffect() const override

virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override

virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override

virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override

virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override

float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override

virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override

Public Static Functions

static bool existsInBodyInputs(std::vector<std::string> &loopBodyInputIds, TensorId &tensorId)

static bool existsInOpInputs(std::vector<std::pair<TensorId, TensorInfo>> &opInputs, TensorId &tensorId)

static std::vector<TensorId> getBodyInputIds(const ONNX_NAMESPACE::GraphProto &bodyProto)

static std::vector<TensorId> getBodyOutputIds(const ONNX_NAMESPACE::GraphProto &bodyProto)

class SubsampleBaseOp : public popart::Op 

Subclassed by popart::SubsampleInplaceOp, popart::SubsampleOp

Public Functions

SubsampleBaseOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void setup() override

std::vector<std::unique_ptr<Op>> getGradOps() final

inline std::vector<int64_t> getStrides() const

std::vector<uint32_t> strides_u32() const

bool strideSizeOne() const

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

void growAliasModel(AliasModel&) const override

inline float getSubgraphValue() const final

Public Members

std::vector<int64_t> strides

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class SubsampleGradOp : public popart::Op 

Public Functions

SubsampleGradOp(const SubsampleBaseOp &fwdOp)

std::unique_ptr<Op> clone() const final

void setup() override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

inline std::vector<int64_t> getStrides() const

std::vector<uint32_t> strides_u32() const

inline const Shape &getFwdInputShape() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class SubsampleInplaceOp : public popart::SubsampleBaseOp 

Public Functions

SubsampleInplaceOp(const SubsampleOp&)

std::unique_ptr<Op> clone() const final

inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

view::Regions aliases(InIndex in, OutIndex) const final

class SubsampleOp : public popart::SubsampleBaseOp 

Public Functions

SubsampleOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

class SubtractArg0GradOp : public popart::ReduceSumOp 

Public Functions

SubtractArg0GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

void setup() final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class SubtractArg1GradOp : public popart::ElementWiseBinaryArg1GradOp 

Public Functions

SubtractArg1GradOp(const Op&, const std::vector<int64_t> &_reduction_axes)

std::unique_ptr<Op> clone() const final

class SumArgGradOp : public popart::LinearVariadicGradOp 

Public Functions

SumArgGradOp(const SumOp&, InIndex inIndex)

const std::vector<GradInOutMapper> &gradInputInfo() const final

std::unique_ptr<Op> clone() const final

bool canBeReplacedByIdentity() const override

inline bool canShard() const override

class SumOp : public popart::VariadicOp 

Public Functions

SumOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

inline bool canShard() const override

class SwishGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

SwishGradOp(const SwishOp&)

std::unique_ptr<Op> clone() const final

class SwishInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

SwishInplaceOp(const SwishOp&)

SwishInplaceOp(const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

class SwishOp : public popart::ElementWiseUnaryOp 

Public Functions

SwishOp(const OperatorIdentifier &opid, const Op::Settings &settings)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

class SyncOp : public popart::Op 

Public Functions

SyncOp(const Op::Settings&, poplar::SyncType syncType)

std::unique_ptr<Op> clone() const override

const poplar::SyncType &getSyncType() const

inline void setup() final

inline float getSubgraphValue() const final

inline bool hasSideEffect() const override

class TanhGradOp : public popart::Op 

Public Functions

TanhGradOp(const TanhOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

inline bool canShard() const override

Public Static Functions

static inline InIndex getGradInIndex()

static inline InIndex getFwdOutInIndex()

static inline OutIndex getOutIndex()

class TanhOp : public popart::ElementWiseUnaryOp 

Public Functions

TanhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

class TensorRemapOp : public popart::Op 

Op that creates a new output tensor with tensor layout created by downstream consumers, and then copies the input tensor to the output tensor.

Can improve tile memory liveness if the tensor without remapping is unsuitable for downstream consumers. Should only be used if actual issues occur, since remapping clones the tensor and can introduce more rearrangement and data copies than necessary.

Public Functions

TensorRemapOp(const OperatorIdentifier&, const TensorRemapType&, const Op::Settings&)

TensorRemapOp(const TensorRemapOp&)

virtual std::unique_ptr<Op> clone() const final

Return a copy of the op.

This method must be implemented. The compiler throws an error if this method is not implemented.

virtual void setup() final

Set the shape and type of the arguments to the op.

This MUST set the type and shape information for all the output TensorInfo objects.

inline TensorRemapType getTensorRemapType() const

virtual std::vector<std::unique_ptr<Op>> getGradOps() final

Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.

There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.

The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.

Throws an error if this op is already a gradient op.

virtual const std::vector<GradInOutMapper> &gradInputInfo() const final

Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

Returns: The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.

virtual const std::map<int, int> &gradOutToNonGradIn() const final

Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.

This method throws an error if the op this is called on is not a grad op.

inline virtual float getSubgraphValue() const final

Get the subgraph value.

This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).

Returns: The subgraph value. Default: 0.

inline virtual bool isOutlineable() const final

Check if op can be outlined.

If this method returns false, it will mean that any possible subgraph that this op is part of will not be cached.

Returns: true if the op can be outlined, false otherwise. Default: true.

Public Static Functions

static inline InIndex getInIndex()

static inline InIndex getRefInIndex()

static inline OutIndex getOutIndex()

class ThresholdedReluGradOp : public popart::ElementWiseNonLinearUnaryGradOp 

Public Functions

ThresholdedReluGradOp(const ThresholdedReluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

class ThresholdedReluInplaceOp : public popart::ElementWiseInplaceUnaryOp 

Public Functions

ThresholdedReluInplaceOp(const ThresholdedReluOp&)

std::unique_ptr<Op> clone() const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

class ThresholdedReluOp : public popart::ElementWiseUnaryOp 

Public Functions

ThresholdedReluOp(const OperatorIdentifier &opid, float _alpha, const Op::Settings &settings)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

void appendAttributes(OpSerialiserBase&) const override

inline float getAlpha() const

class TiedGatherGradOp : public popart::GatherGradOp 

Public Functions

TiedGatherGradOp(const TiedGatherOp *fwdOp, int64_t axis)

std::unique_ptr<Op> clone() const final

Public Members

const TiedGatherOp *fwdOp

class TiedGatherOp : public popart::GatherOp 

Public Functions

TiedGatherOp(int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)

std::unique_ptr<Op> clone() const final

std::vector<std::unique_ptr<Op>> getGradOps() final

class TileGradOp : public popart::TileOp 

Public Functions

TileGradOp(const TileOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class TileOp : public popart::Op 

Subclassed by popart::TileGradOp

Public Functions

TileOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

TileOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &repeats_, const Shape &outShape_, const Op::Settings &settings_)

std::vector<std::unique_ptr<Op>> getGradOps() final

std::unique_ptr<Op> clone() const override

void setup() final

virtual void connectInTensor(InIndex, TensorId) final

const Shape &getOutShape()

const std::vector<int64_t> &getRepeats() const

bool canBeReplacedByIdentity() const override

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class TopKGradOp : public popart::Op 

Public Functions

TopKGradOp(const TopKOp&)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

int64_t getAxis() const

const TensorInfo &getGradOutInfo() const

void appendOutlineAttributes(OpSerialiserBase&) const override

inline float getSubgraphValue() const final

inline nonstd::optional<float> getAvailableMemoryProportion() const

Public Static Functions

static inline InIndex gradInIndex()

static inline InIndex indicesInIndex()

static inline OutIndex gradOutIndex()

class TopKOp : public popart::BaseSortOp 

Subclassed by popart::SortOp

Public Functions

TopKOp(const OperatorIdentifier &_opid, int64_t k, int64_t axis, bool largest, bool sorted, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)

std::unique_ptr<Op> clone() const override

void setup() final

int64_t getK() const noexcept

bool getLargest() const noexcept

bool getSorted() const noexcept

bool getStable() const noexcept

std::vector<std::unique_ptr<Op>> getGradOps() final

void appendOutlineAttributes(OpSerialiserBase&) const final

inline nonstd::optional<float> getAvailableMemoryProportion() const

Public Static Functions

static inline OutIndex getValuesOutIndex()

static inline OutIndex getIndicesOutIndex()

class TransposeBaseOp : public popart::Op 

Subclassed by popart::TransposeInplaceOp, popart::TransposeOp

Public Functions

TransposeBaseOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const final

inline void setPerm(const Shape &value)

inline const Shape &getPerm() const

std::vector<uint64_t> getPerm_u64() const

view::RegMap fwdRegMap(InIndex, OutIndex) const final

view::RegMap bwdRegMap(InIndex, OutIndex) const final

Shape generateReversePermutation() const

inline bool canShard() const override

int getOutBatchAxis(OutIndex) const override

void growAliasModel(AliasModel&) const override

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class TransposeGradOp : public popart::TransposeOp 

Public Functions

TransposeGradOp(const TransposeOp &fwdOp)

std::unique_ptr<Op> clone() const final

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

class TransposeInplaceOp : public popart::TransposeBaseOp 

Public Functions

TransposeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)

TransposeInplaceOp(const TransposeOp&)

std::unique_ptr<Op> clone() const final

inline view::Regions aliases(InIndex in, OutIndex) const final

inline bool isInplaceViewChange() const override

class TransposeOp : public popart::TransposeBaseOp 

Subclassed by popart::TransposeGradOp

Public Functions

TransposeOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() final

void appendOutlineAttributes(OpSerialiserBase&) const override

bool canBeReplacedByIdentity() const override

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override

inline bool isOutplaceViewChange() const override

class UnaryZeroGradOp : public popart::ZerosLikeOp 

Public Functions

UnaryZeroGradOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

inline const std::vector<GradInOutMapper> &gradInputInfo() const

inline const std::map<int, int> &gradOutToNonGradIn() const

Public Static Functions

static std::vector<std::unique_ptr<Op>> getGradOpVector(const Op::Settings &settings_)

class UpsampleOp : public popart::Op 

Public Functions

UpsampleOp(const OperatorIdentifier&, const Op::Settings&, UpsampleMode, const std::vector<float> &scales)

std::unique_ptr<Op> clone() const override

void setup() final

inline float getSubgraphValue() const final

void connectInTensor(InIndex inIndex, TensorId tenId) final

inline UpsampleMode getMode() const

inline const std::vector<float> &getScales() const

Public Static Functions

static inline InIndex getInIndex()

static inline OutIndex getOutIndex()

class VariadicGradOp : public popart::Op 

Subclassed by popart::LinearVariadicGradOp, popart::NonLinearVariadicGradOp

Public Functions

VariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)

std::unique_ptr<Op> clone() const override

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline InIndex getFwdIndex()

inline const TensorInfo &getFwdInputInfo()

inline float getSubgraphValue() const final

Public Static Functions

static inline InIndex getGradInIndex()

static inline OutIndex getOutIndex()

class VariadicOp : public popart::Op 

Subclassed by popart::MaxOp, popart::MeanOp, popart::MinOp, popart::SumOp

Public Functions

VariadicOp(const OperatorIdentifier &_opid, const Op::Settings &settings)

virtual std::unique_ptr<Op> clone() const override = 0

std::vector<std::unique_ptr<Op>> getGradOps() final

void setup() final

bool canBeReplacedByIdentity() const final

inline float getSubgraphValue() const final

Public Static Functions

static inline OutIndex getOutIndex()

class WhereLhsInplaceOp : public popart::WhereOp 

Public Functions

WhereLhsInplaceOp(const WhereOp &op)

std::unique_ptr<Op> clone() const override

view::Regions modifies(InIndex index) const final

view::Regions aliases(InIndex index, OutIndex) const final

class WhereOp : public popart::Op 

Subclassed by popart::WhereLhsInplaceOp, popart::WhereRhsInplaceOp

Public Functions

WhereOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

std::vector<std::unique_ptr<Op>> getGradOps() override

void setup() final

std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final

std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final

inline float getSubgraphValue() const final

poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier opId) const override

void growAliasModel(AliasModel &m) const override

Public Static Functions

static inline InIndex conditionInIndex()

static inline InIndex xInIndex()

static inline InIndex yInIndex()

static inline OutIndex outIndex()

class WhereRhsInplaceOp : public popart::WhereOp 

Public Functions

WhereRhsInplaceOp(const WhereOp &op)

std::unique_ptr<Op> clone() const override

view::Regions modifies(InIndex index) const final

view::Regions aliases(InIndex index, OutIndex) const final

class WhereXGradOp : public popart::Op 

Public Functions

WhereXGradOp(const WhereOp &op)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

std::vector<size_t> getFwdInShape() const

Public Static Functions

static inline InIndex fwdConditionInIndex()

static inline InIndex outGradInIndex()

static inline OutIndex outIndex()

class WhereYGradOp : public popart::Op 

Public Functions

WhereYGradOp(const WhereOp &op)

std::unique_ptr<Op> clone() const override

const std::vector<GradInOutMapper> &gradInputInfo() const final

const std::map<int, int> &gradOutToNonGradIn() const final

void setup() final

inline float getSubgraphValue() const final

std::vector<size_t> getFwdInShape() const

Public Static Functions

static inline InIndex fwdConditionInIndex()

static inline InIndex outGradInIndex()

static inline OutIndex outIndex()

class ZerosBaseOp : public popart::ShapeOrLikeOp 

Subclassed by popart::ZerosLikeOp, popart::ZerosOp

Public Functions

ZerosBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const override

inline std::vector<DataType> getSupportedDataTypes() const override

Public Static Functions

static std::vector<DataType> supportedDataTypes()

class ZerosLikeOp : public popart::ZerosBaseOp 

Subclassed by popart::UnaryZeroGradOp

Public Functions

ZerosLikeOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)

void setup() final

std::unique_ptr<Op> clone() const override

std::unique_ptr<ZerosOp> foldInputTensor(const Op::Settings&) const

Public Static Functions

static inline InIndex getInIndex()

class ZerosOp : public popart::ZerosBaseOp 

Public Functions

ZerosOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, const Op::Settings &settings_)

std::unique_ptr<Op> clone() const final

void setup() final

14.8.4. Available Ops (Opx class)

class AbortOpx : public popart::popx::Opx 

Public Functions

AbortOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AbsOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

AbsOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AccumulateBaseOpx : public popart::popx::VarUpdateOpx 

Subclassed by popart::popx::AccumulateOpx, popart::popx::RescaleAccumulateOpx, popart::popx::SparseAccumulateOpx

Public Functions

AccumulateBaseOpx(Op*, Devicex*)

virtual void grow(poplar::program::Sequence&) const override = 0

poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const override

std::set<TensorId> mustExistBeforeCreate(InIndex) const override

InputCreatorType getInputCreatorType(InIndex) const final

bool hasCreatorViewChangers(InIndex index) const final

ViewChangers getCreatorViewChangers(InIndex index) const final

class AccumulateOpx : public popart::popx::AccumulateBaseOpx 

Public Functions

AccumulateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AccumulatorScaleOpx : public popart::popx::VarUpdateOpx 

Public Functions

AccumulatorScaleOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AdaDeltaUpdaterOpx : public popart::popx::Opx 

Public Functions

AdaDeltaUpdaterOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

bool hasCreatorViewChangers(InIndex index) const final

ViewChangers getCreatorViewChangers(InIndex index) const final

class AdamUpdaterOpx : public popart::popx::Opx 

Public Functions

AdamUpdaterOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AdamVarUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

AdamVarUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AddArg0GradOpx : public popart::popx::ReduceSumOpx 

Public Functions

AddArg0GradOpx(Op*, Devicex*)

class AddArg1GradOpx : public popart::popx::ReduceSumOpx 

Public Functions

AddArg1GradOpx(Op*, Devicex*)

class AddBiasBiasGradOpx : public popart::popx::ReduceSumOpx 

Public Functions

AddBiasBiasGradOpx(Op*, Devicex*)

class AddBiasDataGradOpx : public popart::popx::Opx 

Public Functions

AddBiasDataGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AddBiasInplaceOpx : public popart::popx::AddBiasOpx 

Public Functions

AddBiasInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AddBiasOpx : public popart::popx::Opx 

Subclassed by popart::popx::AddBiasInplaceOpx

Public Functions

AddBiasOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

std::set<TensorId> mustExistBeforeCreate(int index0) const override

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

class AddLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

AddLhsInplaceOpx(Op*, Devicex*)

class AddOpx : public popart::popx::ElementWiseBinaryOutplaceOpx 

Public Functions

AddOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const override

class AddRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

AddRhsInplaceOpx(Op*, Devicex*)

class AllReduceOpx : public popart::popx::Opx 

Public Functions

AllReduceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const

InputCreatorType getInputCreatorType(int index0) const

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const

view::RegMap unwindRegion(InIndex, OutIndex) const

class AndOpx : public popart::popx::BinaryComparisonOpx 

Public Functions

AndOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ArgExtremaOpx : public popart::popx::Opx 

Subclassed by popart::popx::ArgMaxOpx, popart::popx::ArgMinOpx

Public Functions

ArgExtremaOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ArgMaxOpx : public popart::popx::ArgExtremaOpx 

class ArgMinOpx : public popart::popx::ArgExtremaOpx 

class AsinGradOpx : public popart::popx::Opx 

Public Functions

AsinGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AsinInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

AsinInplaceOpx(Op*, Devicex*)

class AsinOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

AsinOpx(Op*, Devicex*)

class Atan2LhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

Atan2LhsInplaceOpx(Op*, Devicex*)

class Atan2Opx : public popart::popx::ElementWiseBinaryOutplaceOpx 

Public Functions

Atan2Opx(Op*, Devicex*)

class AtanGradOpx : public popart::popx::Opx 

Public Functions

AtanGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class AtanInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

AtanInplaceOpx(Op*, Devicex*)

class AtanOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

AtanOpx(Op*, Devicex*)

class BaseConcatOpx : public popart::popx::Opx 

Subclassed by popart::popx::ConcatInplaceOpx, popart::popx::ConcatOpx

Public Functions

BaseConcatOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class BaseExpandOpx : public popart::popx::Opx 

Subclassed by popart::popx::ExpandInplaceOpx, popart::popx::ExpandOpx

Public Functions

BaseExpandOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class BasePadOpx : public popart::popx::Opx 

Subclassed by popart::popx::PadInplaceOpx, popart::popx::PadOpx

Public Functions

BasePadOpx(Op*, Devicex*)

const BasePadOp &getBasePadOp() const

poplar::Tensor padGrow(poplar::Tensor inTensor, poplar::program::Sequence&, bool inPlaceAllowed) const

class BaseSliceOpx : public popart::popx::Opx 

Subclassed by popart::popx::SliceInplaceOpx, popart::popx::SliceOpx

Public Functions

BaseSliceOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class BaseSortOpx : public popart::popx::Opx 

Subclassed by popart::popx::TopKOpx

Public Functions

BaseSortOpx(Op*, Devicex*)

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final

class BaseWhereOpx : public popart::popx::Opx 

Subclassed by popart::popx::WhereLhsInplaceOpx, popart::popx::WhereOpx, popart::popx::WhereRhsInplaceOpx

Public Functions

BaseWhereOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex inIndex) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class BatchNormGradOpx : public popart::popx::NormOpx 

Public Functions

BatchNormGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class BatchNormOpx : public popart::popx::NormOpx 

Public Functions

BatchNormOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class BinaryComparisonOpx : public popart::popx::Opx 

Subclassed by popart::popx::AndOpx, popart::popx::EqualOpx, popart::popx::GreaterOpx, popart::popx::LessOpx, popart::popx::OrOpx

Public Functions

BinaryComparisonOpx(Op*, Devicex*)

class BitwiseBinaryOpx : public popart::popx::ElementWiseBinaryOpx 

Public Functions

BitwiseBinaryOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class BitwiseNotOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

BitwiseNotOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class CallGradOpx : public popart::popx::CallOpx 

Public Functions

CallGradOpx(Op*, Devicex*)

class CallOpx : public popart::popx::SubgraphOpx 

Subclassed by popart::popx::CallGradOpx

Public Functions

CallOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

void grow(std::vector<poplar::program::Sequence>&) const final

InputCreatorType getInputCreatorType(InIndex) const

class CastGradOpx : public popart::popx::CastOpx 

Public Functions

CastGradOpx(Op*, Devicex*)

class CastOpx : public popart::popx::Opx 

Subclassed by popart::popx::CastGradOpx

Public Functions

CastOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class CeilInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

CeilInplaceOpx(Op*, Devicex*)

class CeilOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

CeilOpx(Op*, Devicex*)

class ClipGradOpx : public popart::popx::Opx 

Public Functions

ClipGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ClipInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ClipInplaceOpx(Op*, Devicex*)

class ClipOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

ClipOpx(Op*, Devicex*)

class CollectivesBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::MultiCollectiveBaseOpx, popart::popx::ReplicatedAllGatherOpx, popart::popx::ReplicatedAllReduceOpx, popart::popx::ReplicatedReduceScatterOpx

Public Functions

CollectivesBaseOpx(Op*, Devicex*)

ReplicatedTensorShardingGroup getCollectiveLinkedGroup(ReplicatedTensorShardingIndicesIndex groupIndex) const

Function to determine which collective Ops need to be in the same collective linked group.

Ops in the same collective linked group need to use the same collective balanced reorder to ensure tensor layouts of tensors that interact with each other in the graph, are compatible.

Scenarios leading to collective Ops belonging to the same group:

The CollectivesBaseOp::getCollectiveLinkedIndex() is connected to the same root tensor (i.e. tensor A connects to the getCollectiveLinkedIndex of a ReduceScatter and AllGather, directly or indirectly):

A -> ReduceScatter -> IdentiyOp -> AllGather
The RTS enabled input/output tensors of RTS enabled collective operations meet in the compute graph:

B -> ReduceScatter -> C -> AllGather -> F -> ReduceScatter -> G \ VarUpdateOp / D -> ReduceScatter -> E

C, E and the VarUpdateOp in this graph are replicated tensor sharded (RTS) and therefore, both ReduceScatter Ops and the AllGather Op end up in the same collective linked group. B, D, F, G are not sharded, and therefore, the ReduceScatter between F and G can be in a different collective linked group.

The primary motivation for collective linked groups is “folding” multiple RTS tensors together via e.g. outlining. Folding in this context is when two operations or tensors that were unique now use the same code or memory, which implies that for example tensor layouts need to be identical too. If the graph has 3 RTS enabled variables, for example, and 2 of them use the same VarUpdateOp due to outlining, this implies that we need to ensure all RTS related Ops connected to those 2 variables use identical CBR (collective balanced reorder) rearrangement.

CBR is set in the collective Ops themselves either during Opx::unwindTensorLayout, Opx:createInput or Opx::grow by calling createCollectiveBalancedReorder

The third variable would use a separate VarUpdateOp, and therefore is in a separate collective linked group, and can instantiate it’s own CBR, even if the tensor shapes matches.

getCollectiveLinkedGroup uses Ops that introduce RTS/CBR as a starting point (ReduceScatter & AllGather) and tracks all associated Ops that propagate RTS with a DFS search on the graph.

Parameters: groupIndex – The index of the rtsIndices for which to return the collective group.
Returns: Returns all linked tensors and their connected ops to coordinate tensor mapping of collective inputs and outputs

gcl::CollectiveBalancedReorder *getCollectiveBalancedReorder(ReplicatedTensorShardingIndicesIndex groupIndex) const

Get the existing CBR.

Parameters: groupIndex – The index of the rtsIndices for which to return the collective group.
Returns: Existing CBR for the input/output tensor of the collective Op

gcl::CollectiveBalancedReorder *createCollectiveBalancedReorder(poplar::Tensor tensor, ReplicatedTensorShardingIndicesIndex groupIndex) const

Create a new CBR instance for the reference tensor.

Parameters

tensor – non-sharded reference tensor
groupIndex – The index of the rtsIndices for which to return the collective group.

Returns

New CBR for the input/output tensor of the collective Op

class ConcatGradOpx : public popart::popx::Opx 

Public Functions

ConcatGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ConcatInplaceOpx : public popart::popx::BaseConcatOpx 

Public Functions

ConcatInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ConcatOpx : public popart::popx::BaseConcatOpx 

Public Functions

ConcatOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ConvFlipWeightsGradOpx : public popart::popx::Opx 

Public Functions

ConvFlipWeightsGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ConvOpx : public popart::popx::MultiConvBaseOpx 

Public Functions

ConvOpx(Op*, Devicex*)

poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final

poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final

InputCreatorType getInputCreatorType(InIndex idx) const override

std::vector<poplar::Tensor> convolve(poplar::program::Sequence&, const std::vector<poplar::Tensor> &weights) const final

class ConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx 

Public Functions

ConvWeightsGradOpx(Op*, Devicex*)

std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const final

class CopyVarUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

CopyVarUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class CosOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

CosOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class CtcBeamSearchDecoderOpx : public popart::popx::Opx 

Public Functions

CtcBeamSearchDecoderOpx(Op *op, Devicex *device)

~CtcBeamSearchDecoderOpx()

void grow(poplar::program::Sequence &prog) const final

class CtcGradOpx : public popart::popx::Opx 

Public Functions

CtcGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class CtcOpx : public popart::popx::Opx 

Public Functions

CtcOpx(Op*, Devicex*)

~CtcOpx()

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override

InputCreatorType getInputCreatorType(InIndex index) const override

std::set<TensorId> mustExistBeforeCreate(InIndex index) const override

class CumSumGradOpx : public popart::popx::Opx 

Public Functions

CumSumGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class CumSumOpx : public popart::popx::Opx 

Public Functions

CumSumOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class DetachInplaceOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

DetachInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class DetachOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

DetachOpx(popart::Op*, popart::popx::Devicex*)

void grow(poplar::program::Sequence&) const

class DivOpx : public popart::popx::ElementWiseBinaryOpx 

Public Functions

DivOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class DropoutOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

DropoutOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

InputCreatorType getInputCreatorType(InIndex) const override

class DynamicAddInplaceOpx : public popart::popx::DynamicAddOpx 

Public Functions

inline DynamicAddInplaceOpx(Op *op, Devicex *devicex)

poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override

class DynamicAddOpx : public popart::popx::DynamicUpdateOpx 

Subclassed by popart::popx::DynamicAddInplaceOpx

Public Functions

inline DynamicAddOpx(Op *op, Devicex *devicex)

void grow(poplar::program::Sequence&) const final

class DynamicSliceInplaceOpx : public popart::popx::DynamicSliceOpx 

Public Functions

DynamicSliceInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class DynamicSliceOpx : public popart::popx::Opx 

Subclassed by popart::popx::DynamicSliceInplaceOpx

Public Functions

DynamicSliceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class DynamicUpdateInplaceOpx : public popart::popx::DynamicUpdateOpx 

Public Functions

DynamicUpdateInplaceOpx(Op*, Devicex*)

poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override

class DynamicUpdateOpx : public popart::popx::Opx 

Subclassed by popart::popx::DynamicAddOpx, popart::popx::DynamicUpdateInplaceOpx

Public Functions

DynamicUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

InputCreatorType getInputCreatorType(InIndex index) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

virtual poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const

class DynamicZeroInplaceOpx : public popart::popx::DynamicZeroOpx 

Public Functions

inline DynamicZeroInplaceOpx(Op *op, Devicex *devicex)

poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const override

class DynamicZeroOpx : public popart::popx::Opx 

Subclassed by popart::popx::DynamicZeroInplaceOpx

Public Functions

inline DynamicZeroOpx(Op *op, Devicex *devicex)

void grow(poplar::program::Sequence&) const override

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

virtual poplar::Tensor cloneNcopyOpt(poplar::program::Sequence&, const poplar::Tensor&) const

class ElementWiseBinaryInplaceOpx : public popart::popx::ElementWiseBinaryOpx 

Subclassed by popart::popx::AddLhsInplaceOpx, popart::popx::AddRhsInplaceOpx, popart::popx::Atan2LhsInplaceOpx, popart::popx::MulLhsInplaceOpx, popart::popx::MulRhsInplaceOpx, popart::popx::PowLhsInplaceOpx

Public Functions

inline ElementWiseBinaryInplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwbComputex> cx_)

void grow(poplar::program::Sequence&) const final

class ElementWiseBinaryOpx : public popart::popx::Opx 

Subclassed by popart::popx::BitwiseBinaryOpx, popart::popx::DivOpx, popart::popx::ElementWiseBinaryInplaceOpx, popart::popx::ElementWiseBinaryOutplaceOpx, popart::popx::FmodOpx, popart::popx::PReluOpx, popart::popx::SubtractOpx

Public Functions

ElementWiseBinaryOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const override

std::set<TensorId> mustExistBeforeCreate(InIndex) const override

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class ElementWiseBinaryOutplaceOpx : public popart::popx::ElementWiseBinaryOpx 

Subclassed by popart::popx::AddOpx, popart::popx::Atan2Opx, popart::popx::MulOpx, popart::popx::PowOpx

Public Functions

inline ElementWiseBinaryOutplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwbComputex> cx_)

void grow(poplar::program::Sequence&) const final

class ElementWiseUnaryInplaceOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

inline ElementWiseUnaryInplaceOpx(Op *op, Devicex *devx, std::unique_ptr<EwuComputex> cx_)

void grow(poplar::program::Sequence &prog) const final

class ElementWiseUnaryOpx : public popart::popx::Opx 

Subclassed by popart::popx::AbsOpx, popart::popx::BitwiseNotOpx, popart::popx::CosOpx, popart::popx::DetachInplaceOpx, popart::popx::DetachOpx, popart::popx::DropoutOpx, popart::popx::ElementWiseUnaryInplaceOpx, popart::popx::ElementWiseUnaryOutplaceOpx, popart::popx::ErfxGradOpx, popart::popx::ErfxOpx, popart::popx::IdentityGradOpx, popart::popx::IdentityOpx, popart::popx::IsInfx, popart::popx::IsNaNx, popart::popx::LogOpx, popart::popx::LogSoftmaxGradOpx, popart::popx::MeanOpx, popart::popx::NegateGradOpx, popart::popx::NegateOpx, popart::popx::NotOpx, popart::popx::ReciprocalOpx, popart::popx::SigmoidGradOpx, popart::popx::SinOpx, popart::popx::SoftmaxGradOpx, popart::popx::SqrtOpx, popart::popx::SquareOpx

Public Functions

ElementWiseUnaryOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class ElementWiseUnaryOutplaceOpx : public popart::popx::ElementWiseUnaryOpx 

Subclassed by popart::popx::AsinOpx, popart::popx::AtanOpx, popart::popx::CeilOpx, popart::popx::ClipOpx, popart::popx::EluOpx, popart::popx::Expm1Opx, popart::popx::ExpOpx, popart::popx::FloorOpx, popart::popx::GeluErfOpx, popart::popx::GeluOpx, popart::popx::HardSigmoidOpx, popart::popx::IncrementModOpx, popart::popx::LeakyReluOpx, popart::popx::Log1pOpx, popart::popx::LogSoftmaxOpx, popart::popx::NearbyIntOpx, popart::popx::ReluOpx, popart::popx::RoundOpx, popart::popx::ScaleOpx, popart::popx::SeluOpx, popart::popx::ShrinkOpx, popart::popx::SigmoidOpx, popart::popx::SignOpx, popart::popx::SinhOpx, popart::popx::SoftmaxOpx, popart::popx::SoftPlusOpx, popart::popx::SoftSignOpx, popart::popx::SwishOpx, popart::popx::ThresholdedReluOpx

Public Functions

ElementWiseUnaryOutplaceOpx(Op*, Devicex*, std::unique_ptr<EwuComputex> cx_)

void grow(poplar::program::Sequence&) const final

class EluGradOpx : public popart::popx::Opx 

Public Functions

EluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class EluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

EluInplaceOpx(Op*, Devicex*)

class EluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

EluOpx(Op*, Devicex*)

class EqualOpx : public popart::popx::BinaryComparisonOpx 

Public Functions

EqualOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ErfxGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

ErfxGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ErfxOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

ErfxOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ExchangeBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::HostBaseOpx, popart::popx::MultiExchangeOpx, popart::popx::RemoteBaseOpx, popart::popx::RemoteCodeLoadOpx

Public Functions

ExchangeBaseOpx(Op*, Devicex*)

inline std::set<TensorId> mustExistBeforeCreate(int) const override

class ExpInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ExpInplaceOpx(Op*, Devicex*)

class ExpOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

ExpOpx(Op*, Devicex*)

class ExpandGradOpx : public popart::popx::Opx 

Public Functions

ExpandGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ExpandInplaceOpx : public popart::popx::BaseExpandOpx 

Public Functions

ExpandInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ExpandOpx : public popart::popx::BaseExpandOpx 

Public Functions

ExpandOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class Expm1InplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

Expm1InplaceOpx(Op*, Devicex*)

class Expm1Opx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

Expm1Opx(Op*, Devicex*)

class FloorInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

FloorInplaceOpx(Op*, Devicex*)

class FloorOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

FloorOpx(Op*, Devicex*)

class FmodOpx : public popart::popx::ElementWiseBinaryOpx 

Public Functions

FmodOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GRUGradOpx : public popart::popx::Opx 

Public Functions

GRUGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GRUOpx : public popart::popx::Opx 

Public Functions

GRUOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const

Public Static Functions

static poplar::Tensor reshapePoplibWeightsForOnnx(poplar::Tensor)

static poplar::Tensor reshapePoplibBiasesForOnnx(poplar::Tensor)

class GatherBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::GatherOpx, popart::popx::TiedGatherOpx

Public Functions

GatherBaseOpx(Op*, Devicex*)

virtual void grow(poplar::program::Sequence&) const override = 0

virtual poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const override = 0

virtual InputCreatorType getInputCreatorType(int index0) const override = 0

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class GatherGradOpx : public popart::popx::Opx 

Public Functions

GatherGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final

Public Static Functions

static std::tuple<poplar::Tensor, poplar::Tensor, poplar::Tensor> handleNDMultiUpdate(poplar::Tensor target, poplar::Tensor update, poplar::Tensor indices, int64_t axis, int64_t group_size)

class GatherOpx : public popart::popx::GatherBaseOpx 

Public Functions

GatherOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(int index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(int index) const final

class GeluGradOpx : public popart::popx::Opx 

Public Functions

GeluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

GeluInplaceOpx(Op*, Devicex*)

class GeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

GeluOpx(Op*, Devicex*)

class GeluErfGradOpx : public popart::popx::Opx 

Public Functions

GeluErfGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GeluErfInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

GeluErfInplaceOpx(Op*, Devicex*)

class GeluErfOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

GeluErfOpx(Op*, Devicex*)

class GetRandomSeedOpx : public popart::popx::Opx 

Public Functions

GetRandomSeedOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GreaterOpx : public popart::popx::BinaryComparisonOpx 

Public Functions

GreaterOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GroupNormGradOpx : public popart::popx::NormOpx 

Public Functions

GroupNormGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class GroupNormOpx : public popart::popx::NormOpx 

Public Functions

GroupNormOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class HardSigmoidGradOpx : public popart::popx::Opx 

Public Functions

HardSigmoidGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class HardSigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

HardSigmoidInplaceOpx(Op*, Devicex*)

class HardSigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

HardSigmoidOpx(Op*, Devicex*)

class HistogramOpx : public popart::popx::Opx 

Public Functions

HistogramOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class HostBaseOpx : public popart::popx::ExchangeBaseOpx 

Subclassed by popart::popx::HostLoadOpx, popart::popx::HostStoreOpx

Public Functions

HostBaseOpx(Op*, Devicex*)

class HostLoadInplaceOpx : public popart::popx::HostLoadOpx 

Public Functions

HostLoadInplaceOpx(Op*, Devicex*)

class HostLoadOpx : public popart::popx::HostBaseOpx 

Subclassed by popart::popx::HostLoadInplaceOpx

Public Functions

HostLoadOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class HostStoreOpx : public popart::popx::HostBaseOpx 

Public Functions

HostStoreOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex) const final

class IdentityGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

IdentityGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IdentityInplaceOpx : public popart::popx::Opx 

Public Functions

IdentityInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IdentityLossGradOpx : public popart::popx::Opx 

Public Functions

IdentityLossGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

inline bool outputCreatedExternally(OutIndex) const final

class IdentityLossOpx : public popart::popx::Opx 

Public Functions

IdentityLossOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class IdentityOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

IdentityOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IfGradOpx : public popart::popx::IfOpx 

Public Functions

IfGradOpx(Op*, Devicex*)

class IfOpx : public popart::popx::Opx 

Subclassed by popart::popx::IfGradOpx

Public Functions

IfOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IncrementModInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

IncrementModInplaceOpx(Op*, Devicex*)

class IncrementModOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

IncrementModOpx(Op*, Devicex*)

class InitOpx : public popart::popx::Opx 

Public Functions

InitOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

inline bool outputCreatedExternally(OutIndex) const final

class InstanceNormGradOpx : public popart::popx::NormOpx 

Public Functions

InstanceNormGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class InstanceNormOpx : public popart::popx::NormOpx 

Public Functions

InstanceNormOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IoTileCopyOpx : public popart::popx::Opx 

Public Functions

IoTileCopyOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

inline bool outputCreatedExternally(OutIndex) const final

class IpuCopyOpx : public popart::popx::Opx 

Public Functions

IpuCopyOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

PreparedCopyTensors createPipelinedOutput() const

void growPipelined(poplar::program::Sequence&, PreparedCopyTensors) const

inline InputCreatorType getInputCreatorType(InIndex index) const final

inline bool canUnwind(InIndex in, OutIndex out) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

poplar::Graph &srcGraph(InIndex) const final

poplar::Graph &dstGraph(OutIndex) const final

class L1GradOpx : public popart::popx::Opx 

Public Functions

L1GradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class L1Opx : public popart::popx::Opx 

Public Functions

L1Opx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class LRNGradOpx : public popart::popx::Opx 

Public Functions

LRNGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LRNOpx : public popart::popx::Opx 

Public Functions

LRNOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LSTMGradOpx : public popart::popx::Opx 

Public Functions

LSTMGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LSTMOpx : public popart::popx::Opx 

Public Functions

LSTMOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const

popnn::lstm::LstmParams createLSTMParams() const

Public Static Functions

static poplar::Tensor reshapePoplibWeightsForOnnx(poplar::Tensor, bool transpose)

class LambSquareOpx : public popart::popx::Opx 

Public Functions

LambSquareOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LeakyReluGradOpx : public popart::popx::Opx 

Public Functions

LeakyReluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LeakyReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

LeakyReluInplaceOpx(Op*, Devicex*)

class LeakyReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

LeakyReluOpx(Op*, Devicex*)

class LessOpx : public popart::popx::BinaryComparisonOpx 

Public Functions

LessOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class Log1pInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

Log1pInplaceOpx(Op*, Devicex*)

class Log1pOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

Log1pOpx(Op*, Devicex*)

class LogOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

LogOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class LogSoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

LogSoftmaxGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor cloneNcopyGrouped(poplar::program::Sequence &s, const poplar::Tensor &t) const

class LogSoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

LogSoftmaxInplaceOpx(Op*, Devicex*)

class LogSoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

LogSoftmaxOpx(Op*, Devicex*)

class LoopOpx : public popart::popx::SubgraphOpx 

Public Functions

LoopOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const

bool canUnwind(InIndex in, OutIndex out) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class LossScaleUpdateOpx : public popart::popx::Opx 

Public Functions

LossScaleUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MatMulOpx : public popart::popx::Opx 

Public Functions

MatMulOpx(Op*, Devicex*)

~MatMulOpx() override = default

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final

MatMulOp *getMatMulOp() const

void grow(poplar::program::Sequence&) const final

poplar::Type getOutputType(const poplar::Tensor &output) const

void verifyCacheSizeUnchanged(size_t beforeCacheSize) const

Public Static Functions

static std::vector<std::size_t> onnxShapeToPoplar(const Shape &shape)

static void appendPoplarOptionsForOp(const MatMulBaseOp &op, poplar::OptionFlags &opts)

static void addPartialsType(const MatMulPartialsType &partialsType, poplar::OptionFlags &opts)

static std::pair<poplar::Tensor, poplar::Tensor> groupedMatMulInputsFromOpxInputs(MatMulBaseOp &matmul, poplar::Tensor lhs, poplar::Tensor rhs)

class MaxArgGradOpx : public popart::popx::Opx 

Public Functions

MaxArgGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MaxOpx : public popart::popx::Opx 

Public Functions

MaxOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class MeanArgGradOpx : public popart::popx::Opx 

Public Functions

MeanArgGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MeanOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

MeanOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MinArgGradOpx : public popart::popx::Opx 

Public Functions

MinArgGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MinOpx : public popart::popx::Opx 

Public Functions

MinOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

class ModifyRandomSeedOpx : public popart::popx::Opx 

Public Functions

ModifyRandomSeedOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class MulLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

MulLhsInplaceOpx(Op*, Devicex*)

class MulOpx : public popart::popx::ElementWiseBinaryOutplaceOpx 

Public Functions

MulOpx(Op*, Devicex*)

class MulRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

MulRhsInplaceOpx(Op*, Devicex*)

class MultiCollectiveBaseOpx : public popart::popx::CollectivesBaseOpx 

A base class for the lowering of different subclasses of MultiCollectiveBaseOp.

Each output tensor can be grown separately.

Subclassed by popart::popx::MultiReplicatedAllGatherOpx, popart::popx::MultiReplicatedAllReduceOpx, popart::popx::MultiReplicatedReduceScatterOpx

Public Functions

MultiCollectiveBaseOpx(Op *op, Devicex *devicex)

std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const override

Defines which “parts” use a particular input tensor There are “output->n()” parts in the collective operation: part “i” uses input “i” and the indices tensor at “i + output->n()” this logic is the same for all collective ops, even in the absence of an indices tensor.

Parameters: inTensor – the tensor for which to return a part id

OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const override

Defines which “part” is responsible for constructing a particular output There are “output->n()” parts: each part “i” produces output “i”.

Parameters: outTensor – the tensor for which to return a corresponding part id

class MultiConvBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::ConvOpx, popart::popx::MultiConvOpx

Public Functions

inline MultiConvBaseOpx(Op *op, Devicex *dv)

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final

InputCreatorType getInputCreatorType(InIndex) const override

void grow(poplar::program::Sequence&) const final

poplar::OptionFlags getConvOptions(int, std::string pass = "") const

std::string getFwdPassFlagString() const

inline virtual std::vector<poplar::Tensor> convolve(poplar::program::Sequence &prog, const std::vector<poplar::Tensor> &weights) const

inline virtual poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const

inline virtual poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const

bool isWeightsInIndex(InIndex) const

bool isDataInIndex(InIndex) const

void verifyCacheSizeUnchanged(size_t beforeCacheSize) const

class MultiConvOpx : public popart::popx::MultiConvBaseOpx 

Public Functions

MultiConvOpx(Op*, Devicex*)

poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final

poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final

std::vector<poplar::Tensor> convolve(poplar::program::Sequence&, const std::vector<poplar::Tensor>&) const final

class MultiConvWeightsGradBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::ConvWeightsGradOpx, popart::popx::MultiConvWeightsGradOpx

Public Functions

inline MultiConvWeightsGradBaseOpx(Op *op, Devicex *dv)

void grow(poplar::program::Sequence&) const final

inline virtual std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const

poplar::OptionFlags getConvOptions(int convIndex = 0) const

void verifyCacheSizeUnchanged(size_t beforeCacheSize) const

class MultiConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx 

Public Functions

MultiConvWeightsGradOpx(Op*, Devicex*)

std::vector<poplar::Tensor> calculateWeightDeltas(poplar::program::Sequence&) const final

class MultiExchangeOpx : public popart::popx::ExchangeBaseOpx 

Public Functions

MultiExchangeOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

std::vector<std::pair<int, int>> getSegments() const

std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const final

OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const final

void growPart(OpxGrowPartId id) const final

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

bool canUnwind(InIndex, OutIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class MultiReplicatedAllReduceOpx : public popart::popx::MultiCollectiveBaseOpx 

Lowers the MultiReplicatedAllReduceOp to Poplar by growing each individual output tensor, and performing a to-destination all-reduce on a concatenation of the input tensors.

Mixing of both in-place and out-place all-reduce operations is supported.

Public Functions

MultiReplicatedAllReduceOpx(popart::Op *op, Devicex *devicex)

InputCreatorType getInputCreatorType(InIndex) const override

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex in, OutIndex out) const override

view::RegMap unwindRegion(InIndex, OutIndex) const override

void growPart(OpxGrowPartId id) const override

void grow(poplar::program::Sequence &prog) const override

class NegateGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

NegateGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NegateOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

NegateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NllGradOpx : public popart::popx::Opx 

Public Functions

NllGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NllOpx : public popart::popx::Opx 

Public Functions

NllOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

Public Static Functions

static void flattenAndEncodeOneHot(const Opx &opx, poplar::program::Sequence &prog, const poplar::Tensor &probs, const poplar::Tensor &label, poplar::Tensor &probs2D, poplar::Tensor &label1D, poplar::Tensor &oneHot)

static poplar::Tensor applyMaskInPlaceForIgnoredIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor labels, int ignoreIndex, poplar::program::Sequence &prog)

static void applyScalingInPlaceForMeanReduction(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::program::Sequence &prog)

static void applyScalingInPlaceForMeanReductionWithIgnoreIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::Tensor mask, poplar::program::Sequence &prog)

static void handleLossGradScaling(const Opx &opx, bool hasIgnoreIndex, int64_t ignoreIndex, bool meanReduce, poplar::Tensor &oneHot, poplar::Tensor &gradIn, poplar::Tensor &label1D, poplar::program::Sequence &prog)

static void handleLossOutReducedToScalar(const Opx &opx, bool hasIgnoreIndex, int64_t ignoreIndex, bool meanReduce, poplar::Tensor &reduction, poplar::Tensor &label1D, poplar::program::Sequence &prog, const OutIndex outIdx)

static void handleLossOutNotReducedToScalar(const Opx &opx, poplar::Tensor &reduction, const poplar::Tensor &label, poplar::Tensor &label1D, poplar::program::Sequence &prog)

class NlllWithSoftmaxGradDirectOpx : public popart::popx::Opx 

Public Functions

NlllWithSoftmaxGradDirectOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NopOpx : public popart::popx::Opx 

Public Functions

NopOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NormOpx : public popart::popx::Opx 

Subclassed by popart::popx::BatchNormGradOpx, popart::popx::BatchNormOpx, popart::popx::GroupNormGradOpx, popart::popx::GroupNormOpx, popart::popx::InstanceNormGradOpx, popart::popx::InstanceNormOpx

Public Functions

NormOpx(Op*, Devicex*)

class NotOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

NotOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class NormalizeImageOpx : public popart::popx::Opx 

Public Functions

NormalizeImageOpx(popart::Op *op, popart::popx::Devicex *devicex)

poplar::Tensor createInput(popart::InIndex index, const poplar::DebugNameAndId &dnai) const override

popart::popx::InputCreatorType getInputCreatorType(popart::InIndex index) const override

std::set<popart::TensorId> mustExistBeforeCreate(popart::InIndex) const override

poplar::Tensor createNormalizedImageInput(const poplar::DebugNameAndId &dnai) const

void grow(poplar::program::Sequence &prog) const final

class OnehotGradOpx : public popart::popx::Opx 

Public Functions

OnehotGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class OnehotOpx : public popart::popx::Opx 

Public Functions

OnehotOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class OrOpx : public popart::popx::BinaryComparisonOpx 

Public Functions

OrOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class PReluOpx : public popart::popx::ElementWiseBinaryOpx 

Public Functions

PReluOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class PadGradOpx : public popart::popx::SliceOpx 

Public Functions

PadGradOpx(Op*, Devicex*)

class PadInplaceOpx : public popart::popx::BasePadOpx 

Public Functions

PadInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class PadOpx : public popart::popx::BasePadOpx 

Public Functions

PadOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

template<typename LSTMOP> class PopartLSTMOpxBase : public popart::popx::Opx 

Subclassed by popart::popx::PopartLSTMGradOpx, popart::popx::PopartLSTMOpx

Public Functions

inline PopartLSTMOpxBase(Op *op, Devicex *devicex)

class PowLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx 

Public Functions

PowLhsInplaceOpx(Op*, Devicex*)

class PowOpx : public popart::popx::ElementWiseBinaryOutplaceOpx 

Public Functions

PowOpx(Op*, Devicex*)

class PrintTensorOpx : public popart::popx::Opx 

Public Functions

PrintTensorOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class RMSPropUpdaterOpx : public popart::popx::Opx 

Public Functions

RMSPropUpdaterOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class RNNGradOpx : public popart::popx::Opx 

Public Functions

RNNGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const

class RNNOpx : public popart::popx::Opx 

Public Functions

RNNOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const

class RandomNormalOpx : public popart::popx::Opx 

Public Functions

RandomNormalOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class RandomUniformOpx : public popart::popx::Opx 

Public Functions

RandomUniformOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReciprocalOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

ReciprocalOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReduceL1GradOpx : public popart::popx::Opx 

Public Functions

ReduceL1GradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceL1Opx : public popart::popx::Opx 

Public Functions

ReduceL1Opx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceL2GradOpx : public popart::popx::Opx 

Public Functions

ReduceL2GradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceL2Opx : public popart::popx::Opx 

Public Functions

ReduceL2Opx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceLogSumExpGradOpx : public popart::popx::Opx 

Public Functions

ReduceLogSumExpGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceLogSumExpOpx : public popart::popx::Opx 

Public Functions

ReduceLogSumExpOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceLogSumGradOpx : public popart::popx::Opx 

Public Functions

ReduceLogSumGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceLogSumOpx : public popart::popx::Opx 

Public Functions

ReduceLogSumOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMaxGradOpx : public popart::popx::Opx 

Public Functions

ReduceMaxGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMaxOpx : public popart::popx::Opx 

Public Functions

ReduceMaxOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMeanGradOpx : public popart::popx::Opx 

Public Functions

ReduceMeanGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMeanOpx : public popart::popx::Opx 

Public Functions

ReduceMeanOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMedianGradOpx : public popart::popx::Opx 

Public Functions

ReduceMedianGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMedianOpx : public popart::popx::Opx 

Public Functions

ReduceMedianOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMinGradOpx : public popart::popx::Opx 

Public Functions

ReduceMinGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceMinOpx : public popart::popx::Opx 

Public Functions

ReduceMinOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceProdGradOpx : public popart::popx::Opx 

Public Functions

ReduceProdGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceProdOpx : public popart::popx::Opx 

Public Functions

ReduceProdOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceSumGradOpx : public popart::popx::Opx 

Public Functions

ReduceSumGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceSumOpx : public popart::popx::Opx 

Subclassed by popart::popx::AddArg0GradOpx, popart::popx::AddArg1GradOpx, popart::popx::AddBiasBiasGradOpx, popart::popx::SubtractArg0GradOpx

Public Functions

ReduceSumOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceSumSquareGradOpx : public popart::popx::Opx 

Public Functions

ReduceSumSquareGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReduceSumSquareOpx : public popart::popx::Opx 

Public Functions

ReduceSumSquareOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ReluGradOpx : public popart::popx::Opx 

Public Functions

ReluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ReluInplaceOpx(Op*, Devicex*)

class ReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

ReluOpx(Op*, Devicex*)

class RemoteBaseOpx : public popart::popx::ExchangeBaseOpx 

Subclassed by popart::popx::RemoteLoadOpx, popart::popx::RemoteStoreOpx

Public Functions

RemoteBaseOpx(Op*, Devicex*)

class RemoteLoadInplaceOpx : public popart::popx::RemoteLoadOpx 

Public Functions

RemoteLoadInplaceOpx(Op*, Devicex*)

class RemoteLoadOpx : public popart::popx::RemoteBaseOpx 

Subclassed by popart::popx::RemoteLoadInplaceOpx

Public Functions

RemoteLoadOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class RemoteStoreOpx : public popart::popx::RemoteBaseOpx 

Public Functions

RemoteStoreOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReplicatedAllGatherOpx : public popart::popx::CollectivesBaseOpx 

Public Functions

ReplicatedAllGatherOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex index0) const final

bool hasCreatorViewChangers(InIndex index) const final

ViewChangers getCreatorViewChangers(InIndex index) const final

class ReplicatedAllReduceInplaceOpx : public popart::popx::ReplicatedAllReduceOpx 

Public Functions

ReplicatedAllReduceInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReplicatedAllReduceOpx : public popart::popx::CollectivesBaseOpx 

Subclassed by popart::popx::ReplicatedAllReduceInplaceOpx

Public Functions

ReplicatedAllReduceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class ReplicatedReduceScatterOpx : public popart::popx::CollectivesBaseOpx 

Public Functions

ReplicatedReduceScatterOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex) const final

DnfTensorIds mustExistBeforeCreateDNF(InIndex index0) const final

bool hasCreatorViewChangers(InIndex index) const final

ViewChangers getCreatorViewChangers(InIndex index) const final

class RescaleAccumulateOpx : public popart::popx::AccumulateBaseOpx 

Public Functions

RescaleAccumulateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReshapeBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::ReshapeInplaceOpx, popart::popx::ReshapeOpx

Public Functions

ReshapeBaseOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex inIndex, OutIndex outIndex) const final

class ReshapeGradOpx : public popart::popx::ReshapeOpx 

Public Functions

ReshapeGradOpx(Op*, Devicex*)

class ReshapeInplaceOpx : public popart::popx::ReshapeBaseOpx 

Public Functions

ReshapeInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReshapeOpx : public popart::popx::ReshapeBaseOpx 

Subclassed by popart::popx::ReshapeGradOpx

Public Functions

ReshapeOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ResizeGradOpx : public popart::popx::Opx 

Public Functions

ResizeGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ResizeOpx : public popart::popx::Opx 

Public Functions

ResizeOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

template<typename Derived> class RestoreBaseOpx : public popart::popx::Opx 

Base class for restore opxs.

Template Parameters: Opx – is subclass of RestoreBaseOpx. Must have type alias OpType defined as the Op that it corresponds to.

Public Functions

RestoreBaseOpx(Op *op, Devicex *devicex)

virtual void grow(poplar::program::Sequence&) const = 0

class ReverseBaseOpx : public popart::popx::Opx 

Subclassed by popart::popx::ReverseInplaceOpx, popart::popx::ReverseOpx

Public Functions

ReverseBaseOpx(Op*, Devicex*)

inline InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex inIndex, OutIndex outIndex) const final

class ReverseGradOpx : public popart::popx::ReverseOpx 

Public Functions

ReverseGradOpx(Op*, Devicex*)

class ReverseInplaceOpx : public popart::popx::ReverseBaseOpx 

Public Functions

ReverseInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ReverseOpx : public popart::popx::ReverseBaseOpx 

Subclassed by popart::popx::ReverseGradOpx

Public Functions

ReverseOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class RoundInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

RoundInplaceOpx(Op*, Devicex*)

class RoundOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

RoundOpx(Op*, Devicex*)

class SGD0VarUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

SGD0VarUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SGD1AcclUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

SGD1AcclUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SGD1VarUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

SGD1VarUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ScaleInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ScaleInplaceOpx(Op*, Devicex*)

class ScaleGradOpx : public popart::popx::ScaleOpx 

Public Functions

ScaleGradOpx(Op*, Devicex*)

class ScaleOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Subclassed by popart::popx::ScaleGradOpx

Public Functions

ScaleOpx(Op*, Devicex*)

class ScaledAddLhsInplaceOpx : public popart::popx::ScaledAddOpx 

Public Functions

ScaledAddLhsInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ScaledAddOpx : public popart::popx::Opx 

Subclassed by popart::popx::ScaledAddLhsInplaceOpx, popart::popx::ScaledAddRhsInplaceOpx

Public Functions

ScaledAddOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ScaledAddRhsInplaceOpx : public popart::popx::ScaledAddOpx 

Public Functions

ScaledAddRhsInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ScaledVarUpdateOpx : public popart::popx::VarUpdateOpx 

Public Functions

ScaledVarUpdateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ScatterDataGradOpx : public popart::popx::Opx 

Public Functions

ScatterDataGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class ScatterOpx : public popart::popx::ScatterReduceOpx 

Public Functions

ScatterOpx(Op*, Devicex*)

class ScatterReduceGradOpx : public popart::popx::Opx 

Public Functions

ScatterReduceGradOpx(Op*, Devicex*)

~ScatterReduceGradOpx()

void grow(poplar::program::Sequence&) const final override

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final override

InputCreatorType getInputCreatorType(InIndex index) const final override

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final override

class ScatterReduceOpx : public popart::popx::Opx 

Subclassed by popart::popx::ScatterOpx

Public Functions

ScatterReduceOpx(Op*, Devicex*)

~ScatterReduceOpx()

void grow(poplar::program::Sequence&) const final override

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final override

InputCreatorType getInputCreatorType(InIndex index) const final override

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final override

class ScatterUpdateGradOpx : public popart::popx::Opx 

Public Functions

ScatterUpdateGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class SeluGradOpx : public popart::popx::Opx 

Public Functions

SeluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SeluInplaceOpx(Op*, Devicex*)

class SeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SeluOpx(Op*, Devicex*)

class SequenceSliceInplaceOpx : public popart::popx::Opx 

Public Functions

SequenceSliceInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SequenceSliceOpx : public popart::popx::Opx 

Public Functions

SequenceSliceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ShapedDropoutOpx : public popart::popx::Opx 

Public Functions

ShapedDropoutOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const override

class ShrinkGradOpx : public popart::popx::Opx 

Public Functions

ShrinkGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ShrinkInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ShrinkInplaceOpx(Op*, Devicex*)

class ShrinkOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

ShrinkOpx(Op*, Devicex*)

class SigmoidGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

SigmoidGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SigmoidInplaceOpx(Op*, Devicex*)

class SigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SigmoidOpx(Op*, Devicex*)

class SignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SignInplaceOpx(Op*, Devicex*)

class SignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SignOpx(Op*, Devicex*)

class SinOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

SinOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SinhGradOpx : public popart::popx::Opx 

Public Functions

SinhGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SinhInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SinhInplaceOpx(Op*, Devicex*)

class SinhOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SinhOpx(Op*, Devicex*)

class SliceInplaceOpx : public popart::popx::BaseSliceOpx 

Public Functions

SliceInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SliceOpx : public popart::popx::BaseSliceOpx 

Subclassed by popart::popx::PadGradOpx

Public Functions

SliceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SoftPlusGradOpx : public popart::popx::Opx 

Public Functions

SoftPlusGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SoftPlusInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SoftPlusInplaceOpx(Op*, Devicex*)

class SoftPlusOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SoftPlusOpx(Op*, Devicex*)

class SoftSignGradOpx : public popart::popx::Opx 

Public Functions

SoftSignGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SoftSignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SoftSignInplaceOpx(Op*, Devicex*)

class SoftSignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SoftSignOpx(Op*, Devicex*)

class SoftmaxGradDirectOpx : public popart::popx::Opx 

Public Functions

SoftmaxGradDirectOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

SoftmaxGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SoftmaxInplaceOpx(Op*, Devicex*)

class SoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SoftmaxOpx(Op*, Devicex*)

class SparseAccumulateOpx : public popart::popx::AccumulateBaseOpx 

Public Functions

SparseAccumulateOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInput(InIndex, const poplar::DebugNameAndId &dnai) const final

std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class SplitOpx : public popart::popx::Opx 

Public Functions

SplitOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SqrtOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

SqrtOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SquareOpx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

SquareOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class StashOpx : public popart::popx::Opx 

Public Functions

StashOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SubgraphOpx : public popart::popx::Opx 

Subclassed by popart::popx::CallOpx, popart::popx::LoopOpx

Public Functions

SubgraphOpx(Op*, Devicex*)

inline bool outputCreatedExternally(OutIndex) const final

PreparedTensorInfos getInputsToPrepare() const override

PreparedTensorInfos getOutputsToPrepare() const override

class SubsampleGradOpx : public popart::popx::Opx 

Public Functions

SubsampleGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SubsampleInplaceOpx : public popart::popx::Opx 

Public Functions

SubsampleInplaceOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SubsampleOpx : public popart::popx::Opx 

Public Functions

SubsampleOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SubtractArg0GradOpx : public popart::popx::ReduceSumOpx 

Public Functions

SubtractArg0GradOpx(Op*, Devicex*)

class SubtractOpx : public popart::popx::ElementWiseBinaryOpx 

Public Functions

SubtractOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SumArgGradOpx : public popart::popx::Opx 

Public Functions

SumArgGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SumOpx : public popart::popx::Opx 

Public Functions

SumOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class SwishGradOpx : public popart::popx::Opx 

Public Functions

SwishGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class SwishInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

SwishInplaceOpx(Op*, Devicex*)

class SwishOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

SwishOpx(Op*, Devicex*)

class SyncOpx : public popart::popx::Opx 

Public Functions

SyncOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class TanhGradOpx : public popart::popx::Opx 

Public Functions

TanhGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class TanhOpx : public popart::popx::Opx 

Public Functions

TanhOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class TensorRemapOpx : public popart::popx::Opx 

Public Functions

TensorRemapOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

bool outputCreatedExternally(OutIndex) const final

InputCreatorType getInputCreatorType(InIndex index) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor, InIndex, OutIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class ThresholdedReluGradOpx : public popart::popx::Opx 

Public Functions

ThresholdedReluGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ThresholdedReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx 

Public Functions

ThresholdedReluInplaceOpx(Op*, Devicex*)

class ThresholdedReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx 

Public Functions

ThresholdedReluOpx(Op*, Devicex*)

class TiedGatherOpx : public popart::popx::GatherBaseOpx 

Public Functions

TiedGatherOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(int index0) const final

poplar::Tensor createInput(InIndex index, const poplar::DebugNameAndId &dnai) const final

class TileGradOpx : public popart::popx::Opx 

Public Functions

TileGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class TileOpx : public popart::popx::Opx 

Public Functions

TileOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class TopKGradOpx : public popart::popx::Opx 

Public Functions

TopKGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

poplar::Tensor createInputTensor(InIndex index, const poplar::DebugNameAndId &dnai) const final

InputCreatorType getInputCreatorType(InIndex index) const final

inline std::set<TensorId> mustExistBeforeCreate(InIndex) const final

class TopKOpx : public popart::popx::BaseSortOpx 

Public Functions

TopKOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class TransposeGradOpx : public popart::popx::TransposeOpx 

Public Functions

TransposeGradOpx(Op*, Devicex*)

class TransposeInplaceOpx : public popart::popx::Opx 

Public Functions

TransposeInplaceOpx(Op*, Devicex*)

InputCreatorType getInputCreatorType(InIndex) const final

void grow(poplar::program::Sequence&) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class TransposeOpx : public popart::popx::Opx 

Subclassed by popart::popx::TransposeGradOpx

Public Functions

TransposeOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

InputCreatorType getInputCreatorType(InIndex) const final

poplar::Tensor unwindTensorLayout(poplar::Tensor tensor, InIndex inIndex, OutIndex outIndex) const final

view::RegMap unwindRegion(InIndex, OutIndex) const final

class VarUpdateOpx : public popart::popx::Opx 

Subclassed by popart::popx::AccumulateBaseOpx, popart::popx::AccumulatorScaleOpx, popart::popx::AdamVarUpdateOpx, popart::popx::CopyVarUpdateOpx, popart::popx::ScaledVarUpdateOpx, popart::popx::SGD0VarUpdateOpx, popart::popx::SGD1AcclUpdateOpx, popart::popx::SGD1VarUpdateOpx

Public Functions

inline VarUpdateOpx(Op *op, Devicex *devicex)

class WhereLhsInplaceOpx : public popart::popx::BaseWhereOpx 

Public Functions

WhereLhsInplaceOpx(Op*, Devicex*)

void doGrow(poplar::program::Sequence&, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final

class WhereOpx : public popart::popx::BaseWhereOpx 

Public Functions

WhereOpx(Op*, Devicex*)

void doGrow(poplar::program::Sequence &prog, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final

class WhereRhsInplaceOpx : public popart::popx::BaseWhereOpx 

Public Functions

WhereRhsInplaceOpx(Op*, Devicex*)

void doGrow(poplar::program::Sequence&, const poplar::Tensor&, const poplar::Tensor&, const poplar::Tensor&) const final

class WhereXGradOpx : public popart::popx::Opx 

Public Functions

WhereXGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class WhereYGradOpx : public popart::popx::Opx 

Public Functions

WhereYGradOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ZerosOpx : public popart::popx::Opx 

Public Functions

ZerosOpx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

14.9. Patterns

#include <popart/patterns/patterns.hpp>

class Patterns

A class to hold which patterns are enabled and disabled.

Public Functions

Patterns(PatternsLevel level)

Constructor for the Patterns class.

Parameters: level – The pattern set to run.

inline Patterns()

Default constructor for the Patterns class.

The pattern set to run is set to PatternsLevel::Default.

Patterns(std::vector<std::string> patterns)

Constructor for the Patterns class.

Parameters: patterns – A vector of pattern names of patterns to be run.

bool isPatternEnabled(const std::type_index &t)

Check if a pattern (of class PreAliasPattern) is enabled.

Parameters: t – The pattern to check.
Returns: true if pattern is enabled; false otherwise.

bool isPatternEnabled(const std::string &t)

Check if pattern (not of class PreAliasPattern) is enabled.

Parameters: t – The name of the pattern to check.
Returns: true if pattern is enabled; false otherwise.

Patterns &enablePattern(const std::type_index &t, bool v)

Enable a pattern of class PreAliasPattern.

Parameters

t – The pattern to enable.
v – If true then enable pattern. If false then disable pattern.

Returns

Pattern.

Patterns &enablePattern(const std::string &t, bool v)

Enable a pattern not of class PreAliasPattern.

Parameters

t – The pattern to enable.
v – If true then enable pattern. If false then disable pattern.

Returns

Pattern.

bool isInitAccumulateEnabled()

Check if InitAccumulatePattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isPreUniReplEnabled()

Check if PreUniRepl is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isPostNReplEnabled()

Check if PostNRepl is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isSoftMaxGradDirectEnabled()

Check if SoftMaxGradDirect is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isNlllWithSoftMaxGradDirectEnabled()

Check if NlllWithSoftMaxGradDirect is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isSplitGatherEnabled()

Check if SplitGatherPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isOpToIdentityEnabled()

Check if OpToIdentityPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isUpsampleToResizeEnabled()

Check if UpsampleToResizePattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isSubtractArg1GradOpEnabled()

Check if SubtractArg1GradOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isMulArgGradOpEnabled()

Check if MulArgGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isReciprocalGradOpEnabled()

Check if ReciprocalGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isAtan2Arg0GradOpEnabled()

Check if Atan2Arg0GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isAtan2Arg1GradOpEnabled()

Check if Atan2Arg1GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isDivArg0GradOpEnabled()

Check if DivArg0GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isDivArg1GradOpEnabled()

Check if DivArg1GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isPowArg0GradOpEnabled()

Check if PowArg0GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isPowArg1GradOpEnabled()

Check if PowArg1GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isSinGradOpEnabled()

Check if SinGradOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isCosGradOpEnabled()

Check if CosGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

inline bool isInPlaceEnabled()

Check if InPlace is enabled.

Returns: true if pattern is enabled; false otherwise.

inline bool isUpdateInplacePrioritiesForIpuEnabled()

Check if UpdateInplacePrioritiesForIpu is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isSqrtGradOpEnabled()

Check if SqrtGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isConvFlipWeightsDoubleFlipEnabled()

Check if ConvFlipWeightsDoubleFlipPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isConvFlipWeightsGradOpEnabled()

Check if ConvFlipWeightsGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isExpandCastEnabled()

Check if ExpandCastPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isExpGradOpEnabled()

Check if ExpGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isExpm1GradOpEnabled()

Check if Expm1GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isLog1pGradOpEnabled()

Check if Log1pGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isLogGradOpEnabled()

Check if LogGradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isNegativeOneScaleEnabled()

Check if NegativeOneScalePattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isMatMulOpEnabled()

Check if MatMulOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isMatMulLhsGradOpEnabled()

Check if MatMulLhsGradOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isMatMulRhsGradOpEnabled()

Check if MatMulRhsGradOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isRandomNormalLikeOpPatternEnabled()

Check if RandomNormalLikeOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isRandomUniformLikeOpPatternEnabled()

Check if RandomUniformLikeOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isZerosLikeOpPatternEnabled()

Check if ZerosLikeOp is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isDecomposeBinaryConstScalarEnabled()

Check if DecomposeBinaryConstScalar is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isFmodArg0GradOpEnabled()

Check if FmodArg0GradOpPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isLambSerialisedWeightEnabled()

Check if LambSerialisedWeightPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isTiedGatherEnabled()

Check if TiedGatherPattern is enabled.

Returns: true if pattern is enabled; false otherwise.

bool isTiedGatherAccumulateEnabled()

Check if TiedGatherAccumulatePattern is enabled.

Returns: true if pattern is enabled; false otherwise.

Patterns &enableInitAccumulate(bool v)

Enable or disable InitAccumulatePattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enablePreUniRepl(bool v)

Enable or disable PreUniRepl.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enablePostNRepl(bool v)

Enable or disable PostNRepl.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableSoftMaxGradDirect(bool v)

Enable or disable SoftMaxGradDirect.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableNlllWithSoftMaxGradDirect(bool v)

Enable or disable NlllWithSoftMaxGradDirect.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableSplitGather(bool v)

Enable or disable SplitGatherPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableOpToIdentity(bool v)

Enable or disable OpToIdentityPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableUpsampleToResize(bool v)

Enable or disable UpsampleToResizePattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableSubtractArg1GradOp(bool v)

Enable or disable SubtractArg1GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableMulArgGradOp(bool v)

Enable or disable MulArgGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableReciprocalGradOp(bool v)

Enable or disable ReciprocalGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableAtan2Arg0GradOp(bool v)

Enable or disable Atan2Arg0GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableAtan2Arg1GradOp(bool v)

Enable or disable Atan2Arg1GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableDivArg0GradOp(bool v)

Enable or disable DivArg0GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableDivArg1GradOp(bool v)

Enable or disable DivArg1GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enablePowArg0GradOp(bool v)

Enable or disable PowArg0GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enablePowArg1GradOp(bool v)

Enable or disable PowArg1GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableSinGradOp(bool v)

Enable or disable SinGradOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableCosGradOp(bool v)

Enable or disable CosGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableInPlace(bool v)

Enable or disable InPlace.

Parameters: v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableUpdateInplacePrioritiesForIpu(bool v)

Enable or disable UpdateInplacePrioritiesForIpu.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableSqrtGradOp(bool v)

Enable or disable SqrtGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableConvFlipWeightsDoubleFlip(bool v)

Enable or disable ConvFlipWeightsDoubleFlipPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableConvFlipWeightsGradOp(bool v)

Enable or disable ConvFlipWeightsGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableExpGradOp(bool v)

Enable or disable ExpGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableExpm1GradOp(bool v)

Enable or disable Expm1GradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableLog1pGradOp(bool v)

Enable or disable Log1pGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableLogGradOp(bool v)

Enable or disable LogGradOpPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableNegativeOneScale(bool v)

Enable or disable NegativeOneScalePattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulOp(bool v)

Enable or disable MatMulOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulLhsGradOp(bool v)

Enable or disable MatMulLhsGradOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableMatMulRhsGradOp(bool v)

Enable or disable MatMulRhsGradOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableRandomNormalLikeOpPattern(bool v)

Enable or disable RandomNormalLikeOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableRandomUniformLikeOpPattern(bool v)

Enable or disable RandomUniformLikeOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableZerosLikeOpPattern(bool v)

Enable or disable ZerosLikeOp.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableDecomposeBinaryConstScalar(bool v)

Enable or disable DecomposeBinaryConstScalar.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableLambSerialisedWeight(bool v)

Enable or disable LambSerialisedWeightPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableTiedGather(bool v)

Enable or disable TiedGatherPattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

Patterns &enableTiedGatherAccumulate(bool v)

Enable or disable TiedGatherAccumulatePattern.

Parameters: v – If true then enable pattern. If false then disable pattern.

inline Patterns &enableRuntimeAsserts(bool b)

Enable or disable runtime asserts.

If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.

Parameters: v – If true then enable runtime asserts. If false then disable run time asserts.

std::vector<std::unique_ptr<PreAliasPattern>> getPreAliasList()

Get list of patterns to be run before aliasing.

Returns: A vector of pointers to patterns of class PreAliasPattern.

bool operator==(const Patterns &p) const

Equality operator.

Parameters: p – Pattern to compare to.
Returns: true if patterns are equal; false otherwise.

inline const std::map<std::type_index, bool> &getSettings() const

Get the settings (enabled or disabled) for patterns.

Returns: Map of which patterns are enabled or disabled, indexed by value of std::type_index.

inline bool getInplaceEnabled() const

Check if the pattern InPlace is enabled.

Returns: true if pattern is enabled; false otherwise.

inline bool getUpdateInplacePrioritiesForIpuEnabled() const

Check if the pattern UpdateInplacePrioritiesForIpu is enabled.

Returns: true if pattern is enabled; false otherwise.

inline bool getRuntimeAssertsOn() const

Check if runtime asserts are enabled.

If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.

Returns: true if runtime asserts are enabled; false otherwise.

Public Static Functions

static Patterns create(std::vector<std::string> patterns)

Create a set of pattern to be run.

Parameters: patterns – A vector of pattern names of patterns to be run.

static std::vector<std::string> getAllPreAliasPatternNames()

Get the names of all patterns of class PreAliasPattern, using the same order as getPreAliasList().

Returns: A vector of the names of all patterns of class PreAliasPattern.

static bool isMandatory(Pattern &pattern)

Check if a pattern is mandatory.

Mandatory patterns must be enabled and must be run.

This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is true.

Parameters: pattern – The pattern to check.
Returns: If true then pattern is mandatory. If false then pattern is not mandatory.

static bool isMandatory(std::string &patternName)

Check if a pattern is mandatory.

Mandatory patterns must be enabled and must be run.

This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is true.

Parameters: patternName – The name of the pattern to check.
Returns: If true then pattern is mandatory. If false then pattern is not mandatory.

Friends

friend std::ostream &operator<<(std::ostream &os, const Patterns &patterns)

Write a string representation of patterns to an output stream.

Parameters

os – An output stream that the the string representation should be written to.
patterns – The patterns for which the string representation is created.

Returns

An output stream containing the string representation of the patterns.

class PreAliasPattern : public popart::Pattern

Public Functions

PreAliasPattern() = default

virtual ~PreAliasPattern() = default

virtual std::vector<const Tensor*> touches(Op *op) const = 0

Op *makeReplacementOpInIr(const OperatorIdentifier&, Op *oldOp, const std::string name = "") const

virtual bool matches(Op *op) const = 0

virtual bool apply(Op *op) const = 0

bool touchesAnchored(Op*) const

14.9.1. Available patterns

class AllReduceToIdentityPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class BinaryGradOpPattern : public popart::PreAliasPattern 

Subclassed by popart::Atan2Arg0GradOpPattern, popart::Atan2Arg1GradOpPattern, popart::DivArg0GradOpPattern, popart::DivArg1GradOpPattern, popart::FmodArg0GradOpPattern, popart::PowArg0GradOpPattern, popart::PowArg1GradOpPattern, popart::SubtractArg1GradOpPattern

Public Functions

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const final

class ContiguateIpuCopyIndicesPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class ConvDataGradPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class ConvFlipWeightsDoubleFlipPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class ConvFlipWeightsGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class ConvTransposePattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class CosGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class CoshOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class DecomposeBinaryConstScalar : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

template<class GRADOP, class DOP> class ElementWiseGradOpPattern : public popart::PreAliasPattern 

Public Functions

inline bool matches(Op *op) const override

inline std::vector<const Tensor*> touches(Op*) const override

inline bool apply(Op *op) const override

class ExpGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class ExpandCastPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class Expm1GradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class Fuser : public popart::PreAliasPattern 

Subclassed by popart::SoftmaxGradDirect

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class InitAccumulatePattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class LSTMPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op *op) const override

inline std::vector<const Tensor*> touches(Op*) const override

bool apply(Op *op) const override

class LambSerialisedWeightPattern : public popart::PreAliasPattern 

This Pattern finds Weights that have been serialised and are being updated in the Lamb Optimizer in slices.

Transforming:

Slice(W) U_sliced } | (R1) | (R2) } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } Into:

Slice(W) U_sliced } | | } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } Sum Sum \ / AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } x N

A key property of LambSquare is that the output has not been sqrt yet, so it is valid to just Sum the outputs.

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

template<class L> class LikeOpsPattern : public popart::PreAliasPattern 

Public Functions

inline bool matches(Op *op) const final

inline std::vector<const Tensor*> touches(Op*) const final

inline bool apply(Op *op) const final

class Log1pGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class LogGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class LoopScanOutPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class MatMulGradPattern : public popart::PreAliasPattern 

Subclassed by popart::MatMulLhsGradPattern, popart::MatMulRhsGradPattern

Public Functions

inline std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

virtual popart::Tensor *getIn(Op *op) const = 0

virtual popart::Tensor *getGradIn(Op *op) const = 0

virtual popart::Tensor *getGradOut(Op *op) const = 0

virtual InIndex getInIndex() const = 0

virtual InIndex getGradInIndex() const = 0

class MatMulPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op *op) const override

inline std::vector<const Tensor*> touches(Op*) const override

bool apply(Op *op) const override

class MulArgGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class NlllWithSoftmaxGradDirect : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class OptimizerDecompose : public popart::PreAliasPattern : Subclassed by popart::AdamDecompose, popart::AdaptiveDecompose, popart::SGD0Decompose, popart::SGD1Decompose, popart::SGD2Decompose

class PackedDataBlockPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class PadSumPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class PostNRepl : public popart::PreAliasPattern 

Public Functions

PostNRepl() = default

~PostNRepl() override = default

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class PreUniRepl : public popart::PreAliasPattern 

Public Functions

PreUniRepl() = default

~PreUniRepl() override = default

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class ReciprocalGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class RemoveUnnecessaryLossGradCast : public popart::PreAliasPattern 

The RemoveUnnecessaryLossGradCast changes

fp32_lossScale -- Cast -- fp16_lossScale -- NllLossGradOp -- fp16_grad
                          fp16_probs -------'

to

fp32_lossScale -- NllLossGradOp -- fp16_grad
fp16_probs -------'

This corner case can occur in a model with fp16 activations when its fp16 loss scale is anchored for summation and upcast to fp32 in order to prevent overflow. In this case if we have a loss scale >max(fp16) the downcasting will result in a clipping of the loss scale.

Notice that even if the loss scale is >max(fp16) the resulting gradients can be within fp16 range. If the resulting gradients are >max(fp16), they will be clipped (unless the user has enabled NaN on overflow).

Public Functions

bool matches(Op *lossOp) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op *lossOp) const final

class ScanToLoopPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class SequenceExpander : public popart::PreAliasPattern 

Subclassed by popart::NegativeOneScalePattern, popart::OpToIdentityPattern, popart::SplitGradOpToConcatPattern

Public Functions

std::vector<const Tensor*> touches(Op *op) const final

bool apply(Op *op) const final

class SplitGatherPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class SplitOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class SqrtGradOpPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class SumToAddPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class TiedGatherAccumulatePattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class TiedGatherPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class TransposeToIdentityOrReshapePattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class UpsampleToResizePattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class ViewSimplifyPattern : public popart::PreAliasPattern 

Public Functions

bool matches(Op*) const override

std::vector<const Tensor*> touches(Op*) const override

bool apply(Op*) const override

class AdamDecompose : public popart::OptimizerDecompose 

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

TensorId rescaleRatio(Graph &graph, AdamComboOp *combo) const

std::pair<Op*, TensorId> rescaleAccl(Graph &graph, AdamComboOp *combo, bool accl1, TensorId acclId, TensorId gradIntoAcclId, TensorId rescaleRatioId) const

class AdaptiveDecompose : public popart::OptimizerDecompose 

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class Atan2Arg0GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class Atan2Arg1GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class DivArg0GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class DivArg1GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class FmodArg0GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const final

class MatMulLhsGradPattern : public popart::MatMulGradPattern 

Public Functions

bool matches(Op *op) const override

inline popart::Tensor *getIn(Op *op) const override

inline popart::Tensor *getGradIn(Op *op) const override

inline popart::Tensor *getGradOut(Op *op) const override

inline InIndex getInIndex() const override

inline InIndex getGradInIndex() const override

class MatMulRhsGradPattern : public popart::MatMulGradPattern 

Public Functions

bool matches(Op *op) const override

inline popart::Tensor *getIn(Op *op) const override

inline popart::Tensor *getGradIn(Op *op) const override

inline popart::Tensor *getGradOut(Op *op) const override

inline InIndex getInIndex() const override

inline InIndex getGradInIndex() const override

class NegativeOneScalePattern : public popart::SequenceExpander 

Public Functions

bool matches(Op*) const override

class OpToIdentityPattern : public popart::SequenceExpander 

Public Functions

bool matches(Op*) const override

class PowArg0GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class PowArg1GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

class SGD0Decompose : public popart::OptimizerDecompose 

Decomposes an SGD0ComboOp into the Ops and Tensors that implement the SGD0 optimiser step it describes.

If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix reservedAccumPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD0ComboOp.

See also

SGD0ComboOp

See also

SGD.

Recall the SGD0 optimiser step, possibly with gradient accumulation, replication:

(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (4) w = (w * wdsf0) - (slr0 * a) (5) a = 0 [if grad acc]

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by an SGD0VarUpdateOp.

(5) is implemented by an AccumulatorUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD0ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3-5) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD0ComboOp will be disconnected and erased.

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

Op *varUpdateAndEraseCombo(Graph &graph, SGD0ComboOp *combo, const TensorId &weightId, const TensorId &gradIntoUpdateId, const TensorId &updatedWeightId) const

class SGD1Decompose : public popart::OptimizerDecompose 

Decomposes an SGD1ComboOp into the Ops and Tensors that implement the SGD1 optimiser step it describes.

Will create the accl tensor (combined accumulator and first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix reservedAcclPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to slr1 * w_0, where w_0 is the initial value of w.

See also

SGD1ComboOp

See also

SGD.

Recall the SGD1 optimiser step, possibly with gradient accumulation and replication:

(_) for each micro batch (1) allReduce(g) [if OptimizerReductionType=GradReduce] (2) v += dpsf1 * g (_) if enable nesterov momentum: (_) a += g (3) v = allReduce(v) [if OptimizerReductionType=AcclReduce] (_) if enable nesterov momentum: (_) a = allReduce(a) [if OptimizerReductionType=AcclReduce] (4) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := g if enable nesterov momentum else v] (5) w = w - slr1 * x (6) v = v * smm1 + swd1 * w

See the SGD docs in optimizer.hpp for derivation of the above.

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by a MulOp and a SGD1NesterovOp.

(5) is implemented by an SGD1VarUpdateOp.

(6) is implemented by an SGD1AcclUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD1ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3), (4), (5), (6) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD1ComboOp will be disconnected and erased.

Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD1ComboOp to the new ops without breaking this. To do this:

At the start of apply, add a topo con from (1) to the combo op.
Transfer topo cons from combo to (2). Since (1)/(2) are the first op to run in the optimiser step (the other ops consume (2)’s output so will always run after), this ensures the pre-existing topo cons on combo are respected.
Insert topo con from (5) to (6), to ensure w update happens before the next step’s v update.

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class SGD2Decompose : public popart::OptimizerDecompose 

Decomposes an SGD2ComboOp into the Ops and Tensors that implement the SGD2 optimiser step it describes.

Will create the accl1 tensor (first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix reservedAccl1Prefix(). The tensor will be initialised to 0.The DataType of the tensor is as specified in the SGD2ComboOp.

See also

SGD2ComboOp

See also

SGD.

If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix reservedAccumPrefix(). If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD2ComboOp.

Recall the SGD2 optimiser step, possibly with gradient accumulation, replication:

(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (_) // Note we break the single v update equation into two steps: (4) v += dpsf1 * a (5) v = v * smm1 + swd1 * w (6) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := a if enable nesterov momentum else v] (7) w = w - slr1 * x (8) a = 0 [if grad acc]

See the SGD docs in optimizer.hpp for derivation of the above.

(1) is implemented by a ReplicatedAllReduceOp.

(2) is implemented by an AccumulateOp.

(3) is implemented by a ReplicatedAllReduceInplaceOp.

(4) is implemented by an AccumulateOp.

(5) is implemented by an SGD2AcclUpdateOp. Note this is equivalent to an SGD1AcclUpdateOp.

(6) is implemented by a MulOp and a SGD1NesterovOp.

(7) is implemented by an SGD2VarUpdateOp. Note this is equivalent to an SGD1VarUpdateOp.

(8) is implemented by an AccumulatorUpdateOp.

For all the above ops, if they consume a non-const OptimizerValue, then the SGD2ComboOp will have an additional input for that scalar, which will be connected to the new Op.

If gradient accumulation, (3-8) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.

The SGD2ComboOp will be disconnected and erased.

Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD2ComboOp to the new ops without breaking this. To do this:

Transfer topo cons from combo to (1).
Transfer topo cons from combo to (2).
Insert topo con from (7) to (8) to ensure accum not zeroed until after v update (which consumes it).
Transfer topo cons from combo to (8). Only required if not grad acc.

Public Functions

bool matches(Op*) const final

std::vector<const Tensor*> touches(Op*) const final

bool apply(Op*) const final

class SoftmaxGradDirect : public popart::Fuser 

class SplitGradOpToConcatPattern : public popart::SequenceExpander 

Public Functions

bool matches(Op*) const override

class SubtractArg1GradOpPattern : public popart::BinaryGradOpPattern 

Public Functions

bool matches(Op*) const override

14.10. Transforms

#include <popart/transforms/transform.hpp>

class Transform

Subclassed by popart::AccumulateOuterFragmentParallelizer, popart::Autodiff, popart::AutomaticLossScale, popart::AutoVirtualGraph, popart::BatchSerialize, popart::ClipWeightGradientsByNorm, popart::ContiguateCollectivesTransform, popart::DecomposeLoops, popart::DecomposeSum, popart::DynamicOpTransform, popart::EnsureFp32LossScale, popart::ExplicitRecompute, popart::HostIOSetup, popart::InferPipelineStages, popart::InplaceAccumulateGradPartialsIntoOptimizerAccumTensor, popart::InterIpuCopy, popart::IoComputeTileCopy, popart::MainLoops, popart::MergeCollectivesTransform, popart::MergeCopies, popart::MergeDuplicateOps, popart::MergeExchange, popart::MergeLoops, popart::MergeVarUpdates, popart::OverlapIO, popart::Pipeline, popart::PreAutomaticLossScale, popart::Prune, popart::RandomSetup, popart::RemoteSetup, popart::SerializeMatMuls, popart::StochasticRounding, popart::StreamingMemory, popart::SubgraphOutline

Public Functions

inline Transform()

inline virtual ~Transform()

virtual bool apply(Graph &graph) const = 0

virtual std::size_t getId() const = 0

virtual std::string getName() const = 0

Public Static Functions

static void applyTransform(std::size_t transformId, Graph&)

static bool registerTransform(Transform *transform)

static std::size_t getIdFromName(const std::string &transformName)

14.10.1. Available transforms

class AccumulateOuterFragmentParallelizer : public popart::Transform 

Public Functions

AccumulateOuterFragmentParallelizer()

virtual ~AccumulateOuterFragmentParallelizer()

virtual bool apply(Graph &graph) const final

virtual std::vector<std::vector<Op*>> getBinConstraints(const Graph &graph) const

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class AutoVirtualGraph : public popart::Transform 

Public Functions

inline AutoVirtualGraph()

inline ~AutoVirtualGraph() override

bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

float costFn(Op *op, bool training, float w_weights, float w_activations) const

Public Static Functions

static std::size_t id()

class Autodiff : public popart::Transform 

Class responsible for the automatic differentiation (autodiff) transform.

Public Types

using TensorIds = std::vector<TensorId>: Vector of tensor IDs.

using FwdGraphId = GraphId : ID of the forward graph.

Public Functions

Autodiff(): Default constructor for the Autodiff class.

~Autodiff() override: Destructor for the Autodiff class.

bool apply(Graph &graph) const override

Perform automatic differentiation.

Implemented as applyToIr(graph.getIr()))

Parameters: graph – The autodiff transform is applied to the IR containing the Graph graph.
Returns: An indication of whether the automatic differentiation has been completed (true) or not (false).

virtual bool applyToIr(Ir &ir) const

Perform automatic differentiation.

Parameters: ir – The IR to apply the autodiff transform to.
Returns: An indication of whether the automatic differentiation has been completed (true) or not (false).

virtual FwdGraphToBwdGraphInfo apply(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo, AutodiffStitchStrategy stitchStrategy)

Create a backward graph.

Apply createBwdGraph() and stitch() recursively, top-down, to create a backward graph for the forward graph with ID fwdGraphId.

The forward graph being differentiated can call subgraphs. If the autodiff transform has already been applied to the subgraphs and the result stored in , then the backward graph that has already been created for the subgraphs will be used. Otherwise, this method will recurse on the subgraphs.

When recursing on a subgraph, this method does not know for which tensors gradients are required. If a null gradsRequiredForFwdId is passed, the autodiff transform will produce gradients for all input tensors.

For control over which gradients are produced for the subgraph, first (manually) call the autodiff transform on the subgraph and pass gradsRequiredForFwdId. Store the resultant BwdGraphInfo in the FwdGraphToBwdGraphInfo map passed to the autodiff call for the forward graph.

NOTE: This method may fail if any required gradient cannot be produced.

Parameters

ir – The IR to which this transform is applied.
fwdGraphId – The ID of the graph to differentiate.
gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, the autodiff transform will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, the autodiff transform will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.
gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark these all as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.
calledGraphsGradInfo – The result of applying the autodiff transform to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.
stitchStrategy – The method used to stitch any result of the autodiff transform for graphs that are directly or indirectly called by the graph. This stitch strategy will be universally applied to all relevant inputs.

Returns

An FwdGraphToBwdGraphInfo object that contains BwdGraphInfo for all descended graphs and for which all entries have the following properties:

expectedInputs may contain a tuple (t, ExpectedConnectionType::Fwd) iff t is an input or output tensor of the forward graph. Only tensors t in gradsProvidedForFwdId may appear as a tuple (t, ExpectedConnectionType::FwdGrad) in expectedInputs. If gradsProvidedForFwdId is set, the first inputs will match the gradients of gradsProvidedForFwdId, respecting the order.
expectedOutputs may only contain tuples of the type (t, ExpectedConnectionType::FwdGrad) where t is an input tensor of the forward graph. If gradsRequiredForFwdId is set, the expectedOutputs list matches the size and order of gradsRequiredForFwdId exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.

virtual BwdGraphInfo createBwdGraph(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)

Create backward graph information for a specific subgraph (non-recursive).

This method returns an “unstitched” result. This means that it is not guaranteed that all non-gradient inputs to a backward graph are available as inputs or outputs of the forward graph. This is a precondition for BwdGraphInfo objects used as values in calledGraphsGradInfo. So, you must call stitch on the result before using the result information in an autodiff call.

NOTE: This method may fail if any required gradient cannot be produced.

Parameters

ir – The IR to which this transform is applied.
fwdGraphId – The ID of the subgraph to differentiate.
gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, autodiff will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, autodiff will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.
gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark all these as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.
calledGraphsGradInfo – The result of applying autodiff to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.

Returns

A BwdGraphInfo object with the following properties:

expectedInputs may contain arbitrary tuples (t, ExpectedConnectionType::Fwd) where t is any tensor in the forward graph (it need not be an input or output). Only tensors t in gradsProvidedForFwdId may appear as a tuple (t, ExpectedConnectionType::FwdGrad) in expectedInputs. If gradsProvidedForFwdId is set, the first inputs will match the gradients of gradsProvidedForFwdId, respecting the order.
expectedOutputs may only contain tuples of the type (t, ExpectedConnectionType::FwdGrad) where t is an input tensor of the forward graph. If gradsRequiredForFwdId is set, the expectedOutputs list matches the size and order of gradsRequiredForFwdId exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.

virtual BwdGraphInfo stitch(Ir &ir, const GraphId &fwdGraphId, const BwdGraphInfo &bwdGraphInfo, AutodiffStitchStrategy stitchStrategy, const nonstd::optional<std::vector<InIndex>> &stitchIndices)

Stitch a forward-backward graph pair.

To stitch a forward-backward graph pair means to make it so that the backward graph no longer has any non-gradient inputs of the forward graph tensors that are neither inputs nor outputs of the forward graph.

When applying the autodiff transform to a graph, PopART assumes that all input tensors to the gradient ops are either 1) a forward op input 2) a forward op output or 3) the gradient of a forward op output. For this to be true for gradient ops of subgraph ops (for example: CallOp and IfOp), typically the backward graphs of those called subgraphs must not have inputs that are associated with non-gradient forward tensors that are neither inputs nor outputs of the forward graph. This is because the inputs and outputs of a forward subgraph typically map to the inputs and outputs of the associated forward op. Similarly, the inputs and outputs of a backward subgraph typically map to the inputs and outputs of the associated gradient op.

For stitch strategies that affect the forward graph’s inputs or outputs, stitch() should also amend all call sites of the forward graph as appropriate. Conversely, for the backward graphs, it is assumed there are no call sites as it’s anticipated this method is called before parents of the backward graph exist.

NOTE: This method may modify the forward graph, backward graph, or any graphs that call these graphs, depending on the method. It also may raise a popart::error if it is unable to stitch an index.

Parameters

ir – The IR in the context of which this transformation is applied.
fwdGraphId – The ID of the subgraph to differentiate.
bwdGraphInfo – The data structure describing the backward graph.
stitchStrategy – The method by which to stitch any autodiff result for graphs that are directly or indirectly called by the graph.
stitchIndices – If provided, backward graph input indices not in this list must be ignored and backward graph input indices in this list must be stitched (or an exception raised). If not set, it is up to the stitcher to decide what indices to stitch.

Throws

popart::error – if unable to stitch an index.

Returns

An updated BwdGraphInfo data structure (with some expectedInputs removed).

inline std::size_t getId() const override: Get the ID of the autodiff transform.

inline std::string getName() const override: Get the name of the autodiff transform.

Public Static Functions

static std::size_t id(): ID of the autodiff transform.

class AutomaticLossScale : public popart::Transform 

Public Functions

inline AutomaticLossScale()

inline ~AutomaticLossScale() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static Op *executeOpNTimesEveryMTimes(Op *op, unsigned n, unsigned m, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasMode)

When applied to an op it will be effectively executed n times every m times.

It returns a pointer to an IfOp which either calls an ‘empty’ subgraph, or calls a subgraph containing the op passed as the argument. The ‘empty’ subgraph is meant to be low intensity compute. It is possible to connect inputs and outputs via nop operations and set up default values of outputs in the ‘empty’ subgraph.

Parameters

op – Operator whose execution frequency is modified.
n – Execute the op n times every m times.
m – Execute the op n times every m times.
identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.
outputIndiciesAndValues – Map of pairs of output indices and values. Note: inplacing and aliasing of inputs are not supported. If the op inplace-modifies or aliases an input, in the transformed graph after this method is called, this will not longer be the case.

class BatchSerialize : public popart::Transform 

Public Functions

inline BatchSerialize(int pass_)

inline ~BatchSerialize() override

bool apply(Graph &graph) const final

inline std::size_t getId() const final

inline std::string getName() const final

Public Static Functions

static std::size_t id(int)

class ClipWeightGradientsByNorm : public popart::Transform 

Public Functions

inline ClipWeightGradientsByNorm()

inline ~ClipWeightGradientsByNorm() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static std::vector<std::vector<Op*>> findGradientClippingGroups(const Graph &graph)

class ContiguateCollectivesTransform : public popart::Transform 

A transform that inserts topological constraints into the graph.

These force collective operations which can potentially be merged to be scheduled contiguously (one right after the other) in the schedule.

Currently supported collective types:

ReplicatedAllReduceOp
ReplicatedReduceScatterOp
ReplicatedAllGatherOp

Public Functions

inline ContiguateCollectivesTransform()

inline ~ContiguateCollectivesTransform() override

bool apply(Graph &graph) const override

std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> includeOps) const

inline std::size_t getId() const override

inline std::string getName() const override

template<typename BaseType> void processOp(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess) const

Processing baseOp involves finding all other collective ops in the graph with which baseOp can be merged, the inserting constraints between the matching ops and baseOp, that ensure the ops are scheduled contiguously one after another.

Parameters

baseOp – is the Op that should be merged with other collectives
schedule – is a vector of ops sorted in schedule order
opsToProcess – is set of all other collective ops in the graph (which are candidates for merging with base op)

Returns

void, modifies the graph of baseOp

Public Static Functions

static std::size_t id()

template<typename BaseType> static bool checkCollectiveOp(BaseType *baseOp, BaseType *candidate)

Check whether two ops use the same collective operator.

Parameters

baseOp – against which to compare the candidate op
candidate – op of the same type as baseOp

Returns

true, if the two ops use the same collective operator or if neither uses a collective operator

template<typename BaseType> static std::set<BaseType*, POpCmp> lookForMatchingOps(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess)

Loop through the ops in the schedule and find those matching baseOp to avoid merging the same op twice, make sure it is still in opsToProcess.

Parameters

baseOp – the op that should be merged with other collectives
schedule – the schedule of the (Collective) ops in the graph
opsToProcess – the (Collective) ops that can still be considered for merging

Returns

a vector of collective ops that can be merged with the baseOp

class DecomposeGradSum : public popart::DecomposeSum

Public Functions

inline std::size_t getId() const override

inline std::string getName() const override

Public Static Functions

static std::size_t id()

class DecomposeLoops : public popart::Transform 

Transform that generically decomposes/unrolls loop iterations to:

Unroll LoopOp iterations in general
Arrange IO Ops to enable overlap between IO and compute tiles
Arrange Ops PipelineStages to enable overlap between PipelineStages

If we want to unroll a loop by a factor of 2, each Op that existed in the loop needs 3 instances, denoted as 0, 1 and 2, one per apparent iteration. If we want to unroll such that iterations can partially overlap (IO and compute overlap), we can’t generally, for all operations, place 0 before the loop, 2 after loop and 1 during the loop (see skewed unrolling below), because this would not lead to overlap between either pipeline stages or IO and compute operations.

Rather, we classify Ops (see DecomposeLoopOpTypeEnum), according to their data, topological dependencies and the tile set they are running on, into one of the categories. The available categories depend on the DecomposeLoopModel implementation. We can then shuffle the operations to before, during and after the loop accordingly. Note that every operation is cloned 2 extra times (for an unroll factor of 2), but the original operation in the loop remains.

However, the “apparent iteration” (iteration that the Op instance corresponds to in the LoopOp before unrolling) has changed.

The number of apparent iterations in total is always the unroll factor (counting all iterations before and after the loop) plus one iteration for the loop itself:

num_apparent_iterations = unroll_factor + 1

In loop iteration n, the Ops (depending on classification) now correspond to iterations i (0), i+1 (1) and i+2 (2) respectively. The Ops unrolled before the loop process iterations 0 (0) and 1 (1) The Ops unrolled after the loop process iterations n-1 (1) and n (2) (where (0) (1) and (2) correspond to the cloned operations)

As an example for apparent iteration: Before unrolling, there is an operation in a loop (denoted as {}): { Op }

If we unroll by a factor of 2, the operation is cloned into the parent graph twice,and there are different possible arrangements, depending on how we skew the unrolling:

a.) { Op } - Op0 - Op1

In this case: Op - unrollIndex -1 - apparent iteration 0 - before loop: no Op0 - unrollIndex 0 - apparent iteration 1 - before loop: no Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no

(use case example: if Op is a HostStoreOp that should do overlapped IO with compute (such as a MatMulOp))

b.) Op0 - { Op } - Op1

In this case: Op - unrollIndex -1 - apparent iteration 1 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no

(use case example: if Op is a MatMulOp that should do overlapped compute with IO (such as HostloadOp and HostStoreOp))

c.) Op0 - Op1 - { Op }

In this case: Op - unrollIndex -1 - apparent iteration 2 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 1 - before loop: yes

(use case example: if Op is a HostLoadOp that should do overlapped IO with compute (such as a MatMulOp))

Use case example:

HostLoadOp0 HostLoadOp1 { HostLoadOp  }
            MatMulOp0   { MatMulOp    } MatMulOp1
                        { HostStoreOp } HostStoreOp0 HostStoreOp1
            ^^^^^^^^^^^   ^^^^^^^^^^^   ^^^^^^^^^^^^
            overlap       overlap       overlap

{ } denotes the LoopOp

Where the data dependencies are: HostLoadOp0 -> MatMulOp0 -> HostStoreOp HostLoadOp1 -> MatMulOp -> HostStoreOp0 HostLoadOp -> MatMulOp1 -> HostStoreOp1

This skew is controlled by the decomposition model (see DecomposeLoopOpTypeEnum for details). If the model is unrolling pipeline stages, for example, each stage will be skewed differently (see DecomposeLoopPipelineModel).

Public Functions

inline DecomposeLoops()

inline ~DecomposeLoops() override

virtual bool apply(Graph &graph) const final

Decomposes all LoopOps in the graph using the standard model of loop decomposition (which is DecomposeLoopOverlapModel())

Parameters: graph – Graph containing the LoopOp to decompose
Returns: true If apply is successful. An error will be thrown if not.

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

void decomposeLoop(Graph &graph, LoopOp *loopOp, const DecomposeLoopModel &model) const

Decompose a loop with a custom DecomposeLoopModel.

Parameters

graph – graph containing the LoopOp to decompose
loopOp – LoopOp to decompose
model – DecomposeLoopModel to apply

Public Static Functions

static std::size_t id()

static bool isComputeOp(Op *op)

Check if an Op should be classified as compute.

The condition is that the operation is on compute tiles.

Parameters: op – Op to check
Returns: true if it is a Compute Op

static bool isIOOp(Op *op)

Checks if an Op is an IO operation.

The condition is that the operation is one of HostLoadOp, HostStoreOp, RemoteLoadOp, RemoteStoreOp, MultiExchangeOp.

Parameters: op – Op to check
Returns: true if it is an IO Op

static bool isComputeLikeIOOp(std::set<ExchangeStrategy> computeLikeStrategies, Op *op)

Checks if an Op is classified as IO, and executes on IO tiles, but should still be handled like a compute operation (as in, classified, unrolled and scheduled as DecomposeLoopOpTypeEnum::Compute) (instead of an IO operation that should overlap with compute (classified DecomposeLoopOpTypeEnum::IoBeforeCompute or DecomposeLoopOpTypeEnum::IoAfterCompute)).

Operations should be handled like compute instead of IO operations when they are not required to overlap with compute.

Parameters

computeLikeStrategies – ExchangeStrategy that should be considered as compute
op – Op to check

Returns

True if it is a compute Op

class DynamicOpTransform : public popart::Transform 

Public Functions

inline DynamicOpTransform()

inline ~DynamicOpTransform() override

bool apply(Graph &graph) const final

inline std::size_t getId() const final

void transferProperties(Op *from, Op *to) const

void inplace(Op *from) const

inline std::string getName() const final

Public Static Functions

static std::size_t id()

class EnsureFp32LossScale : public popart::Transform 

Public Functions

inline EnsureFp32LossScale()

inline ~EnsureFp32LossScale() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

bool isPassThroughOp(Op *op) const

For deciding whether to continue graph traversal from op’s outputs, or to terminate the traversal at this op.

Parameters: op – The op.
Returns: True if the op has a single input, and all its outputs are of the same type as the input.

FromLossScaleTraversalOps traverseFromLossScaleTensor(const Graph &graph) const

Traverse the graph from the loss scale tensor.

We ‘pass through’ single-input ops that do not combine the loss scale (or a descendant of it) with an activation tensor.
Otherwise we terminate the traversal. We refer to these terminal ops as ‘mixed precision loss grad op’ (or MPLGO) candidates.

Parameters: graph – The graph to be traversed.
Returns: A pair containing the list of pass-through ops and MPLGO candidates.

bool shouldApply(const Graph &graph) const

Run the checks to see if the transform should be applied.

Parameters: graph – The graph that the checks are run on.
Returns: True if the checks pass.

void upCastTensor(Op *op, InIndex index) const

Upcast fp16 tensor at input index index to op to fp32.

This is done by disconnecting the input tensor, inserting a CastOp, and re-connecting the output tensor of the CastOp at index.

Parameters

op – The op whose input is to be upcast.
index – The input index to op at which the tensor is to be upcast.

void downCastTensor(Tensor *tensor) const

Downcast fp16 tensor to fp16.

This is done by disconnecting it from its consumers, inserting a CastOp, and re-connecting the output tensor of the CastOp to the consumers.

Parameters: tensor – The tensor to be downcast.

Public Static Functions

static std::size_t id()

static bool isMixedPrecisionLossGradOp(Op *op)

To return true, the op’s implementation must be able to handle mixed precision maths.

We have no good way to know this programmatically at the point of running this transform, so we hard code this information here.

Parameters: op – The op we want to check if it has an impelemntation that is known to support mixed precision inputs.
Returns: True if it is has an implementation known to support mixed precision inputs.

static Tensor *getLossScaleInputTensor(Op *op)

Only to be called on an op for which a call to isMixedPrecisionLossGradOp return true.

Parameters: op – An MPLGO candidate whose loss scale tensor (or descendant there-of) you want to find.
Returns: The input tensor.

class ExplicitRecompute : public popart::Transform 

Explicit recomputation is a transformation that clones forward-pass operations marked for recomputation and clones them.

Consider a fragment of the training graph before the explicit recomputation transform, where one gradient operation (CheckpointOp1Grad) requires a value from the forward pass (RecomputeOp1) which is considered for recomputation:

(where CheckpointOp* is an op with op->settings.recomputeType == RecomputeType::Checkpoint and RecomputeOp* is an op with op->settings.recomputeType == RecomputeType::Recompute)

By marking these ops as ‘recompute’, the output of RecomputeOp1 does not need to remain live until the recomputation of CheckpointOp1Grad. In other words, the memory used to store this tensor is freed for allocation of other tensors as soon as RecomputeOp1’s output is read during the computation of CheckpointOp1. How does this work in practice?

After the transform, the graph fragment will look like:

Where every operation marked as Recompute will be cloned and added to the backward pass, while all Checkpoint operation will remain connected as-is.

In pipelining, every copy operation between pipeline stages is (required to be) checkpointed (in order to not cause data dependencies between stages running in parallel), while everything else is recomputed. The user can choose to checkpoint more, but not recompute more (with pipelining).

The alternative, in the case of implicit recomputation, is to not transform the graph at the IR level, and to use these recomputation settings to affect the Ir lowering. In this case, the poplar::program::Sequences that correspond to the lowered RecomputeOps are added once to the main program as scheduled in the forward pass, and then again directly preceding the poplar::program::Sequence of the CheckpointOp1Grad. See the `FindRequiredRecomputes class in irlowering.cpp

Public Functions

inline ExplicitRecompute()

inline ~ExplicitRecompute() override

bool apply(Graph &graph) const final

inline std::size_t getId() const final

inline std::string getName() const final

Public Static Functions

static std::size_t id()

class HostIOSetup : public popart::Transform 

Public Functions

inline HostIOSetup(int pass_)

inline ~HostIOSetup() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id(int)

class InferPipelineStages : public popart::Transform 

Public Functions

inline InferPipelineStages()

inline ~InferPipelineStages() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class InplaceAccumulateGradPartialsIntoOptimizerAccumTensor : public popart::Transform 

Replaces an accumulation tree consumed by an AccumulateOp (which has its own accumulator tensor), with an accumulation tree directly on the AccumulateOp’s accumulator tensor, thereby removing one allocation from the graph (the accumulation tree’s original accumulation tensor).

More precisely:

Becomes:

A | accum pW0 \ / Accumulate | dW1 pW1 \ / Accumulate | accum’ | B

See below comment for more discussion of the conditions required to be able to perform this transform.

The primary use case of this is a decomposed grad sum whose addition tree is fed into an AccumulateOp as part of the optimiser step.

Public Functions

InplaceAccumulateGradPartialsIntoOptimizerAccumTensor()

~InplaceAccumulateGradPartialsIntoOptimizerAccumTensor() final

bool apply(Graph &graph) const final

inline std::size_t getId() const final

inline std::string getName() const final

Public Static Functions

static std::size_t id()

class InterIpuCopy : public popart::Transform 

Public Functions

inline InterIpuCopy()

inline ~InterIpuCopy() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class IoComputeTileCopy : public popart::Transform 

Public Functions

inline IoComputeTileCopy()

inline ~IoComputeTileCopy() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MainLoops : public popart::Transform 

Public Functions

inline MainLoops()

inline ~MainLoops() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static inline std::string getStepGraphName()

Return the name of the step subgraph.

The step subgraph is the body of the LoopOp stepLoop . The stepLoop is run when session.run(...) is called, and will run batchesPerStep number of times (i.e. the trip_count of the loop equals batchesPerStep ). A step thus constitutes a call to session.run(...) . As a call to session.run(...) involves a call to engine.run() (which is expensive, and will involve returning to the host for more data) we would like to have as large a batchesPerStep as possible.

See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator

Returns: The name of the step graph

static inline std::string getAccumulationGraphName()

Return the name of the gradient accumulation subgraph.

The gradient accumulation subgraph is the body of the LoopOp accumLoop . The accumLoop will run accumulationFactor number of times (i.e. the trip_count of the loop equals batchesPerStep ) and will accumulate the gradients for each pass. These accumulated gradients will be used to calculate the weigth update.

See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator

Returns: The name of the accumulation graph

static Graph &getInnerLoopSubgraph(Ir &ir)

Helper function for accessing the subgraph of the inner loop.

The inner loop depends on the values of accumulationFactor and batchesPerStep. The inner loop equals:

The mainGraph if accumulationFactor = 1 and batchesPerStep = 1
The accumulationGraph if accumulationFactor > 1 and batchesPerStep = 1
The stepGraph if accumulationFactor = 1 and batchesPerStep > 1
The accumulationGraph if accumulationFactor > 1 and batchesPerStep > 1

Warning

Should only be used after the transform has been applied, this means after call to apply() has been made.

Note

innerLoop and outerLoop are represented by the differnt graphs only when accumulationFactor > 1 and batchesPerStep > 1. In that case the outerLoop repeats the innerLoop

Returns: The inner loop subgraph

static const Graph &getInnerLoopSubgraph(const Ir &ir)

static Graph &getOuterLoopSubgraph(Ir &ir)

Helper function for accessing the subgraph of the outer loop.

The outer loop depends on the values of accumulationFactor and batchesPerStep. The outer loop equals:

The mainGraph if accumulationFactor = 1 and batchesPerStep = 1
The accumulationGraph if accumulationFactor > 1 and batchesPerStep = 1
The stepGraph if accumulationFactor = 1 and batchesPerStep > 1
The stepGraph if accumulationFactor > 1 and batchesPerStep > 1

Warning

Should only be used after the transform has been applied, this means after call to apply() has been made.

Note

innerLoop and outerLoop are represented by the differnt graphs only when accumulationFactor > 1 and batchesPerStep > 1. In that case the outerLoop repeats the innerLoop

Returns: The outer loop subgraph

static LoopOp *getInnerLoopOp(Ir &ir)

class MergeAllVarUpdates : public popart::MergeVarUpdates 

Public Functions

inline MergeAllVarUpdates()

inline ~MergeAllVarUpdates() override

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MergeAuto : public popart::MergeVarUpdates 

Subclassed by popart::MergeLooseThreshold, popart::MergeTightThreshold

Public Functions

int64_t getThresholdMemory(const Graph&) const

class MergeLooseThreshold : public popart::MergeAuto 

Public Functions

inline MergeLooseThreshold()

inline ~MergeLooseThreshold() override

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

int64_t getMemToPlayWithAtPeak(const Graph&) const

Public Static Functions

static std::size_t id()

class MergeTightThreshold : public popart::MergeAuto 

Public Functions

inline MergeTightThreshold()

inline ~MergeTightThreshold() override

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MergeCollectivesTransform : public popart::Transform 

A transform for merging multiple compatible collective operations into a single larger collective operation.

Ops are only merged if they apear in contiguous order in the schedule.

Currently supported collective types:

ReplicatedAllReduceOp

Public Functions

inline MergeCollectivesTransform()

inline ~MergeCollectivesTransform() override

bool apply(Graph &graph) const override

std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> includeOps) const

inline std::size_t getId() const override

inline std::string getName() const override

template<typename BaseType> bool collectiveOpCheck(BaseType *A, BaseType *B) const

Confirm that two collective ops of the same BaseType use the same collective operation i.e.

ADD, MUL etc. If the BaseType does not require a collective op (gather), return true

Parameters

A – the first op
B – the second op

Returns

true is A and B use the same collective operation or both use none

template<typename MultiOpType, typename BaseType> Op *attemptToMergeOnOp(BaseType *baseOp, std::vector<Op*>::iterator &schedulePos, std::vector<Op*> &opSchedule) const

Given a collective operation, attempt to merge it with other compatible collective ops which are tied (in the schedule) to the current op.

Parameters

baseOp – a collective op that should be merged
opSchedule – the schedule of all (collective) ops in the graph

Returns

pointer the constructed op

template<typename MultiOpType, typename BaseType> std::unique_ptr<MultiOpType> constructMultiOp(BaseType *baseOp, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet, std::vector<BaseType*> matchingOps) const

Constructs a new MultiOpType which will replace the baseOp and all matching ops.

Parameters

baseOp – is the operation to be replaced
outInfoFromBaseOps – is the output information for each output tensor collected from the ops with which base op will be merged.
inputVirtualGraphIdAndTileSet – the input virtual graph and tile set information collected from the ops that will be merged
outputVirtualGraphIdAndTileSet – the output virtual graph and tile set information collected from the ops that will be merged
matchingOps – the vector of matching ops

Returns

a unique pointer to the new multi-collective op

Public Static Functions

static std::size_t id()

class MergeCopies : public popart::Transform 

Public Functions

inline MergeCopies()

inline ~MergeCopies() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MergeDuplicateOps : public popart::Transform 

Public Functions

inline MergeDuplicateOps()

inline ~MergeDuplicateOps() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MergeExchange : public popart::Transform 

Public Functions

inline MergeExchange()

inline ~MergeExchange() override

bool apply(Graph &graph) const override

std::vector<Op*> applyToOps(Graph &graph, const std::set<OpId> include_ops) const

inline std::size_t getId() const override

inline std::string getName() const override

Public Static Functions

static std::size_t id()

class MergeLoops : public popart::Transform 

Public Functions

inline MergeLoops()

inline ~MergeLoops() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class MergeVarUpdates : public popart::Transform 

Subclassed by popart::MergeAllVarUpdates, popart::MergeAuto

Public Types

using PartitionId = std::string

using PartitionMap = std::map<PartitionId, std::vector<VarUpdateStartEnd>>

Public Functions

PartitionId getPartitionId(Op *op) const

virtual bool apply(Graph&) const final

PartitionMap getLargestGroupTargetsMap(const Graph&) const

class OverlapIO : public popart::Transform 

Public Functions

inline OverlapIO()

inline ~OverlapIO() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static std::map<ExchangeStrategy, std::set<PipelineStage>> overlapIORequired(Ir &ir)

Check what level of ExchangeStrategy is required with overlapped IO.

Each pipeline stage can contain IO operations that belong to any of the strategies defined in the ExchangeStrategy enum. This will then inform how the IO operations of each pipeline stages have to be unrolled.

Parameters: ir – IR to check for overlapped IO settings
Returns: Map of required exchange strategies and pipeline stages in which exchanges occur. The set of stages will be empty if the ExchangeStrategy is not set on the InputSettings or AnchorReturnType of an input or output respectively. HostLoad and HostStore operations inserted by the HostIoSetup transform will inherit the ExchangeStrategy from InputSettings or AnchorReturnType respectively.

class Pipeline : public popart::Transform 

Public Functions

inline Pipeline()

inline ~Pipeline() override

virtual bool apply(Graph &graph) const final

Checks if the pipelining settings are valid and applies either implicit or explicit pipelining transforms to the graph.

Parameters: graph – top-level IR graph (main graph) for implicit pipelining, pipeline loop subgraph for explicit pipelining
Returns: true if the transformation has changed the graph

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

bool addDynamicStashAndRestoreOps(Graph &graph) const

Add all required dynamic update and dynamic slice operations to the graph, which link forward and recompute/backward stages together via stashes Only works for explicit pipelining.

Parameters: graph – Pipeline loop subgraph
Returns: True if successful, will raise error if not

bool contiguateIpuCopies(Graph &graph) const

Add required IpuCopyOps to ensure that within the pipelined execution, no copies between non-contiguous pipeline stages occur.

Parameters: graph – Pipeline loop subgraph
Returns: True if successful, will raise error if not

int getStashSize(const Ir &ir, PipelineStage stashStage, PipelineStage maxRestoreStage) const

Calculate the required stash size.

Parameters

ir – The current IR
stashStage – The stage in which the stash is updated
maxRestoreStage – The last stage in which the stash is restored

Returns

Required number of stash entries

Public Static Functions

static std::size_t id()

static bool checkIsFullRecompute(Graph &graph)

static bool checkIsFullCheckpoint(Graph &graph)

static bool inplaceRestoreRequiredForRecompute(Op *op)

Implicit pipelining and implicit recompute only! Test if the (implicit) recompute logic requires an inplace restored version of a forward ActGrad tensor (from the stash)

Parameters: op – the Op to check if it is convertible to RestoreInplaceOp and is required for (implicit) recompute
Returns: True if the inplace restore is required

static bool inplaceRecomputationConflict(Op *op, InIndex in, OutIndex out)

Implicit pipelining and implicit recompute only! Check if implicit recompute is in conflict with implicit pipelining when restoring a forward ActGrad tensor inplace.

Parameters

op – the Op to check
in – input index of the Op
out – output index of the Op

Returns

true if there is an inplace overwritng conflict

static void setFinalFwdStageRecomputation(Graph &graph)

Implicit pipelining and implicit recompute only! This annotation pass will try to set the Ops between the topologically final Checkpoints and the loss to NOT be recomputed.

This avoid a program where operations are run twice in a row with no benefit to liveness.

Parameters: graph – top-level IR graph (main graph)

static void checkOpsPipelineStage(Graph &graph)

Check and adjust pipeline stage annotations on operations.

Parameters: graph – Graph on which to check pipeline stages

static std::map<PipelineStage, PipelineStage> withStages(const Ir &ir)

Check which stages should be executed with which other stage.

Parameters: ir – IR from which to read the pipeline stages
Returns: Map of pipeline stages to which stage to execute with in sequence.

class PreAutomaticLossScale : public popart::Transform 

A transform that annotates tensors in the forward graph, so that their gradients can be tracked in automatic loss scaling.

This transform reads a list of user-provided tensor IDs in the forward graph and inserts AutoLossScaleProxyOps after them (see example below). Later in the lowering process, the Autodiff transform will place the corresponding AutoLossScaleProxyGradOps in the backward graph, marking the tensor locations in the graph, for which to track gradients.

Example graph before applying the transform: A — MulOp — C B -’

Example graph after applying the transform with toTrackTensors = [“A”, “C”]: A — AlsProxyOp — A* — MulOp — C — AlsProxyOp — C* B ———————’

It is important to apply the AutomaticLossScale transform after PreAutomaticLossScale and Autodiff to remove all AutoLossScaleProxyOps and AutoLossScaleProxyGradOps.

Public Functions

inline PreAutomaticLossScale()

inline ~PreAutomaticLossScale() override

virtual bool apply(Graph &graph) const final

Annotate tensors in the forward graph, so that their gradients can be found and tracked in automatic loss scaling.

See class documentation for details.

Parameters

graph – The graph which to transform.

Throws

error – if the user provides an empty list to automaticLossScalingSettings.toTrackTensors.
error – if any of the tensor IDs in automaticLossScalingSettings.toTrackTensors don’t exist in the graph.

Returns

true if there was a change to the graph.

Returns

false if there wasn’t a change to the graph.

inline virtual std::size_t getId() const final

virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class Prune : public popart::Transform 

Public Functions

inline Prune()

inline ~Prune() override

bool apply(Graph &graph) const override

inline std::size_t getId() const override

inline std::string getName() const override

Public Static Functions

static std::size_t id()

class RandomSetup : public popart::Transform 

Public Functions

inline RandomSetup()

inline ~RandomSetup() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static bool hasRandomSeed(const Ir &ir)

static bool requiresRandomSeed(const Ir &ir)

static TensorId getStreamedSeedTensorId()

class RemoteSetup : public popart::Transform 

Public Functions

inline RemoteSetup()

inline ~RemoteSetup() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static void getRemoteArgMapping(Graph &graph, RemoteArgOpMap&, RemoteOpArgMap&, RemoteArgBufferMap&)

class SerializeMatMuls : public popart::Transform 

Public Functions

inline SerializeMatMuls()

inline ~SerializeMatMuls() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class StochasticRounding : public popart::Transform 

Public Functions

inline StochasticRounding()

inline ~StochasticRounding() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

class StreamingMemory : public popart::Transform 

Public Functions

inline StreamingMemory(int pass_)

inline ~StreamingMemory() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id(int)

class SubgraphOutline : public popart::Transform 

Class for creating functionally equivalent subgraphs from SubgraphableOpClusters, and replacing instances of SubgraphableOpClusters with calls to these subgraphs.

Further down the stack, this allows for code-reuse, which results in a lower memory footprint for the compiled graph.

Public Functions

inline SubgraphOutline()

inline ~SubgraphOutline() override

virtual bool apply(Graph &graph) const final

inline virtual std::size_t getId() const final

inline virtual std::string getName() const final

Public Static Functions

static std::size_t id()

static Graph &createSubgraph(const std::vector<SubgraphableOpCluster> instances, Ir &ir, std::map<Op*, int> &index_map, std::string subgraphId = "call")

Create a subgraph from a set of identitcal op clusters.

Parameters

instances – A set of SubgraphableOpClusters that can be replaced by a call to the same subgraph. All SubgraphableOpCluster instances must be functionally equivalent.
ir – The IR.
index_map – An empty map, passed by reference. Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance. Required as input argument to ‘replaceWithCallOp’.
subgraphId – The returned subgraph’s id.

Returns

A Graph that is functionally equivalent to each SubgraphableOpCluster instance.

static Graph &createEmptySubgraph(const SubgraphableOpCluster &instance, Ir &ir, std::string subgraphId, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasModel)

Create an ‘empty’ subgraph from an op cluster.

Parameters

instance – A SubgraphableOpCluster that is used as a template for which we build ‘empty’ subgraph where inputs and output tensors can be connected via nops and output tensors can be set to default values.
ir – The IR.
subgraphId – The returned subgraph’s id.
identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.
outputIndiciesAndValues – Map of pairs of output indices and values.

Returns

A Graph, low compute subgraph which stands for the op when it is not executed.

static void setSubgraphOpSettingsFromClusterInstance(Op *op, const SubgraphableOpCluster &instance)

static Op *replaceWithCallOp(const SubgraphableOpCluster &instance, Graph &subgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap)

Replace a cluster of ops with a call to a subgraph.

Parameters

instance – The SubgraphableOpClusters instance to be replaced.
subgraph – The subgraph, a call to which is to replace the instance.
index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.
aliasesMap – AliasesMap with alias information for instance’s graph.

Returns

The replacement CallOp’s pointer.

static Op *replaceWithEmptyElseBranchIfOp(const SubgraphableOpCluster &instance, Graph &subgraph, Graph &emptySubgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap, Tensor *flag)

Replace an op with if op.

Where the op is moved to the first branch of if op. Its second branch is for low intensity compute which passes input tensors to outputs or provide default output tensors.

Parameters

instance – The SubgraphableOpClusters instance which holds op to be replaced.
subgraph – if then branch subgraph which contains the op.
emptySubgraph – if else low intensity compute branch subgraph.
index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.
aliasesMap – AliasesMap with alias information for instance’s graph.
flag – a Tensor deciding which branch should be used.

Returns

The replacement IfOp’s pointer.

#include <popart/bwdgraphinfo.hpp>

struct BwdGraphInfo

A data structure that captures the result of applying autodiff to a graph.

Public Functions

bool operator==(const BwdGraphInfo &rhs) const: Equality operator.

Public Members

GraphId bwdGraphId: A newly constructed backward graph.

ExpectedConnections expectedInputs: Expected connection details for each of bwdGraph’s inputs.

ExpectedConnections expectedOutputs: Expected connection details for each of bwdGraph’s outputs.

enum class popart::ExpectedConnectionType

The type of tensor expected to connect to a graph input or output.

Values:

enumerator Fwd = 0: A tensor from a forward graph.

enumerator FwdGrad = 1: The gradient of a tensor from a forward graph.

struct ExpectedConnection

Description of tensor expected to connect to graph input or output.

Public Functions

bool operator==(const ExpectedConnection &rhs) const: Equality operator.

Public Members

TensorId fwdId: TensorId in the fwdGraph.

ExpectedConnectionType type: Either fwdId or getGradId(fwdId).

14.11. Utility classes

14.11.1. Graph

#include <popart/graphutils.hpp>

using popart::graphutils::CallStack = std::vector<Op*>: CallStack representation.

using popart::graphutils::TensorAndCallStack = std::pair<Tensor*, CallStack>

14.11.2. Region

#include <popart/region.hpp>

14.11.3. Error handling

#include <popart/error.hpp>

enum class popart::ErrorSource

Values:

enumerator popart = 0

enumerator popart_internal

enumerator poplar

enumerator poplibs

enumerator unknown

class error : public runtime_error 

Exception class for popart.

Subclassed by popart::internal_error, popart::memory_allocation_err, popart::runtime_error

Public Functions

template<typename ...Args> inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args> inline explicit error(const std::string &s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const char *s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)

const std::string &stackreport() const

inline ErrorUid uid() const

class internal_error : public popart::error 

Exception class specific to internal errors This should be used as an assert; for states where the user should not have been able to create.

Public Functions

template<typename ...Args> inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args> inline explicit error(const std::string &s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const char *s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)

class memory_allocation_err : public popart::error 

Subclassed by popart::popx::devicex_memory_allocation_err

Public Functions

inline memory_allocation_err(const std::string &info)

virtual std::unique_ptr<memory_allocation_err> clone() const = 0

virtual std::string getSummaryReport() const = 0

virtual std::string getProfilePath() const = 0

class runtime_error : public popart::error 

Exception class specific to errors that occur when running a model.

For example, this error could be thrown when a user-implemented IStepIO callback doesn’t return any data.

NOTE: This is different from a C++ runtime error.

Public Functions

template<typename ...Args> inline explicit error(const char *s, const Args&... args)

Variadic constructor for error which allows the user to use a fmt string for the message.

throw error(“This is an error reason {}”, 42);

template<typename ...Args> inline explicit error(const std::string &s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const char *s, const Args&... args)

template<typename ...Args> inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)

class devicex_memory_allocation_err : public popart::memory_allocation_err 

Public Functions

devicex_memory_allocation_err(const devicex_memory_allocation_err &rhs)

devicex_memory_allocation_err(const poplar::graph_memory_allocation_error &e, const poplar::OptionFlags &_reportOptions)

std::unique_ptr<memory_allocation_err> clone() const

std::string getSummaryReport() const

std::string getProfilePath() const

14.11.4. Debug context

#include <popart/debugcontext.hpp>

class DebugContext

Public Functions

DebugContext(SourceLocation loc = SourceLocation::Current())

DebugContext(const char *name, SourceLocation loc = SourceLocation::Current())

DebugContext(std::string name, SourceLocation loc = SourceLocation::Current())

DebugContext(const DebugInfo &debugInfo, std::string name = "", SourceLocation loc = SourceLocation::Current())

DebugContext(const DebugNameAndId &debugNameAndId, std::string name = "", SourceLocation loc = SourceLocation::Current())

DebugContext(DebugContext&&)

DebugContext(const DebugContext&)

~DebugContext()

std::string getPathName() const

class DebugInfo

Subclassed by popart::OnnxOpDebugInfo, popart::OnnxVariableDebugInfo, popart::OpDebugInfo, popart::TensorDebugInfo

Public Types

enum class SerializationFormat

Values:

enumerator JSON: Serialise in JSON format.

enumerator CBOR: Serialise in CBOR format.

Public Functions

DebugInfo(const DebugContext &debugContext, const std::string &layer)

DebugInfo &operator=(const DebugInfo&) = delete

DebugInfo(const DebugInfo&) = delete

virtual ~DebugInfo()

DebugId getId() const

std::string getPathName() const

bool setValue(std::string name, ProfileValue value)

Public Static Functions

static void initializeStreamer(const std::string &fileName, const SerializationFormat &format = SerializationFormat::CBOR)

static void closeStreamer()

class OnnxOpDebugInfo : public popart::DebugInfo 

Public Functions

OnnxOpDebugInfo(const DebugContext &debugContext, const Node &node)

OnnxOpDebugInfo &operator=(const OnnxOpDebugInfo&) = delete

OnnxOpDebugInfo(const OnnxOpDebugInfo&) = delete

virtual ~OnnxOpDebugInfo() = default

class OnnxVariableDebugInfo : public popart::DebugInfo 

Public Functions

OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::TensorProto &proto)

OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto)

OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto, const TensorInfo &ti)

OnnxVariableDebugInfo &operator=(const OnnxVariableDebugInfo&) = delete

OnnxVariableDebugInfo(const OnnxVariableDebugInfo&) = delete

virtual ~OnnxVariableDebugInfo() = default

class OpDebugInfo : public popart::DebugInfo 

Public Functions

OpDebugInfo(const DebugContext &debugContext, const Op &_op)

virtual ~OpDebugInfo()

OpDebugInfo &operator=(const OpDebugInfo&) = delete

OpDebugInfo(const OpDebugInfo&) = delete

void finalize()

class TensorDebugInfo : public popart::DebugInfo 

Public Functions

TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorInfo &info, const TensorType &tt)

TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorType &tt)

TensorDebugInfo &operator=(const TensorDebugInfo&) = delete

TensorDebugInfo(const TensorDebugInfo&) = delete

virtual ~TensorDebugInfo() = default

14.11.5. Attributes

#include <popart/attributes.hpp>

class Attributes

Wrapper around the container of ONNX_NAMESPACE::AtrributeProtos of a Node.

Provides faster and cleaner reads of values from keys (strings) than ONNX_NAMESPACE::AttributesProto.

Public Types

using Ints = std::vector<int64_t>: The types of attributes as defined in the ONNX spec.

using Int = int64_t

using Floats = std::vector<float>

using Float = float

using Strings = std::vector<std::string>

using String = std::string

using Graphs = std::vector<ONNX_NAMESPACE::GraphProto>

using Graph = ONNX_NAMESPACE::GraphProto

Public Functions

Attributes(const NodeAttributes&)

Attributes() = default

const std::vector<std::string> &getNames() const

onnxAttPtr at(const std::string &name) const

void append(std::stringstream &ss, std::string prefix = "") const

template<typename T> void setIfPresent(T&, const std::string &key) const

template<typename T> void set(T&, const std::string &key) const

bool hasAttribute(const std::string &key) const

void takeAttribute(const std::string &key, const Attributes &attributes): Take an attribute identified by key from the given Attributes object.

template<typename UnaryPredicate> inline Attributes filter(UnaryPredicate p) const: Take the set of attributes that match the given predicate.

template<typename T> T getAttribute(const std::string &key, const T &defaultValue) const

Attributes::Graphs getAllGraphAttributes() const

template<typename T> T getAttribute(const std::string &key) const

template<typename T> void setAttribute(const std::string &key, T&)

template<> Attributes filter(const char *key) const

template<> Attributes filter(std::string key) const

template<> void setIfPresent(std::vector<int64_t>&, const std::string &key) const

template<> void setIfPresent(int64_t&, const std::string &key) const

template<> void setIfPresent(bool &v, const std::string &key) const

template<> void setIfPresent(std::string&, const std::string &key) const

template<> void setIfPresent(float&, const std::string &key) const

template<> void set(std::vector<int64_t> &vs, const std::string &key) const

template<> void set(std::vector<float> &vs, const std::string &key) const

template<> void set(std::vector<std::string> &vs, const std::string &key) const

template<> void set(float &v, const std::string &key) const

template<> void set(int64_t &v, const std::string &key) const

template<> Attributes::Ints getAttribute(const std::string &key, const Attributes::Ints &defaultValue) const

template<> Attributes::Int getAttribute(const std::string &key, const Attributes::Int &defaultValue) const

template<> Attributes::String getAttribute(const std::string &key, const Attributes::String &defaultValue) const

template<> Attributes::Float getAttribute(const std::string &key, const Attributes::Float &defaultValue) const

template<> Attributes::Ints getAttribute(const std::string &key) const

template<> void setAttribute(const std::string &key, Attributes::Ints&)

template<> void setAttribute(const std::string &key, Attributes::Int&)

template<> void setAttribute(const std::string &key, Attributes::String&)

14.11.6. Void data

#include <popart/voiddata.hpp>

class ConstVoidData

A class to point to constant data.

Public Functions

ConstVoidData() = default

ConstVoidData(const void *data_, const TensorInfo &info_)

inline bool storesData() const

void store(std::vector<char> &&d, const TensorInfo &i)

Public Members

const void *data = nullptr

TensorInfo info

class MutableVoidData

A class to point to non-constant data.

Public Members

void *data = nullptr

TensorInfo info

14.11.7. Input shape information

#include <popart/inputshapeinfo.hpp>

class InputShapeInfo

Class that contains what is known about the input tensors (as TensorInfo objects) in the IR prior to compilation.

This knowledge can sometimes be compiled into the IR, and for certain backends is even required, for example the IPU requires all Stream Tensor shapes.

Public Functions

InputShapeInfo() = default: Default constructor for the InputShapeInfo class.

void add(TensorId, const TensorInfo&)

Add the identifier and TensorInfo object for a tensor to the InputShapeInfo object.

Parameters

TensorId – The identifier of the tensor for which information is being added.
TensorInfo – The tensor information to be added.

const TensorInfo &get(TensorId) const

Get the information of a tensor.

Parameters: TensorId – The identifier of the tensor for which to get the tensor information.

bool has(TensorId) const

Check if the InputShapeInfo object contains information for a tensor.

Parameters: TensorId – The identifier of the tensor to check.
Returns: If true, the InputShapeInfo object contains information for the tensor. If false, the InputShapeInfo object does not contain information for the tensor.

std::vector<TensorId> getAllTensorIds() const

Get all unique tensor identifiers of tensors in the InputShapeInfo object.

Returns: Vector of tensor identifiers.

inline const std::map<TensorId, TensorInfo> &getInfos() const

Get all information contained the InputShapeInfo object.

Returns: Map of tensor identifiers and the corresponding tensor information.

14.11.8. Profiling

#include <popart/liveness.hpp>

class LivenessAnalyzer

Public Types

using PendingCopies = std::vector<LivenessNode>

Public Functions

LivenessAnalyzer(const Ir *ir_, const SubgraphCopyingStrategy *subgraphCopyingStrat)

void apply()

int64_t getGlobalSchedulePosition(CallStack callStack) const

inline size_t getOpScheduleSize() const

inline const LivenessNode &getOpScheduleAt(int64_t scheduleIndex) const

inline const std::vector<Op*> &getGraphOpSchedule(GraphId id) const

inline const std::vector<int64_t> &getScheduleIndices(Op *op) const

inline const std::vector<int64_t> &getScheduleIndices(Tensor *t) const

inline const std::vector<int64_t> &getScheduleIndices(TensorId tid) const

inline const std::vector<int64_t> &getCallSiteLinksAt(int64_t scheduleIndex) const

inline const std::vector<int64_t> &getCallSiteLinksInvAt(int64_t scheduleIndex) const

inline const std::vector<Op*> &getGraphCallSites(GraphId id) const

int64_t getContextStartIndex(ExecutionContext context) const

int64_t getContextEndIndex(ExecutionContext context) const

#include <popart/subgraphpartitioner.hpp>

class SubgraphPartitioner

When lowering CallOps, we would previously copy all tensors from the call site (the CallOp’s input tensors) to the subgraph’s input tensors, do the call and then copy the subgraph’s output tensors back to the call site’s output tensors:

Copy(caller_in_1, subgraph_in_1) Copy(caller_in_2, subgraph_in_2) Call(subgraph) Copy(subgraph_out_1, caller_out_1) Copy(subgraph_out_2, caller_out_2)

With this approach both, subgraph_in_1 and subgraph_in_2 are live during the call. This can be suboptimal — in some cases some subgraph inputs may not be required until later in the subgraph and copying them later would improve the required memory. Analogously, in some cases some subgraph outputs may be ready to copy well before the end of the subgraph and it may be advantageous to do this copy early. This is especially true for subgraphs that deal with multiple inputs/outputs in sequence.

To that end, graphs now support lowering over multiple “subgraph parts” to allow CallOps that have these subgraphs as their called graph to copy inputs later and outputs earlier. Essentially, each graph is ‘split’ over multiple PopART fragments / Poplar sequences to facilitate any parent graph that calls it to do a Copy of inputs or outputs between.

The scheduling of copies for subgraph ops is already modelled by the LivenessAnalyzer. We base our partitioning on this model. This class simply interprets the LivenessAnalyzer’s schedule and determines how to split subgraphs into parts based on the LivenessAnalyzer’s schedule.

Public Types

enum class CallOpPartType

Enum type for CallOpPart types.

Values:

enumerator Undefined = 0

enumerator CopyInput

enumerator CopyOutput

enumerator CopyModified

enumerator CallSubgraphPart

using CallOpSchedule = std::vector<std::tuple<CallOpPart, SubgraphPartIndex>>

Public Functions

SubgraphPartitioner() = default: Default contructor.

virtual ~SubgraphPartitioner() = default: Default destructor.

virtual void apply()

Prepare the results.

Errors if IR or liveness analyser not set.

virtual void setIr(const Ir*): Set the IR dependency to use.

virtual void setLivenessAnalyzer(const LivenessAnalyzer*): Set the LivenessAnalyzer dependency to use.

virtual int getNumSubgraphParts(const Graph&) const

Interpret the liveness analysis and work out what how many subgraph parts a graph needs to lower all fragments between input/output copies.

Errors if apply was not run.

virtual SubgraphPartIndex getOpSubgraphPartBegin(Op*) const

Interpret the liveness analysis and work out what subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.

For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the first such part. Errors if apply was not run.

virtual SubgraphPartIndex getOpSubgraphPartEnd(Op*) const

Interpret the liveness analysis and work out what index is one larger than the last subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.

For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the last such part. Errors if apply was not run.

virtual CallOpSchedule getCallOpSchedule(CallOp*) const

Intepret the liveness analysis results and work out how a CallOp is broken down over various subgraph parts.

The result is a vector of pairs of CallOp ‘parts’ and the ‘subgraph parts’ they should be lowered in.

Public Static Functions

static bool isPartitionable(const Graph &graph)

Returns true for a graph if we support it being ‘broken’ into multiple subgraph parts.

The main graph does not support this. Subgraphs that are called by any op that is not a CallOp also do not support this.

class CallOpPart

A class to represent a part of a CallOp.

Public Members

CallOpPartType type

InIndex inIndex

OutIndex outIndex

SubgraphPartIndex subgraphPartIndex

#include <popart/aliaszerocopy.hpp>

class AliasZeroCopy

Public Functions

AliasZeroCopy(const Ir *ir, const LivenessAnalyzer *analyzer)

void apply()

void removePostIRAliases(Tensor*)

std::set<Tensor*, PTensorCmp> getPostIRAliases(Tensor*) const

std::set<Tensor*, PTensorCmp> getTensorsWithPostIRAliases() const

std::set<Tensor*, PTensorCmp> getProposedAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const

std::set<Tensor*, PTensorCmp> getActiveAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const

void activateAlias(Tensor *ta, Tensor *tb)

bool nodeRequired(Op *op, OpStatus status, int index) const

bool opRequired(Op*) const

bool copyInputRequired(Op*, InIndex) const

bool copyLoopCarriedRequired(Op*, InIndex) const

bool copyModifiedRequired(Op*, InIndex) const

bool copyOutputRequired(Op*, OutIndex) const

void printLivenessIntervals(std::set<Tensor*, PTensorCmp> tensors, ProducerInterval producerInterval)

Intervals getLivenessIntervals(Tensor*, ProducerInterval)

Intervals getCandidateLivenessIntervals(Tensor*, ProducerInterval = ProducerInterval::Enforce, bool forceUpdateCache = false)

Public Static Functions

static std::size_t id()

static bool doOverlap(const Intervals &aIntervals, const Intervals &bIntervals)

class Intervals

Public Functions

Intervals()

Intervals(const Intervals &other)

~Intervals()

void insert(int64_t s, int64_t e)

bool empty() const

Intervals operator&(const Intervals &other) const

Intervals &operator=(const Intervals &other)

Intervals &operator+=(const Intervals &other)

bool operator==(const Intervals &other) const

bool operator!=(const Intervals &other) const

Friends

friend std::ostream &operator<<(std::ostream &os, const Intervals&)

enum popart::liveness::ProducerInterval

Values:

enumerator Enforce = 0

enumerator Ignore

14.11.9. Task information

#include <popart/taskid.hpp>

class TaskId

A class describing an IR-to-poplar lowering task.

This is a class that is cheap to construct. We construct and compare TaskIds a lot in irlowering.cpp so it pays to make these cheap operations. Note that previously TaskId was a std::string and creating a TaskId typically involved some string manipulation, meaning heap memory may be involved. Comparing strings for equality or ordering strings is also typically not constant-time.

Public Types

enum class Type

TaskId type.

Values:

enumerator AnchorStreamToHostTask = 0

enumerator AnchorSumTask

enumerator AnchorToHostTask

enumerator FromHostTask

enumerator FromHostUpdateTask

enumerator FromOpTask

enumerator InitBatchCounterTensorsTask

enumerator InitRngSeedsTask

enumerator InitRandomSeedTask

enumerator InitRngStateTensorTask

enumerator InitTensorTask

enumerator PipelinedCopyTask

enumerator RandomSeedToHostTask

enumerator RngStateFromHostTask

enumerator RngStateToHostTask

enumerator SetInitTensorValTask

enumerator StreamFromHostTask

enumerator UpdateBatchCountTask

enumerator WeightStreamToHostTask

enumerator WeightToHostTask

enumerator Undefined

Public Functions

TaskId()

explicit TaskId(Type type)

TaskId(Type, const TensorId &tensorId)

TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier)

TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier, const OpxGrowPartId &opxGrowPartId)

TaskId(Type type, nonstd::optional<TensorId> tensorId, nonstd::optional<OpId> opId, nonstd::optional<OperatorIdentifier> opIdentifier, nonstd::optional<OpxGrowPartId> opxGrowPartId)

bool empty() const

bool operator<(const TaskId &rhs) const

bool operator==(const TaskId &rhs) const

inline const nonstd::optional<TensorId> &getTensorId() const

inline const Type &getType() const

14.11.10. Type definitions

namespace onnx

namespace google

namespace protobuf

namespace popart

namespace view

Typedefs

using Regions = std::vector<Region>

using RegMap = std::function<Regions(const Region&)>

using LowBounds = std::vector<int64_t>

using UppBounds = std::vector<int64_t>

Typedefs

using Shape = std::vector<int64_t>: The dimensions of a tensor, equivalent to numpy.shape.

using Rank = int: Rank of a tensor. That is, the number of indices.

typedef std::string TensorId: Label put on a tensor to distinguish it from the others in the graph.

using DnfTensorIds = std::vector<std::set<TensorId>>

using OpName = std::string: Name of the instance of the operator.

using OpDomain = std::string

Specifies who created the operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpType = std::string

Specifies the type of an operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpVersion = unsigned

Specifies the version of the operator.

Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)

using OpId = int: Label put on a operator to distinguish it from the others in the graph.

using ReturnPeriod = int

using ReplicaIndex = int: The index of a replica.

using SubgraphIndex = int: The index of a subgraph for an Op.

using SubgraphPartIndex = int: The index of the subgraph part.

using OpxGrowPartId = int: Identifies a part of an Opx grow function.

using InIndex = int: The position at which a tensor is input by an Op.

using OutIndex = int: The position at which a tensor is output by an Op.

using CollectiveBalancedReorderId = int: The identifier of the collective balanced host rearrangement.

using ReplicatedTensorShardingIndices = std::set<std::pair<std::set<InIndex>, std::set<OutIndex>>>: The set of indices that have to be replica sharded together, and the outputs that will be replica sharded as a result.

using ReplicatedTensorShardingIndicesIndex = int: The position in ReplicatedTensorShardingIndices for which to get the ReplicatedTensorShardingGroup.

using ReplicatedTensorShardingGroupId = int: The unique integer id for a ReplicatedTensorShardingGroup.

using PipelineCycle = int64_t

using VGraphId = int64_t

using PipelineStage = int64_t

using ExecutionPhase = int64_t

using BatchSerializedPhase = int64_t

using StashIndex = int64_t

using RemoteBufferId = int64_t

using RemoteBufferIndex = int64_t

using RandomReferenceId = int64_t

using ConvInputs = std::vector<TensorId>

using ConvDilations = std::vector<int64_t>

using ConvGroup = int64_t

using ConvPads = std::vector<int64_t>

using ConvStrides = std::vector<int64_t>

using ConvTruncs = std::vector<int64_t>

using MultiConvInputs = std::vector<ConvInputs>

using MultiConvDilations = std::vector<ConvDilations>

using MultiConvGroups = std::vector<ConvGroup>

using MultiConvPads = std::vector<ConvPads>

using MultiConvStrides = std::vector<ConvStrides>

using TensorInterval = std::pair<size_t, size_t>

using TensorIntervalList = std::vector<TensorInterval>

using onnxAttPtr = const ONNX_NAMESPACE::AttributeProto*

using NodeAttributes = google::protobuf::RepeatedPtrField<ONNX_NAMESPACE::AttributeProto>

using OnnxTensors = std::map<TensorId, ONNX_NAMESPACE::TensorProto>

using Node = ONNX_NAMESPACE::NodeProto

using OnnxTensorPtrs = std::map<TensorId, const ONNX_NAMESPACE::TensorProto*>

using OpsBeforeKey = std::map<Op*, std::vector<Op*>, POpCmp>

using IsReplicaEqual = bool

using ReplEqInputMap = std::map<InIndex, IsReplicaEqual>

using ReplEqOutputMap = std::map<InIndex, IsReplicaEqual>

using ReplEqModifiedInputMap = ReplEqInputMap 

using ReplEqFun = std::function<std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap>(const ReplEqInputMap&)>

using ReplEqGraphFuns = std::function<ReplEqFun(const Graph*)>

Enums

enum StochasticRoundingMethod

Used to describe the stochastic rounding which is applied to the output(s) of an Op.

See also docs/notes/ir/attributes/stochasticroundingmethod.md

Values:

enumerator DifferingSeeds = 1

Apply stochastic rounding with a replica-local seed.

That is, stochastic rounding performed by an Op on one replica is nominally different to stochastic rounding performed by the same Op on another replica. Use this setting for Ops where you want to apply stochastic rounding but you cannot meet the condition of StochasticRoundingMethod::IdenticalSeeds. For example, this setting can be useful for gradient accumulation steps.

enumerator IdenticalSeeds = 2

Apply stochastic rounding with a RNG state (the value of poplar::getHwSeeds) that is identical across replicas.

Use this option on, e.g., the weight update step to ensure that the weight tensor on each replica has stochastic rounding applied to it in the same way and there is no weight drift.

REQUIREMENT: The ability to provide an RNG state (the value of poplar::getHwSeeds) that is identical on each replica relies on all Ops that use this setting to behave in a way that does not violate this property for Ops that follow it. More formally, you must only apply this setting to Ops for which you can guarantee that if the RNG state is the same across replicas before the Op is executed then the RNG state is still the same on all replicas after the Op is done executing. A typically sufficient (but not necessary) condition is that all input tensors of the Op have the same value across replicas.

using popart::FwdGraphToBwdGraphInfo = std::map<GraphId, BwdGraphInfo>: Mapping from fwdGraph to info on the bwdGraph.

using popart::popx::PreparedCopyTensors = std::map<InIndex, PreparedCopyTensor>

using popart::popx::PreparedTensorInfos = std::vector<PreparedTensorInfo>

14.11.11. Enums

enum class popart::AccumulationType

Values:

enumerator Add = 0

enumerator DampenedAdd

enumerator DampenedAddSquare

enumerator DecayAdd

enumerator DecayAddSquare

enumerator MovingAverage

enumerator MovingAverageSquare

enumerator Infinity

enumerator Mean

enum class popart::ActivationFunction

Values:

enumerator Sigmoid = 0

enumerator Relu

enumerator Tanh

enumerator Gelu

enumerator GeluErf

enumerator Swish

enumerator Softmax

enumerator SoftmaxStable

enumerator SoftmaxScaled

enumerator N

enumerator Invalid

enum class popart::AutoPad

Values:

enumerator NOTSET = 0

enumerator SAME_UPPER

enumerator SAME_LOWER

enumerator VALID

enum class popart::CollectiveOperator

Values:

enumerator Add = 0

enumerator Mean

enumerator Mul

enumerator Min

enumerator Max

enumerator LogicalAnd

enumerator LogicalOr

enumerator SquareAdd

enumerator Local

enumerator N

enum class popart::DeviceSelectionCriterion

Controls how to select an available IPU.

Values:

enumerator First = 0

enumerator Random

Select the first device available. (Default).

Select a device randomly from those available.

enum class popart::InitType

Values:

enumerator NoInit = 0

enumerator Zero

enum class popart::MatMulPartialsType

Values:

enumerator HALF

enumerator FLOAT

enum class popart::ResizeCoordinateTransformationMode

Values:

enumerator HalfPixel

enumerator PytorchHalfPixel

enumerator AlignCorners

enumerator Asymmetric

enumerator TfCropAndResize

enumerator N

enum class popart::ResizeMode

Values:

enumerator Nearest

enumerator Linear

enumerator Cubic

enumerator N

enum class popart::ResizeNearestMode

Values:

enumerator RoundPreferFloor

enumerator RoundPreferCeil

enumerator Floor

enumerator Ceil

enumerator Pytorch

enumerator N

enum class popart::ScatterReduction

Values:

enumerator Sum = 0

enumerator Max

enumerator Min

enumerator Mul

enumerator None

enum class popart::TensorRemapType

Enum describing how the tensor layout should be remapped during the forward and backward pass (backward pass remapping requires the Op to exist in the IR before autodiff).

Values:

enumerator FwdBwdReverse = 0: Remap the tensor in the forward pass, reverse-apply the remapping in the backward pass.

enumerator FwdBwd: Remap the tensor in the forward pass and backward pass independently.

enumerator Fwd: Only remap the tensor in the forward pass, use identity for the backward pass.

14.11.12. Structs

struct BranchInfo

Public Functions

BranchInfo(const GraphId&, const std::map<int, int> inputIndicesMap, const std::map<int, int> outputIndicesMap)

Public Members

GraphId graphId

std::map<int, int> inputIndicesMap

std::map<int, int> outputIndicesMap

struct ClonedGraphMaps

Struct of maps that map cloned Op and Tensor Ids back to the original, and vice-versa.

Public Members

std::map<OpId, OpId> opIdMap

std::map<TensorId, TensorId> tensorIdMap

struct ConvParameters

Public Members

DataType type

int64_t batchSize

int64_t numInChannelsPerGroup

int64_t numOutChannelsPerGroup

int64_t numGroups

Shape inputShape

Shape kernelShape

struct popart::ConvParameters::Input inputTransformation

struct popart::ConvParameters::Input kernelTransformation

struct popart::ConvParameters::Output outputTransformation

struct Input

Public Members

std::vector<int64_t> lowerTruncation

std::vector<int64_t> upperTruncation

std::vector<int64_t> dilation

std::vector<int64_t> lowerPadding

std::vector<int64_t> upperPadding

std::vector<bool> flip

struct Output

Public Members

std::vector<int64_t> lowerTruncation

std::vector<int64_t> upperTruncation

std::vector<int64_t> stride

std::vector<int64_t> lowerPadding

std::vector<int64_t> upperPadding

struct OpxInAndOutIndex

Public Functions

inline OpxInAndOutIndex(const Opx *opx_, InIndex inIndex_, OutIndex outIndex_)

inline OpxInAndOutIndex(const Opx *opx_)

OpxInAndOutIndex() = default

inline bool operator==(const OpxInAndOutIndex &rhs) const

Public Members

const Opx *opx

InIndex inIndex

OutIndex outIndex

bool isDelegate

struct PTensorCmp

Public Functions

bool operator()(const Tensor *const &a, const Tensor *const &b) const

struct ReplicatedTensorShardingOpInfo

Struct that describes which inputs/outputs of an Op belong to the sharding group.

Regular operations typically belong to only one sharding group, however:

Subgraphing operations (CallOp, LoopOp)
MultiExchangeOp can belong to multiple sharding groups, depending on the input and ouput indices.

Public Functions

inline ReplicatedTensorShardingOpInfo()

inline ReplicatedTensorShardingOpInfo(OpId id_, std::set<InIndex> inIndices_, std::set<OutIndex> outIndices_)

bool operator<(ReplicatedTensorShardingOpInfo const &rhs) const

Public Members

OpId id: Unique ID of the operator.

std::set<InIndex> inIndices: Input indices belonging to the sharding group.

std::set<OutIndex> outIndices: Output indices belonging to the sharding group.

14.11.13. Other classes

template<typename T, uint32_t V = 0> class BasicOptional

A temporary solution to removing boost::optional from certain header files This class is an incomplete replacement of boost::optional (and std::optional).

template parameter T: the type which will optionally be stored template parameter V: has no effect, but enables compiler errors when two objects of type T should not be compared

Public Functions

inline BasicOptional() noexcept: Construct an unset BasicOptional<T>

inline BasicOptional(T t): Create a set BasicOptional<T> from a value.

BasicOptional(const BasicOptional<T, V> &rhs) = default

BasicOptional<T, V> &operator=(const BasicOptional<T, V>&) = default

inline BasicOptional<T, V> &operator=(const T &t)

inline const T &operator*() const &: Get a constant reference to the value.

inline T &operator*() &: Get a reference to the value.

inline explicit operator bool() const

Return true if set.

Can be used as:

BasicOptional<Foo> foo(6); if (foo){ *foo = 7; }

inline void reset() noexcept

class ExchangeDescriptor

Class describing an external exchanges from IPUs.

Public Functions

ExchangeDescriptor(ExchangeDirection direction, TensorId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)

Create an ExchangeDescriptor for a host exchange.

Parameters

direction – Load (from host) or Store (to host)
id – Host stream tensor ID
vgid – Virtual graph for the exchange
tileSet – Tile set for the exchange
numInputs – Number of tensor inputs expected
numOutputs – Number of tensor outputs expected
inplace – If the output of the exchange should alias the input during Load

ExchangeDescriptor(ExchangeDirection direction, RemoteBufferId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)

Create an ExchangeDescriptor for a remote exchange.

Parameters

direction – Load (from host) or Store (to host)
id – Remote buffer id
vgid – Virtual graph for the exchange
tileSet – Tile set for the exchange
numInputs – Number of tensor inputs expected
numOutputs – Number of tensor outputs expected
inplace – If the output of the exchange should alias the input during Load

ExchangeDescriptor(ExchangeDirection direction, GraphId id, TileSet destination, CodeMemoryType destinationType)

Create an ExchangeDescriptor for an External code copy op.

Parameters

direction – Load (from host) or Store (to host)
id – GraphId of the graph to load.
destination – The destination TileSet to load to .
destinationType – The destination memory type to load to.

inline const ExchangeDirection &getDirection() const

inline bool isRemoteExchange() const

inline bool isHostExchange() const

inline bool isCodeCopy() const

Returns true if this exchange descriptor is is associated with a code copy operation.

Returns: true If it is associated with a code copy op.
Returns: false Otherwise.

inline const RemoteBufferId &getRemoteBufferId() const

inline void setRemoteBufferId(RemoteBufferId id)

inline const TensorId &getHostStreamTensorId() const

inline const OptionalGraphId &getGraphToLoadId() const

GraphId of the graph which this op will load code for.

Returns: const OptionalGraphId& Id in question.

inline OptionalCodeMemoryType getDestinationCodeMemoryType() const

Get the Destination Location the code will be sent to, if this is an ExchangeDescriptor for an RemoteCodeLoadOpOp.

Buffer - Stored in non-executable buffer memory. ExecutableMemory - Stored in executable memory.

Returns: OptionalLocationType One of:

const std::string getResourceId() const

Get an identifier representing which resource (landing pad tensor) this exchange will be using.

Returns: Resource identifier

inline OptionalVGraphId getVGraphID() const

inline TileSet getTileSet() const

inline int getNumInputs() const

inline int getNumOutputs() const

inline bool isInplace() const

class GraphId

Public Functions

GraphId() = delete

GraphId(const std::string&)

bool operator<(const GraphId&) const

bool operator==(const GraphId&) const

bool operator!=(const GraphId&) const

const std::string &str() const

Public Static Functions

static const GraphId &root()

class LeakyReluOpBaseAttributes

Subclassed by popart::LeakyReluGradOp, popart::LeakyReluInplaceOp, popart::LeakyReluOp

Public Functions

inline LeakyReluOpBaseAttributes(float _alpha)

inline float getAlpha() const

class MultiConvOptions

Public Functions

MultiConvOptions(const std::map<std::string, std::string> sessionConvOptions, const Attributes &attr)

std::map<std::string, std::string> getConvOptions(int convIndex) const

std::map<std::string, std::string> getGlobalOptions() const

Public Members

std::vector<float> availableMemoryProportions

std::vector<std::string> partialsTypes

nonstd::optional<std::string> planType

nonstd::optional<int> perConvReservedTiles

nonstd::optional<float> cycleBackOff

std::vector<int64_t> enableConvDithering

class OpEquivIdCreator : public popart::OpSerialiserBase 

Public Functions

OpEquivIdCreator(const Op*)

void appendAttribute(const std::string&, nonstd::optional<int64_t>) override

void appendAttribute(const std::string&, nonstd::optional<float>) override

void appendAttribute(const std::string&, nonstd::optional<double>) override

void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override

void appendForwardOp(const Op*) override

std::string str()

template<> void appendAttr(const TensorIndexMap &tmap)

class OpJsonSerialiser : public popart::OpSerialiserBase 

Public Functions

OpJsonSerialiser(const Op*, std::stringstream &ss_)

void appendAttribute(const std::string&, nonstd::optional<int64_t>) override

void appendAttribute(const std::string&, nonstd::optional<float>) override

void appendAttribute(const std::string&, nonstd::optional<double>) override

void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override

void appendForwardOp(const Op*) override

class OpSerialiser : public popart::OpSerialiserBase 

Public Functions

OpSerialiser(const Op*, std::stringstream &ss_)

void appendAttribute(const std::string&, nonstd::optional<int64_t>) override

void appendAttribute(const std::string&, nonstd::optional<float>) override

void appendAttribute(const std::string&, nonstd::optional<double>) override

void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) override

void appendForwardOp(const Op*) override

class OpSerialiserBase

Subclassed by popart::OpEquivIdCreator, popart::OpJsonSerialiser, popart::OpSerialiser

Public Functions

inline virtual ~OpSerialiserBase()

void appendAttribute(const std::string&, float)

void appendAttribute(const std::string&, double)

void appendAttribute(const std::string&, int)

void appendAttribute(const std::string&, int64_t)

void appendAttribute(const std::string&, uint32_t)

void appendAttribute(const std::string&, uint64_t)

void appendAttribute(const std::string&, const std::string&)

void appendAttribute(const std::string&, const std::vector<float>&)

void appendAttribute(const std::string&, const std::vector<double>&)

void appendAttribute(const std::string&, const std::vector<int64_t>&)

void appendAttribute(const std::string&, const Scope&)

void appendAttribute(const std::string&, bool)

virtual void appendAttribute(const std::string&, nonstd::optional<int64_t>) = 0

virtual void appendAttribute(const std::string&, nonstd::optional<float>) = 0

virtual void appendAttribute(const std::string&, nonstd::optional<double>) = 0

virtual void appendAttribute(const std::string&, const std::map<TensorId, uint64_t>) = 0

template<typename T, uint32_t V> inline void appendAttribute(const std::string &key, const BasicOptional<T, V> &value)

template<typename T> inline void appendAttribute(const std::string &key, const T &value)

virtual void appendForwardOp(const Op*) = 0

class PriTaskDependency

Public Functions

PriTaskDependency(TaskId taskId, DependencyType type)

PriTaskDependency(std::set<TaskId> taskIds, DependencyType type)

inline DependencyType getType() const

inline bool satisfiedBy(TaskId taskId) const

inline const std::set<TaskId> &getTaskIds() const

bool operator==(PriTaskDependency const &rhs) const

class ReplicaEqualAnalysisProxy

Interface for object passed to Op::fwdPropagateIsReplicaEqual.

Public Functions

virtual ReplEqModifiedInputMap getModifiedInputMapFromAliases(const Op *op, const ReplEqOutputMap &replEqOpOutputMap) const = 0

Work out replica-equal values for modified inputs by setting replica-equal values of modified inputs to true if and only if the Op has an output that is an alias of the modified input, containing all elements of the input, and the output is deemed replica-equal.

If this doesn’t hold a modified input is assumed to be not replica-equal.

NOTE: It is possible for an Op to modify an input to a replica-equal value in a way that will not be detected by this implementation, but it’s generally true for currently supported Ops at time of writing.

Parameters

op – The Op to get the replica-equal values for modified inputs for.
replEqOpOutputMap – The Op’s replica-equal output values.

Returns

A mapping containing replica-equal values for modified outputs.

virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqualThroughGraph(const Graph *graph, const ReplEqInputMap &replEqGraphInputMap) = 0

A method that can be called to work out how replica-equal values for graph inputs propagate to replica-equal values for graph outputs.

NOTE: Graphs never copy-modify input tensors, although Ops that call graphs might (like CallOp, LoopOp).

Parameters

graph – The graph to propagate replica-equal values through.
replEqGraphInputMap – The replica-equal values for the graph’s inputs.

Returns

A tuple containing a ReplEqOutputMap that describes replica-equal values for the graph’s outputs and a ReplEqModifiedInputMap that describes the final replica-equal values of the graph’s inputs.

inline virtual ~ReplicaEqualAnalysisProxy()

class ReplicatedTensorShardingTracer

Class that traces the graph and finds all tensors that are: 1.) Replicated tensor sharded 2.) Have the same meta-shape describing the tensor shape before sharding 3.) Use the same collective balanced reorder (CBR) when lowered to Poplar 4.) Share the same elementwise compatible tensor layout by virtue of 2.) and 3.)

Public Functions

ReplicatedTensorShardingTracer(const Ir &ir_)

Instantiate the tracer and trace.

Parameters

ir_ – IR to operate on
startTensors_ – Tensors to trace from

bool hasGroup(const ReplicatedTensorShardingOpInfo &opInfo) const

Check if the Op associated with the opId has a replicated tensor sharding group.

Parameters: opInfo – OpId and input/output indices
Returns: True if there is a group associated with the opId

bool hasGroup(const TensorId &tensorId) const

Check if the tensor associated with the tensorId has a replicated tensor sharding group.

Parameters: tensorId – TensorId
Returns: True if there is a group associated with the tensorId

const ReplicatedTensorShardingGroup &getGroup(const ReplicatedTensorShardingOpInfo &opInfo) const

Get the replicated tensor sharding group associated with the opId.

Parameters: opInfo – OpId and input/output indices
Returns: Associated replicated tensor sharding group.

const ReplicatedTensorShardingGroup &getGroup(const TensorId &tensorId) const

Get the replicated tensor sharding group associated with the tensorId.

Parameters: tensorId – TensorId
Returns: Associated replicated tensor sharding group.

void trace(const std::set<Tensor*, PTensorCmp> &startTensors)

Traverse the graph to trace out operators and tensors belonging to the same replicated tensor sharding group.

Parameters: startTensors –

class TensorLocationInfo

Public Functions

inline void setRemote(bool remote_)

inline bool isRemote() const

inline void setSharded(bool sharded_)

inline bool isSharded() const

inline void setRemoteBufferInfo(RemoteBufferId rbId, RemoteBufferIndex index)

inline const std::pair<RemoteBufferId, RemoteBufferIndex> getRemoteBufferInfo() const

inline bool operator==(const TensorLocationInfo &rhs) const

class InputCreatorCandidate : public popart::popx::ICreatorCandidate 

Public Functions

InputCreatorCandidate(InIndex index_, const Opx *opx_, std::vector<OpxInAndOutIndex> pathFromInput_, int64_t scheduleIndex_)

InputCreatorCandidate() = default

~InputCreatorCandidate() override = default

std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override

DnfTensorIds mustExistBeforeCreate() override

double getMaxCreatorPriority() const override

int64_t getNumElems() const override

inline InIndex getIndex() const

inline const Opx *getOpx() const

inline std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final

inline void setPathFromInput(const std::vector<OpxInAndOutIndex> &value)

std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) override

std::vector<popart::view::Region> unwind(popart::view::Region) override

std::vector<popart::view::Region> unwind() override

std::string str() override

inline int64_t getScheduleIndex() const final

class InputMultiCreatorCandidate : public popart::popx::ICreatorCandidate 

Public Functions

InputMultiCreatorCandidate()

~InputMultiCreatorCandidate() override = default

std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override

DnfTensorIds mustExistBeforeCreate() override

double getMaxCreatorPriority() const override

int64_t getNumElems() const override

std::string str() override

bool addCreatorCandidate(ICreatorCandidatePtr)

std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final

std::pair<poplar::Tensor, ViewChangers> unwind(poplar::Tensor) override

std::vector<popart::view::Region> unwind(popart::view::Region) override

std::vector<popart::view::Region> unwind() override

int64_t getScheduleIndex() const final

class IsInfx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

IsInfx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class IsNaNx : public popart::popx::ElementWiseUnaryOpx 

Public Functions

IsNaNx(Op*, Devicex*)

void grow(poplar::program::Sequence&) const final

class ViewChanger

Subclassed by popart::popx::ReplicatedGatherInScatterOutViewChanger, popart::popx::ReplicatedGatherOutScatterInViewChanger

Public Functions

inline virtual ~ViewChanger()

inline virtual poplar::Tensor apply(poplar::Tensor tensor) const

inline virtual bool containsAllDataRegions() const

inline virtual bool operator==(const ViewChanger &rhs) const

inline virtual bool operator!=(const ViewChanger &rhs) const

class ViewChangers

Public Functions

ViewChangers()

ViewChangers(std::vector<std::shared_ptr<ViewChanger>> viewChangers_)

poplar::Tensor apply(poplar::Tensor tensor) const

inline bool empty() const

bool operator==(const ViewChangers &rhs) const

bool operator!=(const ViewChangers &rhs) const

class ReplicatedGatherInScatterOutViewChanger : public popart::popx::ViewChanger 

Public Functions

inline ReplicatedGatherInScatterOutViewChanger(int64_t nelms_, ReplicatedTensorShardingGroupId group_)

inline poplar::Tensor apply(poplar::Tensor tensor) const final

inline bool containsAllDataRegions() const final

inline bool operator==(const ViewChanger &rhs) const final

class ReplicatedGatherOutScatterInViewChanger : public popart::popx::ViewChanger 

Public Functions

inline ReplicatedGatherOutScatterInViewChanger(const gcl::CollectiveBalancedReorder *cbr_, ReplicatedTensorShardingGroupId group_)

inline poplar::Tensor apply(poplar::Tensor tensor) const final

inline bool operator==(const ViewChanger &rhs) const final

class Reader

A class which facilitates deserialization process.

It allows reading serialized streams allowing restoring PopART state. For more information on what components are deserialized please refer to Writer class.

Public Functions

Reader(const std::vector<std::shared_ptr<std::istream>> &in_vec)

Constructs Reader class object.

Parameters: in – Vector of source streams from which a PopEF file will be read.

Reader(Reader &&reader): Move constructor.

~Reader(): Default destructor.

size_t readExecutableHash() const

Returns: The executable hash or 0 if the stream contains corrupted data.

bool containsPoplarExecutable() const

Returns: True if the stream contains a Poplar executable.

bool containsExecutable() const

Returns: True if the stream contains a Popart executable.

bool containsPopefMetadata()

Returns: True if the stream contains a PopEF metadata.

poplar::Executable deserializePoplarExecutable() const

Deserializes Poplar executable from an executable blob which is part of a PopEF file.

Returns: Poplar executable.

std::unique_ptr<popart::popx::Executablex> deserializeExecutable(popart::Ir &ir, popart::popx::IrLowering &lowering) const

Load a PopART executable from a PopEF file.

Parameters

ir – Object which some of the deserialized data will be written to.
lowering – Object which some of the deserialized data will be written to.

Returns

PopART executable.

Public Static Functions

static nonstd::optional<size_t> checkFileForValidPoplarExecutable(const std::string &filePath)

Check that a PopART executable can be loaded from a PopEF file.

Parameters: filePath – The full path to the popef file.
Returns: nonstd::optional<size_t> The hash of the PopART IR if an executable could be loaded.

Search help

14. PopART C++ API

14.1. Sessions

14.1.1. Training session

14.1.2. Inference session

14.1.3. Session options

14.2. Data input and output (IStepIO)

14.3. Tensors

14.4. Optimizers

14.4.1. Stochastic Gradient Descent (SGD)

14.4.2. Adam, AdaMax & Lamb

14.4.3. AdaDelta, RMSProp & AdaGrad

14.5. Builder

14.6. Data flow

14.7. Device manager

14.8. Ops

14.8.1. Op definition for PopART IR

14.8.2. Op definition for Poplar implementation

14.8.3. Available Ops (Op class)

14.8.4. Available Ops (Opx class)

14.9. Patterns

14.9.1. Available patterns

14.10. Transforms

14.10.1. Available transforms

14.11. Utility classes

14.11.1. Graph

14.11.2. Region

14.11.3. Error handling

14.11.4. Debug context

14.11.5. Attributes

14.11.6. Void data

14.11.7. Input shape information

14.11.8. Profiling

14.11.9. Task information

14.11.10. Type definitions

14.11.11. Enums

14.11.12. Structs

14.11.13. Other classes