14. PopART C++ API
This chapter describes the PopART C++ API.
14.1. Sessions
#include <popart/session.hpp>
-
class Session
Session is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware.
Subclassed by popart::InferenceSession, popart::TrainingSession
Public Functions
-
std::vector<uint32_t> getRNGState()
Get state of the random number generator.
-
void setRNGState(const std::vector<uint32_t>)
Set state of the random number generator.
-
void setRandomSeed(uint64_t seedValue)
Set the value of the random number generator seed.
This method explicitly seeds all random operations. Additionally, this method derives a new state for the random number generator (RNG) from the seed and sets it on the device. This RNG state is used to resolve stochastic rounding. Note that to deterministically store and restore the combined random state for a session, do the following:
C++:
// Store random state (session s0). auto seed = s0.getRandomSeed(); auto rngState = s0.getRNGState(); // Restore random state (session s1). s1.setRandomSeed(seed); // <-- affects RNG state, order important s1.setRNGState(rngState);
Python:
# Store random state (session s0). seed = s0.getRandomSeed() rngState = s0.getRNGState() # Restore random state (session s1). s1.setRandomSeed(seed) # <-- affects RNG state, order important s1.setRNGState(rngState)
- Parameters
seedValue – The value of the seed.
-
uint64_t getRandomSeed()
Get the value of the random number generator seed.
Calling setRandomSeed() with this value (at a later stage) reinstates the random state logic that seeds random operations.
- Returns
The value used to seed current random operations.
-
void compileAndExport(const std::string &filename)
Compile the graph and export it to a file.
This method will first create a
poplar::Graph
and compile thepoplar::Executable
. Next, it will export the executable and PopART metadata to the file. The exported file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.- Parameters
filename – The name of the file where the compiled executable and metadata will be saved.
-
void compileAndExport(std::ostream &out)
Compile the graph and export it to a stream.
This method will first create a
poplar::Graph
and compile thepoplar::Executable
. Next, it will export the executable and PopART metadata to the stream. The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.This method automatically creates folders as needed if
filename
is located in a folder which does not exist.- Parameters
out – The stream that the compiled executable and metadata will be written to.
-
void saveExecutableToFile(const std::string &filename)
Save a compiled graph to a file.
The file will be in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.
This method automatically creates folders as needed if
filename
is located in a folder which does not exist.- Parameters
filename – The name of the file where the compiled executable and metadata will be saved.
- Pre
prepareDevice() must have been called.
-
void saveExecutableToStream(std::ostream &out)
Save a compiled graph to a stream.
The data will be streamed in the PopEF format. This means that the file can be used to run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information.
- Parameters
out – The stream where the compiled executable and metadata will be written to.
- Pre
prepareDevice() must have been called.
-
void saveExecutable(const std::string &path, bool savePopartMetadata = true, bool saveVariables = true)
Save a compiled graph with additional data to a file.
PopART is able to save its state after the model compilation is complete, so that it can be restored at a later time. To make this possible, it is necessary to save such elements as:
a serialised Poplar executable,
its associated metadata,
tensor data blobs if model parameters have not been frozen (refer to the
SessionOptions::constantWeights
for more information),a PopART-specific opaque blob to store information only relevant to PopART. This is needed to restore PopART state.
The file will be in the PopEF format. This means that the file can be used to restore the state of the PopART program without recompiling the graph, or run inference using the Triton Inference Server with the Graphcore Triton backend. See the Poplar Triton Backend User Guide for more information. If you want to analyze file structure saved by the function please refer to the PopEF dump tool.
- Parameters
path – The name of the file or directory where the compiled executable, metadata and variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/executable.popef”. If the file exists, the function will overwrite the old data with the new ones.
savePopartMetadata – If you do not need the option to restore the PopART state later, you can set the flag to false to reduce disk space taken up by the file.
saveVariables – If you don’t need to save variables (tensors) state, you can set the flag to false if you want to save them later or in a different location. The function will save data consistent with the variables contained within the model.
- Pre
prepareDevice() must have been called.
-
void saveVariables(const std::string &path)
Save all variables to a file.
The function will save data consistent with the variables contained within the model.
The file will be in the PopEF format. If you want to analyze tensors saved by the function refer to the PopEF dump tool.
- Parameters
path – The name of the file or directory where the compiled variables will be saved. If you specified a path to the directory, the function will write the data to the file: “<path>/variables.popef”. If the file exists, the function will overwrite the old data with the new ones.
- Pre
prepareDevice() must have been called.
-
void checkInplacingAmbiguity() const
Check for potential inplacing ambiguities.
This method creates an
AliasModel
object for each graph and runs the Poprithms ambiguity checker on it.Throws an error if the graph has an inplacing ambiguity and will prompt the user to check the inplacing.
See
poprithms::memory::inplace::Graph::AmbiguityStatus
on the Poprithms GitHub repo for more on what constitutes an ambiguity.
-
void loadExecutableFromFile(const std::string &filename)
Load the compiled executable and metadata from a file.
The file must have been created with compileAndExport(const std::string).
- Parameters
filename – The name of the file to load the executable and metadata from.
Load the compiled executable and from a stream.
The stream must have been created with compileAndExport(std::ostream).
- Parameters
in – The shared pointer to the stream to load the executable from.
-
void prepareDevice(bool loadEngine = true)
Prepare the network for execution.
This will create the
poplar::Graph
andpoplar::Engine
.- Parameters
loadEngine – If
true
, load the engine and connect the streams once the device is ready.
-
void loadEngineAndConnectStreams()
Load the engine on the device and connect the streams.
This will set up the
poplar::Streams
.Note: This call is optional. The engine will implicitly be loaded on the device when required.
-
void weightsFromHost()
Copy weights from the host to the device.
-
void buffersFromHost()
Copy buffers from the host to the device.
-
void weightsToHost()
Copy the weights from the device to the host steam memory.
-
uint64_t getCycleCount(std::string id = "")
Copy the cycle count tensor from the device to the host.
- Parameters
id – The identifier of the cycle count tensor.
-
void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index = 0)
Connect a Poplar stream with a callback.
This method will be called whenever the stream will be read or was written to by the device. The memory location will only be valid for reading or writing for the duration of the callback.
- Parameters
streamHandle – The name of the stream to connect to.
callback – The callback to be called whenever the stream is to be read or was written to by the device.
index – The replica index to connect to, when using replicated graphs. Default=0.
-
void connectStream(const std::string &streamHandle, void *buffer)
Connect a Poplar stream with a fixed location in memory.
Each time data is copied to the stream, this location will be read and each time data is copied from the stream, this location will be written.
- Parameters
streamHandle – The handle of the stream to connect to.
buffer – The pointer to the memory location.
-
void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index = 0)
Connect a host function to a callback.
The callback takes two arguments, which point to the locations in memory for each of the function’s input and output arguments, respectively. During a host function call, first the device transfers the input data to the host, then the callback is invoked, and finally the output data is copied back to the device. The memory pointed to by the callback arguments must only be accessed during the duration of the callback.
- Parameters
functionHandle – The name of the host function.
callback – The function to be called whenever new input data is available.
index – The replica index to connect to, when using replicated graphs. Default=0.
-
void run(IStepIO &stepIO, std::string debugName = "")
Run one step.
Read input data from address in
stepIO.in
.Write the output data to addresses in
stepIO.out
.- Parameters
stepIO – The input and output data.
debugName – A debug string to identify this run in logs.
-
void run(std::string programHandle, IStepIO &stepIO, std::string debugName = "")
Run one step of a custom program.
Read input data from address in
stepIO.in
.Write the output data to addresses in
stepIO.out
.- Parameters
programHandle – The handle of the custom program to run.
stepIO – The input and output data.
debugName – A debug string to identify this run in logs.
-
void updateExternallySavedTensorLocations(const std::string &fromLocation, const std::string &toLocation)
Update the tensor locations of tensors in the session’s ONNX model.
A new file will be created at this point, and written to when the ONNX model is saved with a subsequent call to modelToHost().
- Parameters
fromLocation – All externally saved tensors with location
fromLocation
will have their location updated totoLocation
.toLocation – The updated tensor locations. This must not already exist.
-
void modelToHost(const std::string &fn)
Write the current model to an ONNX file.
- Parameters
fn – The path to file. The path can be absolute or relative. If you plan to run your program in multiple processes simultaneously, you should avoid possible race conditions by writing to different files, for example by using temporary files.
-
TensorInfo getInfo(TensorId) const
Get the tensor information for a tensor.
- Parameters
TensorId – The identifier of the tensor to get the tensor information for.
- Returns
The tensor information for the tensor.
-
bool hasInfo(TensorId) const
Check whether a tensor has information.
- Parameters
TensorId – The identifier of the tensor to get the tensor information for.
- Returns
true
if the tensor with identifier TensorId has tensor information andfalse
if not.
-
std::set<TensorId> getAllTensorIds() const
Returns the ids of all tensors in the model.
- Pre
prepareDevice() must have been called.
-
std::string getSummaryReport(bool resetProfile = true) const
Retrieve the summary report from from the
poplar::Engine
.The options which were passed to the Session constructor will influence the information in the report.
This method may only be called after prepareDevice() has been called.
- Parameters
resetProfile – If
true
, resets the execution profile. Default =true
.- Returns
A string containing the report.
-
std::string getSerializedGraph() const
Retrieve the serialized graph from the
poplar::Engine
.A JSON format report is produced.
This method may only be called after prepareDevice() has been called.
- Returns
A string containing the serialized graph.
-
pva::Report getReport() const
Retrieve the graph report from the
poplar::Engine
.The options which were passed to the Session constructor will influence the information in the report.
This method may only be called after prepareDevice() has been called.
- Returns
The PopVision Analysis report object.
-
void resetHostWeights(const std::string &model, const bool ignoreWeightsInModelWithoutCorrespondingHostWeight = false)
Reset weights with weights in an ONNX model.
Note that the only differences between the ONNX model and the current model must be the weights. No other differences are allowed.
This method only updates the weights on the host. weightsFromHost() must be called after this method to update the weights on the device.
- Parameters
model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
ignoreWeightsInModelWithoutCorrespondingHostWeight – If
true
, do not throw an error if there are initializers in the ONNX model without corresponding initializer tensor(s) in the session’s IR.
-
void readWeights(const IWeightsIO &weightsIo)
Read the weights from the host stream memory and write to the host.
This method may only be called after weightsToHost() has been called.
- Parameters
weightsIo – The weight data that is read from the host stream memory is written to the addresses in
weightsIo.out
.
-
void writeWeights(const IWeightsIO &weightsIo)
Write the weights from the host to the IR tensor memory.
This method may only be called after weightsFromHost() has been called.
- Parameters
weightsIo – The weight data is written to the addresses in
weightsIo.out
.
-
std::string serializeIr(IrSerializationFormat format)
Serialize the IR graph to a string.
- Parameters
format – The format to use for serializing.
-
inline const popx::IrLowering &getIrLowering() const
Get the IR lowering associated with the Session.
-
inline const popx::Executablex &getExecutable() const
Get the executable associated with the Session.
-
void broadcastWeights(int rootRank = 0)
Broadcasts the weight from the PopRun instance with index
rootRank
to all other instances.- Parameters
rootRank – The index of the PopRun instance from which the weights should be broadcasted.
-
void updateEngineCache()
Update cacheEntries from engine cache directory and update ir::hashMatched_ with the updated cacheEntries.
Set the DeviceInfo of the Session.
-
std::vector<uint32_t> getRNGState()
14.1.1. Training session
#include <popart/session.hpp>
-
class TrainingSession : public popart::Session
TrainingSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware with training provided by optimizing a loss tensor using an optimizer and automatic differentiation (backpropagation).
Public Functions
-
~TrainingSession() override
Destructor for the TrainingSession class.
-
void updateOptimizerFromHost(const Optimizer *optimizer)
Update the optimizer from the host.
This method updates the optimizer and the associated hyperparameters but not the optimizer state tensors.
NOTE: The optimizer parameter has to be compatible with the optimizer passed to the TrainingSession constructor. For example, you cannot call this function with an
SDG1
optimizer if you created the session with anSDG0
optimizer. This is because it is not possible to change the IR after a session has been constructed.- Parameters
optimizer – A pointer to a popart::Optimizer.
-
void copyFromRemoteBuffer(const std::string &buffer, void *w, int repeat_index, unsigned replication_index = 0)
Copy from a remote butter into a user buffer.
This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.
- Parameters
buffer – The name of the remote buffer to copy from.
w – Pointer to a user buffer to copy to.
repeat_index – The index in the remote buffer to copy from.
replication_index – The replicated graph index when using replicated graphs. Default=0.
-
void copyToRemoteBuffer(void *w, const std::string &buffer, int repeat_index, unsigned replication_index = 0)
Copy from a user buffer to a remote buffer.
This can be useful when we run larger models with host side reductions since HEXOPT is currently limited to 128 MB.
- Parameters
w – Pointer to a user buffer to copy from.
buffer – The remote buffer to copy to.
repeat_index – The index in the remote buffer to copy to.
replication_index – The replicated graph index when using replicated graphs. Default=0.
Public Static Functions
Create a session for training from an IR.
- Parameters
ir – The IR to create the session from.
deviceInfo – The type of device that this session uses.
name – The name of this training session. Default: “training”.
Create a session for inference from an ONNX model.
- Parameters
model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
dataFlow – Configuration for the data feeds and fetches.
loss – The identifier of the final scalar loss tensor for training.
optimizer – The name of an optimizer to use when training.
deviceInfo – The type of device that this session uses.
inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().
userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().
patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().
name – (Optional) The name of this inference session. Default: “training”.
-
~TrainingSession() override
14.1.2. Inference session
#include <popart/session.hpp>
-
class InferenceSession : public popart::Session
InferenceSession is a runtime instance that provides an interface for executing ONNX graphs on IPU hardware, without any automatic differentiation (backpropagation) or optimization.
Public Functions
-
~InferenceSession() override
Destructor for the InferenceSession class.
-
void popxlSetEngineIsLoaded(bool isLoaded)
Public Static Functions
Create a session for inference from an IR.
- Parameters
ir – The IR to create the session from.
deviceInfo – The type of device that this session uses.
name – The name of this inference session. Default: “inference”.
Create a session for inference from an ONNX model.
- Parameters
model – An ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
dataFlow – Configuration for the data feeds and fetches.
deviceInfo – The type of device that this session uses.
inputShapeInfo – (Optional) The sizes and dtypes of the input tensors. This is used to specify the sizes of the input tensors in the case that the ONNX model does not include this information. The Poplar graph programming framework uses statically allocated memory buffers and so it needs to know the size of tensors before the compilation. Default: InputShapeInfo().
userOptions – (Optional) The user configuration options for the Session class. Default: SessionOptions().
patterns – (Optional) A user-selected set of graph transformation patterns which will be applied to the graph. If this is not specified, a default set of optimisation transformations will be applied. Default: Patterns().
name – (Optional) The name of this inference session. Default: “inference”.
-
~InferenceSession() override
14.1.3. Session options
#include <popart/sessionoptions.hpp>
-
enum class popart::AccumulateOuterFragmentSchedule
Enum type that determines how the operations in the accumulate outer fragment will be scheduled across virtual graphs (only relevant to pipelined modes).
Values:
-
enumerator Scheduler = 0
Don’t add additional constraints and let the scheduler work it out.
-
enumerator Serial
Add constraints that ensure ops are executed in virtual graph ID order.
-
enumerator OverlapCycleOptimized
Try and parallelise ops with different virtual graph IDs as much as possible.
-
enumerator OverlapMemoryOptimized
Try and parallelise ops with different virtual graph IDs but avoid certain steps that are costly in terms of memory usage.
-
enumerator Scheduler = 0
-
enum class popart::AutodiffStitchStrategy
Enum type representing a strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph.
Strategies may expose tensors that would otherwise have been internal to the forward graph as outputs of this forward graph.
Values:
-
enumerator RecomputeMinimal = 0
Recompute any backward graph inputs associated with non-gradient forward graph tensors that are neither inputs nor outputs in the forward graph.
-
enumerator RecomputeAllNonInputs
Recompute any backward graph inputs associated with non-gradient forward graph tensors that are not inputs in the forward graph.
-
enumerator AddFwdOutputs
For backward graph inputs associated with non-gradient forward graph tensors that are neither inputs or outputs in the forward graph, add them as outputs to the forward graph.
-
enumerator SafeAddFwdOutputs
Like AutodiffStitchStrategy::AddFwdOutputs except that those backward graph inputs that can’t be stitched with AutodiffStitchStrategy::AddFwdOutputs (that is, by adding outputs to the forward graph) are stitched using the AutodiffStitchStrategy::RecomputeMinimal strategy instead.
This means that this is a safe strategy to use as an Autodiff default.
-
enumerator N
Number of
AutodiffStitchStrategy
values.
-
enumerator RecomputeMinimal = 0
-
enum class popart::BatchSerializationBatchSchedule
Enum type that describes how to change the batch serialisation subgraph schedule before outlining.
Note
This setting is experimental and may change.
Values:
-
enumerator Scheduler = 0
Don’t encourage any particular scheduling for ops within batch subgraphs (leave it to the scheduler) but tell the scheduler to schedule subgraphs in sequence.
-
enumerator Isomorphic
Encourage all ops within batch subgraphs to be scheduled identically and for each subgraph to be scheduled in sequence (good for outlineability).
-
enumerator OverlapOnIo
Attempt to put the remote load op for batch N+1 right after the compute phase of batch N.
-
enumerator OverlapOnCompute
Attempt to put the remote load op for batch N+1 right before the compute phase of batch N.
-
enumerator N
The number of
BatchSerializationBatchSchedule
values.
-
enumerator Scheduler = 0
-
enum class popart::BatchSerializationMethod
Enum type that describes how to apply the batch serialization.
Note
This setting is experimental and may change.
Values:
-
enumerator UnrollDynamic = 0
Unroll the batch with dynamic slicing.
-
enumerator UnrollStatic
Unroll the batch with static slicing.
-
enumerator Loop
Loop over the batch dimension.
-
enumerator N
The number of
BatchSerializationMethod
values.
-
enumerator UnrollDynamic = 0
-
enum class popart::BatchSerializationTransformContext
Enum type that describes when to apply batch serialization.
Note
This setting is experimental and may change.
Values:
-
enumerator Fwd = 0
Apply batch serialiation before growing the backward pass.
-
enumerator Bwd
Apply batch serialiation after growing the backward pass.
-
enumerator N
The number of
BatchSerializationTransformContext
values.
-
enumerator Fwd = 0
-
enum class popart::ExecutionPhaseIOSchedule
Enum type to specify when to load tensors.
Values:
-
enumerator Preload = 0
Preload tensors in previous phase for use in current phase.
-
enumerator OnDemand
Load tensors just before they are required.
-
enumerator N
The number of
ExecutionPhaseIOSchedule
values.
-
enumerator Preload = 0
-
enum class popart::ExecutionPhaseSchedule
Enum type to specify the order of processing optimizer operations for different weights of the same execution phase.
The steps for phased execution are:
Copy to IO tiles if necessary.
Run collective operations if necessary.
Load optimizer state.
Update optimizer state.
Apply optimizer.
Store updated tensor if necessary.
Values:
-
enumerator Interleaving = 0
Process above steps for one weight at a time (for example: 123456, 123456, 123456).
The scheduler may interleave these steps.
-
enumerator Batch
Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange (for example: 333, 111, 222, 444, 555, 666).
-
enumerator BatchClusteredIO
Process above steps for all weights together, in a way that maximises overlap potential between compute and exchange, and maximise stream copy merges by keeping RemoteLoad/RemoteStore operations clustered (for example: 333, 111, 222, 444, 555, 666).
-
enumerator N
The number of
ExecutionPhaseSchedule
values.
-
enum class popart::GradientTensorTrackingMethod
Enum type to specify the method for selecting gradient tensors whose statistics are to be tracked for the AutomaticLossScale transform.
Values:
-
enumerator AllNonViewChangingGradientTensors = 0
Track all gradients of non-view-changing gradient tensors.
-
enumerator ConvAndMatmulGradients
Track all gradients of inputs to MatMul and Convolution ops.
-
enumerator GradientsOfUserSpecifiedTensors
Track gradients of user-specified tensors.
-
enumerator N
The number of
GradientTensorTrackingMethod
values.
-
enumerator AllNonViewChangingGradientTensors = 0
-
enum class popart::Instrumentation
Enum type used to specify an instrumentation type.
Values:
-
enumerator Outer = 0
Outer loop instrumentation, graph over all IPUs.
-
enumerator Inner
Inner loop instrumentation, graph per IPU.
-
enumerator N
The number of
Instrumentation
values.
-
enumerator Outer = 0
-
enum class popart::IrSerializationFormat
Enum type used to specify a serialization format.
Values:
-
enumerator JSON
JavaScript Object Notation (JSON).
-
enumerator JSON
-
enum class popart::MeanReductionStrategy
Enum type that specifies when to divide by a mean reduction factor, when doing mean reduction over a sequence of tensors \(t_1, t_2, ..., t_k\).
Values:
-
enumerator Running = 0
Keep the reduction buffer as the mean of the tensors accumulated so far.
If \(t_1, ..., t_f\) has just been processed, the current accumulator \(s\) is the mean of these values, and the next accumulator update is \(s = \frac{f}{f+1} * s + \frac{1}{f+1} * t_{f+1}\) to keep \(s\) a running mean.
This strategy guarantees \(s \le \max(a_1, ..., a_k)\) throughout the accumulation, therefore it will not overflow, but it is generally slower than MeanReductionStrategy::Post.
-
enumerator Post
Keep the accumulation factor as the running sum, and divide once by \(k\) at the end of the accumulation.
This strategy will generally be faster than MeanReductionStrategy::Running, but is prone to overflow (especially when using
fp16
).
-
enumerator N
The number of
MeanReductionStrategy
values.
-
enumerator Running = 0
-
enum class popart::MergeVarUpdateType
Enum type used to specify which VarUpdateOp ops to merge.
Values:
-
enumerator None = 0
Do not merge VarUpdateOp ops.
-
enumerator All
Merge all VarUpdateOp ops into as few groups as possible.
This is a good choice when memory is not a constraint.
-
enumerator AutoLoose
Merge into groups while attempting not to increase maximum variable liveness, and also not slice tensor variables so they will need to be processed by different VarUpdateOp ops.
-
enumerator AutoTight
Merge into groups, so that VarUpdateOp ops process tensors of exactly
SessionOptions::mergeVarUpdateMemThreshold
in size.
-
enumerator N
The number of
MergeVarUpdateType
values.
-
enumerator None = 0
-
enum class popart::RecomputationType
Enum type to specify which ops to recompute in the backward pass when doing auto-recomputation.
Values:
-
enumerator None = 0
No ops are recomputed (Default).
-
enumerator Standard
Recompute using algorithm that picks checkpoints to try and minimise max liveness.
-
enumerator NormOnly
Only Norm ops (+ non-linearities, if following) are recomputed.
-
enumerator Pipeline
Recompute all forward pipeline stages.
-
enumerator RecomputeAll
Recompute all ops.
-
enumerator N
The number of
RecomputationTypes
values.
-
enumerator None = 0
-
enum class popart::SubgraphCopyingStrategy
Enum type that describes how copies for inputs and outputs for subgraphs are lowered.
Currently this only affects subgraphs associated with CallOp ops.
Values:
-
enumerator OnEnterAndExit = 0
Copy all inputs before the start of the subgraph, copy all outputs after all ops in the subgraph.
With this strategy, subgraphs will always map to a single Poplar function.
-
enumerator JustInTime
Copy inputs just before they are consumed and copy outputs as soon as they are produced.
With this strategy, subgraphs may be lowered into multiple Poplar functions.
-
enumerator N
The number of
SubgraphCopyingStrategy
values.
-
enumerator OnEnterAndExit = 0
-
enum class popart::SyntheticDataMode
Enum type used to specify the data source for input tensors.
Values:
-
enumerator Off = 0
Use real data.
-
enumerator Zeros
Input tensors are initialised to all zeros.
-
enumerator RandomNormal
Input tensors are initialised with a random normal distribution ~N(0,1).
-
enumerator RandomUniform
Input tensors are initialised with a uniform distribution.
-
enumerator N
The number of
SyntheticDataMode
values.
-
enumerator Off = 0
-
enum class popart::VirtualGraphMode
Enum type used to specify a virtual graph mode.
Values:
-
enumerator Off = 0
Virtual graphs are not enabled.
-
enumerator Manual
User must set the popart::Op::virtualGraph attribute on all ops.
-
enumerator Auto
Use the AutoVirtualGraph transform.
-
enumerator ExecutionPhases
Virtual graphs are tied to execution phases.
-
enumerator N
The number of
VirtualGraphMode
values.
-
enumerator Off = 0
-
struct AccumulateOuterFragmentSettings
A structure containing accumulate outer fragment settings.
Public Functions
-
AccumulateOuterFragmentSettings() = default
-
inline AccumulateOuterFragmentSettings(AccumulateOuterFragmentSchedule schedule_, const std::vector<int> &excludedVirtualGraphs_)
Constructor for AccumulateOuterFragmentSettings.
- Parameters
schedule_ – Indicate how to schedule the accumulate outer fragment. This setting is experimental and may change. Default: AccumulateOuterFragmentSchedule::Serial
excludedVirtualGraphs_ – Indicate to explicitly avoid parallelising the virtual graph IDs. This setting is experimental and may change.
Public Members
-
AccumulateOuterFragmentSchedule schedule = AccumulateOuterFragmentSchedule::Serial
Indicate how to schedule the accumulate outer fragment.
Note
This setting is experimental and may change.
-
std::vector<int> excludedVirtualGraphs = {}
Indicate to explicitly avoid parallelising the virtual graph IDs.
Note
This setting is experimental and may change.
-
AccumulateOuterFragmentSettings() = default
-
struct AutodiffSettings
The settings for the Autodiff transform.
Public Functions
-
AutodiffSettings() = default
Default constructor for the AutodiffSettings struct.
-
inline AutodiffSettings(AutodiffStitchStrategy stitchStrategy_)
Constructor for the AutodiffSettings struct.
- Parameters
stitchStrategy_ – The strategy to ensure a backward graph’s inputs are either inputs of the forward graph, outputs of the forward graph or gradients of outputs of the forward graph. Default: AutodiffStitchStrategy::RecomputeAllNonInputs.
Public Members
-
AutodiffStitchStrategy stitchStrategy = AutodiffStitchStrategy::RecomputeAllNonInputs
The strategy PopART should use to ensure that all graph inputs of a backward graph are available as either inputs or outputs of the forward graph or gradients of outputs of the forward graph.
Note
This is an experimental option and may change.
-
AutodiffSettings() = default
-
struct AutomaticLossScalingSettings
A structure containing user configuration for automatic loss scaling settings.
Note
Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.
Public Functions
-
AutomaticLossScalingSettings() = default
Default constructor for AutomaticLossScalingSettings.
-
AutomaticLossScalingSettings(bool enabled_, const nonstd::optional<std::vector<TensorId>> &toTrackTensors_, float binEdgeLocation_, float thresholdUpperCountProportion_, int updatePeriod_, GradientTensorTrackingMethod gradientTensorTrackingMethod_)
Constructor for AutomaticLossScalingSettings.
- Parameters
enabled_ – Indicate whether to keep track (
true
) or not (false
) of the distribution of gradient tensor elements over the floating point range. Default:false
.toTrackTensors_ – An optional list of model tensor names, for which gradient statistics will be collected. If not set, the gradients of all tensors produced by default operations (matmul, conv) will be used.
binEdgeLocation_ – The location of the bin edge as a proportion of the absolute numerical range of the tracked gradient tensor elements, in the range [0, 1]. 0 represents the smallest representable value, and 1 the maximum. This is the single bin edge of the histogram that is an input to the loss scale updater algorithm. Default: 0.125.
thresholdUpperCountProportion_ – The proportion of the elements in the upper bin above which the loss scale is increased, and below which the loss scale is decreased. Should be in the range [0, 1]. Default: 1e-7.
updatePeriod_ – Indicate how often the loss scale update factor should be updated with respect to optimizer steps. Default: 1
gradientTensorTrackingMethod_ – The method for selecting gradient tensors whose statistics are to be tracked. Default: GradientTensorTrackingMethod::AllNonViewChangingGradientTensors.
-
std::size_t hash() const
Public Members
-
bool enabled = false
-
float binEdgeLocation = 0.125f
-
float thresholdUpperCountProportion = 1e-7
-
int updatePeriod = 1
-
GradientTensorTrackingMethod gradientTensorTrackingMethod = GradientTensorTrackingMethod::AllNonViewChangingGradientTensors
-
AutomaticLossScalingSettings() = default
-
struct BatchSerializationSettings
A structure containing batch serialization settings.
Public Functions
-
BatchSerializationSettings() = default
Default constructor for BatchSerializationSettings.
-
BatchSerializationSettings(int factor_, bool concatOnVirtualGraphChange_, bool concatOnExecutionPhaseChange_, bool concatOnPipelineStageChange_, BatchSerializationTransformContext transformContext_ = BatchSerializationTransformContext::Fwd, BatchSerializationMethod method_ = BatchSerializationMethod::UnrollDynamic, BatchSerializationBatchSchedule batchSchedule_ = BatchSerializationBatchSchedule::Isomorphic)
Constructor for BatchSerializationSettings.
- Parameters
factor_ – The number of compute batches to split operations into. Default: 0.
concatOnVirtualGraphChange_ – Indicate to break batch serialization chains (
true
) when the virtual graph changes (by concatenating the compute batches to the local batch). Default:true
.concatOnExecutionPhaseChange_ – Indicate to break batch serialization chains (
true
) when the execution phase changes (by concatenating the compute batches to the local batch). Default:true
.concatOnPipelineStageChange_ – Indicate to break batch serialization chains (
true
) when the pipeline stage changes (by concatenating the compute batches to the local batch). Default:true
.transformContext_ – An experimental value to control when batch serialization is applied. Default: ::Fwd.
method_ – An experimental value to control how batch serialization is applied. Default: BatchSerializationMethod::UnrollDynamic.
batchSchedule_ – An experimental value that changes how operations are scheduled. Default: BatchSerializationBatchSchedule::Isomorphic.
Public Members
-
int factor = 0
The number of compute batches to split operations into.
-
bool concatOnVirtualGraphChange = true
Break batch serialization chains when the virtual graph changes (by concatenating the compute batches to the local batch).
-
bool concatOnExecutionPhaseChange = true
Break batch serialization chains when the execution phase changes (by concatenating the compute batches to the local batch).
-
bool concatOnPipelineStageChange = true
Break batch serialization chains when the pipeline stage changes (by concatenating the compute batches to the local batch).
-
BatchSerializationTransformContext transformContext = BatchSerializationTransformContext::Fwd
Experimental value to control when batch serialization is applied.
-
BatchSerializationMethod method = BatchSerializationMethod::UnrollDynamic
Experimental value to control how batch serialization is applied.
-
BatchSerializationBatchSchedule batchSchedule = BatchSerializationBatchSchedule::Isomorphic
Experimental value that changes how operations are scheduled.
-
BatchSerializationSettings() = default
-
struct ExecutionPhaseSettings
A structure containing ExecutionPhase settings.
Public Functions
-
ExecutionPhaseSettings() = default
Default constructor for ExecutionPhaseSettings.
-
inline ExecutionPhaseSettings(int phases_, bool stages_, ExecutionPhaseIOSchedule weightIOSchedule_, ExecutionPhaseIOSchedule activationIOSchedule_, ExecutionPhaseIOSchedule optimizerStateIOSchedule_, ExecutionPhaseIOSchedule accumulatorIOSchedule_, ExecutionPhaseSchedule schedule_)
Constructor for ExecutionPhaseSettings.
- Parameters
phases_ – The number of execution phases for the whole model. Default=0.
stages_ – The number of overlapping stages:
1: Parallel streaming memory, default for 1 IPU per replica.
2: PingPong between 2 IPUs, default for 2 or more IPUs per replica (Default).
weightIOSchedule_ – The execution phase IO schedule for weight tensors. Default: ExecutionPhaseIOSchedule::Preload.
activationIOSchedule_ – The execution phase IO schedule for activation and gradient tensors. Default: ExecutionPhaseIOSchedule::Preload.
optimizerStateIOSchedule_ – An experimental value to control when batch serialization is applied. Default: ExecutionPhaseIOSchedule::OnDemand.
accumulatorIOSchedule_ – An experimental value to control how batch serialization is applied. Default: ExecutionPhaseIOSchedule::Preload.
schedule_ – An experimental value that changes how operations are scheduled. Default: ExecutionPhaseSchedule::Interleaving.
Public Members
-
int phases = 0
Number of ExecutionPhases for the whole model.
-
int stages = 2
Number of overlapping stages.
1: Parallel streaming memory, default for 1 IPU per replica.
2: PingPong between 2 IPUs, default for 2 or more IPUs per replica.
-
ExecutionPhaseIOSchedule weightIOSchedule = ExecutionPhaseIOSchedule::Preload
The execution phase IO schedule for weight tensors.
-
ExecutionPhaseIOSchedule activationIOSchedule = ExecutionPhaseIOSchedule::Preload
The execution phase IO schedule for activation and gradient tensors.
-
ExecutionPhaseIOSchedule optimizerStateIOSchedule = ExecutionPhaseIOSchedule::OnDemand
-
ExecutionPhaseIOSchedule accumulatorIOSchedule = ExecutionPhaseIOSchedule::Preload
-
ExecutionPhaseSettings() = default
-
struct ReplicatedCollectivesSettings
A structure containing settings for replicated collective operations.
Public Functions
-
ReplicatedCollectivesSettings(bool prepareScheduleForMergingCollectives = false, bool mergeAllReduceCollectives = false, bool mergeReduceScatterCollectives = false, bool mergeAllGatherCollectives = false)
Constructor for the ReplicatedCollectivesSettings struct.
- Parameters
prepareScheduleForMergingCollectives – Insert constraints into the schedule such that collectives which can be merged occur one right after the other.
true
to insert constraints,false
otherwise. Default:false
.mergeAllReduceCollectives – Identify allreduce operations which can be scheduled at the same time, and perform them as one larger operation to better utilize the bandwidth between replicas.
true
to identify operations,false
otherwise. Default:false
.
-
std::size_t hash() const
Public Members
-
bool prepareScheduleForMergingCollectives = false
-
bool mergeAllReduceCollectives = false
-
bool mergeReduceScatterCollectives = false
Identifies reduce-scatter operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.
-
bool mergeAllGatherCollectives = false
Identifies allgather operations which can be scheduled at the same time, and performs them as one larger operation so as to better utilize the bandwidth between replicas.
-
ReplicatedCollectivesSettings(bool prepareScheduleForMergingCollectives = false, bool mergeAllReduceCollectives = false, bool mergeReduceScatterCollectives = false, bool mergeAllGatherCollectives = false)
-
struct SessionOptions
A structure containing user configuration options for the Session class.
Public Functions
-
inline bool explicitPipeliningEnabled() const
Enable explicit pipelining.
Determined from values for
enablePipelining
,useHostCopyOpsfault
andenableExplicitMainLoops
.
-
inline bool implicitPipeliningEnabled() const
Enable implicit pipelining.
Determined from values for
enablePipelining
,useHostCopyOpsfault
andenableExplicitMainLoops
.
-
inline void enableExplicitIR(bool enable)
Enable explicit representations in the IR (code paths).
Enabled if
true
, otherwise not.
-
bool shouldDelayVarUpdates() const
-
int64_t getGlobalReplicationFactor() const
Get the global replication factor.
- Returns
If
enableDistributedReplicatedGraphs
istrue
, then returnglobalReplicationFactor
.If
enableReplicatedGraphs
istrue
, then returnreplicatedGraphCount
.otherwise return 1.
-
unsigned getAccumulationFactor() const
Get the gradient accumulation factor.
Throws an error if gradient accumulation is not enabled (
enableGradientAccumulation
isfalse
) and the factor (accumulationFactor
) is set to >1.- Returns
The accumulation factor.
-
bool autoRecomputationEnabled() const
Returns
true
if auto-recomputation is enabled,false
otherwise.
-
inline SessionOptions()
Constructor for SessionOptions.
Public Members
-
std::string logDir
A directory for log traces to be written into.
-
std::set<std::string> dotChecks = {}
When to write
.dot
files during IR construction.
-
int firstDotOp = 0
The ops written to the
.dot
file will be a part of the schedule, controlled by firstDotOp and finalDotOp.In particular, it will be [max(0, firstDotOp), min(N ops in IR, finalDotOp)).
-
int finalDotOp = 10000
See firstDotOp.
-
bool dotOpNames = false
Enable inclusion of the op name in the
.dot
file (the op type is always exported).Enabled when
true
. Default:false
.
-
bool exportPoplarComputationGraph = false
Enable export of Poplar computational graph.
Enabled when
true
. Default:false
.
-
bool exportPoplarVertexGraph = false
Enable export of Poplar vertex graph.
Enabled when
true
. Default:false
.
-
bool separateCallOpPdfs = true
Enable creation of separate PDFs for each subgraph when generating PDFs of IR graphs.
Enabled when
true
. Default:true
.
-
bool enableOutlining = true
Enable outlining.
This identifies and extracts repeated parts of computational graph into subgraphs. Enabled when
true
. Default:true
.
-
bool enableOutliningCopyCostPruning = true
Enable inclusion of the cost of copying of cached sections should be in the outlining cost model.
Enabled when
true
. Default:true
.
-
float outlineThreshold = 1.0f
Specify the incremental value that a sub-graph requires, relative to its nested sub-graphs (if any), to be eligible for outlining.
A high threshold results in fewer sub-graphs being outlined, a negative value results in all being outlined. The gross value of a sub-graph is the sum of its constituent ops’ Op::getSubgraphValue() values. To disable outlining, it is better to set enableOutlining to false than to set this value to infinity. The default value of 1.0f results in all high value operations such as convolution being cached, but standalone low value operations such as ReLU will not be.
Default: 1.0f.
-
float outlineSequenceBreakCost = 10000.0f
Specify the penalty applied to outlining potential sub-graphs if the sub-graph to be created breaks up a sequence of operations that are more efficient (for example for overlapping compute and exchange) when outlined together.
Default: 10000.0f.
-
SubgraphCopyingStrategy subgraphCopyingStrategy = SubgraphCopyingStrategy::OnEnterAndExit
Specify how copies for inputs and outputs for subgraphs are lowered.
Setting this value to SubgraphCopyingStrategy::JustInTime may save memory at the cost of fragmenting subgraphs into multiple Poplar functions. This may be particularly useful when a number of weight updates are outlined in one subgraph, as it may prevent multiple weight tensors from being live at the same time inside the subgraph.
Default: SubgraphCopyingStrategy::OnEnterAndExit.
-
RecomputationType autoRecomputation = RecomputationType::None
Enable recomputation of operations in the graph in the backward pass.
This will reduce model size at the cost of computation cycles.
Default: RecomputationType::None (no recomputation).
-
MergeVarUpdateType mergeVarUpdate = MergeVarUpdateType::None
Enable merging of VarUpdates into groups of VarUpdates, by flattening and concatenating variable tensors and updating tensors.
Default: MergeVarUpdateType::None (no merging).
-
int64_t mergeVarUpdateMemThreshold = 1000000
Specify the memory threshold for VarUpdateOp merging algorithms.
The MergeVarUpdateType::AutoLoose and MergeVarUpdateType::AutoTight VarUpdateOp merging algorithms have a threshold on the total memory of variable tensors to merge for updating. Defined as total memory in bytes.
Default: 1000000.
-
int64_t looseThresholdAtPeak = 8000
Specify the threshold at peak used in the calculation of the absolute threshold in the MergeVarUpdateType::AutoLoose VarUpdateOp merging algorithm.
min(mergeVarUpdateMemThreshold, liveAtPeak - liveCurrently + looseThresholdAtPeak)
where:
liveAtPeak
is an estimate of the maximum live memory of the computation; andliveCurrently
is an estimate of the live memory where the threshold is being used to determine whether to schedule or postpone a VarUpdateOp.
Default: 80000.
-
bool rearrangeAnchorsOnHost = true
Enable rearrangement (in memory) of anchor tensors to be done on the host.
Before anchor tensors are streamed from device to host, they are not necessarily arranged in memory as required when they are to be copied from host stream to host. This can be done on the device or on the host.
Default:
true
(Rearrangement done on host to save memory, but often at the expense of cycles, especially for larger anchor tensors.).
-
bool rearrangeStreamsOnHost = false
Enable rearrangement (in memory) of stream tensors to be done on the host.
Before stream tensors are streamed from host to device, they are not necessarily arranged in memory as required when they are to be copied from host stream to device. This can be done on the device or on the host.
Default:
false
(Rearrangement done on device).
-
bool enablePrefetchDatastreams = true
Enable prefetching for input data streams.
Poplar will speculatively read data for a stream before it is required in order to allow the ‘preparation’ of the data to occur in parallel with compute. Enabled when
true
. Default:true
.
-
unsigned defaultBufferingDepth = 1
Specify the default buffering depth value used for streams that are not re-arranged on the host.
For tensors that are rearranged on the host, a buffering depth of 1 will always be used. This default value can be overridden via bufferingDepthMap.
-
unsigned defaultPrefetchBufferingDepth = initialDefaultPrefetchBufferingDepthValue
- Deprecated:
This session option name has been deprecated and will be removed in a future release.
-
std::map<TensorId, unsigned> bufferingDepthMap
This mapping can be used to set stream-specific buffering depths.
The buffering depth could be thought of as being the size of a circular buffer that feeds data to and from Poplar. A buffering depth greater than 1 may improve the performance due to increased parallelisation but comes at the cost of increasing the memory footprint. Streams for tensors that have no entry in this map will default to 1 (if a tensor is rearranged on host) or defaultBufferingDepth (if a tensor is not rearranged on host). Specifying a tensor that gets rearranged on host in this map will throw an error.
-
std::map<TensorId, unsigned> prefetchBufferingDepthMap
- Deprecated:
This session option name has been deprecated and will be removed in a future release.
-
bool enableNonStableSoftmax = false
Enable the non-stable softmax Poplar function.
By default, the stable softmax Poplar function is used. The input tensor to softmax, \(x\), is preprocessed by subtracting \(max(x)\) from each element before computing the exponentials, ensuring numerical stability. If the inputs to the softmax operations are small enough to not cause overflow when computing the exponential, then the non-stable version can be enabled instead, to increase the speed.
Default:
false
(not enabled).
-
bool enableReplicatedGraphs = false
Enable replication of graphs. Default:
false
(not enabled).
-
bool enableGradientAccumulation = false
Enable gradient accumulation. Default:
false
(not enabled).
-
ReductionType accumulationAndReplicationReductionType = ReductionType::Sum
Specify how gradients are reduced when using gradient accumulation and graph replication.
Default: ReductionType::Sum.
-
MeanReductionStrategy meanAccumulationAndReplicationReductionStrategy = MeanReductionStrategy::Post
Specify when to divide by a mean reduction factor when accumulationAndReplicationReductionType is set to ReductionType::Mean.
Default: MeanReductionStrategy::Post.
-
int64_t replicatedGraphCount = 1
Specify the number of model replications.
If
enableReplicatedGraphs
istrue
,replicatedGraphCount
will set the number of model replications. For example, if the model uses 1 IPU, areplicatedGraphCount
of 2 will use 2 IPUs. If the model is pipelined across 4 IPUs, areplicatedGraphCount
of 4 will use 16 IPUs in total. Therefore, the number of IPUs requested must be a multiple ofreplicatedGraphCount
. If the training is done across multiple instances of the program then thereplicatedGraphCount
is the number of replicas for this instance.
-
int64_t accumulationFactor = 1
Specify the number of micro-batches to accumulate before applying the varUpdate.
-
VirtualGraphMode virtualGraphMode = VirtualGraphMode::Off
Specify how to place ops on virtual graphs to achieve model parallelism, either manually using model annotations, or automatically.
Default: VirtualGraphMode::Off.
-
std::vector<float> virtualGraphSplitRatios
Specify split ratios when VirtualGraphModel::Auto enabled.
These values represent split ratios in each device and each of the values is in range (0, 1).
For example, to uniformly split the whole graph on 4 IPUs, the value should be [0.25, 0.25, 025, 0.25].
-
bool enablePipelining = false
Enable pipelining of virtual graphs. Default:
false
(not enabled).
-
SyntheticDataMode syntheticDataMode = SyntheticDataMode::Off
Specify whether to use real or synthetic data to initialize input tensors.
Streaming to/from the host is only enabled for SyntheticDataMode::Off which indicates that real data is being used.
Default: SyntheticDataMode::Off.
-
bool instrumentWithHardwareCycleCounter = false
Add instrumentation to the program to count the number of device cycles (of a single tile, on a single IPU) that the main program takes to execute.
Expect this to have a small detrimental impact on performance.
-
std::set<Instrumentation> hardwareInstrumentations = {Instrumentation::Outer}
-
bool disableGradAccumulationTensorStreams = false
Disable saving of weight gradient tensors off the device.
If
true
, the weight gradient tensors are not saved off the device whendevicex.weightsFromHost()
is called.Note
This option is overridden if
syntheticDataMode
is not SyntheticDataMode::Off.Note
Weight gradient tensors that are also optimiser tensors will only be disabled if both
disableGradAccumulationTensorStreams
anddisableOptimizerStateTensorStreams
aretrue
.
-
bool disableOptimizerStateTensorStreams = false
Disable streaming of optimizer tensors.
If
true
, streaming of optimizer tensors is disabled. This setting can be used to conserve memory if you are not interested in checkpointing the optimizer state.Note
Weight gradient tensors that are also optimiser tensors will only be disabled if both
disableGradAccumulationTensorStreams
anddisableOptimizerStateTensorStreams
aretrue
.
-
bool compileEngine = true
Setting to only build the Poplar graph but not compile not.
If
false
, the backend will build the Poplar graph but not compile it into an Engine. In this case, no execution can be performed, and nothing can be transferred to the device. API calls which retrieve information from the graph building stage, such as tile mapping introspection, can still be used.
-
bool constantWeights = true
Specify an optimization for an inference session to have constant weights.
Set this option to
false
in order to change the weights with a call to Session::resetHostWeights() after the session has been prepared. This option has no effect on a training session.Default:
true
.
-
bool enableEngineCaching = false
Enable Poplar executable caching.
The file is saved to the location defined with
cachePath
. The file will be in the PopEF format. This means that it can be used to run inference using the Triton Inference Server because Graphcore provides a backend to it. See the Poplar Triton Backend user guide for more information.Default:
false
(not enabled).
-
bool enableVariablesCaching = true
Enable variable caching.
This means that the caching process will save variables as additional PopEF blobs to the file location defined with
cachePath
. If PopART will require data for variables (during cache reading process), they will be automatically read from the cache file.Note, turning this off allows a PopART Session to optimise the host memory it consumes during model runtime. Specifically, weightsToHost() can write directly to the IR tensor data buffers. If the option were on, this would not be safe and the session would have to create separate buffers to write the fetched data to.
Default:
true
(enabled).
-
std::string cachePath = "session_cache"
Folder to save the
poplar::Executable
to.
-
bool enableFloatingPointChecks = false
Enable that exceptions are thrown when floating point errors occur.
Default:
false
(not enabled).
-
bool enableStochasticRounding = false
Enable stochastic rounding.
PopART will set the Poplar engine option
target.deterministicWorkers
totrue
if this option is set and tofalse
if it is not set. Adding a value for “target.deterministicWorkers” to SessionOptions::engineOptions overrides this behaviour.Default:
false
(not enabled).
-
bool _enableRngStateManagement = false
-
ExecutionPhaseSettings executionPhaseSettings
Configuration settings for execution phases.
-
AccumulateOuterFragmentSettings accumulateOuterFragmentSettings
Configuration setting for operations in the accumulate outer fragment.
-
bool explicitRecomputation = false
Enable explicit recomputation.
Default:
false
(not enabled).
-
NumIOTiles numIOTiles
Number of IPU tiles dedicated to IO.
-
bool aliasZeroCopy = false
Enable zero-copy for subgraphs.
-
BatchSerializationSettings batchSerializationSettings
Configuration setting for batch serialization.
-
AutodiffSettings autodiffSettings
Configuration settings for the autodiff transform.
-
bool delayVarUpdates = true
Options to delay variable updates as much as possible.
-
bool scheduleNonWeightUpdateGradientConsumersEarly = false
-
bool enableFullyConnectedPass = true
Enable the global
fullyConnectedPass
option for matmuls.See also
poplin::matMul(poplar::Graph, poplar::Tensor, poplar::Tensor, poplar::program::Sequence, poplar::Type, poplar::DebugContext, poplar::OptionFlags, matmul::PlanningCache).
-
bool enableSerializedMatmuls = true
Enable/disable the serializing of matmuls.
-
std::string partialsTypeMatMuls
Set the partials type globally for matmuls.
Can be overridden individually with Builder.setPartialsType(). Valid values are
"float"
and"half"
. By default, this is not set, so no global partials type is imposed.
-
bool enableStableNorm = false
If
true
, computes the mean first and subtracts the activations from it before computing the variance.The implementation with this flag set to
true
is slower than when set tofalse
. The stable version requires the first order moment to be estimated and applied to the sample set before the second order central moment is calculated.
-
std::map<std::string, std::string> engineOptions
Poplar engine options.
-
std::map<std::string, std::string> convolutionOptions
Poplar convolution options.
-
std::map<std::string, std::string> lstmOptions
Poplar LSTM options.
-
std::map<std::string, std::string> matmulOptions
Poplar matmul options.
-
std::map<std::string, std::string> reportOptions
Poplar reporting options.
-
std::map<std::string, std::string> gclOptions
GCL options.
-
ExperimentalSettings experimentalSettings
Configuration setting for custom transform applier.
-
std::vector<std::string> customCodelets
List of codelet files (with file extension) to be added to the Poplar graph.
See the Poplar documentation for poplar::Graph for more information.
-
std::vector<TensorId> updatableNamedBuffers
List of model named buffers that can be updated with call to copyNamedBuffersToDevice().
This allows to update just a subset of model weights instead of all or them as it happens with copyWeightsToDevice() call.
-
std::string customCodeletCompileFlags
Compile flags for the custom codelets.
For example
-g
to generate debug info. See the Poplar documentation for poplar::Engine for more information.
-
double timeLimitScheduler = 1e9
The maximum allowed time (in seconds) that can be spent searching for a good graph schedule before a solution must be returned.
-
int64_t swapLimitScheduler = static_cast<int64_t>(1e9)
The maximum number of improving steps allowed by the scheduling algorithm before a solution must be returned.
-
std::string serializedPoprithmsShiftGraphsDir = {}
The directory to serialize Poprithms graphs to.
PopART uses Poprithms for scheduling PopART graphs. The Poprithms graphs created for scheduling can be optionally serialised (written to file). If
serializedPoprithmsShiftGraphsDir
is empty, then the graphs will not be serialised. The names of serialization files will bepoprithms_shift_graph_i.json
for the lowest non-existing values ofi
. The directory must already exist, PopART will not create it.
-
std::string kahnTieBreaker = "greedy"
Specify which method is used to control how ops are scheduled.
The initial scheduling is done with Kahn’s algorithm. When several ops are free to be scheduled, this controls which method is used.
Options are described in the Poprithms KahnTieBreaker enum.
-
size_t transitiveClosureOptimizationThreshold = {100000}
Specify the transitive closure optimization threshold.
The transitive closure optimization pass can significantly accelerate the scheduler. It does not, in general, affect the final schedule returned. It is run between initialization with Kahn’s algorithms and the shifting swaps. The transitive closure optimization pass is O(nOps^2) and so should not be used for extremely large graphs. If a graph is above this threshold, the transitive closure optimization pass is not run.
-
bool decomposeGradSum = false
Enable replacement of single sums of partial gradients with a tree of additions.
This can reduce max liveness at the cost of extra cycles. A typical use case for this would be if a large weight tensor is used as an input to many operations.
Default:
false
(not enabled).
-
ReplicatedCollectivesSettings replicatedCollectivesSettings
Control the behavior of different collective operations.
-
bool enableDistributedReplicatedGraphs = false
Enable training with Poplar replicated graphs across multiple PopART instances.
Default:
false
(not enabled).
-
int64_t globalReplicationFactor = 1
The total number of replicas in a multi-instance, replicated-graph training session (this should be left as the default value (1) if distributed replicated graphs are disabled).
This value includes local replication.
-
int64_t globalReplicaOffset = 0
The first replica index that this PopART instance is running.
-
bool groupHostSync = false
Specify to group the streams from the host to the device at the beginning of the schedule, and the streams from the device to the host at the end of the schedule.
This trades off memory usage for speed.
When
true
, tensors will stay live for longer.
Default:
false
(not enabled).Note
This setting has no effect when useHostCopyOps is enabled (
true
).
-
bool strictOpVersions = true
Enable strict op version checks.
Strict op version checks will throw an error if the exact version of an op required for the model opset is not supported. Turning this check off will cause PopART to fall back to the latest implementation of the op that is supported.
Default:
true
(enabled).Warning
Turning off these checks may cause undefined behaviour.
-
bool opxAliasChecking = false
Enable running Opx checks to verify that IR tensor aliasing information corresponds to the lowered Poplar tensor aliasing.
Default:
false
(not enabled).
-
bool opxModifyChecking = false
Enable running Opx checks to verify that IR tensor modification information corresponds to the lowered Poplar tensor modifications.
Default:
false
(not enabled).
-
bool useHostCopyOps = false
Enable use of IR graph operations for data and anchor streams.
Default:
false
(not enabled).
-
bool enableEfficientOverlapIOTopoCons = false
Enable simplified and equivalent overlapIO constraints.
Suppose we have the N bins in each of three stage(8 for before loop /7 for insdie loop /6 for after loop), and L ops for each bins, vallina implementaiton of overlapio creates topocons of complexity O(N*N*L*L).
To make sure InitOps in each step are scheduled before HostLoadOps, we only need to keep topo constrains in each bin and let the last of op of each bin Bin0 is scheduled before the first op of Bin1 next to Bin0. Then total complexity O(N*N*L*L) is reduced to (N*L).
Default:
false
(not enabled).
-
bool enableLoadAndOffloadRNGState = false
Enable load and offload of device RNG state from host.
Default:
false
(not enabled).
-
TensorLocationSettings activationTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}
Tensor location settings for activation/gradient tensors.
-
TensorLocationSettings weightTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}
Tensor location for weight tensors.
-
TensorLocationSettings optimizerStateTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}
Tensor location for optimizer state tensors.
-
TensorLocationSettings accumulatorTensorLocationSettings = TensorLocationSettings{TensorLocation(), 2, 8192}
Tensor location for gradient accumulator tensors.
-
std::map<TensorId, TensorLocation> tensorLocationSettingsOverride
Override tensor location for specific tensors by setting tensor locations for specific tensor ID values.
-
AutomaticLossScalingSettings automaticLossScalingSettings
Settings to enable and configure the automatic loss scaling behaviour when training.
Note
Automatic loss scaling is in preview. It is well tested and enabled in some of our example applications, but may not behave as expected in all models. Recommendation: if your model with automatic loss scaling enabled does not converge or triggers a compilation error, then you will need to set the loss scale manually.
-
DeveloperSettings developerSettings
Settings for developers to configure testing and benchmarking.
-
bool enableSupportedDataTypeCasting = true
Enable casting to supported data types.
If enabled (
true
), casts any tensor of unsupported data types to supported data types when lowering to Poplar. Currently, this implies casting:INT64 -> INT32
UINT64 -> UINT32 The cast will throw an error for incompatible data types and over/underflows, and will warn about narrowing casts.
Default:
true
(enabled).
-
bool enableExplicitMainLoops = false
Enable explicit main loop transformation, and disable implicit training loops.
Note
This will be deprecated and enabled by default.
-
bool groupNormStridedChannelGrouping = false
Enable fast math mode for group norms.
Group norms have a fast math mode which changes the implementation to run faster on IPU but as a consequence is incompatible with other implementations (so for running trained weights on host). The default (
false
) is to use the correct, but slightly slower mode.
-
std::function<void(int, int)> compilationProgressLogger
Callback function used to indicate PopART compilation progress.
The function should not block. All calls to the callback function will be made from the main thread so blocking in the callback will block compilation from progressing.
If this logger is not set then compilation progress will be printed on the info channel.
- Param int
The progress value.
- Param int
The maximum value for the progress.
-
int compilationProgressTotal = 100
Total progress ticks until compilation complete.
-
bool enableMergeExchange = true
Enable merging remote and host IO operations to facilitate IO overlap.
true
to enable, otherwisefalse
.Default=
true
.
-
bool ensureFp32LossScaleTensor = false
Ensure that the loss scale tensor is fp32 and that this is combined with fp16 activations as late as possible to produce the first fp16 activation gradients.
This makes it possible to choose a loss scale value greater than max(fp16). This is also recommended when automatic loss scaling is enabled. Only compatible with models that have an fp16 loss scale tensor.
true
ensures that the loss scale tensor is fp32.Default:
false
.
-
bool enableInplaceAmbiguityChecking = false
Enable creation of an
AliasModel
object for each graph and run the Poprithms ambiguity checker on it.This throws an error if the graph has a potential inplacing ambiguity.
See
poprithms::memory::inplace::Graph::AmbiguityStatus
for more info on what constitutes an ambiguity.If set to
true
,AliasModel
object is created for each graph and the the Poprithms ambiguity checker is run on it. No ambiguity checking is performed if this option is set tofalse
(default). However inplace fallbacks will occur if necessary.
-
bool createImplicitPipeliningFwdOnlyProgram = false
- Deprecated:
Create a custom program containing the forward pipeline only.
-
bool throwIfLog2ScaleTensorNotInRange = true
If set to
true
, throw a Poplar error if any fused ops that consume a log2 scale tensor receive a log2 scale tensor value not in the integer range [-32, 32).If set to
false
, no error is thrown. However, note that this may lead to undefined behaviour if the value of the log2 scale is outside the range.
-
bool enableConstantFoldingOfMultipleConsumers = true
If set to
false
, disable constant folding on ops if any input have multiple consumers.Default=
true
.
-
bool useLoopCandidateCreator = false
Use loop candidate creator for constant if one exsits.
Default=
false
.
-
bool stashAllTensorsInferencePipeline = false
Stash all tensors when inference pipeline.
Default=
false
.
-
struct ExperimentalSettings
Public Members
-
std::map<std::string, std::vector<std::string>> customTransformApplierSettings
Custom transform applier settings.
Enable to insert custom transform sequence at predefined checkpoint. Multiple checkpoint names and transform names can be passed for different model configurations.
The predefined checkpoint names are: FWD0: Initial IR immediately after lowering from ONNX to the IR.
FWD1: After the pre-alias patterns have been applied to FWD0.
BWD0: After growing the backward pass (including the optimiser step). Note this happens before optimiser decomposition, so the optimiser will appear as a single special op rather than the many ops that implement it.
PREALIAS: After pre-alias transforms have been applied to BWD0.
MAINLOOPS: After the MainLoops transform has been applied. This transform adds explicit loop ops to the IR for device iterations (batches per step) and gradient accumulation.
FINAL: The final IR after preparation.
The transform names are defined by PopART and users.
For example to execute ‘Transform A’ and ‘Transform B’ at ‘Fwd0’ checkpoint and exectue ‘Transform C’ at ‘Fwd1’ checkpoint:
{ “Fwd0”: [ “Transform A”, “Transform B” ], “Fwd1”: [ “Transform C” ] }
Note
This setting is experimental for inference and may change.
-
bool createHostTransferableTensorWithOffset = false
Accumulate the created tensors bytes, rotate the start tile of the next tensor to balance the tile mapping.
Especially when there are a lot of small input tensors, enable it can avoid mapping on tile0 all the time.
Default=
false
.
-
std::map<std::string, std::vector<std::string>> customTransformApplierSettings
-
class NumIOTiles
A wrapper class for the SessionOptions::numIOTiles option that permits any int value and has an ‘unassigned’ state.
Public Functions
-
NumIOTiles()
Constructor.
-
NumIOTiles(int numIOTiles)
Constructor.
- Parameters
numIOTiles – The number of IPU tiles dedicated to IO.
-
bool operator==(const int &rhs) const
Compare with int.
-
operator int() const
Auto convert to int.
-
NumIOTiles &operator=(const int &x)
Assign value using int.
-
NumIOTiles()
-
inline bool explicitPipeliningEnabled() const
-
struct TensorLocationSettings
A structure containing user configuration for cache/offloading settings.
Public Functions
-
TensorLocationSettings() = default
Constructor.
-
TensorLocationSettings(TensorLocation location_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)
Constructor.
- Parameters
location_ – The tensor location information.
minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.
minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.
-
TensorLocationSettings(TensorStorage storage_, int minElementsForOffChip_ = 2, int minElementsForReplicatedTensorSharding_ = 8192)
Constructor.
- Parameters
storage_ – The tensor storage information.
minElementsForOffChip_ – The minimum number of elements below which offloading won’t be considered.
minElementsForReplicatedTensorSharding_ – The minimum number of elements necessary for replicated tensor sharding.
Public Members
-
TensorLocation location = TensorLocation()
The default tensor location for this tensor type.
-
int minElementsForOffChip = 2
The minimum number of elements below which offloading won’t be considered.
-
int minElementsForReplicatedTensorSharding = 8192
A minimum number of elements below which replicated tensor sharding won’t be considered.
-
TensorLocationSettings() = default
#include <popart/variablesettings.hpp>
-
class VariableSettings
A class to dictate behaviour of variables and reductions of such across multiple graphs.
Public Functions
-
void verify()
Runs test to see if the VariableSettings are invalid, and throws an error if so.
- Returns
the CommGroup sharedVariableDomain of this VariableSettings.
-
ReplicaGrouping getReplicaGrouping(unsigned numReplicas) const
- Parameters
numReplicas – The number of replicas in the IR this is used in.
- Returns
the ReplicaGrouping domain of this VariableSettings.
-
bool isUsingCommGroup() const
- Returns
whether the VariableSettings were initialised using a CommGroup or a stride.
-
CommGroupType getCommGroupType() const
- Returns
the CommGroupType. The value of this is invalid if VariableSettings::isUsingCommGroup returns false.
-
unsigned getStride() const
- Returns
the stride. The value of this is invalid if VariableSettings::isUsingCommGroup returns true.
-
unsigned getGroupSize() const
- Returns
the replica group size.
-
inline VariableRetrievalMode getRetrievalMode() const
- Returns
the VariableRetrievalMode retrievalMode of this VariableSettings.
-
VariableSettings()
“Default” constructor, defaults CommGroup to [All, 0] and retrievalMode to OnePerGroup.
-
VariableSettings(VariableRetrievalMode retrievalMode_)
Defaults CommGroup to [All, 0].
-
VariableSettings(CommGroup sharedVariableDomain_, VariableRetrievalMode retrievalMode_)
Entirely custom VariableSettings.
-
VariableSettings(unsigned stride, unsigned groupSize)
-
VariableSettings(unsigned stride, unsigned groupSize, VariableRetrievalMode retrievalMode)
-
unsigned numReplicasReturningVariable(unsigned replicaCount) const
Calculate the number of replicas that will return this variable.
- Parameters
replicaCount – Number of global replicas.
- Returns
Number of variables returned.
-
unsigned getGroupCount(unsigned replicaCount) const
- Parameters
replicaCount – The replicationFactor of the graph.
- Returns
The number of groups given the replicaFactor and the VariableSettings.
-
unsigned getStride(unsigned replicaCount) const
- Parameters
replicaCount – The replicationFactor of the graph.
- Returns
The stride between each member of a group.
-
unsigned getRealGroupSize(unsigned replicaCount) const
Because CommGroup’s don’t have a defined group-size if the type is All or None, this function will return a group-size that is always accurate, based on replicas.
- Parameters
replicaCount – The replication factor
- Returns
The actual number of replicas in a group
-
unsigned getGroupRepresentative(unsigned group) const
Get the default first member of a group.
- Parameters
group – The group to return the representative for.
- Returns
The representative replica of this group.
-
Shape shapeOnReplica(Shape full_shape, unsigned replicaCount, const TensorId name) const
The shape Onnx reads holds an extra outer dimension in certain cases, where the outer dimension represents the number of returning replica variables.
This function takes an Onnx full-shape and removes the outer dimension safely (ie. checks if the outer dimension matches an expected outer dimension). A quick-function to avoid duplicate code.
- Parameters
full_shape – The shape as presented by Onnx.
replicaCount – The local replication factor, used to calculate the return factor.
name – The TensorId of the function, used to give good error feedback.
- Returns
The shape of the data on the replica.
-
Shape shapeOnHost(Shape replica_shape, unsigned replicaCount) const
Takes the shape of a tensor on a replica and returns it’s full ONNX shape.
This is the inverse operation to shapeOnReplica
- Parameters
replica_shape – The shape of the data on a replica.
replicaCount – The local replication factor, used to calculate the return factor.
- Returns
The shape as presented by Onnx.
-
std::vector<std::vector<std::int64_t>> groups(unsigned replicaCount) const
This function returns a set of vectors where each vector contains all the replicaId’s of the replicas with a sharedVariableDomain given the variableSettings and the replicaCount.
- Parameters
replicaCount – The local replication factor
- Returns
A set of sets, such that set.at(a).set(b) is member nr. b of group a, and set.size() is the number og groups and set.at(A).size() is the size of the group.
-
bool operator==(const VariableSettings &other) const
Compare two variable-settings.
- Parameters
other – VariableSettings to compare these settings to.
- Returns
True if all internal elements are the same
-
bool operator!=(const VariableSettings &other) const
Compare two variable-settings.
- Parameters
other – VariableSettings to compare these settings to.
- Returns
False if all internal elements are the same
-
void verify()
-
enum class popart::VariableRetrievalMode
Enum type that describes how to retrieve variables from the replicas.
Each replica is in a group defined by the
VariableSettings::sharedVariableDomain
. Replicas within a group have variables initialized with the same values.Values:
-
enumerator OnePerGroup = 0
Returns one variable per group (defined by the
VariableSettings::sharedVariableDomain
CommGroup
), automatically returns the first replica of each group, where first means the one with the lowest replica ID.
-
enumerator AllReduceReplicas
As OnePerGroup, but performs an AllReduce among the replicas in the same group according to
VariableSettings::sharedVariableDomain
!!! CURRENTLY UNSUPPORTED.
-
enumerator AllReplicas
Returns all replica Weights.
-
enumerator OnePerGroup = 0
#include <popart/commgroup.hpp>
-
class CommGroup
Class to specify sub-groups of replicas.
Examples of derived sub-groups:
IPU-link domain sub-rack:
type == Consecutive && replicaGroupSize == 64/replica-size/N
where
N
is a power of two andreplicaGroupSize > 1
.Complete IPU-link domain / full rack:
type == Consecutive && replicaGroupSize == 64/replica-size
Using GW-links only:
type == Orthogonal && replicaGroupSize == numberOfIpuLinkDomains
Public Functions
-
CommGroup()
Default CommGroup constructor.
Sets
type
to CommGroupType::All andreplicaGroupSize
to 0.
-
inline CommGroup(CommGroupType type, unsigned groupSize)
Construct CommGroup.
- Parameters
groupType – The replica group type.
groupSize – The replica group size.
-
explicit CommGroup(const ReplicaGrouping &grouping)
Construct CommGroup from a ReplicaGrouping.
- Parameters
grouping – The replica grouping.
Public Members
-
CommGroupType type = CommGroupType::All
Replica group type.
-
unsigned replicaGroupSize = 0
Replica group size.
-
enum class popart::CommGroupType
PopART equivalent of GCL CommGroupType.
Each of these enumeration constants has a corresponding GCL CommGroupType value.
Values:
-
enumerator All = 0
All replicas viewed as one group, replica group size is ignored.
-
enumerator Consecutive
Groups are consecutive in replicas.
If there are N replicas denoted
{0, ... N-1}
and the group size isk
, then there areN/k
groups of sizek
as{0, 1, ... k-1}, {k, ... 2k-1} ... {N-k-1, ... N-1}
.
-
enumerator Orthogonal
Groups are sliced orthogonal to the replica ordering.
If there are
N
replicas denoted{0, ... N-1}
and the group size isk
, then there arem = N/k
groups of sizek
as{0, m, 2m, ...}, {1, m+1, 2m+1, ...} ... {m-1, 2m-1, ... N-1}
.
-
enumerator None
Each replica is in its own group; the replica group size is ignored.
-
enumerator N
Number of values.
-
enumerator All = 0
14.2. Data input and output (IStepIO)
#include <popart/istepio.hpp>
-
class IStepIO
An abstract base class through which input and output data is passed to a Session (see Session::run).
Data is passed via buffers. In the case of buffers returned by IStepIO::in, PopART reads from these buffers. In the case of IStepIO::out, PopART writes to these buffers. The IStepIO::inComplete() and IStepIO::outComplete() functions are called by PopART to signal it is done with an input or output buffer.
An IStepIO implementation should conceptually implement a rolling queue of active buffers for each input and output tensor. Every successful call to IStepIO::in should yield a new data buffer for PopART to read from and add it to the head of the conceptual queue. Conversely, every call to IStepIO::inComplete() should be taken to mean that the buffer at the tail-end of the queue is no longer being used by PopART. This buffer is removed from the conceptual queue.
Note that a IStepIO::in call with the
prefetch
flag set is only considered successful when it returns data.Output works analogously to input.
The expected total number of input (or output) buffers that are ‘completed’ for a tensor in one Session::run call is
bps
\(\times\) SessionOptions::accumulationFactor \(\times\) SessionOptions::replicatedGraphCount, wherebps
is the number of batches per call to Session::run (this is a value captured by the DataFlow instance passed to the Session instance).Note, however, that there may be additional ‘incomplete’ calls to IStepIO::in and IStepIO::out.
Furthermore, the number of input (or output) buffers that may be ‘incomplete’ at a given time for a given tensor should not normally be more than SessionOptions::bufferingDepth \(\times\) SessionOptions::replicatedGraphCount, but this bound is not guaranteed.
EXAMPLE: Suppose a session is configured such that the total expected number of input buffers is 6 and these are input buffers for a tensor with ID
t
with 100 elements. The associated input calls in IStepIO may look like this if SessionOptions::bufferingDepth is 3:in("t", 100, false) -> Give buffer[0] to PopART. in("t", 100, true) -> Give buffer[1] to PopART. in("t", 100, true) -> Give buffer[2] to PopART. inComplete("t", 100) -> buffer[0] is no longer required and can be reused. in("t", 100, true) -> Give buffer[3] to PopART. inComplete("t", 100) -> buffer[1] is no longer required and can be reused. in("t", 100, true) -> Give buffer[4] to PopART. inComplete("t", 100) -> buffer[2] is no longer required and can be reused. in("t", 100, true) -> Give buffer[5] to PopART. inComplete("t", 100) -> buffer[3] is no longer required and can be reused. in("t", 100, true) -> No data available, return nullptr. inComplete("t", 100) -> buffer[4] is no longer required and can be reused. inComplete("t", 100) -> buffer[5] is no longer required and can be reused.
Subclassed by popart::StepIOCallback, popart::StepIOGeneric< ARRAY_TYPE, ACCESSOR_TYPE, ArrayInfoT >, popart::StepIOGeneric< IArray, StepIONS::IArrayAccessor, IArray & >
Public Functions
-
virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, const bool isBroadcast = false) = 0
Request a new input data buffer.
The memory in this buffer is available for use in PopART until the corresponding inComplete() call.
Note
: Failing to provide a valid data buffer will result in a runtime failure if
prefetch
is set tofalse
.- Parameters
id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.
prefetch – If set to
true
the inability to provide data is not considered an error. Iffalse
, it is considered an error if no data can be provided.
- Returns
The input buffer for this tensor (or nullptr on failure) returned as a ConstVoidData object.
-
virtual void inComplete(TensorId id, int64_t numElements, const bool isBroadcast = false) = 0
Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.
- Parameters
id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.
-
virtual MutableVoidData out(TensorId id, int64_t numElements) = 0
Request a new output data buffer.
The memory in this buffer is available for use in PopART until the corresponding inComplete() call and will be modified in-place.
Note
Failing to provide a valid data buffer will result in a runtime failure.
- Parameters
id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.
- Returns
The output buffer for this tensor returned as a MutableVoidData object.
-
inline virtual void outComplete(TensorId)
Notify the user (running a PopART program) that a previously retrieved input data buffer is no longer used by PopART.
- Parameters
id – The ID of the tensor to return data for.
numElements – The number of elements in the tensor.
-
inline void enableRuntimeAsserts(bool b)
Enable or disable runtime asserts.
If runtime asserts are enabled, then a check that the input and output buffers have the correct number of elements is performed. As Session.run() is called multiple times during a user’s session, the check is only performed in the first call to Session.run(), under the assumption that the user is unlikely to change the size of buffers between runs.
- Parameters
b – The setting to enable runtime asserts (
true
) or disable runtime asserts (false
).
-
inline bool runtimeAssertsEnabled() const
Check if runtime asserts are enabled.
- Returns
true
if runtime asserts are enabled, otherwisefalse
.
-
virtual void assertNumElements(const popx::Executablex&) const = 0
Check number of elements.
This check is performed when runtimeAssertsEnabled() is
true
.- Parameters
Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.
-
virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, const bool isBroadcast = false) = 0
#include <popart/stepio.hpp>
-
class StepIO : public popart::StepIOGeneric<IArray, StepIONS::IArrayAccessor, IArray&>
Class to provide a Session object with input and output data.
-
class StepIOCallback : public popart::IStepIO
Class that implements the IStepIO interface using user-provided callback functions.
The IStepIO interface contains a number of pure virtual member functions through which PopART receives buffers to read data from and buffers to write data to. StepIOCallback inherits from IStepIO and implements those member functions by delegating the logic to the callback functions passed in the constructor. This gives the user full control as to how data buffers are provisioned.
See IStepIO for more details on the expected behaviour of the callbacks.
Public Types
-
using InputCallback = std::function<ConstVoidData(TensorId, bool)>
Callable object that implements IStepIO::in().
-
using InputCompleteCallback = std::function<void(TensorId)>
Callable object that implements IStepIO::inComplete().
-
using OutputCallback = std::function<MutableVoidData(TensorId)>
Callable object that implements IStepIO::out().
-
using OutputCompleteCallback = std::function<void(TensorId)>
Callable object that implements IStepIO::outComplete().
Public Functions
-
inline StepIOCallback(InputCallback inputCallback, InputCompleteCallback inputCompleteCallback, OutputCallback outputCallback, OutputCompleteCallback outputCompleteCallback)
Construct a StepIOCallback object.
- Parameters
inputCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::in() is called. See IStepIO for details on how to implement this method.
inputCompleteCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::inComplete() is called. See IStepIO for details on how to implement this method.
outputCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::out() is called. See IStepIO for details on how to implement this method.
outputCompleteCallback – The callback function the constructed StepIOCallback instance will use when IStepIO::outComplete() is called. See IStepIO for details on how to implement this method.
-
inline virtual void assertNumElements(const popx::Executablex&) const
Check number of elements.
This check is performed when IStepIO::runtimeAssertsEnabled() is
true
.- Parameters
Executablex – The input executable to be checked that the input and output buffers have the correct number of elements.
-
virtual ConstVoidData in(TensorId id, int64_t numElements, bool prefetch, bool) final
This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the
inputCallback
parameter passed to the constructor.This function should not be called directly.
-
virtual void inComplete(TensorId id, int64_t numElements, bool) final
This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the
inputCompleteCallback
parameter passed to the constructor.This function should not be called directly.
-
virtual MutableVoidData out(TensorId id, int64_t numElements) final
This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the
outputCallback
parameter passed to the constructor.This function should not be called directly.
-
virtual void outComplete(TensorId id) final
This function is called by PopART when a StepIOCallback instance is passed to Session::run() and will internally call the
outputCompleteCallback
parameter passed to the constructor.This function should not be called directly.
-
using InputCallback = std::function<ConstVoidData(TensorId, bool)>
-
class IWeightsIO
A virtual class for accessing pointers to the data required to perform a training step.
Subclassed by popart::WeightsIO
Public Functions
-
virtual ~IWeightsIO() = default
Destructor for IWeightsIO.
-
virtual bool contains(TensorId) const = 0
Check if the WeightsIO instance contains the weights for a specific tensor.
- Parameters
TensorId – The ID of the tensor to look for weights for.
- Returns
true
if the WeightsIO instance contains weights for the tensor,false
otherwise.
-
virtual MutableVoidData weight(TensorId) const = 0
Retrieve weights for a specific tensor.
- Parameters
TensorId – The ID of the tensor to retrieve weights for.
- Returns
The weights.
-
virtual ~IWeightsIO() = default
-
class WeightsIO : public popart::IWeightsIO
Class representing weights.
Public Functions
-
virtual bool contains(TensorId) const final
Check if the WeightsIO instance contains the weights for a specific tensor.
- Parameters
TensorId – The ID of the tensor to look for weights for.
- Returns
true
if the WeightsIO instance contains weights for the tensor,false
otherwise.
-
virtual MutableVoidData weight(TensorId) const final
Retrieve weights for a specific tensor from the WeightsIO object.
- Parameters
TensorId – The ID of the tensor to retrieve weights for.
- Returns
The weights.
-
void insert(TensorId, MutableVoidData)
Insert weights for a specific tensor into the WeightsIO object.
- Parameters
TensorId – The ID of the tensor to insert weights for.
MutableVoidData – The weights to insert.
-
virtual bool contains(TensorId) const final
-
struct IArrayAccessor
Structure to help with accessing the data in IArray objects.
Public Static Functions
-
static inline void *getDataPointer(IArray &array)
Get pointer to the data.
- Parameters
array – The IArray object.
- Returns
A pointer to the data contained in the IArray object.
-
static inline size_t getArraySize(const IArray &array)
Get the number of data elements.
- Parameters
array – The IArray object.
- Returns
The number of data elements.
-
static inline DataType getArrayDataType(IArray &array)
Get the data type of the data.
- Parameters
array – The IArray object.
- Returns
The data type of the data.
-
static inline void *getDataPointer(IArray &array)
#include <popart/stepio_generic.hpp>
-
template<typename ARRAY_TYPE, typename ACCESSOR_TYPE, typename ArrayInfoT>
class StepIOGeneric : public popart::IStepIO Subclassed by popart::StepIO
Public Functions
-
inline void assertNumElements(const popx::Executablex &exe) const final
-
inline TensorInfo getTensorInfo(ARRAY_TYPE &array) const
-
template<typename T>
inline T get(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, bool advance_, std::string mapName)
-
template<typename T>
inline void advance(TensorId id, std::map<TensorId, ArrayInfo> &M, int64_t numElements, std::string mapName)
-
inline ConstVoidData in(TensorId id, int64_t numElements, bool, bool) final
-
inline MutableVoidData out(TensorId id, int64_t numElements) final
-
inline void assertNumElements(const popx::Executablex &exe) const final
-
struct ArrayInfo
#include <popart/iarray.hpp>
-
class IArray
Subclassed by popart::NDArrayWrapper< T >
14.3. Tensors
#include <popart/tensor.hpp>
-
class Tensor : public popart::Vertex
Public Functions
-
Tensor(TensorId, TensorType, Graph&, const DebugContext& = {})
-
Tensor(TensorId, VariableSettings, Graph&, const DebugContext& = {})
-
Tensor(TensorId, TensorType, VariableSettings, Graph&, const DebugContext& = {})
-
inline std::string str() const final
-
TensorType tensorType() const
-
std::string tensor_type() const
-
void setTensorType(TensorType)
-
inline ReplicatedStreamMode getReplicatedStreamMode() const
-
inline void setReplicatedStreamMode(const ReplicatedStreamMode &mode)
-
void setTensorLocationInfo(TensorLocation&, std::pair<RemoteBufferId, RemoteBufferIndex> &remoteBufferInfo)
-
std::set<PipelineStage> getPipelineStages() const
-
bool hasProducer() const
-
bool isGraphInput() const
-
bool isGraphOutput() const
-
bool isLoopInput() const
-
bool isImplicitLoopInput() const
-
bool isExplicitLoopInput() const
-
bool isLoopTripCounter() const
-
bool isUnmodifiable() const
-
bool isCheckpointTensor() const
-
bool isImplicitRecomputeTensor() const
-
bool isRestoreInplaceTensor() const
-
bool idIncludesPrefix(const std::vector<std::string>&) const
-
bool isOptimizerTensor() const
-
bool isRemoteArgTensor() const
-
bool isRandomSeedTensor() const
-
bool isOptimizerStateTensor() const
-
bool isAccumulatorTensor() const
-
bool isHostLoadTensor() const
Is this tensor produced by a HostLoad Op or MultiExchangeOp with HostLoad descriptor?
- Returns
true if producer is a HostLoad Op or MultiExchangeOp with HostLoad descriptor false otherwise.
-
bool isWeightTensor() const
-
bool isAnchored() const
-
bool isRootAnchor() const
-
bool hasTensorData() const
-
TensorData *tensorData()
-
const TensorData *tensorData() const
-
bool anyAliasFor(std::function<bool(Tensor*)> predicate, const AliasModel &popMem) const
-
void setTensorDataFromCopyOf(const void *src, std::size_t size)
-
void setTensorDataFromViewOf(void *src, std::size_t size)
-
void setTensorDataByEmplaceOf(std::vector<char> &&data)
-
void setTensorData(const TensorData &td)
-
void setTensorData(TensorData &&td)
-
bool hasVirtualGraphId() const
-
VGraphIdAndTileSet getVirtualGraphIdAndTileSet(std::set<OpId> &visited) const
-
VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe() const
-
VGraphIdAndTileSet getVirtualGraphIdAndTileSetUnsafe(std::set<OpId> &visited) const
-
int getBatchAxis() const
-
bool consumersAllPreLoss() const
-
bool isModified(bool considerLoopInput = true) const
Check if any of the consumers modify this tensor.
- Parameters
considerLoopInput – If explicit loop inputs should be considered as being modified. If false, only operations modifying the tensor inplace will be considered.
- Returns
True if the tensor is modified, otherwise false.
-
bool isAliased() const
Check if any of the consumers alias this tensor.
- Returns
True if the tensor is aliased to any output, otherwise false.
-
std::set<Op*, POpCmp> getInplaceModifiers() const
Find operations that modify a tensor.
- Returns
All operations that (direct and indirectly) modify this tensor
-
std::set<Op*, POpCmp> getInplaceModifiersFor(const AliasModel *popMem) const
Find operations that modify a tensor with the given poprithm graph.
- Returns
All operations that (direct and indirectly) modify this tensor
-
std::vector<char> getDataViaGraphTraversal() const
-
inline void setVariableUpdateType(VariableUpdateType type)
Members of old subclass VariableTensor class VariableTensor : public Tensor {.
-
inline VariableUpdateType getVariableUpdateType() const
-
inline VariableSettings getVariableSettings() const
- Returns
The VariableSettings of this Variable
-
std::vector<int64_t> returnedShape(unsigned replicationFactor)
Returns the shape necessitated by IO.
- Parameters
replicationFactor – The replication factor
- Returns
the shape of the tensor, considering replica groups
-
void verifyMutableVoidInfo(const TensorInfo mutableVoidInfo, unsigned replicationFactor)
Check that the info of a mutableVoidData object matches the expectations set by the TensorInfo and VariableSettings.
Throws an error if there is a mismatch.
- Parameters
mutableVoidInfo – The data of the MutableVoidInfo with the same id as this tensor
replicationFactor – The replicationFactor of this instance
-
void setPreparedVGraphIdAndTileSet()
Set the preparedVGraphIdAndTileSet.
Public Members
-
Consumers consumers
-
TensorInfo info
-
TensorLocationInfo tensorLocationInfo
-
InputSettings inputSettings
-
Tensor(TensorId, TensorType, Graph&, const DebugContext& = {})
-
enum class popart::TensorType
Values:
-
enumerator ActGrad = 0
-
enumerator Const
-
enumerator Stream
-
enumerator Unknown
-
enumerator Variable
-
enumerator N
-
enumerator ActGrad = 0
-
enum class popart::VariableUpdateType
Values:
-
enumerator None = 0
-
enumerator Gradient
-
enumerator Copy
-
enumerator None = 0
#include <popart/tensorinfo.hpp>
-
enum class popart::DataType
There is a one-to-one correspondence between
popart::DataTypes
andONNX_NAMESPACE::TensorProto_DataTypes
, which is equivalent todecltype
(ONNX_NAMESPACE::TensorProto().data_type()).Values:
-
enumerator UINT8 = 0
-
enumerator INT8
-
enumerator FLOAT8_143
-
enumerator FLOAT8_152
-
enumerator UINT16
-
enumerator INT16
-
enumerator INT32
-
enumerator INT64
-
enumerator UINT32
-
enumerator UINT64
-
enumerator BOOL
-
enumerator FLOAT
-
enumerator FLOAT16
-
enumerator BFLOAT16
-
enumerator DOUBLE
-
enumerator COMPLEX64
-
enumerator COMPLEX128
-
enumerator STRING
-
enumerator UNDEFINED
-
enumerator UINT8 = 0
-
class DataTypeInfo
-
class TensorInfo
Public Functions
-
TensorInfo(DataType, const Shape&)
Create TensorInformation based on data type and shape.
- Parameters
data_type – - The data type.
shape – - The actual shape of the tensor.
-
TensorInfo(DataType data_type, const Shape &shape, const Shape &meta_shape)
Create TensorInformation based on data type, shape and meta shape.
- Parameters
data_type – - The data type.
shape – - The actual shape of the tensor.
meta_shape – - The meta shape of the tensor, which can for example be used to store the original tensor shape before replicated tensor sharding was applied.
-
TensorInfo(std::string data_type, std::string shape)
-
explicit TensorInfo(const ONNX_NAMESPACE::TensorProto&)
-
explicit TensorInfo(const ONNX_NAMESPACE::TypeProto&)
-
void set(const ONNX_NAMESPACE::TensorProto&)
-
void set(const ONNX_NAMESPACE::TypeProto&)
-
TensorInfo() = default
-
std::vector<size_t> shape_szt() const
-
inline int64_t nelms() const
-
int64_t nbytes() const
-
inline int64_t dim(int i) const
-
inline std::vector<int> strides(const std::vector<long> &shape)
Get the strides of the tensor, that is the number of bytes to step in each dimension when traversing an array in memory.
See https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html
- Parameters
shape – The on-host ONNX shape of a tensor. This is different from this->shape(), which gives the on-replica shape of a tensor
- Returns
std::vector<int> The strides vector.
-
const std::string &data_type() const
-
const std::string &data_type_lcase() const
-
void append(std::ostream&) const
-
bool isSet() const
-
bool operator==(const TensorInfo&) const
-
bool operator!=(const TensorInfo&) const
-
ONNX_NAMESPACE::TypeProto getOnnxTypeProto() const
-
const DataTypeInfo *getDataTypeInfo() const
Public Static Functions
-
static std::string npOutDataTypeExceptionMessage(const TensorInfo &i0, const TensorInfo &i1, const std::string &debugName)
-
TensorInfo(DataType, const Shape&)
#include <popart/tensorindex.hpp>
-
class TensorIndexMap
Public Functions
-
TensorIndexMap() = default
-
~TensorIndexMap()
-
void erase(int)
-
void clear()
-
bool hasIndex(int) const
-
const std::map<Tensor*, std::vector<int>, PTensorCmp> &indicesMap() const
-
int n() const
-
void append(std::stringstream&, std::string prefix, int max_id_length) const
-
void setInfoIfIndex(const TensorInfo&, int index)
-
int maxIdLength() const
-
int minIndex() const
-
int maxIndex() const
-
TensorIndexMap() = default
#include <popart/tensorlocation.hpp>
-
enum class popart::ReplicatedTensorSharding
Enum type to specify whether to shard tensors over replicas.
Values:
-
enumerator Off = 0
Don’t shard tensors over replicas.
-
enumerator On = 1
Do shard tensors over replicas.
-
enumerator N = 2
Number of values.
-
enumerator Off = 0
-
class TensorLocation
Class that describes the memory characteristics of one or multiple tensors.
See also: SessionOptions.
Public Functions
-
TensorLocation()
Equivalent to calling TensorLocation(TensorStorage::Undefined, TileSet::Compute, TileSet::Compute, ReplicatedTensorSharding::Off)
-
TensorLocation(TensorStorage storage)
Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, ReplicatedTensorSharding::Off)
-
TensorLocation(TensorStorage storage, ReplicatedTensorSharding replicatedTensorSharding)
Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, replicatedTensorSharding)
-
TensorLocation(TensorStorage storage, ReplicatedTensorSharding replicatedTensorSharding, CommGroup shardingDomain)
Equivalent to calling TensorLocation(storage, TileSet::Compute, TileSet::Compute, replicatedTensorSharding, shardingDomain)
-
TensorLocation(TensorStorage storage, TileSet loadTileSet, TileSet storageTileSet, ReplicatedTensorSharding replicatedTensorSharding)
Construct a TensorLocation from parameters.
- Parameters
storage – The memory location of the tensor(s).
loadTileSet – The tiles through which the tensor(s) are loaded onto the chip.
storageTileSet – The tiles on which the tensor(s) are stored.
replicatedTensorSharding – Whether to apply replicated tensor. sharding.
-
TensorLocation(TensorStorage storage, TileSet loadTileSet, TileSet storageTileSet, ReplicatedTensorSharding replicatedTensorSharding, CommGroup shardingDomain)
Construct a TensorLocation from parameters.
- Parameters
storage – The memory location of the tensor(s).
loadTileSet – The tiles through which the tensor(s) are loaded onto the chip.
storageTileSet – The tiles on which the tensor(s) are stored.
replicatedTensorSharding – Whether to apply replicated tensor. sharding.
shardingDomain – GCL communication group across which to shard the tensor. Perpendicular replicas will not shard, and reduce gradients normally (via AllReduce). Defaults to sharding across all replicas.
-
TensorLocation(std::vector<int64_t> serialized)
-
bool operator==(const TensorLocation &rhs) const
-
bool operator!=(const TensorLocation &rhs) const
-
std::vector<int64_t> serialize() const
-
bool isRemote() const
Public Members
-
TensorStorage storage
The memory location of the tensor(s).
-
ReplicatedTensorSharding replicatedTensorSharding
Whether to apply replicated tensor sharding (RTS) or not.
-
TensorLocation()
-
enum class popart::TensorStorage
Enum type that determines where a tensor is stored.
Values:
-
enumerator OnChip = 0
Store the tensor in on-chip memory.
-
enumerator OffChip = 1
Store the tensor in streaming memory.
-
enumerator N = 2
Number of values.
-
enumerator OnChip = 0
-
enum class popart::TileSet
Enum type to specify a set of tiles.
Values:
-
enumerator Compute = 0
The set of tiles designated for compute operations.
-
enumerator IO = 1
The set of tiles designated for IO operations.
-
enumerator Undefined = 2
Undefined (no) tile set.
-
enumerator N = 3
Number of values.
-
enumerator Compute = 0
14.4. Optimizers
#include <popart/optimizer.hpp>
-
class Optimizer
Interface for describing an Optimizer and, internally, how to grow the optimiser step for each weight.
The end-user facing interface constructed by the user to describe what kind of optimiser to use.
Then also used internally by the Ir to grow the optimiser step for each weight.
Stores OptimizerValues for optimizer parameters like learning rate, loss scaling, etc.
See also
OptimiserValue.
Optimizer stores the values for each weight - they can have different values. There is a “default” for all weights, then you can specify specific values for specific weights. This is encapsulated by an OptimizerValueMap, which is a sparse map from weight to value, with unspecified values implying the default.
See also
OptimizerValueMap.
At runtime, the user can dynamically update the Optimizer, e.g. by setting new OptimizerValues. validReplacement determines whether the new Optimizer is interchangable with the one the Ir was built for. For example, trying to replace an SGD Optimizer with an Adam Optimizer would throw.
Subclassed by popart::Adam, popart::Adaptive, popart::SGD
Public Functions
-
virtual ~Optimizer() = default
Optimizer class has a two-part initialisation. The ctor, used by the end-user, and setFactorsFromOptions called by the Ir to finish initialisation once we have all the relevant information during Ir preparation.
Some key methods used by the Ir to grow optimiser step for each weight are createOp, getInputIds, optimizerInputs.
If the OptimizerValue is const, no Ir tensor for that value is created and the VarUpdateOp created for that weight will not have the optional input for that tensor. The Opx of the VarUpdateOp will emit poplar code that uses the provided value directly.
If the OptimizerValue is not const, an Ir tensor for that value is created and the VarUpdateOp created for that weight will have the optional input for that tensor. The tensor will be a stream tensor, so that it can be updated later from host. The tensor will be streamed an initial value of the OptimizerValue’s value.
It is common for Optimizer
implementations to make use of “compound
scalars”. Take for example the SGD0 weight update equation: w <- w * (1 - lr * (1 - dm) * wd) - g * (lr * (1 - dm) / ls) w is the weights and g is the grads. lr, dm, wd, ls are all the “atomic scalars”. These are the scalars/hyperparameters of the
Optimizer that the user can set using OptimizerValues, as described above.Multiple atomic scalars appear in expressions together, and will be operated on together before being used by an Op that also consumes a tensor (in this case the weights or grads). For SGD0, they can be grouped as follows:
w <- w * {1 - lr * (1 - dm) * wd} - g * { lr * (1 - dm) / ls } ^^^^^^^^^^^^^^^^^^^^^^^^^ ~~~~~~~~~~~~~~~~~~~~~~ | | weight decay scale factor 0 | scaled learning rate 0
We call wdsf0 and slr0 the “compound scalars”.
We can statically precompute the OptimizerValues for these compound scalars using the OptimizerValues of the atomic scalars. This makes the Ir simpler, as we now have only:
w <- w * wdsf0 - g * slr0
The CompoundScalarHelpers are used to precompute the compound scalar values.
If any of the composite atomic scalars are non-const, the compound scalar is non-const.
See also
compoundscalarhelper.hpp
-
Optimizer(OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings, const DebugContext &debugContext)
-
virtual OptimizerType type() const = 0
-
virtual std::string type_s() const = 0
-
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const = 0
Returns the TensorIds of the input tensors to the VarUpdateOp this optimiser will create for the given
weight
.Specifically, The TensorId at index i will be the id of the input tensor at InIndex i of the VarUpdateOp. If the input is an OptimizerValue, if it is const, then “” will be returned, else the relevant reservered prefix for that OptimizerValue will be used, followed by the weight id. The prefixes are defined in tensornames.hpp, for example
reservedDefaultWeightDecayScaleFactor0Prefix
orreservedSpecificScaledLearningRate1Prefix
(note there are different prefixes depending on if the weight has a specific or default value for that OptimizerValue).
-
virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const = 0
-
inline const OptimizerValue &lossScaling() const
-
inline float getLossScalingVal() const
-
float getFinalLossScalingVal() const
-
virtual void setFactorsFromOptions(const SessionOptions&)
-
bool gradientAccumulationEnabled() const
-
bool meanReductionEnabled() const
-
bool postMeanAccumulationEnabled() const
-
bool postMeanReplicationEnabled() const
-
int64_t getReplicatedGraphCount() const
-
int64_t getAccumulationFactor() const
-
bool meanGradientAccumulationEnabled() const
-
inline const std::vector<ClipNormSettings> &getClipNormSettings() const
-
virtual bool hasSpecific() const = 0
-
virtual size_t hash() const
-
inline DebugContext getDebugContext() const
-
enum class popart::OptimizerType
Types of optimizers.
Values:
-
enumerator SGD = 0
-
enumerator Adam
-
enumerator Adaptive
-
enumerator NTYPES
-
enumerator SGD = 0
-
enum class popart::OptimizerReductionType
Reduction mode when doing data-parallel training over replicated graphs.
Depending on the optimizer used and its configuration, this option describes how the reduction of gradients over replicas will occur. For example, directly on the gradient, on the gradient accumulator, or on the momentum. See the documentation of individual optimizers for more information.
Values:
-
enumerator None = 0
No replicated graph reduction.
-
enumerator GradReduce
Gradient reduction (every iteration, after a weight’s gradient is produced)
-
enumerator AcclReduce
Momentum reduction (SGD1, after the gradient accumulation loop, if applicable)
-
enumerator AccumReduce
Accumulator reduction (Adam/SGD2 + gradient accumulation, after the gradient accumulation loop)
-
enumerator None = 0
#include <popart/optimizervalue.hpp>
-
class OptimizerValue
A class used to represent values of hyper parameters.
Public Functions
-
OptimizerValue() = default
Equivalent to OptimizerValue(0, false).
-
inline OptimizerValue(float v)
Equivalent to OptimizerValue(v, true).
-
inline OptimizerValue(float v, bool c)
Constructor.
- Parameters
v – The current value of the hyper parameter.
c – A boolean flag to indicate whether the parameter will remain at this value forever (
true
) or may change over time (false
).
-
inline OptimizerValue(std::pair<float, bool> x)
-
inline float val() const
-
inline bool isConst() const
-
void validReplacement(const OptimizerValue &rhs) const
-
bool operator==(const OptimizerValue &rhs) const
-
OptimizerValue() = default
#include <popart/optimizervaluemap.hpp>
-
class OptimizerValueMap
Public Functions
-
inline OptimizerValueMap(OptimizerValue g)
-
OptimizerValue get(const TensorId &id) const
-
void insertSpecific(const TensorId&, OptimizerValue)
-
inline bool hasSpecific() const
-
inline OptimizerValue getDefault() const
-
void validReplacement(const OptimizerValueMap &rhs) const
-
inline const std::map<TensorId, OptimizerValue> &getSpecifics() const
-
inline OptimizerValueMap(OptimizerValue g)
14.4.1. Stochastic Gradient Descent (SGD)
#include <popart/clipnormsettings.hpp>
-
class ClipNormSettings
A data structure used to represent a maximum value constraint on one or more weights.
This is passed to the optimizer on construction.
Public Functions
-
ClipNormSettings(const std::vector<TensorId> &weightIds_, float maxNorm_)
DEPRECATED This will be removed from a future release.
Constructor.
- Parameters
weightIds_ – The weight tensor IDs that this constraint applies to.
maxNorm_ – The maximum permissible value.
-
float getMaxNorm() const
-
bool operator==(const ClipNormSettings&) const
-
bool operator!=(const ClipNormSettings &other) const
Public Static Functions
-
static ClipNormSettings clipWeights(const std::vector<TensorId> &weightIds_, float maxNorm_)
-
static ClipNormSettings clipAllWeights(float maxNorm_)
-
ClipNormSettings(const std::vector<TensorId> &weightIds_, float maxNorm_)
#include <popart/sgd.hpp>
-
class SGD : public popart::Optimizer
Stochastic Gradient Descent (SGD) optimizer.
Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.
The SGD optimizer has the following state for each weight:
velocity ( \(v\))
The SGD optimizer has the following hyper parameters:
learning rate ( \(\text{lr}\))
momentum ( \(\text{mm}\))
weight decay ( \(\text{wd}\))
dampening ( \(\text{dm}\))
velocity scaling ( \(\text{vs}\))
loss scaling ( \(\text{ls}\))
nesterov
clip norm settings
The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see SGD::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.
In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.
When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first updates the optimizer state as follows:
\[ v' := v * \text{mm} + (1 - \text{dm}) * (g + \text{wd} * w) \text{ \ . } \]Following the update of the optimizer state the optimizer uses said state to update the weight:
if nesterov is True:
\[ g' := g + \text{wd} * w + \text{mm} * v' \text{ \ . } \]\[ w' := w - \text{lr} * g' \text{ \ . } \]else:\[ w' := w - \text{lr} * v' \text{ \ . } \]In addition to the above, the velocity scaling hyper parameter is a scaling factor that can provide improved numerical stability by ensuring the values stored in the optimizer state, \(v\), are scaled by this value. When using this parameter PopART will automatically deal with the artificially scaled velocity value during the weight update and other hyper parameters do not need to be adjusted).
In addition, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.
Finally, it is possible to add clip norm settings for this optimizer. These clip norms compute the L2 norm for a group of weights and adds a scalar term to the weight update that effectively divides it by the norm (or a constant value that is provided as part of the clip norm, which ever is greater).
See the SGD notes in optimizer.hpp for a more detailed and comprehensive derivation of the SGD optimizer step in PopART.
Subclassed by popart::ConstSGD
Public Functions
-
SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, OptimizerValue nesterov, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})
Constructor.
See also
SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.
- Parameters
defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
nesterov – Option to enable Nesterov momentum. Defaults to false.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.
-
SGD(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultMomentum, OptimizerValue defaultDampening, OptimizerValue defaultVelocityScaling, OptimizerValue lossScaling, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})
Constructor.
See also
SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.
- Parameters
defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultDampening – The dampening value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultVelocityScaling – The velocity scaling value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.
-
SGD(const std::map<std::string, std::pair<float, bool>> ¶ms, const std::vector<ClipNormSettings> &clipNormSettings = {}, SGDAccumulatorAndMomentum sgdAccMm = SGDAccumulatorAndMomentum::Combined, DataType accumType = DataType::UNDEFINED, DataType accl1Type = DataType::UNDEFINED, const DebugContext &debugContext = {})
Constructor.
EXAMPLE:
SGD({{"defaultLearningRate", {0.02, false}}, {"defaultMomentum", {0.6, true}}});
See also
SGDAccumulatorAndMomentum. Defaults to SGDAccumulatorAndMomentum::Combined.
This will create an SGD Optimizer which has a constant momentum of 0.6 and a changeable learning rate initially of 0.02. All OptimizerValues not present in the map will take values from the
getUnset
* functions.- Parameters
params – A parameter map where the keys are one or more of
"defaultLearningRate"
,"defaultWeightDecay"
,"defaultMomentum"
,"defaultDampening"
,"defaultVelocityScaling"
,"lossScaling"
or `”nesterov”. The map’s values are pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter because default values will be used where parameters are missing.clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
sgdAccMm – The implementation strategy to use when gradient accumulation and/or momentum are used, otherwise ignored.
accumType – The DataType of the accum tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
accl1Type – The DataType of the accl1 tensor, when gradient accumulation is used and sgdAccMm = SGDAccumulatorAndMomentum::Separate, otherwise ignored. Only FLOAT, FLOAT16 and UNDEFINED are supported. Defaults to UNDEFINED. If UNDEFINED, the same type as the weights will be used. If accumType is FLOAT16 and accl1Type is FLOAT, this parameter causes accum to be upcasted before being passed to the op that updates accl1.
debugContext – Optional debug context.
-
inline SGD()
Default constructor Creates SGD with default scalars (equivalent to getUnset<scalar>() methods), and other default parameters of main constructor.
-
~SGD() = default
-
inline virtual OptimizerType type() const final
-
inline virtual std::string type_s() const final
-
inline SGDAccumulatorAndMomentum getSGDAccumulatorAndMomentum() const
-
virtual std::unique_ptr<Op> createOp(const Tensor &weight, Graph&) const final
Returns the VarUpdateOp for the given
weight
.If no gradient accumulation of momentum, this will be a SGD0VarUpdateOp. Else, if
getSGDAccumulatorAndMomentum() == ::Combined
, this will be an SGD1ComboOp, else ifgetSGDAccumulatorAndMomentum() == ::Combined
SGD2ComboOp, an SGD2ComboOp.
The required compound scalar OptimizerValues for the
VarUpdateOp wil be computed and passed to the Op. See the SGD notes above this class for how they are derived. Recall that if non-const, the VarUpdateOp will take an input Tensor for the compound scalar.See also
Optimizer::createOp
The OptimizerReductionType of the Op is derived as follows: No replication => None Replication, no grad acc => GradReduce Replication, grad acc, SGD1 => AcclReduce Replication, grad acc, SGD2 => AccumReduce See the SGD notes above this class for why this is.
If SGD2, the DataType of the accum and accl1 tensors passed to the SGD2ComboOp will be as set in the SGD constructor. Recall DataType::UNDEFINED means use the same as the weight.
An SGD1ComboOp will later be decomposed by SGD1Decompose
pattern into a series of Ops and Tensors that implement the SGD1 optimiser step.
An SGD12ComboOp will later be decomposed by
SGD2Decompose pattern into a series of Ops and Tensors that implement the SGD2 optimiser step.See also
See also
-
virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final
smm1 and wdsf0 have the same data type as the
weight
. Everything else
-
float getStoredValue(const TensorId &optId) const
Tensor “opt” has an id, which it uses to match a compound scalar which this object can compute from the atomic scalars.
-
void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue momentum, OptimizerValue dampening, OptimizerValue velocityScaling, OptimizerValue nesterov)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
momentum – The momentum value to use for this specific weight.
dampening – The dampening value to use for this specific weight.
velocityScaling – The velocity scaling value to use for this specific weight.
nesterov – Option to enable Nesterov momentum. Defaults to false.
-
void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> ¶ms)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
params – A parameter map where keys are one of
"learningRate"
,"weightDecay"
,"momentum"
,"dampening"
, or"velocityScaling"
and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.
-
virtual bool hasSpecific() const final
-
inline const OptimizerValueMap &learningRates() const
-
inline const OptimizerValueMap &weightDecays() const
-
inline const OptimizerValueMap &momentums() const
-
inline const OptimizerValueMap &dampenings() const
-
inline const OptimizerValueMap &velocityScalings() const
-
inline const OptimizerValueMap &nesterov() const
-
virtual size_t hash() const
Public Static Functions
-
static inline OptimizerValue getUnsetLearningRate()
Default learning rate value.
-
static inline OptimizerValue getUnsetWeightDecay()
Default weight decay value.
-
static inline OptimizerValue getUnsetMomentum()
Default momentum value.
-
static inline OptimizerValue getUnsetDampening()
Default dampening value.
-
static inline OptimizerValue getUnsetVelocityScaling()
Default velocity scaling value.
-
static inline OptimizerValue getUnsetLossScaling()
Default loss scaling value.
-
static inline OptimizerValue getUnsetNesterov()
Default nesterov.
-
static SGD fromDefaultMap(const std::map<std::string, OptimizerValue>&, const DebugContext &debugContext = {})
-
class ConstSGD : public popart::SGD
Stochastic Gradient Descent (SGD) optimizer with constant learning rate, weight decay, loss scaling and clip norm settings (and default values for momentum, dampening or velocity scaling).
NOTE: See SGD for detailed meaning for these parameters.
NOTE: This class exists for backwards compatibility with the Python API and may be removed at some point in the future.
Public Functions
-
inline ConstSGD(float learningRate, float weightDecay = 0, float lossScaling = 1, const std::vector<ClipNormSettings> &clipNormSettings = {})
Constructor.
- Parameters
learningRate – A constant learning rate.
weightDecay – A constant weight decay value.
lossScaling – A constant loss scaling value.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
-
inline ConstSGD(float learningRate, float weightDecay = 0, float lossScaling = 1, const std::vector<ClipNormSettings> &clipNormSettings = {})
14.4.2. Adam, AdaMax & Lamb
#include <popart/adam.hpp>
-
enum class popart::AdamMode
Enum type describing the mode of an Adam optimizer instance.
Values:
-
enumerator Adam = 0
Adam or AdamW mode, depending on weight decay setting (see Kingma & Ba, 2015 and Loshchilov & Hutter, 2018).
-
enumerator AdaMax
Adamax mode.
-
enumerator Lamb
Lamb mode (see You et al., 2020).
-
enumerator LambNoBias
Like Lamb but without bias correction.
-
enumerator Adam = 0
-
class Adam : public popart::Optimizer
AdamW, Lamb and AdaMax optimizer implementation.
Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.
The optimizer has the following state for each weight:
first-order momentum ( \(m\))
second-order momentum ( \(v\))
time step ( \(t\))
The optimizer has the following hyper parameters:
learning rate ( \(\text{lr}\))
weight decay ( \(\text{wd}\))
beta1 ( \(\beta_1\))
beta2 ( \(\beta_2\))
epsilon ( \(\epsilon\))
loss scaling ( \(\text{ls}\))
maximum weight norm ( \(\text{mwn}\))
The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adam::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.
The values of #AdamMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).
In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.
When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:
\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]Secondly, the optimizer updates the optimizer state as follows:
\[\begin{split} m' &:= \beta_1 * m + (1 - \beta_1) * g_\text{tmp} \\ v' &:= \left\{\begin{aligned} \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Adam/AdamNoBias) } \\ \beta_2 * v + (1 - \beta_2) * g_\text{tmp}^2 & \text{ \; (Lamb/LambNoBias) } \\ \text{max}(\beta_2 * v, |g_\text{tmp}|) & \text{ \; (AdaMax) } \\ \end{aligned}\right.\\ t' &:= t + 1 \\ \end{split}\]Next, it computes the following terms:
\[\begin{split} m_\text{tmp} &:= \left\{\begin{aligned} m' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{m'}{(1 - \beta_1^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ v_\text{tmp} &:= \left\{\begin{aligned} v' & \text{ \; (AdamNoBias/LambNoBias) } \\ \frac{v'}{(1 - \beta_2^{t'})} & \text{ \; (Adam/Lamb/AdaMax) } \\ \end{aligned}\right.\\ u_\text{tmp} &:= \left\{\begin{aligned} \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} + \text{wd} * w &\text{ \; (Decay) } \\ \frac{m_\text{tmp}}{(\sqrt{v_\text{tmp}} + \epsilon)} &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]Finally, the optimizer updates the weight as follows:
\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * u_\text{tmp} &\text{ \; (Adam/AdamNoBias/AdaMax) } \\ w - \biggl(\frac{\text{min}(\lVert{w}\rVert, \text{mwn})}{\lVert{u_\text{tmp}}\rVert}\biggr) * \text{lr} * u_\text{tmp} &\text{ \; (Lamb/LambNoBias) } \\ \end{aligned}\right. \end{split}\]In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability of the gradient calculations. If scaledOptimizerState is enabled then the the lossScaling will not be removed before updating the optimizer state. This can improve the numerical stability when accl1_type is set to FLOAT16.
NOTE: The maximum weight norm is referred to as \(\phi\) in You et al., 2020.
Public Functions
-
virtual bool hasSpecific() const final
-
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
Constructor.
- Parameters
defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultBeta1 – The beta1 value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultBeta2 – The beta2 value value to use for weights for which no weight-specific hyper parameters have been inserted.
defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameters have been inserted.
lossScaling – The loss scaling value to use.
maxWeightNorm – The maxWeightNorm value to use.
adamMode – The AdamMode value to use.
weightDecayMode – The WeightDecayMode value to use.
maxWeightNorm – The maxWeightNorm value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.
debugContext – Optional debug context.
-
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
-
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, OptimizerValue maxWeightNorm, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
-
Adam(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultBeta1, OptimizerValue defaultBeta2, OptimizerValue defaultEps, OptimizerValue lossScaling, AdamMode adamMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
-
Adam(const std::map<std::string, std::pair<float, bool>> ¶ms, AdamMode adamMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, const std::vector<ClipNormSettings> &clipNormSettings = {}, bool scaledOptimizerState = false, const DebugContext &debugContext = {})
Constructor.
EXAMPLE:
Adam({{"defaultLearningRate", {0.02, False}}, {"defaultBeta1", {0.9, True}}, {"defaultBeta2":{0.999, True}}}, AdamMode::Adam, WeightDecayMode::Decay, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT);
- Parameters
params – A parameter map where keys are one of
"defaultLearningRate"
,"defaultWeightDecay"
,"defaultBeta1"
,"defaultBeta2"
,"defaultEps"
,"lossScaling"
or"maxWeightNorm"
, and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.adamMode – The AdamMode value to use.
weightDecayMode – The WeightDecayMode value to use.
maxWeightNorm – The maxWeightNorm value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
clipNormSettings – A vector of ClipNormSettings (this can be used to set maximum values for weights).
scaledOptimizerState – Experimental Option. Does not remove lossScaling before updating the optimizer state. This should have no effect on the update equation. However, it does ensure a more numerically stable implementation when accl1_type is set to DataType::FLOAT16. Note: When loading a model that includes initialised optimizer state, ensure that accl1 and accl2 are scaled by lossScaling and lossScaling^2 respectively.
debugContext – Optional debug context.
-
~Adam() = default
-
inline virtual OptimizerType type() const final
-
inline virtual std::string type_s() const final
-
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final
The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.
In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.
-
virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final
The names and infos of the optimizer tensors.
-
float getStoredValue(const TensorId &optId) const
Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.
-
void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue beta1, OptimizerValue beta2, OptimizerValue eps, OptimizerValue mwn)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
beta1 – The beta1 value to use for this specific weight.
beta2 – The beta2 value to use for this specific weight.
eps – The epsilon value to use for this specific weight.
mwn – The max weight norm value to use for this specific weight.
-
void setStep(int64_t step)
-
void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> ¶ms)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
params – A parameter map where keys are one of
"defaultLearningRate"
,"defaultWeightDecay"
,"defaultBeta1"
,"defaultBeta2"
,"defaultEps"
,"lossScaling"
or"maxWeightNorm"
and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.
-
inline const OptimizerValueMap &learningRates() const
-
inline const OptimizerValueMap &weightDecays() const
-
inline const OptimizerValueMap &beta1s() const
-
inline const OptimizerValueMap &beta2s() const
-
inline const OptimizerValueMap &epss() const
-
inline const OptimizerValueMap &maxWeightNorms() const
-
inline const WeightDecayMode &getWeightDecayMode() const
-
inline bool useScaledOptimizerState() const
-
virtual size_t hash() const final
-
virtual void setFactorsFromOptions(const SessionOptions&) final
Public Static Functions
-
static inline OptimizerValue getUnsetLearningRate()
Default learning rate value.
-
static inline OptimizerValue getUnsetWeightDecay()
Default weight decay value.
-
static inline OptimizerValue getUnsetBeta1()
Default beta1 value.
-
static inline OptimizerValue getUnsetBeta2()
Default beta2 value.
-
static inline OptimizerValue getUnsetEps()
Default epsilon value.
-
static inline OptimizerValue getUnsetLossScaling()
Default loss scaling value.
-
static inline OptimizerValue getUnsetMaxWeightNorm()
Default maximum weight norm value.
-
static Adam fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdamMode adamMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, const DebugContext &debugContext = {})
14.4.3. AdaDelta, RMSProp & AdaGrad
#include <popart/adaptive.hpp>
-
enum class popart::AdaptiveMode
Enum class representing a type of adaptive optimizer.
Values:
-
enumerator AdaGrad = 0
AdaGrad optimizer.
-
enumerator RMSProp
RMSProp optimizer.
-
enumerator CenteredRMSProp
CenteredRMSProp optimizer.
-
enumerator AdaDelta
AdaDelta optimizer.
-
enumerator AdaGrad = 0
-
class Adaptive : public popart::Optimizer
AdaDelta, RMSProp and AdaGrad optimizer implementation.
Like any to any optimizer implementation, this class is responsible for updating each weight tensor ( \(w\)) in the model using the gradient ( \(g\)) of the loss function with respect to the weight as calculated during the backwards pass.
The optimizer has the following state for each weight:
first-order momentum ( \(v_1\))
second-order momentum ( \(v_2\)) (only for AdaGrad/RMSProp)
third-order momentum ( \(v_3\))
The optimizer has the following hyper parameters:
learning rate ( \(\text{lr}\))
weight decay ( \(\text{wd}\))
alpha ( \(\alpha\))
momentum ( \(\text{m}\)))
epsilon ( \(\epsilon\))
loss scaling ( \(\text{ls}\))
The values of these parameters can be shared between all weights but some can be overridden with weight-specific values (see Adaptive::insertSpecific). Hyper parameters are captured using OptimizerValue objects and therefore can be either a constant value or a non-constant value that can be adjusted by the user.
The values of #AdaptiveMode and #WeightDecayMode passed to the constructor determines how weights are updated (see below).
In the following we will describe how this optimizer updates a weight using a gradient. In the context of this description the gradient is is the value of the gradient after any gradient accumulation has been performed and after the application of a loss scaling factor to the gradient has been corrected for.
When the optimizer needs to update a weight, \(w\), using a gradient, \(g\), it first computes a term \(g_\text{tmp}\), which is effectively is \(g\) with L2 regularization applied if the #WeightDecayMode is set to WeightDecayMode::L2Regularization this, as follows:
\[\begin{split} g_\text{tmp} := \left\{\begin{aligned} g & \text{ \; (Decay) } \\ (g + \text{wd} * w) & \text{ \; (L2Regularization) \; . } \\ \end{aligned}\right.\\ \end{split}\]Secondly, the optimizer updates \(v_1\) the optimizer state as follows:
\[\begin{split} v_1' &:= \left\{\begin{aligned} \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (RMSProp/AdaDelta) } \\ \alpha * m + (1 - \alpha) * g_\text{tmp}^2 & \text{ \; (CenteredRMSProp) } \\ v_1 + g_\text{tmp}^2 & \text{ \; (AdaGrad) } \\ \end{aligned}\right.\\ \end{split}\]Next, \(v_2\) is updated, but only for CenteredRMSProp:
\[\begin{split} v_2' &:= \alpha * v_2 + (1 - \alpha) * g_\text{tmp} \text{ \; (CenteredRMSProp) } \\ \end{split}\]Next, it computes the update term \(u_\text{tmp}\):
\[\begin{split} u_\text{tmp} &:= \left\{\begin{aligned} \frac{g_\text{tmp}}{\sqrt{v_1'} + \epsilon} & \text{ \; (AdaGrad/RMSProp) } \\ \frac{g_\text{tmp}}{\sqrt{v_1' - v_2'^2} + \epsilon} & \text{ \; (CenteredRMSProp) } \\ \frac{g_\text{tmp} * \sqrt{v_2 + \epsilon}}{\sqrt{v_1' + \epsilon}} & \text{ \; (AdaDelta) } \\ \end{aligned}\right. \end{split}\]Next, \(v_2\) is updated, but only for AdaDelta:
\[\begin{split} v_2' := \alpha * v_2 + (1 - \alpha) * u_\text{tmp}^2 \text{ \; (AdaDelta) } \\ \end{split}\]Next the third momentum is updated for all modes:
\[ v_3' := m * v_3 + u_\text{tmp} \]Finally, the optimizer updates the weight as follows:
\[\begin{split} w' := \left\{\begin{aligned} w - \text{lr} * (v_3' + \text{wd} * w) &\text{ \; (Decay) } \\ w - \text{lr} * v_3' &\text{ \; (L2Regularization) } \\ \end{aligned}\right. \end{split}\]In addition to the above, the loss scaling hyper parameter is similar in nature to the velocity scaling parameter. It is a scaling value that is applied to the loss gradient at the start of the the backwards pass and, at the end of the backwards pass, this scaling is reversed by multiplying the gradients for each weight with the inverse of the loss scaling value prior to updating the optimizer state. Using loss scaling can also improve numerical stability in some cases.
Public Functions
-
virtual bool hasSpecific() const
-
Adaptive(OptimizerValue defaultLearningRate, OptimizerValue defaultWeightDecay, OptimizerValue defaultAlpha, OptimizerValue defaultMomentum, OptimizerValue defaultEps, OptimizerValue lossScaling, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})
Constructor.
- Parameters
defaultLearningRate – The learning rate value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultWeightDecay – The weight decay value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultAlpha – The alpha value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultMomentum – The momentum value to use for weights for which no weight-specific hyper parameter have been inserted.
defaultEps – The epsilon value to use for weights for which no weight-specific hyper parameter have been inserted.
lossScaling – The loss scaling value to use.
adaptiveMode – The AdaptiveMode value to use.
weightDecayMode – The WeightDecayMode value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.
debugContext – Optional debug context.
-
Adaptive(const std::map<std::string, std::pair<float, bool>> ¶ms, AdaptiveMode adaptiveMode, WeightDecayMode weightDecayMode, DataType accumType, DataType accl1Type, DataType accl2Type, DataType accl3Type, bool rmspropTFVariant = false, const DebugContext &debugContext = {})
Constructor.
EXAMPLE: ```{.cpp} Adaptive({{“defaultLearningRate”, {0.02, False}}, */ // {“defaultAlpha”, {0.99, True}}}, /** AdaptiveMode::RMSProp, WeightDecayMode::Decay, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT, DataType::FLOAT); ```
- Parameters
params – A parameter map where keys are one of
"defaultLearningRate"
,"defaultWeightDecay"
,"defaultAlpha"
,"defaultMomentum"
,"defaultEps"
or"lossScaling"
, and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.adaptiveMode – The AdaptiveMode value to use.
weightDecayMode – The WeightDecayMode value to use.
accumType – Data type to use for gradient accumulation.
accl1Type – Data type to use for tensor that stores first-order momentum optimizer state.
accl2Type – Data type to use for tensor that stores second-order momentum optimizer state.
accl3Type – Data type to use for tensor that stores third-order momentum optimizer state.
debugContext – Optional debug context.
-
~Adaptive() = default
-
inline virtual OptimizerType type() const final
-
inline virtual std::string type_s() const final
-
virtual std::vector<TensorId> getInputIds(const Tensor &weight) const final
The names of the inputs for the VarUpdateOp for the Variable Tensor “weight”.
In the returned vector, an empty string (“”) is used as a placeholder for constant inputs.
-
virtual std::vector<std::tuple<TensorId, TensorInfo>> getOptimizerInputs(const Tensor &weight) const final
The names and infos of the optimizer tensors.
-
float getStoredValue(const TensorId &optId) const
Tensor “opt” has an id, based on which it matches a compound scalar which this object can compute from the atomic scalars.
-
void insertSpecific(const TensorId &weight, OptimizerValue learningRate, OptimizerValue weightDecay, OptimizerValue alpha, OptimizerValue momentum, OptimizerValue eps)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
learningRate – The learning rate value to use for this specific weight.
weightDecay – The weight decay value to use for this specific weight.
alpha – The alpha value to use for this specific weight.
momentum – The momentum value to use for this specific weight.
eps – The epsilon value to use for this specific weight.
-
void setStep(int64_t step)
-
void insertSpecific(const TensorId &weight, const std::map<std::string, std::pair<float, bool>> ¶ms)
Insert a weight-specific set of hyper parameters.
- Parameters
weight – The TensorId of the weight.
params – A parameter map where keys are one of
"defaultLearningRate"
,"defaultWeightDecay"
,"defaultAlpha"
,"defaultMomentum"
,"defaultEps"
or"lossScaling"
and the map’s values pairs of floats and booleans representing OptimizerValue constructor arguments. The map does not have to specify each hyper parameter as default values will be used where parameters are missing.
-
inline const OptimizerValueMap &learningRates() const
-
inline const OptimizerValueMap &weightDecays() const
-
inline const OptimizerValueMap &alphas() const
-
inline const OptimizerValueMap &momentums() const
-
inline const OptimizerValueMap &epss() const
-
virtual size_t hash() const
Public Static Functions
-
static inline OptimizerValue getUnsetLearningRate()
Default learning rate value.
-
static inline OptimizerValue getUnsetWeightDecay()
Default weight decay value.
-
static inline OptimizerValue getUnsetAlpha()
Default alpha value.
-
static inline OptimizerValue getUnsetMomentum()
Default momentum value.
-
static inline OptimizerValue getUnsetEps()
Default epsilon value.
-
static inline OptimizerValue getUnsetLossScaling()
Default loss scaling value.
-
static Adaptive fromDefaultMap(const std::map<std::string, OptimizerValue>&, AdaptiveMode adaptiveMode_, WeightDecayMode decayMode_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, const DebugContext &debugContext = {})
14.5. Builder
#include <popart/builder.hpp>
-
class Builder
An interface for a Builder, used for creating ONNX graphs.
A builder interface for creating ONNX graphs.
ONNX defines a specification for describing graphs and serialising them as protobuf files. This class provides a builder interface for creating such a graph.
Note, in ONNX, all Ops belong to an “Opset”. The Builder itself does not have methods for creating Ops in the ONNX graph, but instead has accessors to Opsets, like AiGraphcoreOpset1, which contain the methods for creating Ops in the graph.
Public Functions
-
Builder &createSubgraphBuilder()
Create a builder for a graph which is nested inside this builder’s graph.
-
TensorId addInputTensor(const TensorInfo &tensorInfo, const popart::DebugContext &debugContext = {})
Add a new input tensor to the model.
- Parameters
tensorInfo – The shape and data type of the input tensor.
debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
TensorId addInputTensor(const std::string &dataType, const Shape &shape, const popart::DebugContext &debugContext = {})
Add a new input tensor to the model.
- Parameters
dataType – The data type of the input tensor.
shape – The shape of the input tensor.
debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
TensorId addInputTensor(const TensorInfo &tensorInfo, const InputSettings &settings, const popart::DebugContext &debugContext = {})
Add a new input tensor to the model.
- Parameters
tensorInfo – The shape and data type of the input tensor.
InputSettings – Settings for
TileSet
andExchangeStrategy
.debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
TensorId addInputTensor(const std::string &dataType, const Shape &shape, const InputSettings &settings, const popart::DebugContext &debugContext = {})
Add a new input tensor to the model.
- Parameters
dataType – The data type of the input tensor.
shape – The shape of the input tensor.
InputSettings – Settings for
TileSet
andExchangeStrategy
.debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
TensorId addUntypedInputTensor(const popart::DebugContext &debugContext = {})
Add a new input tensor without a type or shape to the model.
- Parameters
debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
void addInputTensorFromParentGraph(const TensorId &tensorId)
Add a new named input tensor (from the parent graph) to the model.
- Parameters
tensorId – The identifier string of the input tensor. This identifier must already exist in the name scope of the parent
GraphProto
and must appear topologically before this sub-graph.
-
TensorId addInitializedInputTensor(const ConstVoidData &initData, const popart::DebugContext &debugContext = {})
Add a new pre-initialized input tensor to the model.
- Parameters
initData – The initial data of the input tensor.
debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
TensorId addInitializedInputTensor(const ConstVoidData &initData, const VariableSettings &variableSettings, const popart::DebugContext &debugContext = {})
Add a new pre-initialized input tensor to the model.
- Parameters
initData – The initial data of the input tensor.
variableSettings – The settings that determine how variables are retrieved from replicas.
debugContext – Optional debug information.
- Returns
The tensor id of the input tensor.
-
void addOutputTensor(const TensorId &arg0)
Add an output tensor from a node in the graph into the list of output tensors.
- Parameters
arg0 – The tensor id of the output tensor to be added.
-
inline AiOnnxOpset6 aiOnnxOpset6()
Return the builder interface for ai.onnx opset 6.
-
inline AiOnnxOpset7 aiOnnxOpset7()
Return the builder interface for ai.onnx opset 7.
-
inline AiOnnxOpset8 aiOnnxOpset8()
Return the builder interface for ai.onnx opset 8.
-
inline AiOnnxOpset9 aiOnnxOpset9()
Return the builder interface for ai.onnx opset 9.
-
inline AiOnnxOpset10 aiOnnxOpset10()
Return the builder interface for ai.onnx opset 10.
-
inline AiOnnxOpset11 aiOnnxOpset11()
Return the builder interface for ai.onnx opset 11.
-
inline AiOnnxMlOpset1 aiOnnxMlOpset1()
Return the builder interface for ai.onnx.ml opset 1.
-
inline AiGraphcoreOpset1 aiGraphcoreOpset1()
Return the builder interface for ai.graphcore opset 1.
-
std::vector<TensorId> customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const unsigned numOutputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})
Return the output tensors from a custom op added to the model.
- Parameters
opid – The id of the operator.
opsetVersion – The version of the opset.
inputs – The tensor ids of the A vector of input tensor ids.
numOutputs – The number of output tensors.
attributes – The map of attributes and their values to be added.
debugContext – Optional debug information.
- Returns
The output tensors.
-
void customOp(const OperatorIdentifier &opid, int opsetVersion, const std::vector<TensorId> &inputs, const std::vector<TensorId> &outputs, const std::map<std::string, popart::any> &attributes, const DebugContext &debugContext = {})
Add a custom op to the model.
- Parameters
opid – The id of the operator.
opsetVersion – The version of the opset.
inputs – The tensor ids of the A vector of input tensor ids.
outputs – The tensor ids of the output tensors.
attributes – The map of attributes and their values to be added.
debugContext – Optional debug information.
-
template<class T>
inline TensorId reshape_const(T &t, const std::vector<TensorId> &args, const std::vector<int64_t> &shape, const std::string &name = {}) Add a constant and a reshape a tensor using the provided domain.
- Parameters
t – The builder interface.
args – The tensor ids of the tensors to be updated.
shape – The shape information to be used.
name – (Optional) The name of the updated tensor. Default: None.
- Returns
The tensor id of the updated tensor.
-
inline void outputTensorLocation(const TensorId &nodeOutputName, TensorLocation value)
Set a value for the output tensor location attribute.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The location of the tensor.
-
inline void recomputeOutput(const TensorId &nodeOutputName, RecomputeType value)
Enable recomputation of the output of the node in the backward pass.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – (Optional) The type of the recompute.
-
inline void recomputeOutputInBackwardPass(const TensorId &nodeOutputName, RecomputeType value = RecomputeType::Recompute)
Enable or disable recomputation of the output of the node in the backward pass.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – (Optional) The type of the recompute. Default:
RecomputeType::Recompute
.
-
inline void recomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames, RecomputeType value = RecomputeType::Recompute)
Enable or disable recomputation of the output of the node in the backward pass.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
value – (Optional) The type of the recompute. Default:
RecomputeType::Recompute
.
-
inline bool getRecomputeOutputInBackwardPass(const TensorId &nodeOutputName)
Check if a node will have its output recomputed in the backward pass.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
true
if the output will be recomputed;false
otherwise.
-
inline bool getRecomputeOutputInBackwardPass(const std::set<TensorId> &nodeOutputNames)
Check if a node will have its output recomputed in the backward pass.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
true
if the output will be recomputed;false
otherwise.
-
std::vector<TensorId> checkpointOutput(const std::vector<TensorId> &nodeOutputNames)
Add checkpoint operations to the model.
This is the same as an identity op but RecomputeType is
Checkpoint
by default. Use this to checkpoint a subset of an operation’s output tensors.- Parameters
nodeOutputNames – The tensors to checkpoint.
- Returns
The checkpointed tensors.
-
inline void virtualGraph(const TensorId &nodeOutputName, int64_t value = 0)
Set the virtual graph that computes the given node.
Applies when creating a graph for a multi-IPU configuration.
- Parameters
nodeOutputName – Name of the output tensor of the ONNX node.
value – The index of the virtual graph that computes this node. Default=0.
-
inline void executionPhase(const TensorId &nodeOutputName, int64_t value = 0)
Set the execution phase that computes the given node.
Applies when creating a graph for a multi-IPU configuration.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The index of the virtual graph that computes this node. Default=0.
-
inline void pipelineStage(const TensorId &nodeOutputName, int64_t value)
Set the value on the pipeline stage attribute.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The value to be set.
-
inline void pipelineStage(const std::set<TensorId> &nodeOutputNames, int64_t value)
Set the value on the pipeline stage attribute.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
value – The value to be set.
-
inline void excludePatterns(const TensorId &nodeOutputName, const std::vector<std::string> &patternNames)
Set the patterns to be excluded.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
patternNames – The vector of pattern names to be excluded.
-
inline void excludePatterns(const std::set<TensorId> &nodeOutputNames, const std::vector<std::string> &patternNames)
Set the patterns to be excluded.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node.
patternNames – The vector of pattern names to be excluded.
-
inline void setSerializeMatMul(const std::set<TensorId> &nodeOutputNames, std::string mode, int64_t factor, bool keep_precision)
Set the settings for matmuls that should be serialized.
This option will split a matmul into separate smaller matmuls that will be executed in series. This will also serialize the grad operations during training.
- Parameters
nodeOutputNames – The tensor ids of the output matmul tensors of the ONNX node.
mode – The dimension of the matmul to serialize on. Options are: ‘input_channels’, ‘output_channels’, ‘reducing_dim’, ‘none’.
factor – The number of serialised matmuls. This must be a factor of the dimensions to serialise on.
-
void setPartialsType(const TensorId &nodeOutputName, const std::string partialsType)
Set the partials type for the given node.
This is used in the convolution op.
- Parameters
nodeOutputName – Name of the output tensor of the ONNX node.
partialsType – The type for the partials. Options are:
FLOAT
orHALF
.
-
void setEnableConvDithering(const TensorId &nodeOutputName, int64_t value)
Enable convolution dithering.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
value – The value to enable convolution. This should be 1 to enable convolution dithering and 0 otherwise.
-
std::string getPartialsType(const TensorId &nodeOutputName)
Get the partials type for the given node.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node.
- Returns
The partials type.
-
inline void setInplacePreferences(const TensorId &nodeOutputName, const std::map<OpType, float> &prefs)
-
void setAvailableMemoryProportion(const TensorId &nodeOutputName, const float availableMemoryProportion)
Set the available memory proportion for the given node.
This is used in the convolution op.
See also
Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using
availableMemoryProportion
.- Parameters
nodeOutputName – Name of the output tensor of the ONNX node.
availableMemoryProportion – The available memory proportion [0, 1).
-
void setAvailableMemoryProportion(const std::set<TensorId> &nodeOutputNames, const float availableMemoryProportion)
Set the available memory proportion for the given node.
This is used in the convolution op.
See also
Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using
availableMemoryProportion
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
availableMemoryProportion – The available memory proportion [0, 1).
-
void setAttribute(const std::string &attribute, popart::any value)
Set the value of an attribute that will be set on all subsequent operations.
- Parameters
attribute – The name of the attribute to set.
value – The value to set on the attribute.
-
popart::any getAttribute(const std::string attribute) const
Get an attribute that has been set for all subsequent operations.
- Parameters
attribute – The name of the attribute to get.
- Returns
The attribute.
-
bool hasAttribute(const std::string &attribute) const
Check if an attribute exists.
- Parameters
attribute – The name of the attribute to check.
- Returns
true
if the attribute exists;false
otherwise.
-
void clearAttribute(const std::string &attribute)
Unset an attribute that will be set on all subsequent operations.
- Parameters
attribute – The name of the attribute to unset.
-
bool hasAttribute(const std::string &attribute)
Check if an attribute is set.
- Parameters
attribute – The name of the attribute to check.
- Returns
true
if the attribute is set;false
otherwise.
-
popart::any getAttribute(const std::string &attribute)
Get the attribute value.
- Parameters
attribute – The name of the attribute.
- Returns
The value of the attribute.
-
int64_t getPipelineStage() const
Get the pipeline stage attribute.
- Returns
The pipeline stage.
-
int64_t getExecutionPhase() const
Get the execution phase attribute.
- Returns
The execution phase.
-
int64_t getVirtualGraph() const
Get the virtual graph attribute.
- Returns
The virtual graph.
-
inline void virtualGraph(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)
Set the virtual graph that computes the given node.
Applies when creating a graph for a multi-IPU configuration.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
value – The index of the virtual graph that computes this node.
-
inline void executionPhase(const std::set<TensorId> &nodeOutputNames, int64_t value = 0)
Set the execution phase.
Applies when creating a graph for a multi-IPU configuration.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
value – The index of the virtual graph that computes this node.
-
void addNodeAttribute(const std::string &attributeName, const int64_t &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – An
int64_t
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const std::vector<int64_t> &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A
std::vector<int64_t>
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const float &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A
float
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const std::vector<float> &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – The
std::vector<float>
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const std::string &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A
std::string
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const char *attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A
char
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const std::vector<std::string> &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A
std::vector<std::string>
value of the attribute to add.nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const bool attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A bool value of the attribute to add.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
void addNodeAttribute(const std::string &attributeName, const ConstVoidData &attributeValue, const std::set<TensorId> &nodeOutputNames)
Add an attribute to the ONNX node which is uniquely identified by the output tensors.
This function will throw an exception if it cannot find the unique node or if the attribute already exists.
- Parameters
attributeName – The name of the attribute to add.
attributeValue – A constant tensor initializer.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
bool nodeHasAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Check whether the ONNX node has an attribute set.
This function will throw an exception if it cannot find the unique node.
- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
true
if the node has an attribute set;false
otherwise.
-
int64_t getInt64NodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a
int64_t
.This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the
int64_t
type.- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
std::vector<int64_t> getInt64VectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a
std::vector<int64_t>
.This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the
std::vector<int64_t>
type.- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
float getFloatNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a
float
.This function will throw an exception if it cannot find the unique node or if the attribute does not exist or if it has not been set to the
float
type.- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
std::vector<float> getFloatVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a
std::vector<float>
.This function will throw an exception if it cannot find the unique node or if the attribute does not exist.
- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
std::string getStringNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a string.
This function will throw an exception if it cannot find the unique node or the attribute does not exist or it has not been set to the
std::string
type.- Parameters
attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
std::vector<std::string> getStringVectorNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a vector of strings.
This function will throw an exception if it cannot find the unique node or if the attribute does not exist.
- Parameters
attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
bool getBoolNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Get the value of an attribute for the ONNX node where the value is a boolean.
This function will throw an exception if it cannot find the unique node or if the attribute does not exist.
- Parameters
attributeName – The name of the attribute for which the value is required.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
Value of the attribute.
-
void removeNodeAttribute(const std::string &attributeName, const std::set<TensorId> &nodeOutputNames)
Remove an attribute from the ONNX node.
This function will throw an exception if it cannot find the unique node or if the attribute does not exist.
- Parameters
attributeName – The name of the attribute to find.
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
-
std::vector<std::string> getAllNodeAttributeNames(const std::set<TensorId> &nodeOutputNames)
Get all the attribute names from the ONNX node.
This function will throw an exception if it cannot find the unique node.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
The attribute names associated with the ONNX node.
-
inline int64_t getVirtualGraph(const TensorId &nodeOutputName)
Get the index of the virtual graph that computes this node.
This applies in a multi IPU system.
This function will throw an exception if the virtual graph has not been set in the current scope.
- Parameters
nodeOutputName – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.
- Returns
The virtual graph associated with the ONNX node.
-
inline int64_t getVirtualGraph(const std::set<TensorId> &nodeOutputNames)
Get the index of the virtual graph that computes this node based on multiple output tensors.
This applies in a multi IPU system.
This function will throw an exception if the virtual graph has not been set in the current scope.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
The virtual graph associated with the ONNX node.
-
inline int64_t getExecutionPhase(const TensorId &nodeOutputName)
Get the execution phase for a single output tensor.
This only applies to a multi-IPU system.
This function will throw an exception if the execution phase has not been set in the current scope.
- Parameters
nodeOutputNames – The tensor id of the output tensor of the ONNX node used to find the node in the ONNX model.
- Returns
The execution phase associated with the ONNX node.
-
inline int64_t getExecutionPhase(const std::set<TensorId> &nodeOutputNames)
Get the execution phase for a set of output tensors.
This only applies to a multi-IPU system.
This function will throw an exception if the execution phase has not been set in the current scope.
- Parameters
nodeOutputNames – The tensor ids of the output tensors of the ONNX node used to find the node in the ONNX model.
- Returns
The execution phase associated with the ONNX node.
-
std::string getModelProto(bool humanReadable = false) const
Retrieve the ONNX serialized ModelProto.
- Parameters
humanReadable – If true, return a human readable text representation of the model, otherwise use a binary format.
- Returns
A serialized ONNX ModelProto.
-
void saveModelProto(const std::string &fn)
Save the builder’s ONNX ModelProto into the builder and validate it.
- Parameters
fn – The name of a file containing an ONNX model protobuf.
-
void saveInitializersExternally(const std::vector<TensorId> &ids, const std::string &fn)
Save tensor data externally.
The model data cannot exceed 2GB - the maximum size of a Protobuf message. To avoid this, for large models ONNX tensor data can be saved separately.
- Parameters
ids – The names of tensors for which data is to be saved externally.
fn – The name of a file containing the binary tensor data. This can be an absolute or relative path. If a relative path, when the ONNX model is saved, external tensor data will be written to a path relative to the current working directory.
-
std::vector<TensorId> getInputTensorIds() const
Return a list of ONNX graph input tensor ids.
- Returns
A vector of input tensor ids.
-
std::vector<TensorId> getOutputTensorIds() const
Return a list of ONNX graph output tensor ids.
- Returns
A vector of output tensor ids.
-
std::vector<TensorId> getValueTensorIds() const
Return a list of ONNX graph value tensor ids.
These tensors are stored in the
value_info
section of the ONNXGraphProto
structure.- Returns
A vector of value tensor names.
-
std::vector<TensorId> getTrainableTensorIds() const
Return a list of ONNX graph initialized tensor ids.
These tensors are stored in the
initialized
section of the ONNXGraphProto
structure..- Returns
A vector of names of initialized tensors.
-
bool hasValueInfo(const TensorId &id) const
Check if a tensor has value info.
A tensor may not have value info if this either does not exist or if shape inference has failed.
- Returns
True
if the tensor has value info;false
otherwise..
-
std::vector<int64_t> getTensorShape(const TensorId id)
Return an ONNX graph tensor shape, from either the
input
,output
, orvalue_info
lists inGraphProto
.- Parameters
id – The id of the tensor for which dimensions are required.
- Returns
A vector of the tensor dimensions.
-
bool isInitializer(const TensorId id) const
Check if the ONNX tensor is in the initializer list of
GraphProto
.- Parameters
id – A tensor id.
- Returns
True
if the tensor is in the initializer list;false
otherwise.
-
std::string getTensorDtypeString(const TensorId id)
Return an ONNX graph tensor type as a lower case string, from either the
input
,output
, orvalue_info
lists inGraphProto
.- Parameters
id – The id of the tensor for which the type is required.
- Returns
A lower case string of the tensor data type.
-
DataType getTensorDataType(const TensorId id)
Return a tensor type from either the
input
,output
, orvalue_info
lists inGraphProto
.- Parameters
id – The id of tensor id for which the type is required.
- Returns
The data type of the tensor.
-
void pushNameScope(const std::string &name)
Push a name onto the name scope stack.
The names of tensors and nodes added to the ONNX graph will be prefixed with a concatenation of the names in the name scope stack.
- Parameters
name – The tensor name to be pushed onto the name scope stack.
-
void popNameScope()
Remove the last entry in the name scope stack.
-
std::string getNameScope(const std::string &name = "") const
Get the current name scope stack using the default delimiter.
- Parameters
name – (Optional) A string to concatenate to the end of the stack.
- Returns
A string of the concatenated name scope stack.
-
void setGraphName(const std::string &name)
Set a graph name.
- Parameters
name – The string to name the graph.
-
void setParent(Builder *parent)
Set the parent graph of this builder.
- Parameters
parent – The builder to set as the parent of this builder.
-
inline bool hasParent() const
Check if this builder represents a subgraph.
- Returns
If
true
then the builder represents a subgraph. Iffalse
then the builder does not represent a subgraph.
-
void embedReplicationFactor(int replicationFactor)
Embed the value of replicationFactor into the OnnxModel.
Should be interpreted as 1 if not present in the model.
- Parameters
replicationFactor – The replication factor.
Public Static Functions
-
static std::unique_ptr<Builder> createFromOnnxModel(const std::string &modelProtoOrFilename)
Create a builder which loads a serialized ONNX ModelProto into the builder and validates it.
- Parameters
modelProtoOrFilename – Either an ONNX model protobuf, or the name of a file containing an ONNX model protobuf.
-
Builder &createSubgraphBuilder()
-
class Ir
Public Types
Public Functions
-
poprithms::logging::TimePartitionLogger &timePartitionLogger() const
void foo() { auto timer = timePartitionLogger().scopedStopwatch("In foo"); if (cond0()){ return; } bar(); return; }
When the method timePartitionLoggerStr() (see below) is called, there will be a line with “In foo” summarizing the time between between the construction and destruction of timer, above. Something like:
In foo : 0.03 [s] : 30 % In bar : 0.02 [s] : 10 % unaccounted : 0.05 [s] : 50 % total : 0.10 [s] : 100 %.
In the case where there are multiple timers which exist concurrently, only the most recently constructed one will accumulate time. This means that the most nested scope is the one which will accumulate time.
For more information, see the poprithms SwitchingTimePartitionLogger class
- Returns
An object used to track and summarize where wall clock time is spent in PopART compilation. This object is used to partition time into different components (scheduling, outlining, poplar Graph construction, etc.). It can be used as follows:
-
std::string timePartitionLoggerStr() const
-
Ir()
-
~Ir()
-
inline uint64_t getId() const
-
void setOnnxModel(const ONNX_NAMESPACE::ModelProto &model)
-
inline bool hasOnnxModel() const
Check if there’s an ONNX model in the IR.
This is true if the IR has been created from an ONNX model or using the Builder.
- Returns
true If there is an onnx model, false otherwise.
-
void setUserOptions(const SessionOptions &flags)
-
void setInputShapeInfo(const InputShapeInfo &info)
-
inline const InputShapeInfo &getInputShapeInfo() const
-
void ensureOptimizerTensorCreated(const TensorId &optId, const TensorInfo &info, const DebugContext &debugContext = {})
-
void setDeviceInfo(DeviceInfo&)
-
const DeviceInfo *getDeviceInfo() const
-
void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)
-
void removeIsolatedGraphs()
-
void setExecutionMode(const ExecutionMode &mode)
-
inline bool isTraining() const
-
inline bool isTesting() const
-
void logIr() const
-
void prepare(const IrBundle &bundle, const HashesMap &cacheEntries = {}, size_t hashSeed = 0u)
Prepare the IR based on the IrBundle configuration.
If engine caching is enabled then the IR hash which is based on the IrBundle and the forward graph will be compared to a saved file. If the hash matches then the rest of the Ir preparation will be skipped.
- Parameters
bundle – The bundle to prepare.
cacheEntries – The engine cache.
hashSeed – The seed to initiate the IR hash with — this hash should incorporate non-IR factors that could affect the compilation such as engine options and session options.
-
void finalizeOpDebugInfo()
-
inline bool isPrepared() const
-
inline bool hashMatched() const
-
ONNX_NAMESPACE::ModelProto step(int n)
-
void addAdditionalModelProtoTensors()
-
inline bool additionalModelProtoTensorsHaveBeenAdded() const
-
inline const std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors() const
-
inline std::set<Tensor*, PTensorCmp> &getAdditionalModelProtoTensors()
-
void append(std::stringstream&) const
-
void serialise(SerialiseFormat format, std::stringstream &ss, bool useScheduler = true) const
-
std::map<TensorId, std::vector<Tensor*>> getHostLoadTensors() const
The original input tensor ID (used to identify streams) and the tensors produced by associated HostLoadOp.
-
std::map<TensorId, std::vector<Tensor*>> getHostStoreTensors() const
The original anchor tensor ID (used to identify streams) and the tensors consumed by associated HostStoreOp.
-
std::vector<Op*> opsOfType(const OperatorIdentifier &opid) const
-
bool isConsumedByOpOfType(TensorId tid, const OperatorIdentifier &opid)
-
std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule ros) const
-
bool isSchedulable(const OpsBeforeKey&) const
-
bool virtualGraphsEnabled() const
-
SyntheticDataMode syntheticDataMode() const
-
bool useSyntheticData() const
-
const ONNX_NAMESPACE::ModelProto &getModel() const
- Throws
error – if there is no Onnx model.
- Returns
const reference to the Onnx model.
-
std::vector<TensorId> getModelInputIds() const
- Returns
the id of every input tensor of the Onnx model. If there is no Onnx model, returns empty.
-
void setExternalTensorDataInfo(TensorId, const ONNX_NAMESPACE::TensorProto&)
Set the Onnx TensorProto of the given tensor in the Onnx ModelProto.
- Throws
error – if this Ir has no Onnx model.
-
inline const SessionOptions &getSessionOptions() const
-
inline SessionOptions &getSessionOptions()
-
inline void setSessionName(const std::string name)
-
inline const std::string getSessionName() const
-
std::vector<TensorId> getTensorIds(TensorType) const
-
Op *getOp(OpId opId) const
Returns the Op if it exists in any graph.
Throws an error if the Op could not be found.
-
Tensors &getMainGraphTensors()
-
const Tensors &getMainGraphTensors() const
-
void validateAnchors() const
-
ExecutionMode getExecutionMode() const
-
bool canInfer() const
-
bool canTrain() const
-
bool hasConstructedBackwards() const
-
bool hasDecomposedOptimizers() const
-
bool containsInitialisers() const
-
void constructForwards()
-
void constructBackwards()
-
void registerInputTensors()
-
void updateVertices()
-
void unsetAllVirtualGraphIds()
-
void applyUpdateInplacePrioritiesForIpu()
-
void confirmConstIds() const
-
void confirmNoReservedIds() const
-
int getDefaultOpsetVersion(const std::string &domain) const
-
unsigned getNumVirtualGraphIds() const
-
int getOpSetVersionFromModel(const std::string &domain) const
-
inline bool autoRecomputationEnabled() const
-
bool hasReplicatedTensorSharding() const
-
bool hasOverlappedIO() const
-
inline void setRequiresRandomSeed()
-
inline bool getRequiresRandomSeed() const
-
RandomReferenceId getAndIncrementRandomReferenceId()
-
TensorId getOrSetRandomReferenceTensor(RandomReferenceId, TensorId)
-
void mergeRandomReferenceIds(std::set<RandomReferenceId>&)
-
void setRemoteBufferInfo(RemoteBufferId, RemoteBufferInfo)
-
const RemoteBufferInfo getRemoteBufferInfo(RemoteBufferId) const
-
const std::map<RemoteBufferId, RemoteBufferInfo> getAllRemoteBufferInfos() const
-
inline void setExecutionPhasesReady()
-
inline bool getExecutionPhasesReady() const
-
PipelineStage getNumPipelineStages() const
-
PipelineInfo pipelineInfo() const
-
void setMainGraphPathFromLoss()
-
void verifyTensorInfos() const
Verifies that all tensors have valid TensorInfos.
-
void setIsPrepared()
Marks the Ir as “prepared”.
This means the Ir is now ready to be lowered. Failing to do this before lowering the Ir will result in an error. The schedule of all graphs will be fixed by calling this. Modifying the graphs after the IR is prepared will result in an error.
-
PipelineStage getFinalLossPipelineStage() const
Get pipeline stage containing the final loss (the last forward pipeline stage)
- Returns
pipeline stage containing the final loss
-
PipelineStage getMaxPipelineStage() const
Get the max pipeline stage that will exist after the backward pass has been added to the graph.
- Returns
max pipeline stage of the graph
-
inline const decltype(graphs) &getGraphs() const
-
size_t getHash() const
-
void computeHash(size_t hashSeed)
-
size_t getIrBundleHash() const
-
void setIrBundleHash(size_t)
-
ClonedGraphMaps cloneGraph(GraphId originalGraphId, GraphId newGraphId)
Clone a graph.
The OpIds and TensorIds will differ between the original and the cloned graph. Hence a map between the old OpId and cloned OpId will be returned. The new graph can be obtained by ir.getGraph(newGraphId);
Warning
Does not support cloning of the main graph.
- Parameters
originalGraphId – The id of the graph to clone
newGraphId – The id of the cloned graph
- Returns
A struct of maps between the OpIds and TensorIds in the original and new graphs
-
bool applyPreAliasPattern(const PreAliasPattern*, Graph&)
Public Static Functions
-
static bool usingEngineCache(const SessionOptions&, const DeviceInfo*)
-
poprithms::logging::TimePartitionLogger &timePartitionLogger() const
-
class Graph
Public Types
Public Functions
-
~Graph()
-
Graph() = delete
-
const std::set<int64_t> getAllVirtualGraphIds(bool includeInvalid) const
-
const std::map<int64_t, int> getVirtualGraphCounts() const
-
Op *getOp(OpId opId) const
Return a pointer to the Op if it exists.
Throws an error if the Op could not be found.
See also
getOpUnsafe
-
Op *getOpUnsafe(OpId opId) const
Returns a pointer to the Op if it exists, or nullptr otherwise.
See also
getOp
-
const Tensors &getTensors() const
-
Tensors &getTensors()
-
void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const DebugContext &debugContext)
Add a variable to this graph with the provided properties.
- Parameters
name – The name of the variable.
info – The tensor info to create the variable with, including shape and data type.
src – The data to initialise the tensor with.
debugContext – The debug context to assist with debugging.
-
void addVarInit(const TensorId &name, const TensorInfo &info, const void *src, const VariableSettings &vs, const DebugContext &debugContext)
As per addVarInit, but passing a VariableSettings object to allow for grouped replicas.
See also
addVarInit(const TensorId &, const TensorInfo &, const void *, const DebugContext &)
- Parameters
name – The name of the variable.
info – The tensor info to create the variable with, including shape and data type.
src – The data to initialise the tensor with.
vs – The variablesettings to use.
debugContext – The debug context to assist with debugging.
-
void addConstInit(const TensorId&, const TensorInfo&, const void*, const DebugContext&)
-
void addStream(const TensorId&, const TensorInfo&, const DebugContext&)
-
void constructFromOnnxGraph(const ONNX_NAMESPACE::GraphProto &onnx_graph)
-
template<typename OP, typename ...Args>
OP *createConnectedOp(const std::map<InIndex, TensorId> &in, const std::map<OutIndex, TensorId> &out, Args&&... args)
-
void setVarUpdateConstraints()
-
void setConvFlipWeightConstraints()
-
std::vector<Op*> getOpSchedule(const OpsBeforeKey&, RequireOptimalSchedule requireOptimalSchedule) const
-
void freezeSchedule(const OpsBeforeKey &gCons)
-
bool isSchedulable(const OpsBeforeKey&, bool respectExecutionPhases = false) const
-
bool hasUserRecomputeOps() const
-
InIndex getInputIndex(TensorId id) const
Get the index of the graph input with a specific id.
If the id is not a valid input id then a error will be raised.
- Parameters
id – Tensor name to find the index for.
- Returns
The input index for the specified id, if it exists.
-
void addInput(const InIndex &index, const TensorId &id, const TensorInfo &info, bool overwrite)
Add a graph input at a specific index in the list.
- Parameters
index – Force the input to be at the specified index in the graph.
id – Tensor name to create and connect
info – Tensor info
overwrite – Overwrites any existing input at the index if true, otherwise, moves all other inputs by one position
-
void addInput(const TensorId &id, const TensorInfo &info)
Add a graph input to the end of the list.
- Parameters
id – Tensor name to create and connect
info – Tensor info
-
TensorId addInput(const TensorInfo&)
-
void markAsOutput(const OutIndex &index, const TensorId &id, bool overwrite)
Mark a graph tensor as graph output at a specific index in the list.
- Parameters
index – Force the output to be at the specified index in the graph. Overwrites any existing output at the index.
id – Tensor in the graph to mark as output
overwrite – Overwrites any existing output at the index if true, otherwise, moves all other outputs by one position
-
void markAsOutput(const TensorId &id)
Mark a graph tensor as graph output at the end of the list.
- Parameters
id – Tensor in the graph to mark as output
-
void replaceTensor(const TensorId &oldId, const TensorId &newId)
Replace oldId with newId on any consumers.
Both tensors need to exist.
- Parameters
oldId – Tensor to disconenct from consumers & graph outputs
newId – Tensor to connect from consimers & graph outputs
-
inline const std::string &getGraphId() const
-
std::string getGraphString() const
-
void copyFrom(const Graph &other, CopyInputMarkings copyInputMarkings = CopyInputMarkings::Yes, CopyOutputMarkings copyOutputMarkings = CopyOutputMarkings::Yes)
-
std::pair<bool, std::vector<Op*>> getDirectViewChain(Tensor *from, Tensor *to)
Find a chain of view changing ops in the graph from “from” to “to” (if one exists) and return a vector of ops such that op1(op2(…opN(in))) = out for {op1, op1, …, opN}.
If no such chain exists, returns {false, {}};
- Parameters
from – The tensor to start at
to – The tensor to finish at
- Returns
std::pair<bool, std::vector<Op *>> The ops along the chain, in order. where the first of the pair is a bool indicating whether the path exists. The second is the vector of ops in order from ‘from’ to ‘to’. Givent the ops are 1-in-1-out, this will also be in schedule order.
-
void setOnnxToOnnx(std::unique_ptr<onnxpasses::IOnnxToOnnx>)
Set the object which will perform the ONNX -> ONNX transformation, which happens early on in the Graph constructor.
The default object, which is used if this method is not called, is an instance of the onnxpasses::Canonnxalizer class, which performs a set of required transformations, such as decomposing ASinh into more basic Nodes.
-
void finalizeSchedule()
Finalizes the graph schedule.
Schedule cannot change anymore after this was called. Calling finalize multiple times results in an error.
-
inline void removeIsolatedTensors(bool retainUsedIOTensors = false, bool retainAllIOTensors = false, bool retainVarTensors = false, bool retainConstTensors = false)
-
inline bool canBeRecursivelyAutodiffed() const
If this graph X is called in graph Y, when applying autodiff to Y, is it safe to autodiff X?
-
inline void setCanBeRecursivelyAutodiffed(bool value)
Public Static Attributes
-
static const int64_t NoVGraph
-
~Graph()
-
class AiOnnxMlOpset1 : public popart::DomainOpSet
Class that represents the AI ONNX ML opset.
Public Functions
-
inline AiOnnxMlOpset1(std::unique_ptr<BuilderImpl> &impl_)
Constructor for the AiOnnxMlOpset1 class.
- Parameters
impl_ – A pointer to an implementation of the Builder class.
-
inline AiOnnxMlOpset1(std::unique_ptr<BuilderImpl> &impl_)
-
class AiGraphcoreOpset1 : public popart::DomainOpSet
Class that represents the AI Graphcore opset.
Public Functions
-
inline AiGraphcoreOpset1(std::unique_ptr<BuilderImpl> &impl_)
Constructor for the AiGraphcoreOpset1 class.
- Parameters
impl_ – A pointer to an implementation of the Builder class.
-
TensorId copyvarupdate(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Copies a tensor to an initalised tensor (variable).
This is used to update an initalised tensor (a variable created using addInitializedInputTensor()) which retains its value between iterations, by setting the value to the value of another tensor (the updater). The purpose is to manually update the tensor in use cases for variables other than trained parameters (weights) or tensors used by other ops.
- Parameters
args – A vector of the input tensor ids containing the tensor to be updated,
tensor
and the tensor containing the values for the update,updater
as [tensor
,updater
].debugContext – Optional debug information.
- Returns
An alias to the updated variable: to ensure correct ordering of the updated variable, you should use this variable for any op which should operate on the updated variable.
-
std::vector<TensorId> batchnormalization(const std::vector<TensorId> &args, unsigned num_outputs, float epsilon = 1e-05f, float momentum = 0.9f, const popart::DebugContext &debugContext = {})
Add a batch normalization operation to the model.
This version uses N-1 as the population size for calculating running variance (like PyTorch). PyTorch BatchNorm1d
Whereas, the Onnx version uses N. ONNX version
- Parameters
args – List of input tensor ids
num_outputs – The number of output tensor ids
epsilon – The ‘epsilon’ attribute
momentum – The ‘momentum’ attribute
name – Optional identifier for the operation
- Returns
A list of normalized output tensors
-
std::vector<TensorId> groupnormalization(const std::vector<TensorId> &args, int64_t num_groups, float epsilon = 1e-05f, const DebugContext &debugContext = {})
Add a group normalization operation to the model.
This is a Poplar extension.
The group will be created from a strided input.
- Parameters
args – A vector of input tensor ids for input data
x
, scalescale
, and biasbias
as [x
,scale
,bias
].num_groups – The number of groups to separate the channels into.
epsilon – The epsilon value to use to avoid division by zero.
debugContext – Optional debug information.
- Returns
A vector of output tensor ids for output data
y
, the meanmean
and the variancevar
as [y
,mean
,var
].
-
std::vector<TensorId> multiconv(const MultiConvInputs &tensors, const MultiConvDilations &dilations = {}, const MultiConvDilations &inDilations = {}, const MultiConvPads &pads = {}, const MultiConvPads &outPads = {}, const MultiConvStrides &strides = {}, const std::vector<float> &availableMemoryProportions = {}, const std::vector<std::string> &partialsTypes = {}, const nonstd::optional<std::string> planType = nonstd::nullopt, const nonstd::optional<int> perConvReservedTiles = nonstd::nullopt, const nonstd::optional<float> cycleBackOff = nonstd::nullopt, const std::vector<int64_t> enableConvDithering = {}, const DebugContext &debugContext = {})
Add a multi-convolution operation to the model.
Using this multi-convolution API ensures that the convolutions are executed in parallel on the device.
Functionally, a multi-convolution is equivalent to a series of single convolutions. Using this multi-convolution API is always equivalent to calling the single-convolution API (conv) once for each argument.
For example, calling:
A0 = conv({X0, W0, B0}) A1 = conv({X1, W1})
is functionally equivalent to calling:
{A0, A1} = multiconv({{X0, W0, B0}, {X1, Q1}).
It is possible that any two convolutions cannot be executed in parallel due to topological constraints. For example, the following:
B = conv({A, W0}); C = B + A D = conv({C, W1});
cannot be converted to:
{B, D} = multiconv({{A, W0}, {C, W1}}).
Note that it is not possible to create such a cycle by adding a multi-convolution with this API.
Calls to multiconv() are mapped to poplar::poplin::multiconv::convolution().
All input vectors must be either empty, or equal in length to the number of convolutions. Note that groups for each convolution are automatically inferred from the shapes of the data and weight inputs.
See also
Optimising Temporary Memory Usage for Convolutions and Matmuls on the IPU for some practical examples of using
availableMemoryProportion
.- Parameters
tensors – List of tensor ids for input tensors for data, weights and biases as [
data
,weight
,bias
] for each convolution.bias
is optional.dilations – The dilations attributes for each convolution.
inDilations – The input dilations attributes for each convolution.
pads – The pads for each convolution.
outPads – The output padding for each convolution.
strides – The strides for each convolution.
availableMemoryProportions – The available memory proportions per convolution, each [0, 1).
partialsTypes – The partials type per convolution.
planType – Run convolutions in parallel or series.
perConvReservedTiles – The number of tiles to reserve per convolution when planning.
cycleBackOff – Cycle back-off proportion, [0, 1).
enableConvDithering – Enable convolution dithering per convolution. If
true
, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models.debugContext – Optional debug information.
- Returns
A vector of tensor ids of the output tensor from each convolution.
-
TensorId subsample(const std::vector<TensorId> &args, const std::vector<int64_t> &strides, const DebugContext &debugContext = {})
Add a sub-sample operation to the model.
This is a Poplar extension.
If multiple tensors are provided, the strides will be applied to them all.
- Parameters
args – A vector of tensor ids to sub-sample.
strides – The strides to use.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId printtensor(const std::vector<TensorId> &args, int64_t print_gradient = 1, const DebugContext &debugContext = {}, const std::string &title = {}, const int summariseThreshold = 1000, const int edgeItems = 3, const int maxLineWidth = 75, const int digits = 8, const int floatFormat = 0, const char separator = ' ', const char openBracket = '[', const char closeBracket = ']')
Add a print tensor operation to the model.
This is a Poplar extension.
- Parameters
args – A vector of tensor ids to print.
print_gradient – Indicates whether the gradient tensor(s) associated with the input tensor(s) are also printed. If 1, the gradient tensor(s) are also printed, otherwise the gradient tensor(s) are not printed.
debugContext – Optional debug information.
title – An optional title to print.
summariseThreshold – (default 1000) If the number of elements of the tensor exceeds this threshold the output will be summarised. Only the edge elements will be displayed with an ellipsis indicating skipped elements. A value of 0 will disable summarisation.
edgeItems – (default 3) number of edge elements to include at the beginning and end when summarisation is enabled
maxLineWidth – (default 75) lines longer than this limit will be split across multiple lines. A value of 0 will disable line splitting.
digits – (default 8) number of digits to display. For integers this limit can be exceeded if any number is large enough. For floating points this does not include the exponent. The number of digits is used in conjunction analysis of the tensor to determine the width of each element to align all elements when printed. A value of 0 disables this analysis and each elements will be printed in an unaligned format.
floatFormat – (default 0=Auto) determines the floating point format to use. 0=auto, 1=fixed, 2=scientific 3=none. Automatic mode determines the appropriate format based on the data. If
digits==0
this option is disregarded and the floatFormat is set tonone
.separator – (default space) character used to delininate values.
openBracket – (default square bracket) character used to open a tensor.
closeBracket – (default square bracket) character used to close a tensor.
- Returns
The tensor id of the result tensor.
-
TensorId nop(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a no-op operation to the model.
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId normalize_image(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})
Normalize image and pad it from 3 channels to 4 channels.
The input channel must be in the last dimension.
- Parameters
args – Contains the image input, offsets, scales input tensors as required by Poplibs
scale – the scale to apply
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId scale(const std::vector<TensorId> &args, float scale, const DebugContext &debugContext = {})
Add a scale operation to the model.
This is a Poplar extension.
- Parameters
args – A vector of input tensor ids.
scale – The scale to apply.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId scaledadd(const std::vector<TensorId> &args, float scale0, float scale1, const DebugContext &debugContext = {})
Add a scaled add operation to the model.
The scaled add operation takes the form:
X = scale0 * T0 + scale1 * T1
where
scale0
is the scale factor to be applied to tensor \T0 andscale1
is the scale factor to be applied to tensor \T1.- Parameters
args – A vector of input tensor ids: [T0, T1, scale0, scale1].
scale0 – The scale to apply (if no
scale0
tensor is supplied).scale1 – The scale to apply (if no
scale1
tensor is supplied).debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
std::vector<TensorId> lstm(const std::vector<TensorId> &args, int64_t outputFullSequence, const DebugContext &debugContext = {})
-
TensorId gelu(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a GELU operation to the model.
This is a Poplar extension.
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId geluerf(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add an accurate GELU (ERF instead of TANH) operation to the model.
- Parameters
args – A vector of input tensor IDs.
debugContext – Optional debug information.
- Returns
The tensor ID of the result tensor.
-
TensorId detach(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a detach operation to the model.
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId depthtospace(const std::vector<TensorId> &args, int64_t blocksize, const std::string &mode = "DCR", const DebugContext &debugContext = {})
Add a depth-to-space operation to the model.
This allows DepthToSpace_11 to be targeted from earlier opsets.
The purpose of a depth-to-space operation, also known as pixel shuffling, is to rearrange data from the depth (channels) dimension into the spatial (width and height) dimensions. It is an efficient means of learning upsampling alongside mixing convolution with bilinear interpolation and using transpose convolution.
See also
- Parameters
args – A vector containing a single tensor id of the input tensor of shape [
N
,C
,H
,W
], whereN
is the batch axis,C
is the channel or depth,H
is the height andW
is the width.blocksize – The size of the blocks to be moved. If the input is [
N
,C
,H
,W
] and the blocksize isB
, the output will be [N
,C/(B*B)
,H*B
,W*B
].mode – Specifies how the data is rearranged:
”DCR” (Default): depth-column-row order
”CRD”: column-row-depth order
debugContext – Optional debug information.
- Returns
A tensor which is a rearrangement of the input tensor.
-
TensorId round(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a rounding operation to the model.
This allows
Round_11
to be targeted from earlier opsets.See also
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The normalized output tensor ids.
-
TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, Attributes::Int batch_axis, const DebugContext &debugContext = {})
Add an init operation to the model.
- Parameters
shape – The shape of the tensor to initialise.
data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.
init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.
batch_axis – Batch axis specifies the axis that the batches are split along and is a literal integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId init(Attributes::Ints shape, Attributes::Int data_type, Attributes::Int init_type, const DebugContext &debugContext = {})
Add an init operation to the model.
- Parameters
shape – The shape of the tensor to initialise.
data_type – The data type to initialise tensor with. The value is the integer attribute taken from the DataType enum.
init_type – The mode of the tensor initialisation. The value is the integer attribute taken from the InitType enum.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId dynamicslice(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})
Add a dynamic slice operation to the model.
Creates a new slice tensor,
slice
, at offset position,offset
, in a tensor,tensor
. For example:slice = tensor[offset]
- Parameters
args – A vector of input tensor ids: [tensor, offset].
axes – The axes along which to slice.
sizes – The size of the slice along each axis.
noOverlap – Indicates whether the slice regions overlap or not. If 1, slice regions do not overlap, otherwise they do overlap.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId dynamicupdate(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, Attributes::Int noOverlap, const DebugContext &debugContext = {})
Add a dynamic update operation to the model.
Creates a copy of a tensor,
tensor
, and updates the elements of the copied tensor at offset position,offset
, with the elements contained in the slice tensor,slice
, For example:out = tensor out[offset] = slice
- Parameters
args – A vector of input tensor ids: [tensor, offset, slice].
axes – The axes along which to update.
sizes – The size of the slice along each axis.
noOverlap – Indicates whether the updates overlap or not. If 1, the updates do not overlap, otherwise they do overlap.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId dynamiczero(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})
Add a dynamic zero operation to the model.
Creates a copy of a tensor,
tensor
, with a slice tensor at offset position,offset
set to zero. For example:out = tensor out[offset] = 0.0
- Parameters
args – A vector of input tensor ids: [tensor, offset].
axes – The axes along which to zero elements.
sizes – The size of the slice along each axis.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId dynamicadd(const std::vector<TensorId> &args, Attributes::Ints axes, Attributes::Ints sizes, const DebugContext &debugContext = {})
Add a dynamic add operation to the model.
Creates a copy of a tensor,
tensor
, with a slice tensor,slice
, added at an offset position,offset
. For example:out = tensor out[offset] += slice
- Parameters
args – A vector of input tensor ids: [
tensor
,offset
,slice
].axes – The axes along which to add the slice.
sizes – The size of the slice along each axis.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId sequenceslice(const std::vector<TensorId> &args, Attributes::Int zeroUnused, const DebugContext &debugContext = {})
Slice a 2D tensor based on offsets.
The outermost dimension is sliced. For the following:
source
is the source tensor.destination
is the destination tensor.N
is the number of elements to copy.sourceOffset
is the first element read from the source tensor.destinationOffset
is the first element written to in the destination tensor. Then, for each entry inN
,sourceOffset
anddestinationOffset
:destination[destinationOffset:destinationOffset+N][...] = source[sourceOffset:sourceOffset+N][...]
Entries after the first
N==0
may be ignored. Unreferenced elements ofdestination
are zeroed ifzeroUnused
is set. The same output element should not be written by multiple inputs.source
anddestination
must have rank greater than or equal to 2. The outer dimension is sliced; the product of the inner dimensions must match.sourceOffset
,destinationOffset
andN
must be 1-dimensional and of the same size. For example:N = [1, 1, 1] sourceOffset = [0, 2, 4] destinationOffset = [0, 1, 2]
- Parameters
args – A vector of input tensor ids for the following tensors [
source
,destination
,N
,sourceOffset
,destinationOffset
].zeroUnused – Determines whether to zero unreferenced
destination
elements. If 1, the unreferenced elements are zeroed, otherwise they are not zeroed.debugContext – Optional debug information.
-
std::vector<TensorId> call(const std::vector<TensorId> &args, unsigned num_outputs, const Builder &callee, const DebugContext &debugContext = {})
Add a call operation to the model.
This is a Poplar extension, to expose manual code re-use to the builder.
- Parameters
args – A vector of input tensor ids.
callee – The subgraph to call into.
debugContext – Optional debug information.
- Returns
A vector of tensors; the subgraph outputs.
-
TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})
DEPRECATED: Add a replicated allreduce operation to the model.
This is a Poplar extension, to expose manual code re-use to the builder.
- Parameters
args – A vector of input tensor ids to reduce across.
commGroup – GCL CommGroup parameter.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId replicatedallreduce(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})
Add a replicated allreduce operation to the model.
This is a Poplar extension, to expose manual code re-use to the builder.
- Parameters
args – A vector of input tensor ids to reduce across
collectiveOperator – A Graphcore Communication Library (GCL) collective operator.
commGroup – A GCL CommGroup parameter.
debugContext – Optional debug information
- Returns
The tensor id of the result tensor.
-
TensorId replicatedreducescatter(const std::vector<TensorId> &args, const nonstd::optional<CollectiveOperator> &collectiveOperator = nonstd::nullopt, const nonstd::optional<CommGroup> &commGroup = nonstd::nullopt, const DebugContext &debugContext = {})
Add a replicated reduce-scatter operation to the model.
This is a Poplar extension, to expose manual code re-use to the builder.
- Parameters
args – A vector of input tensor ids to reduce across.
collectiveOperator – A Graphcore Communication Library (GCL) collective operator.
commGroup – A GCL CommGroup parameter.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId l1loss(const std::vector<TensorId> &args, const float lambda, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})
Add an
l1
loss operation to the model.Calculates the mean absolute error between each element in the input with a zero target.
- Parameters
args – A vector of input tensor ids.
lambda – The scale factor of the L1 loss.
reduction – The type of reduction to perform on the individual losses.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId nllloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const nonstd::optional<int> ignoreIndex = nonstd::nullopt, bool inputIsLogProbability = false, const DebugContext &debugContext = {})
Add a negative log-likelihood loss operation to the model.
Calculates the negative log likelihood (NLL) loss given a probability tensor over classes, and a target tensor containing class labels.
- Parameters
args – A vector of input tensor ids: probability and tensor.
reduction – The type of reduction to perform on the individual losses.
ignoreIndex – Optional class index to ignore in loss calculation.
inputIsLogProbability – If
true
the input tensor contains log-probabilities, otherwise raw probabilities. Default =false
.debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId identityloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const DebugContext &debugContext = {})
Add an identity loss operation to the model.
Calculates the loss using the identity operator.
- Parameters
args – A vector of input tensor ids.
reduction – The type of reduction to perform on the individual losses.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId tensorremap(const std::vector<TensorId> &args, Attributes::Int remap_type, const DebugContext &debugContext = {})
Add a tensor remap operation to the model.
Changes the tensor layout to conform to the downstream consumers, which means the consumers can read the tensor without having to rearrange it.
- Parameters
args – The tensor id of the tensor to remap. This is a single tensor that should be copied to a new tensor with a tensor layout conforming to the downstream consumer.
remap_type – The type of remap to perform on the forward/backward pass. Backward pass remapping requires the op to exist in the IR before autodiff. The value is the integer attribute value of the enum TensorRemapType.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})
Add a connectionist temporal classification (CTC) loss operation to the model.
With maximum input length
T
, batch sizeN
, number of classesC
and maximum target lengthS
, this op calculates the CTC loss for a logarithmised probabilities tensor with shape [T
,N
,C
], a class target tensor with shape [N
,S
], an input lengths tensor [N
] and a target lengths tensor [N
].Note that
C
includes a blank class (default=0). The probabilities tensor is padded as required. Target sequences are also padded and are populated with values less than or equal toC
, not including the blank class, up to their respective target lengths. Note that target lengths cannot exceed input lengths.- Parameters
args – A vector of input tensor ids [
log_probs
,targets
,input_lengths
,target_lengths
].reduction – The type of reduction to perform on the individual losses.
blank – The integer representing the blank class.
outDataType – The data type of the output tensors. Default =
UNDEFINED
.zeroInfinity – If
true
infinite losses and the associated gradients are zeroed-out. Default =false
.debugContext – Optional debug information
- Returns
The tensor id of the result tensor.
-
std::vector<TensorId> _ctcloss(const std::vector<TensorId> &args, const ReductionType reduction = ReductionType::Mean, const unsigned blank = 0, const std::string &outDataType = "UNDEFINED", const bool zeroInfinity = false, const DebugContext &debugContext = {})
-
std::vector<TensorId> ctcbeamsearchdecoder(const std::vector<TensorId> &args, unsigned blank = 0, unsigned beamWidth = 100, unsigned topPaths = 1, const DebugContext &debugContext = {})
Add a connectionist temporal classification (CTC) beam search decoder operation to the model.
Calculate the most likely
topPaths
labels and their probabilities given the inputlogProbs
with lengthsdataLengths
.- Parameters
args – A vector of input tensor ids. These are [
logProbs
,dataLengths
], wherelogProbs
is of shape [maxTime
,batchSize
, *numClasses
], anddataLengths
is of shape [batchSize
].blank – The integer representing the blank class.
beamWidth – The number of beams to use when decoding.
topPaths – The number of most likely decoded paths to return, must be less than or equal to
beamWidth
.debugContext – Optional debug information.
- Returns
The names of the result tensors. These are [
labelProbs,
labelLengths,
decodedLabels], where
labelProbsis of shape [
batchSize,
topPaths],
labelLengthsis of shape [
batchSize,
topPaths], and
decodedLabelsis of shape [
batchSize,
topPaths,
maxTime`].
-
TensorId shapeddropout(const std::vector<TensorId> &args, const std::vector<int64_t> &shape, float ratio = 0.5f, const DebugContext &debugContext = {})
Add a shaped dropout operation to the model.
Applies a shaped dropout to the input tensor. This operator requires a shape parameter that is used to define the shape of the dropout mask so that strongly correlated features in the input tensor can be preserved. The provided shape must be broadcastable to the input tensor. Note that this operation targets the
poprand
library function of the same name.- Parameters
args – A vector of input tensor ids.
shape – The shape of dropout mask. This must be broadcastable to the input.
ratio – The probability of dropping an input feature. Default = 0.5.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId atan2(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add an
atan2
operation to the model.Returns the element-wise angle theta as a tensor. For \( -\pi < \theta \le \pi \), such that for two input tensors \(x\) and \(y\) and given \( r \ne 0 \), then \( x = r \cos\theta \), and \( y = r \sin\theta \), element-wise.
In the case of \( x > 0 \) , \( \theta = arctan(y/x)\) .
- Parameters
args – A vector of input tensor ids: [
y
,x
].debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId expm1(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a
expm1
operation to the model.This calculates the element-wise exponential of the input tensor and subtracts one: \( exp(x) - 1 \).
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId log1p(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a
log1p
operation to the model.This calculates the element-wise logarithm of the input tensor plus one: \( log(x + 1) \).
- Parameters
args – A vector of input tensor ids.
name – Optional identifier for operation.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId reshape(const TensorId &arg, const Attributes::Ints &shape, const DebugContext &debugContext = {})
Add a reshape operation to the model.
This reshapes an input tensor. This reshape takes the target shape as an attribute instead of a tensor input as for the ONNX reshape op.
- Parameters
arg – The tensor id of the input tensor.
shape – The shape of the output tensor. The output tensor must contain the same number of elements as the input tensor.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId fmod(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add an
fmod
operation to the model.This is equivalent to the C
fmod
function. The result has the same sign as the dividend.- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
Computes the element-wise remainder of division. The remainder has the same sign as the dividend.
-
TensorId remainder(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a remainder operation to the model.
This is equivalent to Python’s modulo operator
%
. The result has the same sign as the divisor.- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
Computes the element-wise remainder of division. The remainder has the same sign as the divisor.
-
TensorId reverse(const std::vector<TensorId> &args, const std::vector<int64_t> &dimensions, const DebugContext &debugContext = {})
Add a reverse operator to the model.
This reverses or flips the tensor along the specified dimensions.
- Parameters
args – A vector of input tensor ids.
dimensions – The dimensions along which to reverse the tensor. If this is empty then this is equivalent to the identity operator.
debugContext – Optional debug information.
- Returns
The tensor id of the reversed tensor.
-
TensorId slice(const std::vector<TensorId> &args, const std::vector<int64_t> &ends, const std::vector<int64_t> &starts, const std::vector<int64_t> &axes = std::vector<int64_t>(), const popart::DebugContext &debugContext = {})
Add a slice to the model.
This version of slice uses the
starts
,ends
andaxes
attributes rather than tensor inputs. This reduces the number of ops as constant tensors are treated as ops while attributes are not.- Parameters
args – A vector of input tensor ids.
ends – The
ends
attribute.starts – The
starts
attribute.axes – The
axes
attribute.debugContext – Optional debug information.
- Returns
The normalized output tensor id.
-
TensorId packedDataBlock(const std::vector<TensorId> &args, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, const Builder &callback, const DebugContext &debugContext = {})
Add a packedDataBlock operator to the model.
Unpack packed sequences of data and call the callback function on the unpacked sequences.
- Parameters
args – A vector of input tensor ids.
maxSequenceLengths – The maximum length of a sequence in each of the data inputs.
resultSize – The size of the first dimension of the result tensor.
callbackBatchSize – The number of batches to pass to the callback.
callback – The callback function.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
void abort(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add an abort operation to the model.
The operation can be conditional or unconditional.
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
-
TensorId bitwisenot(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a bitwise NOT operation to the model.
The operation computes the bitwise NOT of an integer tensor.
- Parameters
args – An input tensor of type integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId bitwiseand(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a bitwise AND operation to the model.
The operation computes the bitwise AND of two integer tensors.
- Parameters
args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId bitwiseor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a bitwise OR operation to the model.
The operation computes the bitwise OR of two integer tensors.
- Parameters
args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId bitwisexor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a bitwise XOR operation to the model.
The operation computes the bitwise XOR of two integer tensors.
- Parameters
args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId bitwisexnor(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a bitwise XNOR operation to the model.
The operation computes the bitwise XNOR of two integer tensors.
- Parameters
args – Two broadcastable input tensors of type integer.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
std::vector<TensorId> reducemedian(const std::vector<TensorId> &args, const nonstd::optional<std::vector<int64_t>> &axes = nonstd::nullopt, int64_t keepdims = 1, const DebugContext &debugContext = {})
Add reducemedian operation to the model.
This method computes the median values along the specified axes. In the case of an even number of elements, the lower of the two medians is selected. By default, the input tensor is reduced over all axes. Additionally, the operation also returns the indices of found median values in the reduction axis. If reduction is performed over multiple axes, the indices are “flattened” over the reduced axes, similar to
numpy.ndarray.flat
. The index may not be the first occurrence of the median value found in the input tensor.- Parameters
args – A vector with a single input tensor id.
axes – The axes over which the reduction is performed.
keepdims – If 1, the result tensors are of equal size as the input, but with reduction axes of size 1. Otherwise, the reduction axes are squeezed and the result tensors have fewer dimensions compared to the input. Default = 1.
debugContext – Optional debug information.
- Returns
The names of the two result tensors, one for median values and one for indices.
-
TensorId groupedgather(const std::vector<TensorId> &args, Attributes::Int axis = 0, Attributes::Int group_size = 1, const DebugContext &debugContext = {})
-
TensorId groupedscatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int group_size = 1, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})
Add a grouped scatterreduce operation to the model.
Reduces all the values from the source tensor
src
at the indices specified along the given axis byindex
for each group. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy thesrc
input is the data we are splitting and theindices
define the groups for the reduction operation.In pseudocode the operator can be expressed as:
for g in range(group_size): for i in range(axis_size): output[g][i] = reduce(src[g][index == i])
where the looping over output indices is implicitly handled by poplar.
- Parameters
args – A vector of tensor ids as [
src
,index
,initial_values
].initial_values
is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor forScatterReduction::Sum
.axis_size – The size of the reduced axis.
axis – The axis to reduce along. Default = -1.
reduction – The type of reduction to apply. Default =
ScatterReduction::Sum
.group_size – The number of groups to reduce. Default = 1.
enable_index_broadcast – If
1
index will be broadcasted to match”
`data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId scatterreduce(const std::vector<TensorId> &args, Attributes::Int axis_size, Attributes::Int axis = -1, ScatterReduction reduction = ScatterReduction::Sum, Attributes::Int enable_index_broadcast = 1, const DebugContext &debugContext = {})
Add a scatterreduce operation to the model.
Reduces all the values from the source tensor
src
at the indices specified along the given axis byindex
. In some frameworks this is also known as a split-apply-combine operation as well as a reduce or aggregate by key. In this analogy thesrc
input is the data we are splitting and theindices
define the groups for the reduction operation.In pseudocode the operator can be expressed as:
for i in range(axis_size): output[i] = reduce(src[index == i])
where the looping over output indices is implicitly handled by poplar.
- Parameters
args – A vector of tensor ids as [
src
,index
,initial_values
].initial_values
is optional and if omitted the output will be initialised based on the selected reduction type. For example, a tensor of zeros is used to initialise the output tensor forScatterReduction::Sum
.axis_size – The size of the reduced axis.
axis – The axis to reduce along. Default = -1.
reduction – The type of reduction to apply. Default =
ScatterReduction::Sum
.enable_index_broadcast – If
1
index will be broadcasted to match”
`data` tensor size, otherwise (`0`) its size will remain unchanged.” Default = 1.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId swish(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a swish operation to the model.
The operation computes the swish activation function, also known as the SiLU activation.
- Parameters
args – A vector with a single input tensor id.
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId incrementmod(const std::vector<TensorId> &args, Attributes::Float increment, Attributes::Float modulus, const DebugContext &debugContext = {})
Add an incrementmod operation to the model.
The operation is of the form
y = (x + increment) % modulus
.- Parameters
args – A vector with a single input tensor id.
increment – A scalar increment
modulus – A scalar modulus
debugContext – Optional debug information.
- Returns
The tensor id of the result tensor.
-
TensorId bucketize(const std::vector<TensorId> &args, Attributes::Int right = 0, const DebugContext &debugContext = {})
Add a bucketize operation to the model.
The operation returns the indices of the buckets to which each value in the input tensor belongs. The ranges of each bucket are defined by the boundaries tensor. The returned index satisfies the following rules:
right == 1: boundaries[i-1] <= input[m][n]…[l][x] < boundaries[i] right == 0: boundaries[i-1] < input[m][n]…[l][x] <= boundaries[i]
- Parameters
args – A vector of tensor IDs containing [
input
,boundaries
]. Whereinput
is an N-D tensor or a scalar containing the search valuesboundaries
is a 1-D tensor defining ranges of the buckets. This must contain a monotonically increasing sequence.
right – If 0 (default) then the left boundary is closed.
- Returns
The tensor ID of the result tensor. The result tensor has the same size and shape as the input tensor.
-
std::vector<TensorId> sort(const std::vector<TensorId> &args, Attributes::Int axis = -1, Attributes::Int descending = 0, Attributes::Int stable = 0, const popart::DebugContext &debugContext = {})
Add a sort operation to the model.
- Parameters
args – A vector with a single input tensor id.
axis – The dimension to sort along.
descending – If ‘1’ then the elements are sorted in descending order by value.
stable – If ‘1’ then the sorting routine becomes stable, preserving the order of equivalent elements.
- Returns
A vector of (values, indices) is returned, where the values are the sorted values and indices are the indices of the elements in the original input tensor.
-
TensorId nearbyint(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a nearby int rounding operation to the model.
Rounds the floating-point argument to an integer value in floating-point format.
- Parameters
args – A vector of input tensor ids.
debugContext – Optional debug information.
- Returns
The normalized output tensor ids.
-
std::vector<TensorId> splinebasis(const std::vector<TensorId> &args, Attributes::Int degree = 1, const DebugContext &debugContext = {})
Add a splinebasis operation to the model.
The operation returns two outputs: coefficients for the B-spline basis functions and weight indices for each spline coefficient.
- Parameters
args – A vector of tensor IDs containing [
pseudo
,kernel_size
,is_open_spline
]. wherepseudo
is a 2-D tensor with pseudo coordinates, of shape [numEdges * numDims
].kernel_size
is a 1-D tensor containing the kernel size at each dimension of the edge pseudo coordinates.is_open_slice
is a 1-D tensor that for each dimension encodes whether an open or a closed B-spline basis function must be used.
degree – The degree of the B-spline basis function.
- Returns
The
basis
andweightIndex
tensors, both of shape [numEdges * numSplines
].basis
contains the coefficients for the B-spline basis functions.weightIndex
contains weight indices for each spline.
-
TensorId splineweighting(const std::vector<TensorId> &args, const DebugContext &debugContext = {})
Add a splineweighting operation to the model.
The operation returns features weighted by a continuous B-spline kernel function.
- Parameters
args – A vector of tensor IDs containing [
input
,weight
,basis
weightIndex
]. whereinput
is a 2-D tensor (size: [numEdges * numInputChannels
]) with input features.weight
is a 3-D tensor (size: [numEdges * numInputChannels * numOutputChannels
]) containing weights for B-Spline functions.basis
is a 2-D tensor (size: [numEdges * numSplines
]) of the coefficients for the B-spline basis functions and is produced by thesplinebasis
op.weightIndex
is a 2-D tensor (size: [numEdges * numSplines
]) of the weight indices produced by thesplinebasis
op.
- Returns
A tensor of shape [
numEdges * numOutputChannels
] containing features weighted by a continuous B-spline kernel function.
-
inline AiGraphcoreOpset1(std::unique_ptr<BuilderImpl> &impl_)
#include <popart/scope.hpp>
-
class Scope
14.6. Data flow
#include <popart/dataflow.hpp>
-
enum class popart::AnchorReturnTypeId
Class that defines the identifiers for the return type of the anchor tensors.
An anchor tensor is a tensor that the user wants returned after a call to Session::run(). Each call to Session::run() results in
batchesPerStep x accumulationFactor x replicationFactor
of anchor tensors being computed. The samples associated with each computation is called a micro batch. The dimensions are user-specified with the following parameters:batchesPerStep
is number of batches per step and the value is obtained from the DataFlow object.accumulationFactor
is the gradient accumulation factor and the value is defined by SessionOptions::accumulationFactor.replicationFactor
is the number of replicas and the value is defined by SessionOptions::replicatedGraphCount.
This enum type describes the strategy with which the micro batch values for anchor tensors (or their summaries) are written or to the IStepIO instance passed to Session::run.
NOTE: Anchors are essentially what TensorFlow calls “fetches”.
See also
Values:
-
enumerator Final = 0
Only return the tensor value for the last micro batch of the Session::run call for each replica.
The buffer shape required for this anchor in IStepIO is [
replicationFactor
,<anchorTensorShape>
] (with dimensions of size 1 removed).
-
enumerator EveryN
Return the tensor value for every N-th global batch for each replica and for all accumulation steps in that global batch.
Note that the value of N is captured by AnchorReturnType.
The buffer shape required for this anchor in IStepIO is [
batchesPerStep / N
,accumulationFactor
,replicationFactor
,<anchorTensorShape>
] (with dimensions of size 1 removed).
-
enum class popart::ExchangeStrategy
Enum type to specify an exchange strategy.
JustInTime: .- outer loop ———-—. |.- inner loop ——–—.| || load - compute - store || |’————————’| ‘———————–—’
OverlapInnerLoop:
Boxes denote subgraphs / subgraph Ops / loops
Inputs/outputs are loop carried in order
.- outer loop ————————————-—. | .- inner loop -. | | load - compute - | - store | | | load - | - compute — | - store | | | load –— | - compute - store | | ‘———–—’ | ‘————————————————–—’ ^^^^^^^ ^^^^^^^ ^^^^^^^ overlap overlap overlap
OverlapLoops
Boxes denote subgraphs / subgraph Ops / loops
Numbers on boxes are matching subgraph/loop inputs and outputs
Overlap indicators indicate compute & load/store pairs overlapping in time
.- outer loop ——————————–—. | | | | | | | compute store | store | < overlap | \ / | | 1 2 | | .— inner loop —. | | | | | | | | | store compute | | < overlap | | load | | | < overlap | | | | | | | ‘————-—’ | | 2 1 | | | | | | load compute | load | < overlap | | | | | | ‘————————————————’ 3 4 2 1 | | | | compute | store | < overlap | \ / | 1 2 | .— inner loop —. | | | | | | | store compute | < overlap | | load | | < overlap | | | | | | ‘————-—’ | 2 1 | | | store compute store < overlap | storeload | compute load load < overlap | | | 1 2 | .-- inner loop --. | | | | | | | store compute | | < overlap | load | | | < overlap | | | | | '----------------' | 2 1 load compute < overlap | | | | 1 2 3 4
OverlapStep: Not supported yet
Values:
-
enumerator JustInTime = 0
Copy tensor when required.
-
enumerator OverlapInnerLoop = 1
Preload values in previous inner loop iteration for the next iteration.
-
enumerator OverlapLoops = 2
Preload values in the previous loop iteration for the next iteration (implies OverlapInnerLoop)
-
enumerator OverlapStep = 3
Preload values in the previous host training step for next step (implies OverlapLoops) - not supported yet.
-
enumerator N = 4
Number of values.
-
class AnchorReturnType
Class that captures an AnchorReturnTypeId value.
When the value is
AnchorReturnTypeId::EVERYN
, the associated N value. The constructor takesstd::string
values and converts them as appropriate.Public Functions
-
AnchorReturnType()
Default constructor for the AnchorReturnType class.
-
AnchorReturnType(std::string artString, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)
Constructor for the AnchorReturnType class.
NOTE: Attempting to construct an AnchorReturnType for
AnchorReturnTypeId::EVERYN
using this constructor will result in an error. Use AnchorReturnType(std::string,int,TileSet,ExchangeStrategy) which also specifies the return period.- Parameters
artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):
”final” =
AnchorReturnTypeId::FINAL
“all” =
AnchorReturnTypeId::ALL
“sum” =
AnchorReturnTypeId::SUM
tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.
exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.
-
AnchorReturnType(std::string artString, int returnPeriod, TileSet tileSet = TileSet::Compute, ExchangeStrategy exchangeStrategy = ExchangeStrategy::JustInTime)
Constructor for the AnchorReturnType class.
- Parameters
artString – The string to convert to an AnchorReturnTypeId value. The following values are acceptable (case insensitive):
”final” =
AnchorReturnTypeId::FINAL
“all” =
AnchorReturnTypeId::ALL
“sum” =
AnchorReturnTypeId::SUM
returnPeriod – The value of N in the case of
AnchorReturnTypeId::EVERYN
.tileSet – (Optional) The type of the tile set. Default: TileSet::Compute.
exchangeStrategy – (Optional) The overlap strategy (between IO and compute) for anchor tensors. Default: ExchangeStrategy::JustInTime.
-
inline const std::string &str() const
Get a string of AnchorReturnTypeId.
-
inline const ExchangeStrategy &exchangeStrategy() const
Get the type of overlap strategy.
-
AnchorReturnType()
-
class DataFlow
This class specifies parameters for host-device data streams.
The parameters are used to control the amount input data processed in each step, that is each Session::run call. The parameters also determine how data is returned to the user.
See also
AnchorReturnType, AnchorReturnTypeId.
Public Functions
-
DataFlow()
Default constructor.
This constructor sets
batchesPerStep
to 0 and does not have any anchor tensors.
-
DataFlow(int batchesPerStep)
Construct a DataFlow instance without anchor tensors.
- Parameters
batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.
-
DataFlow(int batchesPerStep, const AnchorReturnTypeMap &anchorMap)
Construct a DataFlow instance with anchor tensors.
- Parameters
batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.
anchorMap – A mapping from output tensor TensorId to AnchorReturnType indicating the strategy with which to write the anchor tensor values to the IStepIO object provided to Session::run.
-
DataFlow(int batchesPerStep, const std::vector<TensorId> anchorTensorIds, const AnchorReturnType &anchorReturnType = AnchorReturnType("All"))
Construct a DataFlow instance with anchor tensors.
- Parameters
batchesPerStep – The number of global batches to run in the inference or training session for each call to Session::run before returning control to the caller.
anchorTensorIds – The tensor ID of anchor tensors.
anchorReturnType – The strategy with which to write anchor tensor values to the IStepIO object provided to Session::run.
-
inline void setBatchesPerStep(const int batchesPerStep)
Set the value for
batchesPerStep
.
-
DataFlow()
-
class InputSettings
Class that describes the TileSet, ExchangeStrategy, and ReplicatedStreamMode used for an input tensor.
Public Functions
-
InputSettings()
Constructor for the InputSettings class.
-
InputSettings(TileSet tileSet, ExchangeStrategy exchangeStrategy)
Constructor for the InputSettings class.
- Parameters
tileSet – The type of the tile set.
exchangeStrategy – The overlap strategy (between IO and compute) for anchor tensors.
-
InputSettings(ReplicatedStreamMode replicatedStreamMode)
Constructor for the InputSettings class.
- Parameters
replicatedStreamMode – The mode used for the replicated stream.
-
inline const ExchangeStrategy &exchangeStrategy() const
Get the type of overlap strategy.
-
inline ReplicatedStreamMode replicatedStreamMode() const
Get the mode of the replicated stream.
-
inline void setTileSet(TileSet tileSet)
Set the type of the tile set.
- Parameters
tileSet – The type of the tile set..
-
inline void setExchangeStrategy(ExchangeStrategy exchangeStrategy)
Set the overlap strategy (between IO and compute).
- Parameters
exchangeStrategy – The overlap strategy.
-
inline void setReplicatedStreamMode(ReplicatedStreamMode streamMode)
Set the mode used for the replicated stream.
- Parameters
replicatedStreamMode – The mode used for the replicated stream.
-
InputSettings()
-
using popart::AnchorReturnTypeMap = std::map<TensorId, AnchorReturnType>
#include <popart/replicatedstreammode.hpp>
14.7. Device manager
#include <popart/devicemanager.hpp>
-
enum class popart::DeviceType
Defines the type of device to use for graph compilation and execution.
Values:
-
enumerator IpuModel = 0
Use the Poplar IPU Model for graph compilation and execution.
The IPU Model will simulate the behaviour of the IPU hardware. It will not completely implement every aspect of a real IPU. (Default).
-
enumerator Cpu
Use CPU for graph compilation and execution.
-
enumerator Ipu
Use IPU for graph execution.
-
enumerator OfflineIpu
Compile graph for later execution.
This can be done even if IPUs are not present. Offline graph compilation is also useful for verifying memory constraints.
-
enumerator Sim
[For Graphcore internal use only] Use a simulator for graph compilation and execution.
-
enumerator IpuModel = 0
-
enum class popart::DeviceConnectionType
Controls when to connect to the IPU (if at all).
Values:
-
enumerator Always = 0
Attach to the IPU from the start (Default).
-
enumerator OnDemand
Wait until the compilation is complete and the executable is ready to be run before attaching to the IPU.
-
enumerator Never
Never try to attach to an IPU.
This is useful for offline compilation (
DeviceType::OfflineIpu
. Trying to run an executable will throw an error.
-
enumerator Always = 0
-
enum class popart::SyncPattern
Controls synchronisation in multi-IPU systems.
Values:
-
enumerator Full = 0
Require all IPUs to synchronise on every communication between IPUs or between IPUs and host (Default).
-
enumerator SinglePipeline
Allow IPUs to synchronise with the host independently, without having to synchronise with each other.
This permits any one IPU to perform host IO while other IPUs are processing data.
-
enumerator ReplicaAndLadder
Allow an IPU group to communicate with the host without requiring synchronisation between groups.
This permits multiple IPU groups to alternate between performing host IO and computation.
-
enumerator Full = 0
-
class DeviceInfo
Represents a specific device.
Subclassed by popart::popx::DevicexInfo, popart::popx::DevicexOfflineIpuInfo
Public Functions
-
DeviceInfo(DeviceType _type, DeviceConnectionType _connectionType, const poplar::OptionFlags &_flags)
Constructor for the DeviceInfo class.
- Parameters
_type – The type of the device.
_connectionType – The setting for when to connect to the device, if at all.
_flags – A set of Poplar option/value string flags.
-
virtual ~DeviceInfo()
Destructor for DeviceInfo.
-
virtual bool attach() = 0
Attach to the device.
- Returns
true
if successfully attached to the device,false
otherwise.
-
virtual void detach() = 0
Detach from the device.
-
virtual bool isAttached() const = 0
Check if attached to the device.
- Returns
true
if attached to the device,false
otherwise.
-
inline DeviceType getType() const
Get the type of the device.
- Returns
The type of the device.
-
inline DeviceConnectionType getConnectionType() const
Get the setting for when to connect to the device.
- Returns
The setting for when to connect to the device.
-
std::string toString() const
Return a description of the device.
-
virtual int getId() const = 0
Get the device id.
-
virtual std::vector<int> getChildIds() const = 0
Get the child device IDs.
The value returned by
getId()
for a multi-IPU device is a ‘parent ID’ and does not relate to the IDs of the devices it comprises. This function, in the case of real devices, uses the Poplar API to work out which single-IPU device IDs it relates to. In the case of replication, a device includes all IPUs involved, so a 2-IPU model with 2x replication would expect to have 4 child IDs returned here.
-
virtual std::string getVersion() const = 0
Get the version of the software on the IPU.
-
virtual int getNumIpus() const = 0
Get the number of IPUs in the device.
-
virtual int getTilesPerIPU() const = 0
Get the number of tiles per IPU.
-
virtual int getNumWorkerContexts() const = 0
Get the number of worker contexts per tile.
-
virtual std::string getIpuVersion() const = 0
Get the IPU version.
-
virtual std::vector<unsigned> getDriverIds() const = 0
Get the version of the drivers on the IPU.
-
inline virtual bool canCompileOffline() const
Get whether the device supports offline compilation.
- Returns
true if the device supports offline compilation, otherwise
false`.
-
const poplar::OptionFlags &getOptionFlags() const
-
void setOnDemandAttachTimeout(const unsigned seconds)
Set timeout (in seconds) for trying to attach to a device.
If unable to attach to a device on the first try, the DeviceManager instance will periodically try to attach to the device until successfully attached or this timeout is reached.
Note
This only applies when trying to attach with DeviceConnectionType::OnDemand.
- Parameters
seconds – The timeout (in seconds) for trying to attach to the device.
-
inline const unsigned &getOnDemandAttachTimeout() const
Get timeout (in seconds) for trying to attach to a device.
- Returns
The timeout (in seconds) for trying to attach to the device.
-
bool tryAttachUntilTimeout()
Periodically try to attach to the device until either the attach timeout is reached or successfully attached.
-
bool isHwCompatible() const
-
void writeToDeviceAccessLog(const std::string &event, const std::map<std::string, std::string> &auxKeyVals = {})
Log an event for device debugging purposes.
This event will get logged to the file location defined by the environment variable POPART_LOG_DEVICE_ACCESS_IN_TESTS, if it is set.
- Parameters
event – A text description of the event to be written to the log.
auxKeyVals – Optional additional parameters to log.
-
DeviceInfo(DeviceType _type, DeviceConnectionType _connectionType, const poplar::OptionFlags &_flags)
-
class DevicexInfo : public popart::DeviceInfo
Subclassed by popart::popx::DevicexCpuInfo, popart::popx::DevicexIpuInfo, popart::popx::DevicexIpuModelInfo, popart::popx::DevicexSimInfo
Public Functions
-
inline DevicexInfo(popart::DeviceType _type, popart::DeviceConnectionType _connectionType, poplar::Device &_device, const poplar::OptionFlags &_flags)
-
~DevicexInfo() override
-
bool attach() override
-
void detach() override
-
inline int getNumIpus() const override
-
inline int getTilesPerIPU() const override
-
inline int getNumWorkerContexts() const override
-
inline std::vector<unsigned> getDriverIds() const override
-
inline std::string getIpuVersion() const override
-
inline bool isAttached() const override
-
inline DevicexInfo(popart::DeviceType _type, popart::DeviceConnectionType _connectionType, poplar::Device &_device, const poplar::OptionFlags &_flags)
-
class DevicexCpuInfo : public popart::popx::DevicexInfo
-
class DevicexIpuInfo : public popart::popx::DevicexInfo
Public Functions
-
inline DevicexIpuInfo(popart::DeviceConnectionType _dct, int _id, poplar::Device &_device, const poplar::OptionFlags &_flags)
-
inline int getId() const override
-
std::vector<int> getChildIds() const override
-
std::string getVersion() const override
-
inline bool canCompileOffline() const override
-
inline DevicexIpuInfo(popart::DeviceConnectionType _dct, int _id, poplar::Device &_device, const poplar::OptionFlags &_flags)
-
class DevicexIpuModelInfo : public popart::popx::DevicexInfo
-
class DevicexSimInfo : public popart::popx::DevicexInfo
-
class DevicexOfflineIpuInfo : public popart::DeviceInfo
Public Functions
-
inline DevicexOfflineIpuInfo(poplar::Target &_target, const poplar::OptionFlags &_flags)
-
inline bool attach() override
-
inline void detach() override
-
inline int getId() const override
-
inline std::vector<int> getChildIds() const override
-
inline std::string getVersion() const override
-
inline int getNumIpus() const override
-
inline int getTilesPerIPU() const override
-
inline int getNumWorkerContexts() const override
-
inline std::string getIpuVersion() const override
-
inline std::vector<unsigned> getDriverIds() const override
-
inline bool canCompileOffline() const override
-
inline bool isAttached() const override
-
inline DevicexOfflineIpuInfo(poplar::Target &_target, const poplar::OptionFlags &_flags)
-
class DeviceManager
A class to manage devices.
Public Functions
-
DeviceManager(const DeviceManager&) = default
-
~DeviceManager() = default
-
void registerDeviceProvider(DeviceProvider *provider)
Register a device provider.
- Parameters
provider – The device provider to be registered with the device manager.
Get the list of all devices that satisfy the specified criteria.
- Parameters
devices – The list of devices.
requiredNumIPUs – The number of IPUs required.
syncPattern – The setting for when to synchronise in a multi-IPU system.
type – The type of the device to use for compilation and execution.
connectionType – The setting for when to connect to the device.
requiredTilesPerIPU – The number of tiles per IPU required.
-
std::vector<std::shared_ptr<DeviceInfo>> enumerateDevices(SyncPattern pattern = SyncPattern::Full, int numIpus = 1, DeviceType deviceType = DeviceType::Ipu, DeviceConnectionType connectionType = DeviceConnectionType::Always, int tilesPerIPU = 0)
Get the list of all devices with the required criteria.
- Parameters
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
numIpus – The number of IPUs required. (Default: 1).
deviceType – The type of the device required. (Default: DeviceType::Ipu).
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).
tilesPerIPU – The number of tiles per IPU required. (Default: 0).
- Returns
The list of devices with the required criteria.
-
std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern = SyncPattern::Full, uint32_t deviceManagerId = 0, DeviceConnectionType connectionType = DeviceConnectionType::Always)
Get a device with the required criteria.
- Parameters
syncPattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
deviceManagerId – The ID of the requested device. (Default: 0)
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).
- Returns
A device, which can be used with a session. If no device is acquired, a nullptr is returned.
-
std::shared_ptr<DeviceInfo> tryAcquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)
Finds an available hardware device, with the specified number of IPUs.
This method will attach to the device if
connectionType
is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.- Parameters
numIpus – The number of IPUs on the device (Default: 1).
tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The setting for when to connect to the device. (Default: DeviceConnectionType::Always).
selectionCriterion – The method for selecting a device from the list of valid selections. (Default: DeviceSelectionCriterion::First).
- Returns
A device, which can be used with a session. If no device is acquired, a nullptr is returned.
-
std::shared_ptr<DeviceInfo> acquireAvailableDevice(int numIpus = 1, int tilesPerIPU = 0, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always, DeviceSelectionCriterion selectionCriterion = DeviceSelectionCriterion::First)
Finds an available hardware device, with a certain number of IPUs.
This method will attach to the device if
connectionType
is equal to DeviceConnectionType::Always. Throws an error if there are less thannumIpus
IPUs available.- Parameters
numIpus – The number of IPUs on the device [=1].
tilesPerIPU – The number of tiles per IPU. An input of 0 will match any number. (Default: 0).
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device.
selectionCriterion – How to select a device from the list of valid selections.
- Returns
A device, which can be used with a session.
-
std::shared_ptr<DeviceInfo> tryAcquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)
Allocates the hardware device by ID.
This ID can be found running
gc-info -l
. This method will try to attach to the device ifconnectionType
is equal to DeviceConnectionType::Always. This method is suitable when polling for an available device when resources are constrained.- Parameters
id – The ID of the IPU to be used.
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).
- Returns
A device, which can be used with a session. If no device is acquired, a nullptr is returned.
-
std::shared_ptr<DeviceInfo> acquireDeviceById(int id, SyncPattern pattern = SyncPattern::Full, DeviceConnectionType connectionType = DeviceConnectionType::Always)
Allocates the hardware device by ID.
This ID can be found running
gc-info -l
. This method will attach to the device ifconnectionType
is equal to DeviceConnectionType::Always.- Parameters
id – The ID of the IPU to be used.
pattern – The setting for when to synchronise in a multi-IPU system. (Default: SyncPattern::Full).
connectionType – The connection type, for deciding when to attach to the device. (Default: DeviceConnectionType::Always).
- Returns
A device, which can be used with a session.
-
std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options)
Create a simulated device on the host for testing purposes.
- Parameters
type – The type of device to simulate.
options – The configuration settings for the host device.
- Returns
The requested device for testing purposes.
-
std::shared_ptr<DeviceInfo> createCpuDevice()
Create a simulated CPU device for testing purposes.
- Returns
A simulated CPU device.
-
std::shared_ptr<DeviceInfo> createIpuModelDevice(const std::map<std::string, std::string> &options)
Create a simulated
IpuModel
device for testing purposes.The following options are supported:
numIPUs
: The number of IPUs to simulate (Default: 1).ge
: The number of tiles per IPU (Default: defaultFewTiles).compileIPUCode
: Indicate whether or not to compile real IPU code for modelling.
- Parameters
options – Configuration settings for the IPU Model.
- Returns
A device.
-
std::shared_ptr<DeviceInfo> createSimDevice(const std::map<std::string, std::string> &options)
-
std::shared_ptr<DeviceInfo> createOfflineIPUDevice(const std::map<std::string, std::string> &options)
Create a simulated
OfflineIpu
device for testing purposes.This resembles an IPU and is used for offline compilation.
The following options are supported:
numIPUs
: The number of IPUs to compile forge
: The number of tiles per IPU (Default: defaultManyTiles).ipuVersion
: The ipu architecture (Default: “ipu2”).syncPattern
: The setting for synchronisation in a multi-IPU system.
- Parameters
options – Configuration settings for the IPU Model.
- Returns
A simulated
OfflineIpu
device.
-
std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo)
Create a simulated
OfflineIpu
device from the description of another device.- Parameters
deviceInfo – The device to create a
OfflineIpu
version of.- Returns
An
OfflineIpu
device.
-
std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus)
Create a simulated
OfflineIpu
device from the name of a system.- Parameters
system – The device to create a
OfflineIpu
version of.numIpus – The number of IPUs. Providing 0 corresponds to all IPUs in system
- Returns
An
OfflineIpu
device.
-
void setOnDemandAttachTimeout(const unsigned seconds)
If unable to attach to a device on first try, the attach timeout set here is the length of time (in seconds) that the DeviceManager will wait to try and attach.
Note: this only takes effect when trying to attach with a DeviceConnectionType::OnDemand DeviceConnectionType.
- Parameters
seconds – The attach timeout in seconds.
Public Static Functions
-
static DeviceManager &createDeviceManager()
Accessor for the device manager.
- Returns
A reference to the DeviceManager instance.
-
DeviceManager(const DeviceManager&) = default
-
class DeviceProvider
The interface for device providers which are registered with the device manager.
Subclassed by popart::popx::DevicexManager
Public Functions
-
inline virtual ~DeviceProvider()
Destructor for DeviceProvider.
-
virtual std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, unsigned deviceManagerId, DeviceConnectionType connectionType) = 0
Get the list of all devices that satisfy the specified criteria.
Throws an error if the connection type is DeviceConnectionType::Never.
- Parameters
syncPattern – The setting for synchronisation on multi-IPU systems.
deviceManagerId – The ID of the requested device.
connectionType – The setting for when to connect to the device.
- Returns
The list of all devices that satisfy the specified criteria.
Get the list of all devices that satisfy the specified criteria.
- Parameters
devices – The list of devices.
requiredNumIPUs – The number of IPUs required.
syncPattern – The setting for when to synchronise in a multi-IPU system.
type – The type of the device to use for compilation and execution.
connectionType – The setting for when to connect to the device.
requiredTilesPerIPU – The number of tiles per IPU required.
-
virtual std::shared_ptr<DeviceInfo> createHostDevice(DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) = 0
Create a host device for testing.
- Parameters
type – The type of the device to use for compilation and execution.
options – The configuration for the created device. See createCpuDevice(), createIpuModelDevice(), createOfflineIPUDevice() and createSimDevice() for more information about
options
.syncPattern – The setting for when to synchronise in a multi-IPU system.
- Returns
The device for use in testing.
-
virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) = 0
-
virtual std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) = 0
-
inline virtual ~DeviceProvider()
-
class DevicexManager : public popart::DeviceProvider
Public Functions
-
DevicexManager()
-
std::shared_ptr<DeviceInfo> getDevice(SyncPattern syncPattern, uint32_t deviceManagerId, DeviceConnectionType connectionType) override
-
std::shared_ptr<popart::DeviceInfo> createHostDevice(popart::DeviceType type, const std::map<std::string, std::string> &options, SyncPattern syncPattern = SyncPattern::Full) override
-
std::shared_ptr<DeviceInfo> createOfflineIpuFromDeviceInfo(const DeviceInfo &deviceInfo) override
-
std::shared_ptr<DeviceInfo> createOfflineIpuFromSystemString(const std::string &system, uint32_t numIpus) override
-
DevicexManager()
#include <popart/popx/devicex.hpp>
-
class Devicex
Public Functions
-
const IrLowering &lowering() const
-
IrLowering &lowering()
-
~Devicex()
-
void prepare()
-
void weightsFromHost()
-
void buffersFromHost()
-
void remoteBufferWeightsFromHost(const bool isUpdate = false)
-
void optimizerFromHost()
-
void setRandomSeedFromHost()
-
uint64_t getRandomSeedToHost()
-
void setRngStateFromHost()
-
std::vector<uint32_t> getRngStateToHost()
-
void setRngStateValue(const std::vector<uint32_t>)
-
std::map<std::string, std::vector<uint64_t>> cycleCountTensorToHost()
-
void weightsToHost()
-
void remoteBufferWeightsToHost()
-
void weightsToHost(const std::map<TensorId, MutableVoidData>&)
-
void popxlWeightsToTensorData()
Copy data from the device, to the host buffers, to the
tensor.tensorData()
buffers.Will not run a WeightsToHost program if weights already in sync with ipu. After WeightsToHost, marks the weights as in sync with the ipu.
-
void popxlMarkHostWeightsOutOfSync()
Mark the d2hWeightBuffers as out of sync with the ipu.
-
void popxlMarkHostWeightsInSync()
Mark the d2hWeightBuffers as in sync with the ipu.
-
bool popxlAreHostWeightsInSync()
Are all the weights in sync with the ipu?
-
void readWeights(const IWeightsIO &dst)
-
void writeWeights(const IWeightsIO &src)
-
std::string getSummaryReport(bool resetProfile = true) const
-
std::string getSerializedGraph() const
-
bool isEngineLoaded() const
-
void setEngineIsLoaded(bool isLoaded)
-
void connectRandomSeedStream()
-
void connectRngStateStream()
-
void connectStreamToCallback(const std::string &streamHandle, std::function<void(void*)> callback, unsigned index)
-
void connectStream(const std::string &streamHandle, void *host_buffer)
-
void connectHostFunction(const std::string &functionHandle, std::function<void(const void*const*, size_t, void*const*, size_t)> callback, unsigned index)
-
void copyFromRemoteBuffer(const PopStreamId buffer, void *w, int repeat_index, unsigned replication_index = 0)
-
void copyToRemoteBuffer(void *w, const PopStreamId buffer, int repeat_index, unsigned replication_index = 0)
-
unsigned getReplicationFactor() const
-
unsigned getAccumulationFactor() const
-
unsigned getGlobalReplicaOffset() const
-
unsigned getGlobalReplicationFactor() const
-
bool isReplicatedGraph() const
-
inline const DeviceInfo *getDeviceInfo() const
-
inline DeviceInfo *getDeviceInfo()
-
inline bool prepareHasBeenCalled() const
-
void loadEngineAndConnectStreams()
-
void serializeExecutable(std::ostream &out, bool serializePopartMetadata, bool serializeTensorData)
-
void serializeExecutable(const std::string &path, bool serializePopartMetadata, bool serializeTensorData)
-
void serializeTensorData(const std::string &path)
Public Members
-
poplin::PlanningCache convCache
-
poplin::matmul::PlanningCache matmulCache
-
bool prePlanConvolutions = true
-
bool prePlanMatMuls = true
Friends
- friend class serialization::WriterImpl
-
const IrLowering &lowering() const
-
class Executablex
Public Functions
-
Executablex(IrLowering &ir_lowering_)
-
Executablex(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)
-
IrLowering &lowering()
-
const IrLowering &lowering() const
-
inline bool isDeserialized() const
-
bool shouldSerialize()
-
std::vector<TensorId> getTensorIds(TensorType)
-
void setRandomSeedValue(uint64_t value)
-
void resetWeights(const ONNX_NAMESPACE::ModelProto &modelProto, const bool ignoreWeightsInModelWithoutCorrespondingIrWeight = false)
-
inline const SessionOptions &getSessionOptions() const
-
const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &id) const
-
const std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> getCollectiveBalancedHostRearrangements() const
-
const std::map<TensorId, CollectiveBalancedReorderId> getCollectiveBalancedHostRearrangementIds() const
-
std::string getCachePath(const std::string &cacheDir) const
-
void updateOptimizerTensors()
Public Static Functions
-
static std::unique_ptr<Executablex> createFromLoweredIr(IrLowering &ir_lowering_)
-
static std::unique_ptr<Executablex> createFromStream(IrLowering &ir_lowering_, std::unordered_map<TensorId, std::unique_ptr<Tensor>> &&tensorMap, std::map<TensorId, CollectiveBalancedReorderId> &&cbrIdMap, std::map<CollectiveBalancedReorderId, gcl::CollectiveBalancedHostRearrangement> &&cbrMap)
-
Executablex(IrLowering &ir_lowering_)
#include <popart/popx/irlowering.hpp>
-
class IrLowering
Public Types
-
using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>
Public Functions
-
virtual ~IrLowering()
-
bool tryInitTensorByPostIRAliasing(TensorId dstId, RequireParallelWritable requireParallelWritable, const ViewChangers &viewChangers)
-
inline const std::vector<std::string> &getCycleCountIds() const
-
inline void setCycleCountIds(const std::vector<std::string> &ids)
-
inline const PopTensors &tensors() const
-
inline PopTensors &tensors()
-
inline const PopPrograms &progs() const
-
inline PopPrograms &progs()
-
void instrumentWithHardwareCycleCounter(poplar::program::Sequence&, int64_t tileId = 0, std::string id = "")
-
void prepareGraph()
-
poplar::Executable getExecutable(const ProfileCacher &ProfileCacher)
-
std::string getPoplarGraphDebugName()
-
std::string getSerializedGraph() const
-
PriTaskDependency taskWhichCreates(TensorId) const
-
unsigned getReplicationFactor() const
-
unsigned getAccumulationFactor() const
-
unsigned getGlobalReplicaOffset() const
-
unsigned getGlobalReplicationFactor() const
-
bool isReplicatedGraph() const
-
bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const
-
void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)
-
poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart)
-
void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Add a vector of pairs {f, buffer} for a given graph id, FunctionBufferMappingType pair.
This is enough for an [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.
- Parameters
gid – The graph id to add the functions and buffers for.
fbmt – The FunctionBufferMappingType to add the vector for.
-
inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Get the Function Buffers for the given GraphId and FunctionBufferMappingType.
Wrapper around popprograms function.
- Parameters
gid – The GraphId to lookup.
fbmt – The FunctionBufferMappingType to lookup.
- Returns
FunctionBuffers the vector of functions and buffers.
-
inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Returns true if a functionBuffer vector exists for the given graphId / FunctionBufferMappingType.
Wrapper around popprograms function.
- Parameters
gid – The graph id to lookup.
fbmt – The FunctionBufferMappingType to lookup.
- Returns
true If pairs exist.
- Returns
false Otherwise.
-
std::vector<ICreatorCandidatePtr> getCreatorEndpoints(const Tensor *tensor, bool excludeEndpointsFromPath = true, bool includeDeadends = false) const
-
std::vector<ICreatorCandidatePtr> getTensorCreators(const Tensor *tensor, bool dependencyFree) const
-
poplar::Tensor getConst(poplar::Graph &graph, const poplar::Type &type, const std::vector<size_t> &shape, double val, const poplar::DebugContext &dc = {})
-
inline const ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle() const
-
inline ReplicatedTensorShardingBundle &getReplicatedTensorShardingBundle()
-
poplar::Tensor getScalarVariable(poplar::Graph &graph, const poplar::Type &type, const poplar::DebugContext &dc = {})
-
inline LinearMapper &getLinearMapper()
-
inline InitTensorOffsetMap &getInitTensorOffsetMap()
-
inline const liveness::LivenessAnalyzer *getLivenessAnalyzer() const
-
inline const liveness::SubgraphPartitioner *getSubgraphPartitioner() const
-
inline liveness::AliasZeroCopy *getAliasZeroCopy() const
-
inline const DeviceInfo *getDeviceInfo() const
-
std::string getContextOpString(ExecutionContext context, const std::vector<TaskId> &taskOrder) const
-
inline bool prepareGraphHasBeenCalled() const
-
inline bool getOuterLoopFragEmpty() const
-
inline bool usingCachedExecutable() const
-
poplar::DataStream &insertGradientStoreStream(TensorId, TensorInfo, poplar::Graph&)
-
poplar::DataStream &insertGradientLoadStream(TensorId, TensorInfo, poplar::Graph&)
-
poplar::DataStream &insertWeightLoadStream(TensorId, TensorInfo, poplar::Graph&)
-
inline ExchangeBundle &getExchangeBundle()
Get the exchange bundle containing stream and remote buffer data structures.
- Returns
Exchange bundle
-
inline const ExchangeBundle &getExchangeBundle() const
Get the exchange bundle containing stream and remote buffer data structures.
- Returns
Exchange bundle
-
inline const std::map<TensorId, poplar::DataStream> &getFromHostStreams() const
-
inline const std::map<TensorId, poplar::DataStream> &getToHostAnchorStreams() const
-
inline const std::map<TensorId, poplar::DataStream> &getToHostWeightStreams() const
-
inline void setProgramHandleIndexMap(const std::map<std::string, unsigned> &programHandleIndexMap_)
-
inline const std::map<std::string, unsigned> &getProgramHandleIndexMap() const
Public Members
-
poplar::OptionFlags pooling_options
-
poplar::OptionFlags lstmOptions
-
poplar::OptionFlags matmulOptions
-
poplar::OptionFlags gclOptions
-
poplar::OptionFlags engineOptions
-
poplar::OptionFlags reportOptions
Public Static Functions
-
static std::string cycleCountStreamId(std::string id)
-
static void removeNonDependencyFreeCreators(std::vector<ICreatorCandidatePtr> &candidates)
-
static PopStreamId h2dId(TensorId)
-
static PopStreamId d2hId(TensorId, bool isAnchorStream)
-
static PopStreamId gradientStoreStreamId(TensorId id)
-
static PopStreamId gradientLoadStreamId(TensorId id)
-
static PopStreamId weightLoadStreamId(TensorId id)
-
using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>
#include <popart/popx/poptensors.hpp>
-
class PopTensors
Public Functions
-
const ViewChangers &getViewChangers(TensorId)
-
void setViewChangers(TensorId, const ViewChangers &viewChangers)
-
const ViewChangers &getViewChangers(TensorId)
#include <popart/popx/popprograms.hpp>
-
class PopPrograms
Class for managing the complete set of
programs
that aDevicex
can run.A
program
in this context is the instance of thepoplar::Program
class which represents a control program that executes operations on the graph.The state
std::vector<poplar::program::Sequence>
seqs
contains all these programs, and is populated duringIrLowering
. The programs are passed topoplar::compileGraph
to construct the executable (seeIrLowering::getExecutable()
).Public Types
-
enum ProgramIndex
Values:
-
enumerator WeightsFromHost = 0
-
enumerator OptimizerFromHost
-
enumerator RandomSeedFromHost
-
enumerator RandomSeedToHost
-
enumerator RngStateFromHost
-
enumerator Program
-
enumerator RngStateToHost
-
enumerator WeightsToHost
-
enumerator CycleCountTensorToHost
-
enumerator CustomProgramsStart
-
enumerator N
-
enumerator WeightsFromHost = 0
-
enum class ProgramFragmentIndex
Values:
-
enumerator StreamWeightsFromHost = 0
-
enumerator StreamOptimizerFromHost
-
enumerator RandomSeedFromHost
-
enumerator RandomSeedToHost
-
enumerator RngStateFromHost
-
enumerator Init
-
enumerator PreForward
-
enumerator Forward
-
enumerator Backward
-
enumerator VarUpdateFromAccumulator
-
enumerator RngStateToHost
-
enumerator WeightsToHost
-
enumerator ToHostFinalCopy
-
enumerator CycleCountTensorToHost
-
enumerator N
-
enumerator StreamWeightsFromHost = 0
-
enum class PipelineFragmentId
Values:
-
enumerator ToDeviceStream = 0
-
enumerator Main
-
enumerator ToHostStream
-
enumerator ToDeviceStream = 0
-
using FunctionBuffers = std::vector<std::pair<const poplar::Function, poplar::FunctionBuffer>>
Public Functions
-
PopPrograms(IrLowering *ir_lowering_p_)
-
poplar::program::Sequence &programFragment(PopPrograms::ProgramFragmentIndex)
-
bool containsFragment(const Graph &graph, SubgraphPartIndex subgraphPart) const
-
void createFragment(const Graph &graph, SubgraphPartIndex subgraphPart)
-
std::vector<poplar::Function> &getFragmentFunctions(const Graph &graph, poplar::Graph &poplarGrpah)
-
poplar::Function &getFragmentFunction(const Graph &graph, SubgraphPartIndex subgraphPart, poplar::Graph &poplarGraph)
-
bool hasBeenRecomputed(OpId, ExecutionPhase) const
-
void recordRecomputed(OpId, ExecutionPhase)
-
std::string getStrFromPipelineFragmentId(PipelineFragmentId) const
-
poplar::program::Sequence &pipelineFragment(PipelineStage, PipelineFragmentId, const std::string &desc)
-
poplar::program::Sequence &pipelineToDeviceStreamFragment(PipelineStage pipelineStage, const std::string &desc)
-
poplar::program::Sequence &pipelineMainFragment(PipelineStage, const std::string &desc)
-
poplar::program::Sequence &pipelineToHostStreamFragment(PipelineStage, const std::string &desc)
-
void addPipelineCycle(PipelineInfo pInfo, PipelineCycle pCycle, poplar::program::Sequence &sq, std::ostringstream &ss) const
-
void addFunctionBuffers(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Add a vector of pairs {f, buffer} for a given graph id.
This is enough for a [Internal|External]CodeCopy op to move code from the buffer in to the function. Note the subgraphpartitioner may have split this into multiple functions, so we require a vector of these for each graph.
- Parameters
pair – The graph id, FunctionBufferMappingType pair to add the functions and buffers for.
funcVec – The vector of functions and buffers.
-
inline FunctionBuffers getFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Get the Function Buffers for the given GraphId and FunctionBufferMappingType.
- Parameters
gid – The GraphId to lookup.
fbmt – The FunctionBufferMappingType to lookup.
- Returns
FunctionBuffers the vector of functions and buffers.
-
inline bool hasFunctionBuffer(const GraphId gid, poplar::FunctionBufferMappingType fbmt)
Returns true if a functionBuffer vector exists for the given graphId and FunctionBufferMappingType.
- Parameters
gid – The graph id to lookup.
fbmt – The FunctionBufferMappingType to lookup.
- Returns
true If pairs exist.
- Returns
false Otherwise.
-
unsigned addCustomProgram(const poplar::program::Program &program)
Add a custom program.
- Parameters
program – Program to add
- Returns
Index of the popart/poplar program
-
void createPipelineFunctions()
Turn pipeline sequences into callable pipeline functions.
Public Members
-
IrLowering *ir_lowering_p
Public Static Attributes
-
static const std::unordered_map<popef::ProgramFlow::ProgramIndexType, std::string> commonPrograms
-
enum ProgramIndex
#include <popart/popx/inittensor.hpp>
-
class ICreatorCandidate
Subclassed by popart::popx::InputCreatorCandidate, popart::popx::InputMultiCreatorCandidate
Public Functions
-
ICreatorCandidate()
-
virtual ~ICreatorCandidate() = default
-
virtual std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) = 0
-
virtual DnfTensorIds mustExistBeforeCreate() = 0
-
virtual double getMaxCreatorPriority() const = 0
-
virtual int64_t getNumElems() const = 0
-
virtual std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() = 0
-
virtual std::string str() = 0
-
virtual int64_t getScheduleIndex() const = 0
Public Static Functions
-
static bool greaterThan(ICreatorCandidatePtr, ICreatorCandidatePtr)
-
ICreatorCandidate()
#include <popart/popx/replicatedtensorshardingbundle.hpp>
-
class ReplicatedTensorShardingBundle
Helper class to bundle all replicated tensor sharding related lowering information together.
Public Functions
-
ReplicatedTensorShardingBundle(const Ir &ir)
Construct empty replicated tensor sharding bundle Creates the replicatedTensorShardingTracer with the IR object.
- Parameters
ir – IR to create the
ReplicatedTensorShardingTracer
with
-
bool hasCollectiveBalancedReorder(const TensorId &tensorId) const
Check whether a tensor has an associated
CollectiveBalancedReorder
.- Parameters
tensorId – TensorId to check
- Returns
True if the tensor has an associated
CollectiveBalancedReorder
-
std::shared_ptr<gcl::CollectiveBalancedReorder> getCollectiveBalancedReorder(const TensorId &tensorId) const
Get the associated
CollectiveBalancedReorder
of a tensor.Throws an error if the tensor does not have one.
- Parameters
tensorId – TensorId to return the
CollectiveBalancedReorder
for- Returns
Shared pointer to the associated
CollectiveBalancedReorder
-
const gcl::CollectiveBalancedHostRearrangement &getCollectiveBalancedHostRearrangement(const TensorId &tensorId) const
Get the host rearrangement method of a tensor.
Can be applied on the host-side tensor data to rearrange the data before upload or after download to/from the IPU
- Parameters
tensorId – TensorId to return the
CBR
host rearrangement for- Returns
CBR
host rearrangement method
-
void setCollectiveBalancedReorder(const TensorId &tensorId, CollectiveBalancedReorderId cbrId)
Associate an existing
CollectiveBalancedReorder
with a tensor.- Parameters
tensorId – TensorId to associate the
CollectiveBalancedReorder
withcbrId – Identifier of an existing, registered
CollectiveBalancedReorder
obtained by registerCollectiveBalancedReorder
Register a new collective balanced reorder method.
- Parameters
cbr –
GCL
CollectiveBalancedReoder
to register- Returns
Registered ID for the
CollectiveBalancedReoder
-
inline const std::map<CollectiveBalancedReorderId, std::shared_ptr<gcl::CollectiveBalancedReorder>> &getCollectiveReorders() const
- Returns
-
inline const ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer() const
- Returns
Tracer to resolve replicated tensor sharding groups
-
inline ReplicatedTensorShardingTracer &getReplicatedTensorShardingTracer()
- Returns
Tracer to resolve replicated tensor sharding groups
-
inline const std::map<TensorId, CollectiveBalancedReorderId> &getCollectiveReorderIds() const
Get mapping to resolve which
CollectiveBalancedReorder
has to be applied to a tensor to restore the original data order.- Returns
Mapping of all tensors and their associated
CollectiveBalancedReorderId
-
ReplicatedTensorShardingBundle(const Ir &ir)
#include <popart/popx/linearmapper.hpp>
-
class LinearMapper
14.8. Ops
14.8.1. Op definition for PopART IR
#include <popart/op.hpp>
-
class Op : public popart::Vertex
Parent class for the concrete
Op
implementations.The
poplar
implementation which the op represents can be found in the corresponding popx::Opx class, and will be lowered topoplar
.See also
Subclassed by popart::AbortOp, popart::AbsGradOp, popart::AdaDeltaUpdaterOp, popart::AdamUpdaterOp, popart::AddBiasOp, popart::AllReduceOp, popart::ArgExtremaOp, popart::AveragePoolGradOp, popart::BaseOnnxRNNGradOp, popart::BaseOnnxRNNOp, popart::BasePadOp, popart::BaseSliceOp, popart::BaseSortOp, popart::BatchNormGradOp, popart::BatchNormOp, popart::BinaryComparisonOp, popart::BoundaryOp, popart::BucketizeOp, popart::CastOp, popart::CastThenPow2ScaleOp, popart::CollectivesBaseOp, popart::ConcatGradOp, popart::ConcatOp, popart::ConvFlipWeightsOp, popart::ConvTransposeOp, popart::CoshOp, popart::CtcBeamSearchDecoderOp, popart::CtcGradOp, popart::CumSumGradOp, popart::CumSumOp, popart::DynamicBaseOp, popart::ElementWiseBinaryBaseOp, popart::ElementWiseBinaryGradOp, popart::ElementWiseNonLinearUnaryGradOp, popart::ElementWiseUnaryBooleanOp, popart::ElementWiseUnaryOp, popart::ExchangeBaseOp, popart::ExpandGradOp, popart::ExpandOp, popart::ExpGradOp, popart::Expm1GradOp, popart::GatherGradOp, popart::GatherOp, popart::GetRandomSeedOp, popart::GlobalAveragePoolGradOp, popart::GlobalAveragePoolOp, popart::GlobalMaxPoolGradOp, popart::GlobalMaxPoolOp, popart::GroupNormGradOp, popart::GroupNormOp, popart::HasReceptiveFieldOp, popart::HistogramOp, popart::IdentityLossGradOp, popart::IfOp, popart::InitOp, popart::InstanceNormGradOp, popart::InstanceNormOp, popart::InternalCodeCopyOp, popart::IoTileCopyOp, popart::IpuCopyOp, popart::L1GradOp, popart::LambSquareOp, popart::LeakyReluGradOp, popart::LogSoftmaxGradOp, popart::LossOp, popart::LossScaleUpdateOp, popart::LRNGradOp, popart::LRNOp, popart::MatMulBaseOp, popart::MaxPoolGradOp, popart::ModifyRandomSeedOp, popart::MultiConvBaseOp, popart::MultiConvDataGradBaseOp, popart::MultiConvWeightsGradBaseOp, popart::NllGradOp, popart::NlllWithSoftmaxGradDirectOp, popart::NormalizeImageOp, popart::OnehotGradOp, popart::OnehotOp, popart::PackedDataBlockOp, popart::ParameterizedOp< TDerivedOp, TOpParams >, popart::PlaceholderOp, popart::PopartLSTMGradOp, popart::PopartLSTMOp, popart::Pow2ScaleThenCastOp, popart::ReduceGradOp, popart::ReduceOp, popart::ReluGradOp, popart::ReshapeBaseOp, popart::ResizeOp, popart::RestoreOp, popart::ReverseBaseOp, popart::RMSPropUpdaterOp, popart::RoiAlignGradOp, popart::RoiAlignOp, popart::ScaledAddOp, popart::ScatterDataGradOp, popart::ScatterReduceGradOp, popart::ScatterReduceOp, popart::ScatterUpdateGradOp, popart::SequenceSliceOp, popart::SGD1NesterovOp, popart::ShapeOrLikeOp, popart::SigmoidGradOp, popart::SoftmaxGradDirectOp, popart::SoftmaxGradOp, popart::SplineBasisOp, popart::SplineWeightingOp, popart::SplitGradOp, popart::SplitOp, popart::SqrtGradOp, popart::StashOp, popart::SubgraphOp, popart::SubsampleBaseOp, popart::SubsampleGradOp, popart::SyncOp, popart::TanhGradOp, popart::TensorRemapOp, popart::TileOp, popart::TopKGradOp, popart::TransposeBaseOp, popart::UpsampleOp, popart::VariadicGradOp, popart::VariadicOp, popart::VarUpdateOp, popart::WhereOp, popart::WhereXGradOp, popart::WhereYGradOp
Public Types
Public Functions
-
inline const Settings &getSettings() const
Get the settings associated with the op.
- Returns
The op settings.
-
virtual Settings getInSettings(InIndex) const
Return suitable settings for an op inserted before the input to an existing op.
- Parameters
InIndex – The input index before which the op is inserted.
- Returns
The settings for the op inserted before the input index.
-
virtual Settings getOutSettings(OutIndex) const
Return suitable settings for an op inserted after the output to an existing op.
- Parameters
OutIndex – The output index after which the op is inserted.
- Returns
The settings for the op inserted after the output index.
-
Settings adjustInSettings(InIndex, Op::Settings) const
Adjust the settings to be suitable as input at the input index.
- Parameters
InIndex – The input index where the settings are to be applied.
Settings – The settings to be adjusted.
- Returns
Adjusted settings suitable for input at the input index.
-
Settings adjustOutSettings(InIndex, Op::Settings) const
Adjust the settings to be suitable as output at an output index.
- Parameters
OutIndex – The output index where the settings are to be applied.
Settings – The settings to be adjusted.
- Returns
Adjusted settings suitable for output at the output index.
-
const OptionalVGraphId getOptionalVGraphId() const
Get the ID of the optional virtual graph.
- Returns
The ID of the optional virtual graph.
-
VGraphId getVirtualGraphId() const
Get the ID of the virtual graph.
- Returns
The ID of the virtual graph.
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex) const
Get virtual graph ID and tile set associated with an input index.
- Parameters
InIndex – The input index.
- Returns
The virtual graph ID and tile set at the input index.
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex) const
Get virtual graph ID and tile set associated with an output index.
- Parameters
OutIndex – The output index.
- Returns
The virtual graph ID and tile set at the output index.
-
virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const
Get virtual graph ID and tile set associated with an input index.
- Parameters
InIndex – The input index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.
- Returns
The virtual graph ID and tile set at the input index.
-
virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const
Get virtual graph ID and tile set associated with an output index.
- Parameters
OutIndex – The output index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.
- Returns
The virtual graph ID and tile set at the output index.
-
void setVirtualGraphId(const OptionalVGraphId)
Set a virtual graph ID for the op.
- Parameters
OptionalVGraphId – The ID of the virtual graph to set on this op.
-
bool hasVirtualGraphId() const
Check if the op has a virtual graph ID set.
- Returns
true
if the op has a virtual graph ID set,false
otherwise.
-
void setPipelineStage(OptionalPipelineStage)
Set a pipeline stage for the op.
- Parameters
OptionalPipelineStage – The pipeline stage to be set for the op.
-
bool hasPipelineStage() const
Check if the op has a pipeline stage set.
- Returns
true
if the op has a pipeline stage set,false
otherwise.
-
PipelineStage getPipelineStage() const
Get the pipeline stage that has been set for the op.
- Returns
The pipeline stage that has been set for the op.
-
OptionalPipelineStage getOptionalPipelineStage() const
Get the optional pipeline stage.
- Returns
The optional pipeline stage that has been set for the op.
-
const OptionalExecutionPhase getOptionalExecutionPhase() const
Get the optional execution phase.
- Returns
The optional execution phase that has been set for the op.
-
virtual ExecutionPhase getExecutionPhase() const
Get the execution phase that has been set for the op.
- Returns
The execution phase that has been set for the op.
-
void setExecutionPhase(const OptionalExecutionPhase)
Set the execution phase for the op.
- Parameters
OptionalExecutionPhase – The execution phase to be set for the op.
-
bool hasExecutionPhase() const
Check if the op has an execution phase set.
- Returns
true
if the op has a execution phase set,false
otherwise.
-
const OptionalBatchSerializedPhase getOptionalBatchSerializedPhase() const
Get the optional batch serialized phase.
- Returns
The optional batch serialized phase that has been set for the op.
-
virtual BatchSerializedPhase getBatchSerializedPhase() const
Get the batch serialized phase.
- Returns
The batch serialized phase that has been set for the op.
-
void setBatchSerializedPhase(const OptionalBatchSerializedPhase)
Set the batch serialized phase.
- Parameters
OptionalBatchSerializedPhase – The batch serialized phase to be set for the op.
-
bool hasBatchSerializedPhase() const
Check if the op has a batch serialization phase set.
- Returns
true
if the op has a batch serialization phase set, otherwisefalse
.
-
const OptionalStochasticRoundingMethod getOptionalStochasticRoundingMethod() const
Get the optional stochastic rounding method.
- Returns
The optional stochastic rounding method that has been set for the op.
-
virtual StochasticRoundingMethod getStochasticRoundingMethod() const
Get the stochastic rounding method.
- Returns
The stochastic rounding method that has been set for the op.
-
void setStochasticRoundingMethod(const OptionalStochasticRoundingMethod)
Set the optional stochastic rounding method.
- Parameters
OptionalStochasticRoundingMethod – The optional stochastic rounding method to be set for the op.
-
bool hasStochasticRoundingMethod() const
Check if the op has a stochastic rounding method set.
- Returns
true
if the op has a stochastic rounding method set, otherwisefalse
.
-
bool isExcludedFromPattern(const Pattern*) const
Check if the op is excluded from a pattern.
- Returns
true
if the op is excluded from a pattern,false
otherwise.
-
inline virtual int getInBatchAxis(InIndex) const
Get the batch axis for the input index.
- Returns
The batch axis for the input index.
-
inline virtual int getOutBatchAxis(OutIndex) const
Get the batch axis for the output index.
- Returns
The batch axis for the output index.
-
void inheritPlacementAttributes(bool inheritSerializations, AliasModel &aliasModel)
Helper function to set an op’s placement attributes by inheriting them from other ops in the graph.
The attributes that are set include:
Execution context.
Pipeline stage.
Execution phase.
Virtual graph ID.
Batch serial phase (optional).
- Parameters
inheritSerializations – The indicator to enable or disable the batch serialization phase.
true
enables the batch serialization phase andfalse
disables it.aliasModel – An AliasModel object containing alias info for this op’s graph.
-
inline Graph &getGraph()
Get the graph associated with the op.
- Returns
The graph associated with the op.
-
inline const Graph &getGraph() const
Get the graph associated with the op.
- Returns
The graph associated with the op.
-
inline const Scope &getScope() const
Get the scope associated with the op.
- Returns
The scope associated with the op.
-
inline void setScope(const Scope &scope)
Get the scope associated with the op.
- Returns
The scope associated with the op.
-
inline const std::string &getName() const
Get the name of the op.
- Returns
The name of the op.
-
inline void setName(const std::string &name)
Get the name of the op.
- Returns
The name of the op.
-
inline const OpDebugInfo &getDebugInfo() const
Get the debug info of the op.
- Returns
The debug info for the op.
-
virtual bool isNorm() const
Checks if the op is a norm op.
- Returns
true
if the op is a norm op,false
otherwise.
-
bool isElementWiseUnary() const
Checks if the op is an element-wise unary op.
- Returns
true
if the op is an element-wise unary op,false
otherwise.
-
virtual bool canBeReplacedByIdentity() const
Check if the op can be replaced by the identity op.
- Returns
true
if the op and be replaced by the identity op,false
otherwise.
-
Op(const OperatorIdentifier &_opid, const Op::Settings &settings)
Constructor of the
Op
class.- Parameters
_opid – The operator identifier specifying domain:type:version, minimum and maximum number of input tensors and number of output tensors.
settings – The general op settings such as graph, name and scope.
-
virtual ~Op()
Destructor.
-
std::string str() const final
Return the op ID.
-
std::string debugName() const
Return the op name that is used for debug and profiling.
-
void createAndConnectOutTensor(OutIndex, TensorId)
Create an ActGrad (output) tensor and connect it to this op’s output.
- Parameters
OutIndex – The output index that the output tensor should be connected to.
TensorId – The tensor ID of the tensor to be converted to an output tensor.
-
void append(std::stringstream &ss) const
Append this op to a stream.
- Parameters
ss – The stream to append the op to.
-
void toJSON(std::stringstream &ss) const
Convert this op to JSON format and append it to a stream.
- Parameters
ss – The stream to append the JSON-serialised op to.
-
int64_t memOfOutputs() const
Return the total memory of used by all output tensors.
-
inline virtual std::set<InIndex> optionalInputs() const
Return the input indices of all optional inputs to the op.
-
void defaultConnectInTensor(InIndex, TensorId)
Connect a tensor to an input index.
This method updates the input and updates consumers of the tensor with the tensor ID.
- Parameters
InIndex – The input index to connect the tensor to.
TensorId – The tensor ID of the tensor to connect.
-
virtual void connectInTensor(InIndex index, TensorId tensorId)
Connect existing tensor to input index.
- Parameters
index – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.
-
virtual void connectInTensor(InIndex inIndex, TensorId tensorId, VGraphId vgid)
Connect an existing tensor to an index with the source virtual graph.
- Parameters
inIndex – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.
vgid – The virtual graph on which the existing tensor resides.
-
void connectInTensorDispatch(InIndex inIndex, TensorId tensorId)
Connect an existing tensor at an index with the source virtual graph.
Dispatcher to resolve issues with templated inheritance overloads. This will automatically derive the virtual graph ID of the input when required.
- Parameters
inIndex – The input index at which to connect the tensor.
tensorId – The ID of the existing tensor.
-
void connectInTensorLike(const Op *other, InIndex index, TensorId tenId)
Connects the input tensor analogously to another op.
This is useful when cloning graphs or ops, because it avoids having to check if the op requires special considerations when connecting inputs.
IpuCopyOp is currently the only op where this applies, since a source virtual graph has to be specified when connecting it otherwise:
void connectInTensor(InIndex, TensorId, uint64_t sourceIpu);
- Parameters
other – An op of the same type as the current op, from which to copy how the tensor at the corresponding index should be connected.
index – The input index to connect.
tenId – The ID of the tensor to connect.
-
void connectOutTensor(OutIndex, TensorId)
Connect existing tensor to output index.
- Parameters
index – The output index at which to connect the tensor.
tensorId – The ID of the existing tensor.
-
void disconnectInTensor(Tensor *tensor)
Disconnect an input tensor from the op.
- Parameters
tensor – The tensor to disconnect.
-
virtual void disconnectInTensor(InIndex, Tensor *tensor)
Disconnect an input tensor from the op at a specific input index.
- Parameters
tensor – The tensor to disconnect.
InIndex – The index of the input tensor in the op.
-
void disconnectInTensor(InIndex)
Disconnect an input tensor from the input index.
- Parameters
InIndex – The input index to disconnect the tensor from.
-
void disconnectOutTensor(Tensor *tensor)
Disconnect an output tensor from the op.
- Parameters
tensor – The tensor to disconnect.
-
void disconnectAllInputs()
Disconnect all input tensors from the op.
-
void disconnectAllOutputs()
Disconnect all output tensors from the op.
-
const std::string &name() const
Return the op name.
-
virtual void setup()
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
void finalizeDebugInfo()
Finalize DebugInfo.
This method is called once after Ir::prepare() has completed.
-
virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)
Set information about the gradient graphs for this op’s called subgraphs.
If the op has called subgraphs, then this method will get called prior to
getGradOps()
to provide the op with the information it needs to call the grad version of the called subgraphs.- Parameters
calledGraphsGradInfo – The mapping between the forward graph and information on the gradient graph.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps()
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual void growAliasModel(AliasModel &aliasModel) const
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier) const
Translate a PopART inplacing proposal.
This replaces an outplace op with an inplace op of type
inplaceId
, into an AliasModel equivalent.This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.
- Returns
A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.
-
virtual view::Regions modifies(InIndex) const
Return the input region which this op modifies (for inplace ops).
- Parameters
InIndex – The input index.
- Returns
The regions which this op modifies.
-
virtual view::Regions uses(InIndex) const
Return the input region which this op uses.
- Parameters
InIndex – The input index.
- Returns
The regions which this op uses.
-
virtual view::Regions aliases(InIndex, OutIndex) const
Return the input region which the op output will alias (for inplace and view-changing ops).
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Parameters
InIndex – The input index.
OutIndex – The output index.
- Returns
The regions which the output will alias.
-
virtual view::RegMap fwdRegMap(InIndex, OutIndex) const
Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual view::RegMap bwdRegMap(InIndex, OutIndex) const
Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const
Determine whether output tensors are guaranteed to have an equal value across all replicas.
This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).
The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.
- Parameters
aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.
- Returns
A tuple comprising of:
a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.
-
bool doesAlias() const
Check if any input tensor aliases any output tensor .
- Returns
true
if any input tensor aliases any output tensor, otherwisefalse
.
-
inline bool isOutplace() const
Check if this is an outplace op.
This means that no input tensor aliases any output tensor.
- Returns
true
if this is an outplace op, otherwisefalse
.
-
bool doesAlias(InIndex inIndex, OutIndex outIndex) const
Check that the input tensor at an input index aliases the output tensor at an output index.
- Returns
true
if the input tensor atinIndex
aliases the output tensor atoutIndex
,false
otherwise.
-
bool modifies() const
Check if op modifies a tensor at any index.
- Returns
true
if the op modifies a tensor at any index, otherwisefalse
.
-
bool modifiesIndex(InIndex in) const
Check if an op modifies a tensor at a specific index.
- Parameters
in – The input index to check.
- Returns
true
if the op modifies the tensor,false
otherwise.
-
bool overwritesTensor(Tensor *t) const
Check if an op overwrites a tensor.
- Parameters
t – The tensor to check.
- Returns
true
if it overwrites the tensor,false
otherwise.
-
bool modifiesTensor(Tensor *t) const
Check if an op modifies a tensor.
- Parameters
t – The tensor to check.
- Returns
true
if it modifies the tensor,false
otherwise.
-
inline virtual bool isInplaceViewChange() const
Check if this is an inplace op that changes a view.
Examples of inplace ops that change views are:
ReshapeInplaceOp
IdentityInplaceOp
TransposeInplaceOp.
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Returns
true
if this is a view changing inplace op,false
otherwise.
-
inline virtual bool isOutplaceViewChange() const
Check if this is an outplace op that changes a view.
Examples of outplace ops that change views are:
ReshapeOp
IdentityOp
TransposeOp.
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Returns
true
if this is a view changing outplace op, otherwisefalse
.
-
virtual int getNonGradInIndex(int gradOpOutIndex) const
Return the index in the non-grad op which has an output edge-gradient tensor in the matching grad op.
This method throws an error if the op this is called on is not a grad op.
- Parameters
gradOpOutIndex – The index at which the grad op has an output of an edge-gradient tensor.
- Returns
The index in the non-grad op containing the input tensor corresponding to the edge-gradient tensor in the grad op output.
-
virtual const std::vector<GradInOutMapper> &gradInputInfo() const
Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
- Returns
The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
-
virtual const std::map<int, int> &gradOutToNonGradIn() const
Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
-
virtual std::unique_ptr<Op> clone() const = 0
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
template<typename T>
inline bool isConvertibleTo() const
-
virtual bool isLossOp() const
Check if this is a LossOp op, for example NllOp or L1Op.
Note
The op SumOp which adds the losses together is not a LossOp.
- Returns
true
if this is a LossOp op,false
otherwise.
-
virtual bool isIpuCopyOp() const
Check if this is an IpuCopyOp op.
- Returns
true
if this is an IpuCopyOp op,false
otherwise.
-
virtual bool copiesOptimizerTensors() const
Check if this copies only optimizer tensors from one IPU to another.
- Returns
true
if this op copies only optimizer tensors from one IPU to another,false
otherwise.
-
virtual bool isOptimizerOp() const
Check if op is part of the optimizer.
-
bool isGradientClippingOp() const
Check if op is a part of gradient clipping.
-
virtual bool requiresRandomSeed() const
Check if the op requires a random seed.
This is set to
false
by default and should be overridden and set totrue
if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.- Returns
true
if the op requires a random seed,false
otherwise.
-
virtual InIndex getSeedInIndex() const
Check if the op requires a random seed.
This is set to false by default and should be overridden and set to true if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.
- Returns
true
if the op requires a random seed,false
otherwise.
-
bool hasInput(InIndex index) const
Check if the op has an input at the input index.
- Returns
true
if the op has an input at the input index, otherwisefalse
.
-
bool hasOutput(OutIndex index) const
Check if the op has an output at the output index.
- Returns
true
if the op has an output at the output index, otherwisefalse
.
-
Tensor *inTensor(InIndex index)
Get the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor at the input index.
-
const Tensor *inTensor(InIndex index) const
Get the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor at the input index.
-
Tensor *outTensor(OutIndex index)
Get the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor at the output index.
-
const Tensor *outTensor(OutIndex index) const
Get the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor at the output index.
-
TensorId inId(InIndex index)
Get the ID of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor ID of the tensor at the input index.
-
const TensorId inId(InIndex index) const
Get the ID of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor ID of the tensor at the input index.
-
TensorId outId(OutIndex index)
Get the ID of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor ID of the tensor at the output index.
-
const TensorId outId(OutIndex index) const
Get the ID of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor ID of the tensor at the output index.
-
TensorInfo &inInfo(InIndex index)
Get the info of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor info of the tensor at the input index.
-
const TensorInfo &inInfo(InIndex index) const
Get the info of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The tensor info of the tensor at the input index.
-
TensorInfo &outInfo(OutIndex index)
Get the info of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor info of the tensor at the output index.
-
const TensorInfo &outInfo(OutIndex index) const
Get the info of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The tensor info of the tensor at the output index.
-
const Shape &inShape(InIndex index) const
Get the shape info of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The shape info of the tensor at the input index.
-
const Shape &outShape(OutIndex index) const
Get the shape info of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The shape info of the tensor at the output index.
-
size_t inTensorCount() const
Get the number of input tensors of this op.
- Returns
The number of input tensors this op has.
-
size_t outTensorCount() const
Get the number of output tensors of this op.
- Returns
The number of output tensors this op has.
-
Rank inRank(InIndex index) const
Get the rank of the input tensor at the input index.
- Parameters
index – The input index.
- Returns
The rank of the tensor at the input index.
-
Rank outRank(OutIndex index) const
Get the rank of the output tensor at the output index.
- Parameters
index – The output index.
- Returns
The rank of the tensor at the output index.
-
InIndex inIndex(Tensor*) const
Get the input index of the tensor.
- Parameters
Tensor – The input tensor.
- Returns
The input index of the tensor in the op.
-
OutIndex outIndex(Tensor*) const
Get the output index of the tensor.
- Parameters
Tensor – The output tensor.
- Returns
The output index of the tensor in the op.
-
virtual void appendAttributes(OpSerialiserBase&) const
Append attributes when serialising the op to a stream.
This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
virtual void appendMore(OpSerialiserBase&) const
Append additional attributes to the stream.
This method should be overridden if the derived class has additional attributes.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
Shape prettyNpOut(const Shape &s0, const Shape &s1) const
Calculate the NumPy broadcast shape for two shapes.
This will throw an error if the broadcast is not aligned. The error will have operator context. Note: If the replicated tensor sharding meta-shape is required, use prettyNpOut with TensorInfo instead.
- Parameters
s0 – The first shape.
s1 – The second shape.
- Returns
The NumPy-like broadcasted output shape.
-
TensorInfo prettyNpOut(const TensorInfo &i0, const TensorInfo &i1, bool checkDataType = true) const
Calculate the NumPy broadcast shape for two shapes.
This will throw an error if the broadcast is not aligned. The error will have operator context.
- Parameters
i0 – The info for the first tensor containing shape and meta-shape.
i1 – The info for the second tensor containing shape and meta-shape.
checkDataType – Check that the data types are identical. If
true
, check that the data types are identical and throw an error if they are not. Iffalse
, do not check that data types are identical.
- Returns
The NumPy-like broadcast output info containing the correct shape and meta-shape. The data type is taken from
i0
.
-
virtual std::vector<const Graph*> getCalledGraphs() const
Get all graphs that this op may call during its execution.
- Returns
A vector of all graphs that this op may call during its execution.
-
std::vector<GraphId> getCalledGraphIds() const
Get the IDs of all graphs that this op may call during its execution.
- Returns
A vector of IDs of all graphs that this op may call during its execution.
-
SubgraphIndex getCalledGraphIndex(const GraphId &id) const
Get the index in the op where the graph is called.
- Parameters
id – The ID of the called graph.
- Returns
The index at which the graph is called.
-
virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const
Get the input index for the subgraph corresponding to the op input index.
- Parameters
subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
inIndex – The input index in the op.
- Returns
The input index in the subgraph that corresponds to the input index in the op, or -1 if the op input index is not used by the subgraph.
-
virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const
Get the input index for the op corresponding to the subgraph input index.
- Parameters
subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
inIndex – The input index in the subgraph.
- Returns
The input index in the op that corresponds to the input index in the subgraph, or -1 if the subgraph input index is not used by the op.
-
virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const
Get the output index for the subgraph corresponding to the op output index.
- Parameters
subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
outIndex – The output index in the op.
- Returns
The output index in the subgraph that corresponds to the output index in the op, or -1 if the op output index is not used by the subgraph.
-
virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const
Get the output index for the op corresponding to the subgraph output index.
- Parameters
subgraphIndex – The index of the subgraph from the set of subgraphs called by this op (returned by getCalledGraphs()).
outIndex – The output index in the subgraph.
- Returns
The output index in the op that corresponds to the output index in the subgraph, or -1 if the subgraph output index is not used by the op.
-
virtual std::set<OutIndex> opInToOpOutIndex(InIndex in) const
Get the the set of outputs to visit based on the input index (for graph traversal).
- Parameters
in – The input index used to determine the set of outputs to visit.
- Returns
The set of outputs to visit based on the input index.
-
virtual std::set<InIndex> opOutToOpInIndex(OutIndex out) const
Get the the set of inputs to visit based on the output index (for graph traversal).
- Parameters
out – The output index used to determine the set of inputs to visit.
- Returns
The set of inputs to visit based on the output index.
-
std::string getSubgraphEquivId(const std::map<std::string, popart::any> &externalAttrs = {}) const
Get a string that represents the equivalence class that this op belongs to.
This is used by, for example transforms, to determine if two ops are the same. If and only if two ops return the same equivalence ID then those ops can be considered of the same equivalence class.
- Parameters
externalAttrs – Additional attributes by which to distinguish this op. The value types must be one of: float, double, int, int64_t, uint32_t, uint64_t, std::string, std::vector<float>, std::vector<double>, std::vector<int64_t>, popart::Scope, bool, nonstd::optional<int64_t>, nonstd::optional<float>, nonstd::optional<double> or std::map<TensorId, uint64_t>. We use this to add, for example replica-equalness properties to the equivalence ID, which is a property that is calculated on-the-fly as opposed to stored in the op.
- Returns
The equivalence ID.
-
std::map<fwtools::subgraph::InIndex, SubgraphInSig> getSubgraphInputs() const
Get all the producer ops of the tensors consumed at the input index.
- Returns
A map of producer ops for the tensors consumed at the input index.
-
std::map<fwtools::subgraph::OutIndex, OpSet> getSubgraphOutputs() const
Get all the consumer ops of the tensors produced at the output index.
- Returns
A map of consumer ops for the tensors produced at the output index.
-
virtual float getSubgraphValue() const = 0
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
inline float getHighSubgraphValue() const
Return the high subgraph value.
-
inline float getLowSubgraphValue() const
Return the low subgraph value.
-
virtual float calcAutoVirtualGraphCost(std::set<int> &inputs_seen)
Get approximate cost of activations between forward and backward graphs.
-
virtual bool isOutlineable() const
Check if op can be outlined.
If this method returns
false
, it will mean that any possible subgraph that this op is part of will not be cached.- Returns
true
if the op can be outlined,false
otherwise. Default:true
.
-
virtual bool hasSideEffect() const
Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.
- Returns
true
if the op has side effects,false
otherwise. Default=false
.
-
virtual bool canRecompute() const
Check if the op can be recomputed.
To recompute an op means to clone it to produce the same output. The function checks the safeness of recompute in the context of explicit recompute. It may still be unsafe for implicit recompute.
- Returns
true
if the op can be recomputed,false
otherwise. Default: hasSideEffect().
-
bool inputsUnmodifiable() const
Check if any input indices are unmodifiable or alias an unmodifiable tensor.
- Returns
true
if any connected variable tensor for all input indices has a non-empty alias chain and is unmodifiable,false
otherwise.
-
bool consumesGraphOutput() const
Check if op consumes the outputs of the graph.
- Returns
true
if op consumes graph outputs,false
otherwise.
-
bool producesGraphOutput() const
Check if op produces the outputs of the graph.
- Returns
true
if op produces graph outputs,false
otherwise.
-
bool inputUnmodifiable(InIndex in) const
Check if the input index is unmodifiable or aliases an unmodifiable tensor.
- Parameters
in – The input index to check.
- Returns
true
if any connected variable tensor has a non-empty alias chain and is unmodifiable,false
otherwise.
-
bool inputUnmodifiableFor(InIndex in, const AliasModel *popMem) const
Check if the input index is unmodifiable or aliases an unmodifiable tensor with given poprithm graph.
- Parameters
in – The input index to check.
- Returns
true
if any connected variable tensor has a non-empty alias chain and is unmodifiable,false
otherwise.
-
bool hasAliasedModifiers(OutIndex out) const
Check if output is modified by any consumer.
- Parameters
out – The output index to check.
- Returns
true
if any consumer of any aliased tensor downstream modifies a non-empty region,false
otherwise.
-
bool hasAliasedModifiersFor(OutIndex out, const AliasModel *popMem) const
Check if output is modified by any consumer with the given poprithm graph.
- Parameters
out – The output index to check.
- Returns
true
if any consumer of any aliased tensor downstream modifies a non-empty region,false
otherwise.
-
bool isParentOf(const Op*) const
Check if the graph is a parent of the op.
A graph is a parent of an op if and only if the op is a child of the graph.
- Parameters
1 – The op that is being checked.
- Returns
true
if the graph is a parent graph,false
otherwise.
-
bool isChildOf(const Op*) const
Check if the graph is a child graph.
A graph is a direct child of an op if the graph consumes any of the tensors the op produces.
- Parameters
1 – The op that is being checked.
- Returns
true
if the graph is a child graph,false
otherwise.
-
virtual bool canShard() const
Check if the operation can be sharded into multiple operations.
- Returns
true
if the operation can be sharded,false
otherwise.
-
virtual ReductionType getShardReductionType(OutIndex index) const
Get the reduction type to apply after sharding, if the output shape does not change.
- Parameters
index – The output index at which to determine the reduction type.
- Returns
The reduction type.
-
inline virtual float getShardRescaleFactor(Op *const shardedOp, OutIndex index) const
Get the scale factor to apply after sharding, if required.
- Parameters
shardedOp – The sharded op.
index – The output index at which to determine the scale factor.
- Returns
The scale factor. Default:1.0.
-
std::map<TensorId, std::vector<TensorId>> shard(const std::map<TensorId, std::vector<TensorId>> &inputs)
Shard an operation into multiple operations according to the new, already sharded input tensors.
- Parameters
inputs – The sharded input tensors.
- Returns
The sharded output tensors.
-
ShardingPlan shard(const ShardingPlan plan)
Create an output sharding plan from sharding an op.
The sharding plan also contains the individual input/output shards of an operation. When sharding an operation, the new plan is updated with the resulting sharded tensors.
- Parameters
plan – The input sharding.
- Returns
The plan after sharding the operation containing the resulting sharded tensors.
-
virtual void configureShardedOp(Op *const shardedOp, const Settings *const settings_) const
Configure a sharded op.
- Parameters
shardedOp – The sharded op to be configured.
settings_ – The settings to apply to the sharded op.
-
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const
Return which inputs and outputs are replicated tensor sharding pairs.
-
virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain)
Configure the op for replicated tensor sharding at specific indices.
- Parameters
indices – The indices at which to configure the op for replicated tensor sharding.
shardingDomain – The type and size of the replica group specified by a CommGroup object.
-
virtual void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping)
Configure the op for replicated tensor sharding at specific indices.
- Parameters
indices – The indices at which to configure the op for replicated tensor sharding.
grouping – The stride and size of the replica group specified by a ReplicaGrouping object.
-
void transferBaseProperties(Op *to)
Transfer the base properties from another op to this op.
- Parameters
to – The op to transfer the base properties from.
-
Op *getPrecedingOp(InIndex inIndex)
Get the producer op of the input tensor at the input index.
- Parameters
inIndex – The index at which the input tensor is produced.
- Returns
The op which produces the input tensor at the input index.
-
Op *getFollowingOp(OutIndex outIndex = 0)
Get the op that consumes an output tensor at an output index.
This will throw an error if there is more than one consumer op.
- Parameters
outIndex – The index at which the output tensor is consumed.
- Returns
The op which consumes the output tensor at the output index.
-
std::vector<Op*> getFollowingOps(OutIndex outIndex = 0)
Get all ops that consume an output tensor at an output index.
- Parameters
outIndex – The index at which the output tensor is consumed.
- Returns
A vector of ops which consume the output tensor at the output index.
-
template<typename T>
inline T *getPrecedingOp(InIndex inIndex) Get the producer op of the input tensor at the input index.
This will throw an error if the producer op cannot be converted to type
T
.- Parameters
inIndex – The index at which the input tensor is produced.
- Returns
The op, converted to type
T
, which produces the input tensor at the input index.
-
template<typename T>
inline T *getFollowingOp(OutIndex outIndex = 0) Get the op that consumes an output tensor at an output index.
This will throw an error if there is more than one consumer op, or if the consumer op cannot be converted to type
T
.- Parameters
outIndex – The index at which the output tensor is consumed.
- Returns
The op, converted to type
T
, which consumes the output tensor at the output index.
-
template<typename T>
inline std::vector<T*> getFollowingOps(OutIndex outIndex = 0) Get all ops that consume an output tensor at an output index.
This will throw an error if not all of the consumer ops can be converted to type
T
.- Parameters
outIndex – The index at which the output tensor is consumed.
- Returns
A vector of ops, converted to type
T
, which consume the output tensor at the output index.
-
bool isPipelineIpuCopyOp() const
Check if the op is of the class IpuCopyOp that copies between pipeline stages.
- Returns
true
if op is of the class IpuCopyOp and copies between pipeline stages,false
otherwise.
Public Members
-
std::unique_ptr<TensorIndexMap> input
-
std::unique_ptr<TensorIndexMap> output
-
OperatorIdentifier opid
-
bool pruneable = true
-
OpDebugInfo debugInfo
-
struct Settings
Structure to capture the settings for the op.
Public Functions
-
inline Settings(Graph &graph_, const std::string &name_)
Constructor for the Settings structure.
- Parameters
graph_ – The graph the op belongs to.
name_ – The name of the op.
-
inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_)
Constructor for the Settings structure.
- Parameters
graph_ – The graph the op belongs to.
name_ – The name of the op.
scope_ – The scope of the op.
-
inline Settings(Graph &graph_, const std::string &name_, const Scope &scope_, const uint64_t parentId_)
Constructor for the Settings structure.
- Parameters
graph_ – The graph the op belongs to.
name_ – The name of the op.
scope_ – The scope of the op.
parentId_ – The ID of the debug info.
-
inline Settings(Graph &graph_, const std::string &name_, const uint64_t parentId_)
Constructor for the Settings structure.
- Parameters
graph_ – The main graph.
name_ – The name of the op.
parentId_ – The ID of the debug info.
-
inline Settings copy(const std::string &new_name)
Create a copy of the current settings with a new name.
- Parameters
new_name – The name of the new settings.
- Returns
A copy of the current settings with the new name.
-
virtual void setFromAttributes(const Attributes &attributes)
Append the optional attributes to the Settings structure depending on whether the attribute has been set in the ONNX model.
- Parameters
attributes – The attributes to be added to the Settings structure.
Public Members
-
std::string name = ""
-
RecomputeType recomputeType = RecomputeType::Undefined
-
OptionalTensorLocation tensorLocation
-
std::vector<std::tuple<std::string, float>> inplacePriorityVeto
-
std::unordered_set<std::string> excludePatterns
-
OptionalVGraphId vgraphId
-
OptionalPipelineStage pipelineStage
-
OptionalExecutionPhase executionPhase
-
OptionalBatchSerializedPhase batchSerializedPhase
-
OptionalStochasticRoundingMethod stochasticRoundingMethod
-
ExecutionContext executionContext = {ExecutionContext::Normal}
-
double schedulePriority = {0.0}
-
std::map<std::string, std::string> extraOutlineAttributes
-
uint64_t debugInfoId = {0}
-
bool optimizerOp = {false}
-
bool gradientClippingOp = {false}
-
inline Settings(Graph &graph_, const std::string &name_)
-
inline const Settings &getSettings() const
-
class GradInOutMapper
Class that represents the mapping between the indices of the input tensors to the gradient operation and the indices of these same tensors in the non-gradient operation.
Public Functions
-
GradInOutMapper(InIndex iGrad_, int iNonGrad_, GradOpInType)
Constructor for the GradInOutMapper class.
- Parameters
iGrad_ – The index of the input tensor to the gradient operation.
iNonGrad_ – The index of the gradient operation input tensor as it is indexed in the non-gradient operation.
GradOpInType – The type of the input tensor to the gradient operation.
-
bool operator==(const GradInOutMapper &rhs) const
Check if the current GradInOutMapper object is equal to another GradInOutMapper object.
- Parameters
rhs – A GradInOutMapper object to be compared to the current object.
- Returns
true
if objects are equal,false
otherwise.
-
GradInOutMapper(InIndex iGrad_, int iNonGrad_, GradOpInType)
-
enum class popart::ReductionType
Define the reduction operation to use over a sequence of tensors.
The two use-cases for this enum type are:
denoting how to reduce individual losses produced by a LossOp over a minibatch (specified by the LossOp
reduction
parameter)denoting how to reduce weight gradients over a number of replicas when gradient accumulation is enabled (specified by the global session option SessionOptions::accumulationAndReplicationReductionType).
Values:
-
enumerator Sum = 0
Sum the input values and do not scale the output (Default).
-
enumerator Mean
Take the mean of the input values.
-
enumerator NoReduction
Do not reduce the input values.
Keep them stacked into a single tensor. So values \(t_1, ..., t_k\) get collected into a tensor \([t_1, ..., t_k]\).
-
enumerator N
The number of ReductionType values.
#include <popart/operatoridentifier.hpp>
-
struct OperatorIdentifier
Subclassed by popart::AiGraphcoreOpIdV1
Public Functions
-
inline OperatorIdentifier(const OpDomain &_domain, const OpType &_type, OpVersion _version, NumInputs inputs = {}, int outputs = 0)
-
inline bool operator==(const OperatorIdentifier &rhs) const
-
inline bool operator!=(const OperatorIdentifier &rhs) const
-
inline bool operator<(const OperatorIdentifier &rhs) const
-
inline OperatorIdentifier(const OpDomain &_domain, const OpType &_type, OpVersion _version, NumInputs inputs = {}, int outputs = 0)
-
struct NumInputs
#include <popart/tensorlocation.hpp>
#include <popart/basicoptionals.hpp>
-
using popart::OptionalTensorLocation = BasicOptional<TensorLocation, 9>
-
using popart::OptionalVGraphId = BasicOptional<VGraphId, 2>
-
using popart::OptionalPipelineStage = BasicOptional<PipelineStage, 3>
-
using popart::OptionalExecutionPhase = BasicOptional<ExecutionPhase, 5>
-
using popart::OptionalBatchSerializedPhase = BasicOptional<BatchSerializedPhase, 7>
-
using popart::OptionalStochasticRoundingMethod = BasicOptional<StochasticRoundingMethod, 10>
-
using popart::OptionalDataType = BasicOptional<DataType, 0>
#include <popart/opmanager.hpp>
-
class OpDefinition
Public Types
-
struct Attribute
Public Functions
-
inline Attribute(std::string regex)
Public Members
-
std::string supportedValuesRegex
-
inline Attribute(std::string regex)
-
struct Input
-
struct Output
-
struct Attribute
-
class OpCreatorInfo
Public Functions
-
inline OpCreatorInfo(const OperatorIdentifier &_opid, const Op::Settings &_settings, const Attributes &_attributes, const std::vector<TensorId> &_inputIds, const std::vector<TensorId> &_outputIds)
-
inline bool hasInputIds() const
-
inline bool hasOutputIds() const
-
TensorData *getInputTensorData(int index) const
-
TensorInfo &getInputTensorInfo(int index) const
-
bool hasInputTensor(int index) const
-
std::string debugName() const
-
inline OpCreatorInfo(const OperatorIdentifier &_opid, const Op::Settings &_settings, const Attributes &_attributes, const std::vector<TensorId> &_inputIds, const std::vector<TensorId> &_outputIds)
-
class OpManager
Public Types
-
using OpFactoryFunc = std::function<std::unique_ptr<Op>(const OpCreatorInfo&)>
-
using ComplexOpFactoryFunc = std::function<Op*(const OpCreatorInfo&, Graph &graph)>
Public Functions
-
OpManager() = default
Public Static Functions
-
static Attributes getAttributesFromAnyMap(std::map<std::string, popart::any> attributes)
-
static std::unique_ptr<Op> createOp(const OpDomain &domain, const OpType &type, const int opsetVersion, Graph &graph, const std::string &name = "", const Scope &scope = {}, const Attributes &_attr = {}, const std::vector<TensorId> &inputIds = {}, const std::vector<TensorId> &outputIds = {})
-
static std::unique_ptr<Op> createOp(const OperatorIdentifier &opid, Graph &graph, const std::string &name = "", const Attributes &_attr = {})
-
static std::unique_ptr<Op> createOpWithInputs(const OperatorIdentifier &opid, Graph &graph, const std::string &name, const Attributes &_attr, const std::vector<TensorId> &inIds)
-
static const std::vector<OperatorIdentifier> getSupportedOperations(bool includePrivate)
-
static const std::vector<OperatorIdentifier> getUnsupportedOperations(int opsetVersion)
-
static const OpDefinitions getSupportedOperationsDefinition(bool includePrivate)
-
class OpInfo
Public Functions
-
inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, OpFactoryFunc _f1)
-
inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, ComplexOpFactoryFunc _f2)
-
OpFactoryFunc &getSimpleFactory()
-
ComplexOpFactoryFunc &getComplexFactory()
-
bool hasComplexFactory()
-
inline OpInfo(const OperatorIdentifier &_id, bool _isPublic, const OpDefinition &_details, OpFactoryFunc _f1)
-
using OpFactoryFunc = std::function<std::unique_ptr<Op>(const OpCreatorInfo&)>
-
enum class popart::RecomputeType
Define the type of recomputation.
Values:
-
enumerator Undefined = 0
Default value if RecomputeType has not been set.
-
enumerator Checkpoint
Do not recompute. Outputs from the op are kept from the forward pass.
-
enumerator Recompute
Recompute operation.
-
enumerator Recomputed
For explicit recomputation, this marks a cloned operation that had RecomputeType::Recompute set.
After cloning, the original op is changed to RecomputeType::Checkpoint, and the cloned op is changed to Recomputed.
-
enumerator Undefined = 0
-
enum class popart::ExecutionContext
Define the type of the execution context.
Values:
-
enumerator Normal = 0
Run the forward and backward passes (Default).
-
enumerator AccumulateOuterFragment
Used to run the AccumulateOps after the gradient accumulation loop completes.
-
enumerator WeightsFromHostFragment
Used to transfer weights from host to device.
-
enumerator WeightsToHostFragment
Used to download weights from the device to the host.
-
enumerator OptimizerFromHostFragment
Used to stream the optimizer state from the host.
-
enumerator Subgraph
Program fragment used for subgraph-specific operations.
-
enumerator Normal = 0
-
enum class popart::GradOpInType
Define the relationship between the input tensors of a gradient operation and the corresponding non-gradient operation.
Values:
-
enumerator In = 0
Indicates that the input tensor to the gradient operation is an input tensor of the non-gradient operation (Default).
-
enumerator Out
Indicates that the input tensor to the gradient operation is an output tensor of the non-gradient operation.
-
enumerator GradOut
Indicates that the input tensor to the gradient operation is an output gradient tensor of the non-gradient operation.
-
enumerator In = 0
#include <popart/op/varupdate.hpp>
-
class VarUpdateOp : public popart::Op
Base class used to define PopART ops that update variable tensors.
Subclassed by popart::AccumulatorScaleOp, popart::VarUpdateWithUpdaterOp
Public Functions
-
VarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override = 0
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual view::Regions aliases(InIndex in, OutIndex) const override
Return the input region which the op output will alias (for inplace and view-changing ops).
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Parameters
InIndex – The input index.
OutIndex – The output index.
- Returns
The regions which the output will alias.
-
virtual view::Regions modifies(InIndex) const override
Return the input region which this op modifies (for inplace ops).
- Parameters
InIndex – The input index.
- Returns
The regions which this op modifies.
-
inline virtual bool isOptimizerOp() const override
Check if op is part of the optimizer.
-
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
Return which inputs and outputs are replicated tensor sharding pairs.
-
virtual void growAliasModel(AliasModel&) const override
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
VarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
-
class AccumulatorScaleOp : public popart::VarUpdateOp
Inplace multiplies a tensor by an OptimizerValue factor.
As with other Ops that consume OptimizerValues, will only have an input tensor for the value if the OptimizerValue is not const.
Will directly zero the input tensor if the factor is const and 0.
Subclassed by popart::AccumulatorZeroOp
Public Functions
-
AccumulatorScaleOp(const OptimizerValue factor_, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline const OptimizerValue &getFactor() const
-
inline virtual float getSubgraphValue() const override
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
AccumulatorScaleOp(const OptimizerValue factor_, const Op::Settings&)
-
class AccumulatorZeroOp : public popart::AccumulatorScaleOp
An AccumulatorScaleOp with a factor of 0, so zeroes the input tensor.
-
class VarUpdateWithUpdaterOp : public popart::VarUpdateOp
Subclassed by popart::AccumulateBaseOp, popart::AdamComboOp, popart::AdamVarUpdateOp, popart::AdaptiveComboOp, popart::CopyVarUpdateOp, popart::ScaledVarUpdateOp, popart::SGD0ComboOp, popart::SGD0VarUpdateOpBase, popart::SGD1AcclUpdateOp, popart::SGD1VarUpdateOp, popart::SGDMComboBaseOp
Public Functions
-
VarUpdateWithUpdaterOp(const OperatorIdentifier &opid, const Op::Settings &settings_)
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
VarUpdateWithUpdaterOp(const OperatorIdentifier &opid, const Op::Settings &settings_)
-
class AccumulateBaseOp : public popart::VarUpdateWithUpdaterOp
Subclassed by popart::AccumulateOp, popart::RescaleAccumulateOp, popart::SparseAccumulateOp
Public Functions
-
AccumulateBaseOp(const OperatorIdentifier &opid, AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const override
-
inline AccumulationType getAccumulationType() const
-
inline const OptimizerValue &getFactor() const
-
AccumulateBaseOp(const OperatorIdentifier &opid, AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
-
class AccumulateOp : public popart::AccumulateBaseOp
Public Functions
-
AccumulateOp(AccumulationType type, OptimizerValue factor, const Op::Settings&)
-
AccumulateOp(AccumulationType type, OptimizerValue factor, const Op::Settings&)
-
class RescaleAccumulateOp : public popart::AccumulateBaseOp
The same as AccumulateOp however it also includes a rescale factor that allows for the accumulator to be rescaled at the same time.
Public Functions
-
RescaleAccumulateOp(AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
-
RescaleAccumulateOp(AccumulationType type_, OptimizerValue factor_, const Op::Settings&)
-
class SparseAccumulateOp : public popart::AccumulateBaseOp
Say you have: w -> Gather -> x.
In backward pass you have: dW <- GatherGrad <- x
and when the optimiser step is grown: dW <- GatherGrad <- x \ Accumulate -> accum’ / accum
GatherGrad is essentially a scatter. Then we Accumulate the resultant dW on accum. This involves creating an extra dW tensor, so instead we can do:
accum -> SparseAccumulate -> accum’x | V
Where SparseAccumulate can in one operation, without extra space, accumulate the slices of x into accum as required.
The input tensor at getOriginalVarToUpdateInIndex() is an optional input. This is can be used when two different views of the weight are consumed in the forward pass (by ops that will be autodiffed), and one of those ops is a Gather, thus requiring a SparseAccumulate in the weight update step.
We connect getOriginalVarToUpdateInIndex() to the other view of the weight than the one this SparseAccumulate is for. Then, SparseAccumulateOpx will clone that tensor (and its layout) when creating accum.
You probably do not need this outside of the TiedGatherPattern.
See also
SparseAccumulateOpx::createInputTensor for further motivation of why it does this.
Public Functions
-
SparseAccumulateOp(AccumulationType type, const OptimizerValue &factor, unsigned axis, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual std::set<InIndex> optionalInputs() const override
Return the input indices of all optional inputs to the op.
-
unsigned getAxis() const
-
SparseAccumulateOp(AccumulationType type, const OptimizerValue &factor, unsigned axis, const Op::Settings&)
-
class AdamComboOp : public popart::VarUpdateWithUpdaterOp
Public Functions
-
AdamComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialB1, OptimizerValue initialB2, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue mwn, OptimizerValue initialGs, AdamMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, bool scaledOptimizerState_, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
Public Members
-
const OptimizerValue initLr
-
const OptimizerValue initWd
-
const OptimizerValue initB1
-
const OptimizerValue initB2
-
const OptimizerValue initEps
-
const OptimizerValue initLs
-
const OptimizerValue initMwn
-
const OptimizerValue initGs
-
const WeightDecayMode decayMode
-
const bool withGradAccum
-
const OptimizerReductionType reductionType
-
const bool scaledOptimizerState
Public Static Functions
-
AdamComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialB1, OptimizerValue initialB2, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue mwn, OptimizerValue initialGs, AdamMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, bool scaledOptimizerState_, const Op::Settings&)
-
class AdamVarUpdateOp : public popart::VarUpdateWithUpdaterOp
Public Functions
-
AdamVarUpdateOp(OptimizerValue initLr, OptimizerValue mwn, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
AdamVarUpdateOp(OptimizerValue initLr, OptimizerValue mwn, const Op::Settings&)
-
class AdaptiveComboOp : public popart::VarUpdateWithUpdaterOp
Public Functions
-
AdaptiveComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialA, OptimizerValue initialM, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue initialGs, AdaptiveMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, bool rmspropTFVariant_, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
Public Members
-
const OptimizerValue initLr
-
const OptimizerValue initWd
-
const OptimizerValue initA
-
const OptimizerValue initM
-
const OptimizerValue initEps
-
const OptimizerValue initLs
-
const OptimizerValue initGs
-
const AdaptiveMode mode
-
const WeightDecayMode decayMode
-
const bool withGradAccum
-
const OptimizerReductionType reductionType
-
const bool rmspropTFVariant
-
AdaptiveComboOp(OptimizerValue initialLr, OptimizerValue initialWd, OptimizerValue initialA, OptimizerValue initialM, OptimizerValue initialEps, OptimizerValue initialLs, OptimizerValue initialGs, AdaptiveMode mode_, WeightDecayMode decayMode_, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, DataType accl2Type_, DataType accl3Type_, bool rmspropTFVariant_, const Op::Settings&)
-
class CopyVarUpdateOp : public popart::VarUpdateWithUpdaterOp
Public Functions
-
CopyVarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
-
inline float getSubgraphValue() const final
-
CopyVarUpdateOp(const OperatorIdentifier&, const Op::Settings&)
-
class SGD0ComboOp : public popart::VarUpdateWithUpdaterOp
A single Op that encapsulates all the information needed to describe an SGD0 optimiser step.
The “0” in the name signifies that there is no optimizer state (note a gradient accum tensor may still be required)
The “Combo” in the name signifies that this
Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD0Decompose pattern.See also
SGD for the definition of what SGD0 is.
See also
SGD0Decompose for the definition of this decomposition.
Public Functions
-
SGD0ComboOp(OptimizerValue initialSwd, OptimizerValue initialSlr, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::set<InIndex> optionalInputs() const override
Return the input indices of all optional inputs to the op.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual float getSubgraphValue() const override
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
Public Members
-
OptimizerValue initSlr0
-
OptimizerValue initWdsf0
-
const bool withGradAccum
-
const OptimizerReductionType reductionType
-
SGD0ComboOp(OptimizerValue initialSwd, OptimizerValue initialSlr, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, const Op::Settings&)
-
class SGD0VarUpdateOpBase : public popart::VarUpdateWithUpdaterOp
Subclassed by popart::SGD0VarUpdateOp
Public Functions
-
SGD0VarUpdateOpBase(const OperatorIdentifier &_opid, OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings &settings_)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
SGD0VarUpdateOpBase(const OperatorIdentifier &_opid, OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings &settings_)
-
class SGD0VarUpdateOp : public popart::SGD0VarUpdateOpBase
Public Functions
-
SGD0VarUpdateOp(OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings&)
-
float getSubgraphValue() const final
-
SGD0VarUpdateOp(OptimizerValue initialSlr0, OptimizerValue initialWdsf0, const Op::Settings&)
-
class SGD1AcclUpdateOp : public popart::VarUpdateWithUpdaterOp
Performs the part of the SGD1 velocity update equation that is pre-computed for the next time step after the weight update of the current time step.
Let:
v
be the input atgetVarToUpdateInIndex()
g
be the input atgetUpdaterInIndex()
then this op performs: v <- v * smm1 + swd1 * gSee also
SGD for how this is derived and the definitions of smm1 and swd1.
Subclassed by popart::SGD2PartialAcclUpdateOp
Public Functions
-
SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)
-
class SGD2PartialAcclUpdateOp : public popart::SGD1AcclUpdateOp
This Op is by design exactly equivalent to an SGD1AcclUpdateOp.
Any logic based on an SGD1AcclUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2PartialAcclUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1AcclUpdateOp.
For SGD2, the entire v update equation could be done in one op (see equation derivation in optimizer.hpp); however, we reuse the SG1AcclUpdateOp and AccumulateOp to implement the equation in the two steps.
Public Functions
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, const Op::Settings&)
-
SGD1AcclUpdateOp(OptimizerValue initSmm1, OptimizerValue initSwd1, OperatorIdentifier opid, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const final
-
class SGD1VarUpdateOp : public popart::VarUpdateWithUpdaterOp
Performs the SGD1 weight update equation.
Let:
w
be the input atgetVarToUpdateInIndex()
g
be the input atgetUpdaterInIndex()
then this op performs: w <- w - slr1 * gSee also
SGD for how this is derived and the definition of slr1.
Subclassed by popart::SGD2VarUpdateOp
Public Functions
-
SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
Public Members
-
const OptimizerValue initSlr1
-
SGD1VarUpdateOp(OptimizerValue initSlr1, const Op::Settings&)
-
class SGD2VarUpdateOp : public popart::SGD1VarUpdateOp
This Op is by design exactly equivalent to an SGD1VarUpdateOp.
Any logic based on an SGD1VarUpdateOp, like transform code or lowering into Opx, can be applied to an SGD2VarUpdateOp. This includes the OperatorIdentifer being Onnx::CustomOperators::SGD1VarUpdate.
-
class SGDMComboBaseOp : public popart::VarUpdateWithUpdaterOp
Subclassed by popart::SGD1ComboOp, popart::SGD2ComboOp
Public Functions
-
SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
-
SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf, OptimizerValue initialNdsf, OptimizerReductionType reductionType_, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const override
Public Members
-
const OptimizerValue initSmm1
-
const OptimizerValue initDpsf1
-
const OptimizerValue initSwd1
-
const OptimizerValue initSlr1
-
OptimizerValue initMm
-
OptimizerValue initWd
-
OptimizerValue initNgsf
-
OptimizerValue initNdsf
-
const OptimizerReductionType reductionType
-
bool nesterov
Public Static Functions
-
SGDMComboBaseOp(const OperatorIdentifier &opid, OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
-
class SGD1ComboOp : public popart::SGDMComboBaseOp
A single Op that encapsulates all the information needed to describe an SGD1 optimiser step.
The “1” in the name signifies that only one extra optimiser tensor (the accl tensor) is required.
The “Combo” in the name signifies that this
Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD1Decompose pattern.See also
SGD for the definition of what SGD1 is.
See also
SGD1Decompose for the definition of this decomposition.
Public Functions
-
SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
-
SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf1, OptimizerValue initialNdsf1, OptimizerReductionType reductionType_, const Op::Settings&)
-
SGD1ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerReductionType reductionType_, const Op::Settings&)
-
class SGD2ComboOp : public popart::SGDMComboBaseOp
A single Op that encapsulates all the information needed to describe an SGD2 optimiser step.
The “2” in the name signifies that two extra optimiser tensors (the accum and accl1 tensors) may be required.
The “Combo” in the name signifies that this
Op will later be decomposed into many Ops and Tensors that actually implement the optimiser step. In this case, by the SGD2Decompose pattern.See also
SGD for the definition of what SGD2 is.
See also
SGD2Decompose for the definition of this decomposition.
Public Functions
-
SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)
-
SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, OptimizerValue initialMm, OptimizerValue initialWd, OptimizerValue initialNgsf2, OptimizerValue initialNdsf2, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)
-
SGD2ComboOp(OptimizerValue initialSmm1, OptimizerValue initialDpsf1, OptimizerValue initialSwd1, OptimizerValue initialSlr1, bool withGradAccum_, OptimizerReductionType reductionType_, DataType accumType_, DataType accl1Type_, const Op::Settings&)
-
class ScaledVarUpdateOp : public popart::VarUpdateWithUpdaterOp
Public Functions
-
ScaledVarUpdateOp(OptimizerValue initLr, OptimizerValue initWd, bool lrInUpdater, const Op::Settings&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
ScaledVarUpdateOp(OptimizerValue initLr, OptimizerValue initWd, bool lrInUpdater, const Op::Settings&)
#include <popart/alias/aliasmodel.hpp>
-
class AliasModel
A container for the poprithms::memory::inplace::Graph which corresponds to a PopART Graph.
It contains the poprithms Graph, and mappings between PopART Tensors and Ops, and their poprithms equivalents.
Public Types
-
using PoprithmsTensorId = poprithms::memory::inplace::TensorId
-
using PoprithmsOpId = poprithms::memory::inplace::OpId
Public Functions
-
AliasModel()
-
~AliasModel() = default
-
void insertTensor(const PoprithmsTensorId &poprithmsTensor, const Tensor &popartTensor)
Register that a poprithms Tensor and a popart Tensor correspond to each other.
In addition to registering the Tensor correspondence, the Ops which produce the respective Tensors are registered to be corresponding.
- Parameters
poprithmsTensor – The Tensor in the poprithms Graph.
popartTensor – The Tensor in the PopART Graph.
-
void insertOp(PoprithmsOpId, OpId)
Register that a poprithms Op and a popart Op correspond.
Note that multiple poprithms Ops can correspond to a single popart Op.
-
void insertUnaryModifier0(const Op &op)
This method performs the following steps:
(1) inserts an aliasGate which is open at index 0 (2) appends a modify to the output aliasGate created in (1) (3) registers that op.output(0) match the output of (2) (4) registers that the poprithms ops created at (1) and (2) correspond to #op.
- Parameters
op – A PopART Op, which might have multiple inputs, and whose output is a modifies alias of its input at index 0.
-
void insertUnaryModifier(const Op&, InIndex)
As per insertUnaryModifier0, but the input index may be different from 0.
-
void insertBinaryModifier(const Op &op)
This method performs the following steps:
(1) inserts an aliasGate whose inputs are the 2 poprithms Tensors corresponding to the 2 inputs of #op. The alias gate is open at the index which #op aliases through, if any.
(2) appends a modify to the output of the aliasGate created at (1)
(3) registers that the poprithms ops (1) and (2) correspond to #op.
Diagramatically, for the PopART Op:
input0 … input1 \ / op | output0
This method creates the following poprithms subgraph:
input0 … input1 \ / aliasGate | modify | output0
- Parameters
op – A PopART Op with 2 inputs.
-
void insertNG2aryModifier(const Op &op, unsigned int numInputs)
The method is the same as insertBinaryModifier except for allowing a larger number of inputs than 2.
- Parameters
op – A PopART Op with 2 or more inputs.
numInputs – The number of inputs
-
void insertViewChange(PoprithmsTensorId viewChangeOut, const Tensor &t, bool isOutplace)
This method performs the following steps:
(1) adds an aliasGate whose (unique) unput is viewChangeOut,
(2) registers that the output of the aliasGate corresponds to the PopART Tensor #t.
(3) registers that the creator of t (if there is any) corresponds to 2 poprithms ops: the creator of viewChangeOut and the aliasGate created at (1).
- Parameters
viewChangeOut – This is a Tensor which is the output of a view changing Op, such as reshape and dimShuffle.
t – This PopART Tensor is the output of the corresponding PopART view changing Op.
isOutplace – This boolean determines if the AliasGate created at (1) should be open or closed. If isOutplace is true, then the AliasGate will be closed.
-
void update(OpId oldId, OpId newId)
Replace all appearances of #oldId in all maps between PopART and poprithms, with #newId.
This is useful when, for example, an Op is replaced in the PopART Graph during the inplacing transformation.
-
TensorId getTensorId(const PoprithmsTensorId &id) const
- Returns
The TensorId corresponding to a poprithms TensorId.
-
bool contains(const PoprithmsTensorId&) const
-
PoprithmsTensorId getPoprithmsTensorId(const TensorId &id) const
- Returns
The poprithms TensorId corresponding to a TensorId.
-
OpId getOpId(PoprithmsOpId) const
- Returns
The OpId corresponding to a poprithms OpId.
-
bool contains(PoprithmsOpId) const
-
PoprithmsOpId getGate(OpId opId) const
- Returns
The ID of the AliasGate in the poprithms Graph, which corresponds to the PopART Op #opId. If no such AliasGate exists, an error is thrown.
-
std::vector<PoprithmsOpId> getAll(OpId) const
- Returns
The poprithms OpIds which correspond to a PopART OpId. It is possible for 1 PopART Op to correspond to multiple poprithms Ops.
Public Members
-
poprithms::memory::inplace::Graph g
The poprithms Graph.
Public Static Attributes
-
static constexpr int loadFactor = 0.5
load factor used for hash map containers
-
using PoprithmsTensorId = poprithms::memory::inplace::TensorId
#include <popart/op/ipucopy.hpp>
-
class IpuCopyOp : public popart::Op
Public Functions
-
IpuCopyOp(const OperatorIdentifier &_opid, VGraphId _destIpu, const Op::Settings &settings_)
-
void setup() final
-
const SourceIpuMap &getSourceIpus() const
-
const SourceTensorMap &getSourceTensors() const
-
void setSourceIpus(const SourceIpuMap sourceIpus)
-
void setSourceTensors(const SourceTensorMap sourceTensors)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
bool isOutlineable() const override
-
bool isIpuCopyOp() const final
-
bool copiesOptimizerTensors() const final
-
std::string getFromToStr() const
-
inline bool canShard() const override
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
-
IpuCopyOp(const OperatorIdentifier &_opid, VGraphId _destIpu, const Op::Settings &settings_)
14.8.2. Op definition for Poplar implementation
#include <popart/popx/opx.hpp>
-
class Opx
Subclassed by popart::popx::AbortOpx, popart::popx::AdaDeltaUpdaterOpx, popart::popx::AdamUpdaterOpx, popart::popx::AddBiasDataGradOpx, popart::popx::AddBiasOpx, popart::popx::AllReduceOpx, popart::popx::ArgExtremaOpx, popart::popx::AsinGradOpx, popart::popx::AtanGradOpx, popart::popx::BaseConcatOpx, popart::popx::BaseExpandOpx, popart::popx::BasePadOpx, popart::popx::BaseSliceOpx, popart::popx::BaseSortOpx, popart::popx::BaseWhereOpx, popart::popx::BinaryComparisonOpx, popart::popx::Bucketizex, popart::popx::CastOpx, popart::popx::CastThenPow2ScaleOpx, popart::popx::ClipGradOpx, popart::popx::CollectivesBaseOpx, popart::popx::ConcatGradOpx, popart::popx::ConvFlipWeightsGradOpx, popart::popx::CtcBeamSearchDecoderOpx, popart::popx::CtcGradOpx, popart::popx::CtcOpx, popart::popx::CumSumGradOpx, popart::popx::CumSumOpx, popart::popx::DynamicSliceOpx, popart::popx::DynamicUpdateOpx, popart::popx::DynamicZeroOpx, popart::popx::ElementWiseBinaryOpx, popart::popx::ElementWiseUnaryOpx, popart::popx::EluGradOpx, popart::popx::ExchangeBaseOpx, popart::popx::ExpandGradOpx, popart::popx::GatherBaseOpx, popart::popx::GatherGradOpx, popart::popx::GeluErfGradOpx, popart::popx::GeluGradOpx, popart::popx::GetRandomSeedOpx, popart::popx::GRUGradOpx, popart::popx::GRUOpx, popart::popx::HardSigmoidGradOpx, popart::popx::HistogramOpx, popart::popx::IdentityInplaceOpx, popart::popx::IdentityLossGradOpx, popart::popx::IdentityLossOpx, popart::popx::IfOpx, popart::popx::InitOpx, popart::popx::IoTileCopyOpx, popart::popx::IpuCopyOpx, popart::popx::L1GradOpx, popart::popx::L1Opx, popart::popx::LambSquareOpx, popart::popx::LeakyReluGradOpx, popart::popx::LossScaleUpdateOpx, popart::popx::LRNGradOpx, popart::popx::LRNOpx, popart::popx::LSTMGradOpx, popart::popx::LSTMOpx, popart::popx::MatMulOpx, popart::popx::MaxArgGradOpx, popart::popx::MaxOpx, popart::popx::MeanArgGradOpx, popart::popx::MinArgGradOpx, popart::popx::MinOpx, popart::popx::ModifyRandomSeedOpx, popart::popx::MultiConvBaseOpx, popart::popx::MultiConvWeightsGradBaseOpx, popart::popx::NllGradOpx, popart::popx::NlllWithSoftmaxGradDirectOpx, popart::popx::NllOpx, popart::popx::NopOpx, popart::popx::NormalizeImageOpx, popart::popx::NormOpx, popart::popx::OnehotGradOpx, popart::popx::OnehotOpx, popart::popx::PopartLSTMOpxBase< LSTMOP >, popart::popx::Pow2ScaleThenCastOpx, popart::popx::PrintTensorOpx, popart::popx::RandomNormalOpx, popart::popx::RandomUniformOpx, popart::popx::ReduceL1GradOpx, popart::popx::ReduceL1Opx, popart::popx::ReduceL2GradOpx, popart::popx::ReduceL2Opx, popart::popx::ReduceLogSumExpGradOpx, popart::popx::ReduceLogSumExpOpx, popart::popx::ReduceLogSumGradOpx, popart::popx::ReduceLogSumOpx, popart::popx::ReduceMaxGradOpx, popart::popx::ReduceMaxOpx, popart::popx::ReduceMeanGradOpx, popart::popx::ReduceMeanOpx, popart::popx::ReduceMedianGradOpx, popart::popx::ReduceMedianOpx, popart::popx::ReduceMinGradOpx, popart::popx::ReduceMinOpx, popart::popx::ReduceProdGradOpx, popart::popx::ReduceProdOpx, popart::popx::ReduceSumGradOpx, popart::popx::ReduceSumOpx, popart::popx::ReduceSumSquareGradOpx, popart::popx::ReduceSumSquareOpx, popart::popx::ReluGradOpx, popart::popx::ReshapeBaseOpx, popart::popx::ResizeGradOpx, popart::popx::ResizeOpx, popart::popx::RestoreBaseOpx< Derived >, popart::popx::ReverseBaseOpx, popart::popx::RMSPropUpdaterOpx, popart::popx::RNNGradOpx, popart::popx::RNNOpx, popart::popx::RoiAlignGradOpx, popart::popx::RoiAlignOpx, popart::popx::ScaledAddOpx, popart::popx::ScatterDataGradOpx, popart::popx::ScatterReduceGradOpx, popart::popx::ScatterReduceOpx, popart::popx::ScatterUpdateGradOpx, popart::popx::SeluGradOpx, popart::popx::SequenceSliceInplaceOpx, popart::popx::SequenceSliceOpx, popart::popx::SGD1NesterovOpx, popart::popx::ShapedDropoutOpx, popart::popx::ShrinkGradOpx, popart::popx::SinhGradOpx, popart::popx::SoftmaxGradDirectOpx, popart::popx::SoftPlusGradOpx, popart::popx::SoftSignGradOpx, popart::popx::SplineBasisx, popart::popx::SplineWeightingx, popart::popx::SplitOpx, popart::popx::StashOpx, popart::popx::SubgraphOpx, popart::popx::SubsampleGradOpx, popart::popx::SubsampleInplaceOpx, popart::popx::SubsampleOpx, popart::popx::SumArgGradOpx, popart::popx::SumOpx, popart::popx::SwishGradOpx, popart::popx::SyncOpx, popart::popx::TanhGradOpx, popart::popx::TanhOpx, popart::popx::TensorRemapOpx, popart::popx::ThresholdedReluGradOpx, popart::popx::TileGradOpx, popart::popx::TileOpx, popart::popx::TopKGradOpx, popart::popx::TransposeInplaceOpx, popart::popx::TransposeOpx, popart::popx::VarUpdateOpx, popart::popx::WhereXGradOpx, popart::popx::WhereYGradOpx, popart::popx::ZerosOpx, popart::popx::PopartLSTMOpxBase< PopartLSTMGradOp >, popart::popx::PopartLSTMOpxBase< PopartLSTMOp >, popart::popx::RestoreBaseOpx< RestoreInplaceOpx >, popart::popx::RestoreBaseOpx< RestoreOpx >
Public Functions
-
virtual ~Opx()
-
virtual poplar::Tensor createInputTensor(popart::InIndex index, const poplar::DebugNameAndId &dnai) const
-
virtual DnfTensorIds mustExistBeforeCreateDNF(int index0) const
-
poplar::Tensor cloneNcopy(poplar::program::Sequence&, const poplar::Tensor&, const std::string name = "") const
-
int64_t getVirtualGraphId() const
-
const ViewChangers &getInViewChangers(InIndex index) const
-
void setOutViewChangers(OutIndex index, const ViewChangers &changers) const
-
const TensorInfo &inInfo(InIndex) const
-
const TensorInfo &outInfo(OutIndex) const
-
template<class OP>
inline void verifyOp(Op *op, const OperatorIdentifier &opid)
-
template<class OP>
inline void verifyOp(Op *op, std::vector<OperatorIdentifier> opids)
-
poplar::Tensor getConst(const poplar::Type &type, const std::vector<size_t> &shape, double val, const std::string &name) const
-
poplar::Graph &inGraph(InIndex in) const
Return the virtual graph associated with input at index in.
- Parameters
in – the input index
- Returns
the corresponding poplar virtual graph
-
virtual std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const
-
virtual OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const
-
virtual ViewChangers getCreatorViewChangers(InIndex index) const
-
virtual void growPart(OpxGrowPartId id) const
-
const poplar::DebugNameAndId getDebugNameAndId(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const
-
poplar::DebugContext debugContext(const std::string name = "", poplar::SourceLocation loc = poplar::SourceLocation::Current()) const
-
virtual PreparedTensorInfos getOutputsToPrepare() const
-
virtual PreparedTensorInfos getInputsToPrepare() const
-
virtual ~Opx()
14.8.3. Available Ops (Op class)
-
struct AiGraphcoreOpIdV1 : public popart::OperatorIdentifier
-
class AbsGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline virtual float getSubgraphValue() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class AbsOp : public popart::ElementWiseUnaryOp
-
class AdaDeltaUpdaterOp : public popart::Op
Public Functions
-
AdaDeltaUpdaterOp(OptimizerValue eps, const Op::Settings&)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
inline bool isOptimizerOp() const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
Public Members
-
const OptimizerValue initEps
-
AdaDeltaUpdaterOp(OptimizerValue eps, const Op::Settings&)
-
class AdamUpdaterOp : public popart::Op
Public Functions
-
AdamUpdaterOp(AdamMode mode_, OptimizerValue wd, OptimizerValue b1, OptimizerValue b2, OptimizerValue eps, const Op::Settings&)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
inline bool isOptimizerOp() const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
Public Members
-
const OptimizerValue initWd
-
const OptimizerValue initB1
-
const OptimizerValue initB2
-
const OptimizerValue initEps
Public Static Functions
-
AdamUpdaterOp(AdamMode mode_, OptimizerValue wd, OptimizerValue b1, OptimizerValue b2, OptimizerValue eps, const Op::Settings&)
-
class AddArg0GradOp : public popart::ReduceSumOp
-
class AddArg1GradOp : public popart::ReduceSumOp
-
class AddBiasBiasGradOp : public popart::ReduceSumOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class AddBiasDataGradOp : public popart::IdentityOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class AddBiasInplaceOp : public popart::AddBiasOp
Public Functions
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
class AddBiasOp : public popart::Op
Subclassed by popart::AddBiasInplaceOp
Public Functions
-
AddBiasOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
void setup() final
-
inline float getSubgraphValue() const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
-
void growAliasModel(AliasModel&) const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
AddBiasOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class AddLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp
-
class AddRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp
-
class AllReduceGradOp : public popart::AllReduceOp
Public Functions
-
AllReduceGradOp(CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
AllReduceGradOp(CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)
-
class AllReduceOp : public popart::Op
Subclassed by popart::AllReduceGradOp
Public Functions
-
AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const Op::Settings &settings_)
-
AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const bool identicalInputs_, const bool identicalGradInputs_, const Op::Settings &settings_)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const override
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
-
inline CollectiveOperator getReduceOp() const
-
inline bool getIdenticalInputs() const
-
inline std::vector<int64_t> getIpus() const
-
AllReduceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, std::vector<int64_t> ipus_, const Op::Settings &settings_)
-
class AndOp : public popart::BinaryComparisonOp
-
class ArgExtremaOp : public popart::Op
Subclassed by popart::ArgMaxOp, popart::ArgMinOp
Public Functions
-
ArgExtremaOp(const OperatorIdentifier &_opid, int64_t axis, int64_t keepdims, const Op::Settings &settings)
-
void setup() final
-
int64_t getKeepDims() const
-
int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
ArgExtremaOp(const OperatorIdentifier &_opid, int64_t axis, int64_t keepdims, const Op::Settings &settings)
-
class ArgMaxOp : public popart::ArgExtremaOp
-
class ArgMinOp : public popart::ArgExtremaOp
-
class AsinGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class AsinInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class AsinOp : public popart::ElementWiseUnaryOp
Public Functions
-
AsinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
AsinOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class Atan2Arg0GradOp : public popart::ElementWiseBinaryArg0GradOp
-
class Atan2Arg1GradOp : public popart::ElementWiseBinaryArg1GradOp
-
class Atan2LhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp
-
class AtanGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class AtanInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class AtanOp : public popart::ElementWiseUnaryOp
Public Functions
-
AtanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
AtanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class AutoLossScaleProxyGradOp : public popart::AutoLossScaleProxyOp
Public Functions
-
AutoLossScaleProxyGradOp(const AutoLossScaleProxyOp &fwdOp)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
AutoLossScaleProxyGradOp(const AutoLossScaleProxyOp &fwdOp)
-
class AutoLossScaleProxyOp : public popart::ElementWiseUnaryOp
Subclassed by popart::AutoLossScaleProxyGradOp
-
class AveragePoolGradOp : public popart::Op
Public Functions
-
AveragePoolGradOp(const AveragePoolOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
Public Members
-
AveragePoolGradOp(const AveragePoolOp&)
-
class AveragePoolOp : public popart::HasReceptiveFieldOp
Public Functions
-
AveragePoolOp(const OperatorIdentifier &_opid, int64_t _countIncludePad, const std::vector<int64_t> &_kernelShape, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings_)
-
int64_t getNOutChans() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
bool canBeReplacedByIdentity() const override
-
AveragePoolOp(const OperatorIdentifier &_opid, int64_t _countIncludePad, const std::vector<int64_t> &_kernelShape, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings_)
-
class BaseOnnxRNNGradOp : public popart::Op
Subclassed by popart::GRUGradOp, popart::LSTMGradOp, popart::RNNGradOp
Public Functions
-
BaseOnnxRNNGradOp(const OperatorIdentifier &_opid, const BaseOnnxRNNOp &fwd_op)
-
void setup() override
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
bool hasLastHiddenStateGradInput() const
-
bool hasFullHiddenStateGradInput() const
-
inline float getSubgraphValue() const final
Public Members
-
const bool hasBiasesInput
-
const bool hasInitialHInput
-
const unsigned batch_size
-
const unsigned input_size
-
const unsigned max_seq_length
-
const unsigned num_directions = 1
Public Static Functions
-
BaseOnnxRNNGradOp(const OperatorIdentifier &_opid, const BaseOnnxRNNOp &fwd_op)
-
class BaseOnnxRNNOp : public popart::Op
Subclassed by popart::GRUOp, popart::LSTMOp, popart::RNNOp
Public Functions
-
BaseOnnxRNNOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)
-
int64_t getMaxSeqLength() const
-
int64_t getBatchSize() const
-
int64_t getInputSize() const
-
int64_t getHiddenSize() const
-
virtual int64_t getNumDirections() const
-
void checkHiddenSize() const
-
bool hasBiasesInput() const
-
bool hasInitialHInput() const
-
bool hasSeqLenInput() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline virtual std::string getName() const
-
inline nonstd::optional<int64_t> getHiddenSizeAttribute() const
Public Static Functions
-
BaseOnnxRNNOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)
-
class BasePadOp : public popart::Op
Subclassed by popart::BasePadOutplaceOp, popart::PadInplaceOp
Public Functions
-
BasePadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
bool padSizeZero() const
-
inline float getSubgraphValue() const final
-
std::vector<int64_t> padDimensions() const
-
inline int64_t getLowerPadding(size_t dim) const
-
inline int64_t getUpperPadding(size_t dim) const
-
inline const std::string &getMode() const
-
inline float getPadValue() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
void setup() final
-
inline int64_t getRank() const
-
std::vector<Slice> getSlices() const
-
inline std::vector<std::ptrdiff_t> getLowerPadding() const
-
inline std::vector<std::ptrdiff_t> getUpperPadding() const
-
inline const std::vector<int64_t> &getPads() const
-
inline const std::vector<unsigned> &getFlips() const
-
void growAliasModel(AliasModel&) const override
-
BasePadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
class BasePadOutplaceOp : public popart::BasePadOp
Subclassed by popart::PadOp, popart::SliceGradOp
Public Functions
-
BasePadOutplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
inline bool canBeReplacedByIdentity() const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
-
BasePadOutplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
class BaseSliceOp : public popart::Op
Subclassed by popart::SliceInplaceOp, popart::SliceOp
Public Functions
-
BaseSliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
void growAliasModel(AliasModel&) const override
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline const std::vector<int64_t> &getStarts() const
-
inline const std::vector<int64_t> &getEnds() const
-
inline const std::vector<int64_t> &getAxes() const
-
inline const std::vector<int64_t> &getSteps() const
-
inline void setStarts(const std::vector<int64_t> &x)
-
inline void setEnds(const std::vector<int64_t> &x)
-
inline void setAxes(const std::vector<int64_t> &x)
-
inline void setSteps(const std::vector<int64_t> &x)
-
std::array<std::vector<int64_t>, 2> getLowerUpper() const
-
std::vector<Slice> getSlices(std::vector<int64_t> input_shape) const
-
std::vector<Slice> getSlices() const
-
std::vector<int64_t> getPads() const
-
std::vector<unsigned> getFlips() const
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
Public Members
-
int unwindConcatDim = 0
-
BaseSliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
class BaseSortOp : public popart::Op
Subclassed by popart::TopKOp
Public Functions
-
BaseSortOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings)
-
int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
Public Static Functions
-
static inline int getInIndex()
-
BaseSortOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings)
-
class BatchNormGradOp : public popart::Op
Public Functions
-
BatchNormGradOp(const BatchNormOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getEpsilon() const
-
inline int64_t getSpatial() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
Public Static Functions
-
BatchNormGradOp(const BatchNormOp&)
-
class BatchNormOp : public popart::Op
Public Functions
-
BatchNormOp(const OperatorIdentifier &_opid, float _epsilon, float _momentum, int64_t _spatial, bool _unbiased_variance, const Op::Settings &settings)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline float getEpsilon() const
-
inline float getMomentum() const
-
inline int64_t getSpatial() const
-
inline bool useUnbiasedVariance() const
-
inline bool isTraining() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool isNorm() const override
Public Static Functions
-
BatchNormOp(const OperatorIdentifier &_opid, float _epsilon, float _momentum, int64_t _spatial, bool _unbiased_variance, const Op::Settings &settings)
-
class BinaryComparisonOp : public popart::Op
Subclassed by popart::AndOp, popart::EqualOp, popart::GreaterOp, popart::LessOp, popart::OrOp
-
class BinaryConstScalarOp : public popart::ElementWiseUnaryOp
A unary Op, which performs a binary operation (Mul, Div, etc) between its single input tensor and a scalar, whose value is stored as an Op attribute.
The input index (0 or 1) of the tensor and scalar are controlled by the scalarInIndex attribute.
Some examples. Let T be the input tensor of this Op.
[value = 2, opType = “Div”, scalarInIndex = 1]: T / 2.0
[value = 4, opType = “Pow”, scalarInIndex = 0]: 2.0 ** T
[value = 0.2, opType = “Add”, scalarInIndex = 0]: 0.2 + T
[value = 100, opType = “Sub”, scalarInIndex = 1]: T - 100.
Public Types
Public Functions
-
inline BinaryConstScalarOp(const OperatorIdentifier &x, float value, Type t, int64_t index, const Op::Settings &settings)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
inline float value() const
-
inline int64_t scalarInIndex() const
-
inline BinaryConstScalarOp(const OperatorIdentifier &x, float value, Type t, int64_t index, const Op::Settings &settings)
-
class BitwiseBinaryOp : public popart::ElementWiseBinaryOp
-
class BitwiseNotOp : public popart::ElementWiseUnaryOp
-
class BucketizeOp : public popart::Op
Public Functions
-
BucketizeOp(const OperatorIdentifier &opid, bool right, const Op::Settings &settings)
-
void setup() override
-
float getSubgraphValue() const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool isRight() const noexcept
-
BucketizeOp(const OperatorIdentifier &opid, bool right, const Op::Settings &settings)
-
class CallGradOp : public popart::CallOp
Public Functions
-
CallGradOp(CallOp &fwdOp, Graph &bwdGraph, const std::vector<GradInOutMapper> &gradInInfo_, const std::map<int, int> &gradOutInfo_)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
CallGradOp(CallOp &fwdOp, Graph &bwdGraph, const std::vector<GradInOutMapper> &gradInInfo_, const std::map<int, int> &gradOutInfo_)
-
class CallOp : public popart::SubgraphOp
Subclassed by popart::CallGradOp
Public Functions
-
CallOp(const OperatorIdentifier&, Graph &callee, const Op::Settings &settings)
-
CallOp(const OperatorIdentifier&, Graph &callee, const std::vector<int> &modifiedInputsViaAttrs, const Op::Settings &settings)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase &os) const override
-
inline float getSubgraphValue() const final
-
inline void growAliasModel(AliasModel &m) const override
-
CallOp(const OperatorIdentifier&, Graph &callee, const Op::Settings &settings)
-
class CastOp : public popart::Op
Subclassed by popart::CastGradOp
Public Functions
-
CastOp(const OperatorIdentifier &_opid, DataType _to, const Op::Settings &settings)
-
void setup() override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
inline ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
bool canBeReplacedByIdentity() const override
-
CastOp(const OperatorIdentifier &_opid, DataType _to, const Op::Settings &settings)
-
class CeilInplaceOp : public popart::OneWayUnaryInPlaceOp
-
class CeilOp : public popart::OneWayUnaryOp
Public Functions
-
CeilOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
CeilOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class ClipGradOp : public popart::ClipOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class ClipInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class ClipOp : public popart::ElementWiseUnaryOp
Subclassed by popart::ClipGradOp
Public Functions
-
ClipOp(const OperatorIdentifier &_opid, float min_, float max_, const Op::Settings &settings_)
-
inline void setClipMin(float value)
-
float getClipMin() const
-
inline void setClipMax(float value)
-
float getClipMax() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
bool canBeReplacedByIdentity() const override
-
ClipOp(const OperatorIdentifier &_opid, float min_, float max_, const Op::Settings &settings_)
-
class CollectivesBaseOp : public popart::Op
Subclassed by popart::MultiCollectiveBaseOp, popart::ReplicatedAllGatherOp, popart::ReplicatedAllReduceOp, popart::ReplicatedReduceScatterOp
Public Functions
-
CollectivesBaseOp(const OperatorIdentifier &_opid, CommGroup group, const Op::Settings &settings_)
-
CollectivesBaseOp(const OperatorIdentifier &_opid, const ReplicaGrouping &grouping, const Op::Settings &settings_)
-
void setReplicaGrouping(const ReplicaGrouping &grouping)
-
const ReplicaGrouping &getReplicaGrouping() const
-
virtual int64_t getCommSize() const
Number of replicas the collective communicates across.
This will be used to create a CollectiveBalanceReorder in lowering to improve the tile mapping when using RTS.
-
void appendOutlineAttributes(OpSerialiserBase &os) const override
-
inline virtual bool isConfigureOutputForReplicatedTensorSharding() const
Check Replicated tensor sharding (RTS) mode Collective operations setup for RTS are allowed to scramble the data element order of the input (AllGather) / output (ReduceScatter) tensor such that the tensor layouts minimize inter-tile exchanges.
As a consequence, the RTS sharded tensor does not follow the original data order and can only be used in elementwise, RTS-enabled operations, such as optimizers, where all inputs consumed are rearranged in the same way.
- Returns
True if this operation is configured for replicated tensor sharding
-
CollectivesBaseOp(const OperatorIdentifier &_opid, CommGroup group, const Op::Settings &settings_)
-
class ConcatGradOp : public popart::Op
Public Functions
-
ConcatGradOp(const ConcatInplaceOp &op, InIndex input)
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
int64_t getAxis() const
-
int64_t getStart() const
-
int64_t getEnd() const
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
inline ReductionType getShardReductionType(OutIndex index) const override
-
ConcatGradOp(const ConcatInplaceOp &op, InIndex input)
-
class ConcatInplaceOp : public popart::ConcatOp
Public Functions
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
class ConcatOp : public popart::Op
Subclassed by popart::ConcatInplaceOp
Public Functions
-
ConcatOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings)
-
void setup() final
-
int64_t getAxis() const
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
void growAliasModel(AliasModel&) const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
ConcatOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings)
-
class ConvDataGradOp : public popart::MultiConvDataGradBaseOp
Public Functions
-
inline int numConvs() const override
-
inline const ConvParameters &getParameters() const
-
inline int numConvs() const override
-
class ConvFlipWeightsGradOp : public popart::ConvFlipWeightsOp
Public Functions
-
ConvFlipWeightsGradOp(const ConvFlipWeightsGradOp&) = default
-
ConvFlipWeightsGradOp(const ConvFlipWeightsOp &convFlipWeightsOp)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
ConvFlipWeightsGradOp(const ConvFlipWeightsGradOp&) = default
-
class ConvFlipWeightsOp : public popart::Op
Subclassed by popart::ConvFlipWeightsGradOp
Public Functions
-
ConvFlipWeightsOp(const ConvFlipWeightsOp&) = default
-
ConvFlipWeightsOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
~ConvFlipWeightsOp() override
-
void setup() final
-
inline const ConvParameters &getParameters() const
-
inline void setParameters(const ConvParameters &p)
-
inline bool getGroupReshape() const
-
inline void setGroupReshape(bool reshape)
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase &os) const final
-
inline void setConvOptions(const MultiConvOptions &opts)
-
inline const MultiConvOptions &getMultiConvOptions() const
-
inline std::map<std::string, std::string> getConvOptions() const
-
ConvFlipWeightsOp(const ConvFlipWeightsOp&) = default
-
class ConvOp : public popart::MultiConvBaseOp
Public Functions
-
ConvOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, const MultiConvOptions &convOpts)
-
void setup() final
-
inline int numConvs() const final
-
inline int64_t getGroups() const
-
inline void setGroup()
-
inline int64_t getNInChans() const
-
inline int64_t getNOutChans() const
-
inline ConvParameters getParameters() const
-
void restoreAttributesFromParams(const std::vector<ConvParameters>&) override
-
bool isPow2ScaledConv() const
Returns true if and only if the inputs to the op constitute a valid set of inputs for a fused (float8) convolution.
-
ConvOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, const MultiConvOptions &convOpts)
-
class ConvTransposeOp : public popart::Op
Public Functions
-
ConvTransposeOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, std::vector<int64_t> outputPadding, Shape outputShape, const MultiConvOptions &convOpts)
-
void setup() final
-
inline float getSubgraphValue() const final
-
bool isPow2ScaledConvTranspose() const
Public Members
-
std::vector<int64_t> strides
-
std::vector<int64_t> dilations
-
int64_t group
-
const MultiConvOptions convOpts
-
ConvParameters params
-
ConvTransposeOp(const OperatorIdentifier &_opid, const Settings &settings_, std::vector<int64_t> strides, std::vector<int64_t> pads, std::vector<int64_t> dilations, int64_t group, const AutoPad &padType, std::vector<int64_t> outputPadding, Shape outputShape, const MultiConvOptions &convOpts)
-
class ConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp
Public Functions
-
ConvWeightsGradOp(const ConvWeightsGradOp&) = default
-
inline int numConvs() const final
-
inline const ConvParameters &getParameters() const
-
ConvWeightsGradOp(const ConvWeightsGradOp&) = default
-
class CosGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class CosOp : public popart::ElementWiseUnaryOp
Public Functions
-
CosOp(const OperatorIdentifier &_opid, const Op::Settings&)
Public Static Functions
-
static OperatorIdentifier getOpId(const Ir &ir)
-
CosOp(const OperatorIdentifier &_opid, const Op::Settings&)
-
class CtcBeamSearchDecoderOp : public popart::Op
Public Functions
-
CtcBeamSearchDecoderOp(const popart::OperatorIdentifier &_opid, unsigned _blankClass, unsigned _beamWidth, unsigned _topPaths, const popart::Op::Settings &settings_)
-
void setup() final
-
void appendAttributes(popart::OpSerialiserBase &os) const override
-
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
-
float getSubgraphValue() const final
-
bool requiresRandomSeed() const override
-
inline unsigned getBlankClass() const
-
inline unsigned getBeamWidth() const
-
inline unsigned getTopPaths() const
-
inline unsigned getMaxTime() const
-
inline unsigned getBatchSize() const
-
inline unsigned getNumClasses() const
-
CtcBeamSearchDecoderOp(const popart::OperatorIdentifier &_opid, unsigned _blankClass, unsigned _beamWidth, unsigned _topPaths, const popart::Op::Settings &settings_)
-
class CtcGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline ReductionType getReductionType() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline bool canShard() const override
-
inline bool getEnableReducedClassesInLabel() const
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class CtcOp : public popart::LossOp
Public Functions
-
CtcOp(const OperatorIdentifier &_opid, const ReductionType reduction, const unsigned blank, const bool zeroInfinity, const Op::Settings &settings_, const bool enableReducedClassesInLabel, const DataType outDataType = DataType::UNDEFINED)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline unsigned getBlank() const
-
inline bool getZeroInfinity() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
unsigned getBatchSize() const
-
unsigned getMaxInputLength() const
-
unsigned getMaxTargetLength() const
-
unsigned getNumClasses() const
-
inline bool canShard() const override
-
inline bool getEnableReducedClassesInLabel() const
Public Static Functions
-
CtcOp(const OperatorIdentifier &_opid, const ReductionType reduction, const unsigned blank, const bool zeroInfinity, const Op::Settings &settings_, const bool enableReducedClassesInLabel, const DataType outDataType = DataType::UNDEFINED)
-
class DetachOp : public popart::ElementWiseUnaryOp
Subclassed by popart::DetachInplaceOp
Public Functions
-
DetachOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
inline bool isIdentity() const final
-
inline bool isOutplaceViewChange() const override
-
DetachOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class DivArg0GradOp : public popart::ElementWiseBinaryArg0GradOp
-
class DivArg1GradOp : public popart::ElementWiseBinaryArg1GradOp
-
class DropoutBaseOp : public popart::RandomBaseOp
Subclassed by popart::DropoutOp, popart::ShapedDropoutOp
-
class DropoutOp : public popart::DropoutBaseOp
Subclassed by popart::DropoutGradOp
Public Functions
-
DropoutOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)
-
void setup() override
-
bool canBeReplacedByIdentity() const override
-
void appendAttributes(OpSerialiserBase &os) const override
-
inline void setOutputMask(bool v)
-
inline bool getOutputMask() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline void setReferenceId(RandomReferenceId id)
-
inline RandomReferenceId getReferenceId() const
-
DropoutOp(const OperatorIdentifier &_opid, float ratio_, const Op::Settings &settings_)
-
class DropoutGradOp : public popart::DropoutOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
class DynamicAddInplaceOp : public popart::DynamicTernaryBaseInplaceOp
Public Functions
-
DynamicAddInplaceOp(const DynamicAddOp &dynamicAddOp)
-
DynamicAddInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicAddInplaceOp(const DynamicAddOp &dynamicAddOp)
-
class DynamicAddOp : public popart::DynamicTernaryBaseOp
Public Functions
-
DynamicAddOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
DynamicAddOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicBaseOp : public popart::Op
Dynamic Base Op.
Base class for operators acting on a run-time selectable slice of a tensor.
The word “dynamic” refers to the fact that the index can be specified during runtime, where index
is the second tensor argument of this operator as specified in
A slice along an axis can be defined as by the tuple (
start, stop, step ) start - will be equal the index for the respective axis stop - will be equal index + size for the respective axis step - will equal 1See also
graphcoreoperators.hpp. The axes specifies along which axes the tensor should be sliced. The size specifies the size of the slices.
Limitations: Assuming we would like to slice A with dimension (4, 3)
Step other than 1 is not supported (i.e. A[::2,:] is not supported)
Negative slicing is not supported (i.e. A[:-1,:] is not supported)
stop greater than the size of the axis is not supported (i.e. A[:5,:] is not supported)
Example: Given a Tensor A with shape (3, 2, 4, 5) If we specify axes = {1, 3} (i.e. we will slice the first and third axis [counting from 0]) the operator will operate on A[:, index[0]:(index[0]+size[0]), :, index[1]:(index[1]+size[1])] If we instead specify axes = {0, 1, 3} the operator will operate on A[index[0]:(index[0]+size[0]), index[1]:(index[1]+size[1]), :, index[2]:(index[2]+size[2])]
Subclassed by popart::DynamicBinaryBaseOp, popart::DynamicSliceBaseOp, popart::DynamicSlicePadGradOp
Public Functions
-
DynamicBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() override
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
inline const std::vector<int64_t> &getAxes() const
-
inline void setAxes(const std::vector<int64_t> &x)
-
inline const std::vector<int64_t> &getSizes() const
-
inline void setSizes(const std::vector<int64_t> &x)
-
inline bool isNotOverlapping() const
-
TensorInfo createOutInfo() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
class DynamicBinaryBaseInplaceOp : public popart::DynamicBinaryBaseOp
Subclassed by popart::DynamicZeroInplaceOp
Public Functions
-
DynamicBinaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicBinaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicBinaryBaseOp : public popart::DynamicBaseOp
Dynamic Binary Base Op.
Base class for operators acting on a run-time selectable slice of a tensor. The word “binary” refers to the fact that the operator takes two tensors as input.
See also
DynamicBaseOp for details
Subclassed by popart::DynamicBinaryBaseInplaceOp, popart::DynamicTernaryBaseOp, popart::DynamicUpdateToUpdateGradOp, popart::DynamicZeroGradOp, popart::DynamicZeroOp
Public Functions
-
DynamicBinaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
inline const TensorInfo &getUpdateTensorInfo() const
-
virtual void growAliasModel(AliasModel &m) const final
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
Translate a PopART inplacing proposal.
This replaces an outplace op with an inplace op of type
inplaceId
, into an AliasModel equivalent.This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.
- Returns
A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.
-
DynamicBinaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicSliceBaseOp : public popart::DynamicBaseOp
Subclassed by popart::DynamicSliceOp, popart::DynamicUpdateUpdaterGradOp
Public Functions
-
DynamicSliceBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
void setup() final
-
TensorInfo createOutInfo() const
-
DynamicSliceBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
class DynamicSliceInplaceOp : public popart::DynamicSliceOp
Dynamic Slice Inplace Op.
This Op takes two or three
TensorIds
as input (as indicated inThe
TensorId
of tensor to slice from.The (optional)
TensorId
of the index of the starting point of the slice (See also
DynamicBaseOp for explanation).
The
TensorId
of the tensor to write the slice into (not used in outplace variant).
See also
graphcoreoperators.hpp)
The output is the
TensorId
of the sliced tensor, aliasedPublic Functions
-
DynamicSliceInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
DynamicSliceInplaceOp(const DynamicSliceOp&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final
Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final
Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual view::Regions modifies(InIndex) const override
Return the input region which this op modifies (for inplace ops).
- Parameters
InIndex – The input index.
- Returns
The regions which this op modifies.
-
virtual view::Regions aliases(InIndex, OutIndex) const override
Return the input region which the op output will alias (for inplace and view-changing ops).
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Parameters
InIndex – The input index.
OutIndex – The output index.
- Returns
The regions which the output will alias.
-
class DynamicSliceOp : public popart::DynamicSliceBaseOp
Dynamic Slice Op.
This Op takes two or three
TensorIds
as input (as indicated inThe
TensorId
of tensor to slice from.The (optional)
TensorId
of the index of the starting point of the slice (See also
DynamicBaseOp for explanation).
The
TensorId
of the tensor to write the slice into (not used in outplace variant).
See also
graphcoreoperators.hpp)
The output is the
TensorId
of the sliced tensor.Subclassed by popart::DynamicSliceInplaceOp
Public Functions
-
DynamicSliceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual void growAliasModel(AliasModel&) const override
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
Translate a PopART inplacing proposal.
This replaces an outplace op with an inplace op of type
inplaceId
, into an AliasModel equivalent.This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.
- Returns
A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.
-
class DynamicSlicePadGradOp : public popart::DynamicBaseOp
Public Functions
-
DynamicSlicePadGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
DynamicSlicePadGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicTernaryBaseInplaceOp : public popart::DynamicTernaryBaseOp
Subclassed by popart::DynamicAddInplaceOp, popart::DynamicUpdateInplaceOp
Public Functions
-
DynamicTernaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicTernaryBaseInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicTernaryBaseOp : public popart::DynamicBinaryBaseOp
Dynamic Ternary Base Op.
Base class for operators acting on a run-time selectable slice of a tensor. The word “ternary” refers to the fact that the operator takes three tensors as input.
See also
DynamicBaseOp for details
Subclassed by popart::DynamicAddOp, popart::DynamicTernaryBaseInplaceOp, popart::DynamicUpdateOp
Public Functions
-
DynamicTernaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicTernaryBaseOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class DynamicUpdateInplaceOp : public popart::DynamicTernaryBaseInplaceOp
Public Functions
-
DynamicUpdateInplaceOp(const DynamicUpdateOp &dynamicUpdateOp)
-
DynamicUpdateInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicUpdateInplaceOp(const DynamicUpdateOp &dynamicUpdateOp)
-
class DynamicUpdateOp : public popart::DynamicTernaryBaseOp
Dynamic Update Op.
This class takes three
TensorIds
as input (as indicated inThe
TensorId
of the tensor to be updated.The
TensorId
of the index of the starting point of the slice (See also
DynamicBaseOp for explanation).
The
TensorId
to update with (must match dimension with (index
,axes
,sizes
)).
See also
graphcoreoperators.hpp)
The output is the
TensorId
of the updated tensor.See also
DynamicTernaryBaseOp for details.
Public Functions
-
DynamicUpdateOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
class DynamicUpdateToUpdateGradOp : public popart::DynamicBinaryBaseOp
Public Functions
-
DynamicUpdateToUpdateGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
DynamicUpdateToUpdateGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
class DynamicUpdateUpdaterGradOp : public popart::DynamicSliceBaseOp
Public Functions
-
DynamicUpdateUpdaterGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
DynamicUpdateUpdaterGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&)
-
class DynamicZeroGradOp : public popart::DynamicBinaryBaseOp
Public Functions
-
DynamicZeroGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_)
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
DynamicZeroGradOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_)
-
class DynamicZeroInplaceOp : public popart::DynamicBinaryBaseInplaceOp
Public Functions
-
DynamicZeroInplaceOp(const DynamicZeroOp &dynamicZeroOp)
-
DynamicZeroInplaceOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings&, TensorInfo updateInInfo_ = TensorInfo())
-
DynamicZeroInplaceOp(const DynamicZeroOp &dynamicZeroOp)
-
class DynamicZeroOp : public popart::DynamicBinaryBaseOp
Public Functions
-
DynamicZeroOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
DynamicZeroOp(const OperatorIdentifier &_opid, std::vector<int64_t> axes_, std::vector<int64_t> sizes_, bool noOverlap_, const Op::Settings &settings_, TensorInfo updateInInfo_ = TensorInfo())
-
class ElementWiseBinaryArg0GradOp : public popart::ElementWiseBinaryGradOp
Subclassed by popart::Atan2Arg0GradOp, popart::DivArg0GradOp, popart::FmodArg0GradOp, popart::MulArg0GradOp, popart::PowArg0GradOp
Public Functions
-
inline ElementWiseBinaryArg0GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
inline ElementWiseBinaryArg0GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
class ElementWiseBinaryArg1GradOp : public popart::ElementWiseBinaryGradOp
Subclassed by popart::Atan2Arg1GradOp, popart::DivArg1GradOp, popart::MulArg1GradOp, popart::PowArg1GradOp, popart::SubtractArg1GradOp
Public Functions
-
inline ElementWiseBinaryArg1GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
inline ElementWiseBinaryArg1GradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
class ElementWiseBinaryBaseOp : public popart::Op
Subclassed by popart::ElementWiseBinaryInplaceLhsOp, popart::ElementWiseBinaryInplaceRhsOp, popart::ElementWiseBinaryOp
Public Functions
-
ElementWiseBinaryBaseOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
void setup() override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
void growAliasModel(AliasModel&) const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
ElementWiseBinaryBaseOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
class ElementWiseBinaryGradOp : public popart::Op
Subclassed by popart::ElementWiseBinaryArg0GradOp, popart::ElementWiseBinaryArg1GradOp
Public Functions
-
ElementWiseBinaryGradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
void setup() final
-
inline const std::vector<int64_t> &getReductionAxes() const
-
inline float getSubgraphValue() const final
-
inline const std::map<int, int> &gradOutToNonGradIn() const final
-
inline virtual const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ElementWiseBinaryGradOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_reduction_axes, const TensorInfo &_forward_op_arg_info, const Op::Settings &_settings)
-
class ElementWiseBinaryInplaceLhsOp : public popart::ElementWiseBinaryBaseOp
Subclassed by popart::AddLhsInplaceOp, popart::Atan2LhsInplaceOp, popart::MulLhsInplaceOp, popart::PowLhsInplaceOp
-
class ElementWiseBinaryInplaceRhsOp : public popart::ElementWiseBinaryBaseOp
Subclassed by popart::AddRhsInplaceOp, popart::MulRhsInplaceOp
-
class ElementWiseBinaryOp : public popart::ElementWiseBinaryBaseOp
Subclassed by popart::ElementWiseNpBroadcastableBinaryWithGradOp< AddArg0GradOp, AddArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Atan2Arg0GradOp, Atan2Arg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< DivArg0GradOp, DivArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< MulArg0GradOp, MulArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< PowArg0GradOp, PowArg1GradOp >, popart::ElementWiseNpBroadcastableBinaryWithGradOp< SubtractArg0GradOp, SubtractArg1GradOp >, popart::BitwiseBinaryOp, popart::ElementWiseNpBroadcastableBinaryWithGradOp< Arg0GradOp, Arg1GradOp >, popart::FmodOp, popart::PReluOp
Public Functions
-
ElementWiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void setInplacePriority(const OperatorIdentifier&, float)
-
float getInplacePriority(const OperatorIdentifier&) const
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
ElementWiseBinaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
class ElementWiseInplaceUnaryOp : public popart::ElementWiseUnaryOp
Subclassed by popart::AsinInplaceOp, popart::AtanInplaceOp, popart::ClipInplaceOp, popart::EluInplaceOp, popart::ExpInplaceOp, popart::Expm1InplaceOp, popart::GeluErfInplaceOp, popart::GeluInplaceOp, popart::HardSigmoidInplaceOp, popart::IncrementModInplaceOp, popart::LeakyReluInplaceOp, popart::Log1pInplaceOp, popart::LogSoftmaxInplaceOp, popart::OneWayUnaryInPlaceOp, popart::ReluInplaceOp, popart::ScaleInplaceOp, popart::SeluInplaceOp, popart::ShrinkInplaceOp, popart::SigmoidInplaceOp, popart::SinhInplaceOp, popart::SoftmaxInplaceOp, popart::SoftPlusInplaceOp, popart::SoftSignInplaceOp, popart::SwishInplaceOp, popart::ThresholdedReluInplaceOp
-
class ElementWiseNonLinearUnaryGradOp : public popart::Op
Subclassed by popart::AsinGradOp, popart::AtanGradOp, popart::CosGradOp, popart::EluGradOp, popart::ErfGradOp, popart::GeluErfGradOp, popart::GeluGradOp, popart::HardSigmoidGradOp, popart::Log1pGradOp, popart::LogGradOp, popart::ReciprocalGradOp, popart::SeluGradOp, popart::ShrinkGradOp, popart::SinGradOp, popart::SinhGradOp, popart::SoftPlusGradOp, popart::SoftSignGradOp, popart::SwishGradOp, popart::ThresholdedReluGradOp
Public Functions
-
ElementWiseNonLinearUnaryGradOp(const OperatorIdentifier &_opid, const ElementWiseUnaryOp &fwdOp)
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
ElementWiseNonLinearUnaryGradOp(const OperatorIdentifier &_opid, const ElementWiseUnaryOp &fwdOp)
-
template<class Arg0GradOp, class Arg1GradOp>
class ElementWiseNpBroadcastableBinaryWithGradOp : public popart::ElementWiseBinaryOp Subclassed by popart::AddOp, popart::Atan2Op, popart::DivOp, popart::MulOp, popart::PowOp, popart::SubtractOp
-
class ElementWiseUnaryBooleanOp : public popart::Op
Subclassed by popart::IsInf, popart::IsNaN
-
class ElementWiseUnaryOp : public popart::Op
Subclassed by popart::AbsOp, popart::AsinOp, popart::AtanOp, popart::AutoLossScaleProxyOp, popart::BinaryConstScalarOp, popart::BitwiseNotOp, popart::ClipOp, popart::CosOp, popart::DetachOp, popart::ElementWiseInplaceUnaryOp, popart::EluOp, popart::ErfOp, popart::Expm1Op, popart::ExpOp, popart::GeluErfOp, popart::GeluOp, popart::HardSigmoidOp, popart::IdentityOp, popart::IncrementModOp, popart::LeakyReluOp, popart::Log1pOp, popart::LogOp, popart::LogSoftmaxOp, popart::NegateOp, popart::NopOp, popart::NotOp, popart::OneWayUnaryOp, popart::PrintTensorOp, popart::ReciprocalOp, popart::ReluOp, popart::ScaleOp, popart::SeluOp, popart::ShrinkOp, popart::SigmoidOp, popart::SinhOp, popart::SinOp, popart::SoftmaxOp, popart::SoftPlusOp, popart::SoftSignOp, popart::SqrtOp, popart::SquareOp, popart::SwishOp, popart::TanhOp, popart::ThresholdedReluOp
Public Functions
-
ElementWiseUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
void setup() final
-
inline float getSubgraphValue() const override
-
inline bool canShard() const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
void growAliasModel(AliasModel&) const override
-
inline virtual bool isIdentity() const
- Returns
true, if and only if (iff) this Op is mathematically equivalent to f(x) = x. This is slightly different to canBeReplacedByIdentity; for example Detach and Identity have isIdentity overridden to return true, but still return false for canBeReplacedByIdentity.
-
ElementWiseUnaryOp(const OperatorIdentifier &_opid, const Op::Settings &_settings)
-
class EluGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class EluInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class EluOp : public popart::ElementWiseUnaryOp
Public Functions
-
EluOp(const OperatorIdentifier &opid, float alpha, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void appendAttributes(OpSerialiserBase&) const final
-
inline float alpha() const
-
EluOp(const OperatorIdentifier &opid, float alpha, const Op::Settings &settings)
-
class EqualOp : public popart::BinaryComparisonOp
-
class ErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class ErfOp : public popart::ElementWiseUnaryOp
-
class ExchangeBaseOp : public popart::Op
Subclassed by popart::HostBaseOp, popart::MultiExchangeOp, popart::RemoteBaseOp, popart::RemoteCodeLoadOp
Public Functions
-
inline ExchangeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
inline virtual int getNumExchanges() const
-
virtual ExchangeDescriptor getExchangeDescriptor(int index) const = 0
Return the exchange descriptor at index A
MultiExchangeOp
can contain multiple descriptors, whileRemoteLoad/Store
andHostLoad/Store
contain one each.- Parameters
index – Index of the exchange descriptor to return.
- Returns
ExchangeDescriptor
for the exchange.
-
inline float getSubgraphValue() const final
-
inline bool isOutlineable() const final
-
virtual std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const
Get the descriptor index associated with the input index.
- Parameters
index – input index
- Returns
pair of descriptor index and input index relative to the descriptor
-
virtual std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const
Get the descriptor index associated with the output index.
- Parameters
index – output index
- Returns
pair of descriptor index and output index relative to the descriptor
-
inline ExchangeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class ExpGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class ExpInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class ExpOp : public popart::ElementWiseUnaryOp
Public Functions
-
ExpOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
ExpOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class ExpandGradOp : public popart::Op
Public Functions
-
ExpandGradOp(const ExpandInplaceOp &op)
-
void setup() override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline std::vector<size_t> getXShape()
-
inline float getSubgraphValue() const final
-
ExpandGradOp(const ExpandInplaceOp &op)
-
class ExpandInplaceOp : public popart::ExpandOp
Public Functions
-
ExpandInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
ExpandInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
class ExpandOp : public popart::Op
Subclassed by popart::ExpandInplaceOp
Public Functions
-
inline ExpandOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
ExpandOp(const OperatorIdentifier &_opid, const Shape &_outShape, const Op::Settings &settings)
-
void setup() final
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
-
inline bool canBeReplacedByIdentity() const override
-
void growAliasModel(AliasModel&) const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
inline float getSubgraphValue() const final
-
inline ExpandOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class Expm1GradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class Expm1InplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class Expm1Op : public popart::ElementWiseUnaryOp
Public Functions
-
Expm1Op(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
Expm1Op(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class FloorInplaceOp : public popart::OneWayUnaryInPlaceOp
-
class FloorOp : public popart::OneWayUnaryOp
Public Functions
-
FloorOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
FloorOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class FmodArg0GradOp : public popart::ElementWiseBinaryArg0GradOp
-
class FmodOp : public popart::ElementWiseBinaryOp
-
class GRUGradOp : public popart::BaseOnnxRNNGradOp
Gradient operator for GRUOp.
Public Functions
Public Members
-
const unsigned linear_before_reset_attribute
-
const unsigned linear_before_reset_attribute
-
class GRUOp : public popart::BaseOnnxRNNOp
This op applies a single-layer GRU with a non-linearity to a batch of input sequences.
The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#GRU
Public Functions
-
GRUOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const std::string direction, bool linear_before_reset, const Op::Settings &settings_)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
unsigned getNumChannels() const
-
int64_t getNumDirections() const override
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
bool isTraining() const
-
inline virtual bool isOutlineable() const override
Check if op can be outlined.
If this method returns
false
, it will mean that any possible subgraph that this op is part of will not be cached.- Returns
true
if the op can be outlined,false
otherwise. Default:true
.
-
inline std::string getDirectionAttribute() const
-
inline int getLinearBeforeResetAttribute() const
Public Static Functions
-
GRUOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, const std::string direction, bool linear_before_reset, const Op::Settings &settings_)
-
class GatherGradOp : public popart::Op
Subclassed by popart::TiedGatherGradOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
int64_t getAxis() const
-
int64_t getGroupSize() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
inline nonstd::optional<float> getAvailableMemoryProportion() const
-
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class GatherOp : public popart::Op
Subclassed by popart::TiedGatherOp
Public Functions
-
GatherOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t group_size_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)
-
void setup() final
-
int64_t getAxis() const
-
int64_t getGroupSize() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const override
-
inline bool canShard() const override
-
inline nonstd::optional<float> getAvailableMemoryProportion() const
-
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)
-
inline bool zeroOutOfRangeIndices() const
-
GatherOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t group_size_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt, bool zeroOutOfRangeIndices_ = false)
-
class GeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class GeluInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class GeluOp : public popart::ElementWiseUnaryOp
Public Functions
-
GeluOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
GeluOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class GeluErfGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class GeluErfInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class GeluErfOp : public popart::ElementWiseUnaryOp
Public Functions
-
GeluErfOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
GeluErfOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class GetRandomSeedOp : public popart::Op
Public Functions
-
GetRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline bool isOutlineable() const final
-
inline void growAliasModel(AliasModel &m) const override
-
GetRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class GlobalAveragePoolGradOp : public popart::Op
Public Functions
-
GlobalAveragePoolGradOp(const GlobalAveragePoolOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
Public Members
-
GlobalAveragePoolGradOp(const GlobalAveragePoolOp&)
-
class GlobalAveragePoolOp : public popart::Op
Public Functions
-
GlobalAveragePoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
GlobalAveragePoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class GlobalMaxPoolGradOp : public popart::Op
Public Functions
-
GlobalMaxPoolGradOp(const GlobalMaxPoolOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
Public Members
-
GlobalMaxPoolGradOp(const GlobalMaxPoolOp&)
-
class GlobalMaxPoolOp : public popart::Op
Public Functions
-
GlobalMaxPoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
GlobalMaxPoolOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class GreaterOp : public popart::BinaryComparisonOp
-
class GroupNormGradOp : public popart::Op
Public Functions
-
GroupNormGradOp(const GroupNormOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getEpsilon() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
Public Static Functions
-
GroupNormGradOp(const GroupNormOp&)
-
class GroupNormOp : public popart::Op
Public Functions
-
GroupNormOp(const OperatorIdentifier &opid_, int64_t num_groups_, float epsilon_, const Op::Settings &settings)
-
void setup() final
-
inline float getEpsilon() const
-
inline int64_t getNumGroups() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool isNorm() const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
bool canBeReplacedByIdentity() const final
-
GroupNormOp(const OperatorIdentifier &opid_, int64_t num_groups_, float epsilon_, const Op::Settings &settings)
-
class HardSigmoidGradOp : public popart::ElementWiseNonLinearUnaryGradOp
Public Functions
-
HardSigmoidGradOp(const HardSigmoidOp&)
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
inline float getBeta() const
-
HardSigmoidGradOp(const HardSigmoidOp&)
-
class HardSigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp
Public Functions
-
HardSigmoidInplaceOp(const HardSigmoidOp&)
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
inline float getBeta() const
-
HardSigmoidInplaceOp(const HardSigmoidOp&)
-
class HardSigmoidOp : public popart::ElementWiseUnaryOp
Public Functions
-
HardSigmoidOp(const OperatorIdentifier &opid, float _alpha, float _beta, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
inline float getBeta() const
-
HardSigmoidOp(const OperatorIdentifier &opid, float _alpha, float _beta, const Op::Settings &settings)
-
class HasReceptiveFieldOp : public popart::Op
Subclassed by popart::AveragePoolOp, popart::MaxPoolOp
Public Functions
-
HasReceptiveFieldOp(const OperatorIdentifier &_opid, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
-
int getNSpatialDims() const
-
int64_t getBatchSize() const
-
int64_t getNInChans() const
-
std::vector<int64_t> getSpatialD() const
-
std::vector<int64_t> getSpatialO() const
-
void setup() override
-
virtual int64_t getNOutChans() const = 0
-
std::vector<int64_t> lowerPads() const
-
std::vector<int64_t> upperPads() const
-
std::vector<int64_t> lowerOutPads() const
-
std::vector<int64_t> upperOutPads() const
-
std::vector<size_t> spatialD_szt() const
-
std::vector<size_t> spatialK_szt() const
-
std::vector<uint32_t> lowerPads_u32() const
-
std::vector<uint32_t> upperPads_u32() const
-
std::vector<int> lowerPads_i32() const
-
std::vector<int> upperPads_i32() const
-
std::vector<uint32_t> dilations_u32() const
-
std::vector<uint32_t> strides_u32() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
Public Members
-
const std::vector<int64_t> basePads
-
const std::vector<int64_t> baseOutPads
-
const std::vector<int64_t> baseStrides
-
const std::vector<int64_t> baseDilations
-
const std::vector<int64_t> baseInDilations
-
const bool ceilMode
Public Static Functions
-
struct ReceptiveOpAttributes
Public Functions
-
void setFromAttributes(const Attributes &attributes)
-
void setFromAttributes(const Attributes &attributes)
-
HasReceptiveFieldOp(const OperatorIdentifier &_opid, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
-
class HistogramOp : public popart::Op
Public Functions
-
HistogramOp(const OperatorIdentifier &_opid, const std::vector<float> &levels_, const bool absoluteOfInput_, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline std::vector<float> getLevels() const
-
inline bool getAbsoluteOfInput() const
-
HistogramOp(const OperatorIdentifier &_opid, const std::vector<float> &levels_, const bool absoluteOfInput_, const Op::Settings &settings_)
-
class HostBaseOp : public popart::ExchangeBaseOp
Subclassed by popart::HostLoadOp, popart::HostStoreOp
Public Functions
-
inline HostBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, TensorId sid_)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool canShard() const final
-
inline bool hasSideEffect() const override
-
inline HostBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, TensorId sid_)
-
class HostLoadInplaceOp : public popart::HostLoadOp
Public Functions
-
HostLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
HostLoadInplaceOp(const HostLoadOp&)
-
void setup() final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
ExchangeDescriptor getExchangeDescriptor(int index) const final
-
HostLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
class HostLoadOp : public popart::HostBaseOp
Host Load Op: an op to represent the transfer of data from the host to the device.
It uses the existing host to device transfers created when building the IR, but defers the actual poplar::Copy until the op itself runs. This allows the copy to be scheduled as part of the normal op scheduling.
There is a stage in the IR which adds the following ops:
Device :: InitOp -> input_prehostload -> HostLoadOp -> input -> etc… / Host :: data -> stream
Subclassed by popart::HostLoadInplaceOp
Public Functions
-
HostLoadOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() override
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
Determine whether output tensors are guaranteed to have an equal value across all replicas.
This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).
The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.
- Parameters
aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.
- Returns
A tuple comprising of:
a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual void growAliasModel(AliasModel &m) const final
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final
Translate a PopART inplacing proposal.
This replaces an outplace op with an inplace op of type
inplaceId
, into an AliasModel equivalent.This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.
- Returns
A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.
-
ExchangeDescriptor getExchangeDescriptor(int index) const override
-
HostLoadOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
class HostStoreOp : public popart::HostBaseOp
Public Functions
-
HostStoreOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
void setup() final
-
ExchangeDescriptor getExchangeDescriptor(int index) const final
-
HostStoreOp(const OperatorIdentifier&, const Op::Settings&, TensorId sid_)
-
class IdentityGradOp : public popart::IdentityOp
Public Functions
-
IdentityGradOp(const IdentityOp &fwdOp)
-
IdentityGradOp(const Settings &settings_)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
IdentityGradOp(const IdentityOp &fwdOp)
-
class IdentityInplaceOp : public popart::IdentityOp
Public Functions
-
IdentityInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
IdentityInplaceOp(const IdentityOp &concatOp)
-
inline bool isInplaceViewChange() const override
-
IdentityInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class IdentityLossGradOp : public popart::Op
Public Functions
-
IdentityLossGradOp(const IdentityLossOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
bool canBeReplacedByIdentity() const override
-
inline ReductionType getReductionType() const
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
IdentityLossGradOp(const IdentityLossOp&)
-
class IdentityLossOp : public popart::LossOp
Public Functions
-
IdentityLossOp(const OperatorIdentifier &_opid, const ReductionType &reduction, const Op::Settings &settings_)
-
void setup() final
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
inline ReductionType getShardReductionType(OutIndex index) const override
-
IdentityLossOp(const OperatorIdentifier &_opid, const ReductionType &reduction, const Op::Settings &settings_)
-
class IdentityOp : public popart::ElementWiseUnaryOp
Subclassed by popart::AddBiasDataGradOp, popart::IdentityGradOp, popart::IdentityInplaceOp, popart::IfConditionGradOp
Public Functions
-
IdentityOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
inline bool isIdentity() const final
-
inline bool isOutplaceViewChange() const override
-
IdentityOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class IfConditionGradOp : public popart::IdentityOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class IfGradOp : public popart::IfOp
Public Functions
-
IfGradOp(const IfOp&, const std::vector<GradInOutMapper> &gradInInfo, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
IfGradOp(const IfOp&, const std::vector<GradInOutMapper> &gradInInfo, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo)
-
class IfOp : public popart::Op
Subclassed by popart::IfGradOp
Public Functions
-
IfOp(const OperatorIdentifier&, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo, const Op::Settings&)
-
void setup() final
-
inline float getSubgraphValue() const final
-
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
-
virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
-
virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
-
virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
-
float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override
-
virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override
-
IfOp(const OperatorIdentifier&, const BranchInfo &thenBranchInfo, const BranchInfo &elseBranchInfo, const Op::Settings&)
-
class IncrementModInplaceOp : public popart::ElementWiseInplaceUnaryOp
Increment Modulo Op.
This Op takes one Tensor as input (as indicated in Attributes:
increment - how much to increment the input tensor by (const scalar)
modulus - the modulo operand (const scalar)
See also
graphcoreoperators.hpp)
The Tensor to increment (modulo) The output is the tensor x = (x + increment) % modulus
Inplace - result is mapped back to the input Tensor.
Public Functions
-
IncrementModInplaceOp(const IncrementModOp&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline double getIncrement() const
-
inline double getModulus() const
-
class IncrementModOp : public popart::ElementWiseUnaryOp
Increment Modulo Op.
This Op takes one Tensor as input (as indicated in Attributes:
increment - how much to increment the input tensor by (const scalar)
modulus - the modulo operand (const scalar)
See also
graphcoreoperators.hpp)
The Tensor to increment (modulo) The output is the tensor y = (x + increment) % modulus
Public Functions
-
IncrementModOp(const OperatorIdentifier &opId, double increment_, double modulus_, const Op::Settings &settings)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline double getIncrement() const
-
inline double getModulus() const
-
class InitOp : public popart::Op
Public Functions
-
InitOp(const OperatorIdentifier&, const TensorInfo&, const TensorType&, const InitType&, const Op::Settings&, const int = -1)
-
void setup() final
-
inline TensorInfo getTensorInfo() const
-
inline TensorType getTensorType() const
-
inline float getSubgraphValue() const final
-
inline bool isOutlineable() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool canShard() const override
-
InitOp(const OperatorIdentifier&, const TensorInfo&, const TensorType&, const InitType&, const Op::Settings&, const int = -1)
-
class InstanceNormGradOp : public popart::Op
Public Functions
-
InstanceNormGradOp(const InstanceNormOp &fwd_op)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
Public Static Functions
-
InstanceNormGradOp(const InstanceNormOp &fwd_op)
-
class InstanceNormOp : public popart::Op
Public Functions
-
InstanceNormOp(const OperatorIdentifier &_opid, float _epsilon, const Op::Settings &settings)
-
void setup() final
-
inline float getEpsilon() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool isNorm() const override
-
inline float getSubgraphValue() const final
-
InstanceNormOp(const OperatorIdentifier &_opid, float _epsilon, const Op::Settings &settings)
-
class IoTileCopyOp : public popart::Op
Public Functions
-
IoTileCopyOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final
-
inline bool canShard() const override
-
IoTileCopyOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class IsInf : public popart::ElementWiseUnaryBooleanOp
Public Functions
-
IsInf(const OperatorIdentifier &_opid, const Op::Settings&)
Public Static Functions
-
static OperatorIdentifier getOpId(const Ir &ir)
-
IsInf(const OperatorIdentifier &_opid, const Op::Settings&)
-
class IsNaN : public popart::ElementWiseUnaryBooleanOp
Public Functions
-
IsNaN(const OperatorIdentifier &_opid, const Op::Settings&)
Public Static Functions
-
static OperatorIdentifier getOpId(const Ir &ir)
-
IsNaN(const OperatorIdentifier &_opid, const Op::Settings&)
-
class L1GradOp : public popart::Op
Public Functions
-
L1GradOp(const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline float getLambda() const
-
inline ReductionType getReductionType() const
-
inline bool canShard() const override
-
L1GradOp(const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
-
class L1Op : public popart::LossOp
Public Functions
-
L1Op(const OperatorIdentifier &_opid, const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline float getLambda() const
-
inline bool canShard() const override
-
inline ReductionType getShardReductionType(OutIndex index) const override
-
L1Op(const OperatorIdentifier &_opid, const float lambda_, const ReductionType reduction_, const Op::Settings &settings_)
-
class LRNGradOp : public popart::Op
Public Functions
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline float getAlpha() const
-
inline float getBeta() const
-
inline float getBias() const
-
inline int64_t getSize() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
void setup() final
-
class LRNOp : public popart::Op
Public Functions
-
LRNOp(const OperatorIdentifier &_opid, float _alpha, float _beta, float _bias, int64_t _size, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline float getAlpha() const
-
inline float getBeta() const
-
inline float getBias() const
-
inline int64_t getSize() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
LRNOp(const OperatorIdentifier &_opid, float _alpha, float _beta, float _bias, int64_t _size, const Op::Settings &settings_)
-
class LSTMGradOp : public popart::BaseOnnxRNNGradOp
Gradient operator for LSTM op.
Public Functions
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual const std::map<int, int> &gradOutToNonGradIn() const final
Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
-
bool hasLastCellStateGradInput() const
Public Members
-
const bool hasInitialCInput
-
const std::string fwd_debug_name
-
const ActivationFunction activation
-
const ActivationFunction recurrent_activation
-
virtual std::unique_ptr<Op> clone() const final
-
class LSTMOp : public popart::BaseOnnxRNNOp
This op applies a single-layer LSTM with a non-linearity to a batch of input sequences.
The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#LSTM
Public Functions
-
LSTMOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
unsigned getNumChannels() const
-
nonstd::optional<float> getAvailableMemoryProportion() const
-
bool hasInitialCInput() const
-
virtual std::set<InIndex> optionalInputs() const final
Return the input indices of all optional inputs to the op.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
bool isTraining() const
-
inline virtual bool isOutlineable() const override
Check if op can be outlined.
If this method returns
false
, it will mean that any possible subgraph that this op is part of will not be cached.- Returns
true
if the op can be outlined,false
otherwise. Default:true
.
-
virtual int getInBatchAxis(InIndex) const override
Get the batch axis for the input index.
- Returns
The batch axis for the input index.
-
virtual int getOutBatchAxis(OutIndex) const override
Get the batch axis for the output index.
- Returns
The batch axis for the output index.
-
inline ActivationFunction getActivation() const
-
inline ActivationFunction getRecurrentActivation() const
Public Static Functions
-
LSTMOp(const OperatorIdentifier &_opid, nonstd::optional<int64_t> hidden_size, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings &settings_, const nonstd::optional<float> available_memory_proportion_)
-
class LambSquareOp : public popart::Op
Public Functions
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline bool isOptimizerOp() const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
-
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final
-
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final
-
void setup() final
-
class LeakyReluGradOp : public popart::Op, public popart::LeakyReluOpBaseAttributes
Public Functions
-
LeakyReluGradOp(const LeakyReluOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
void appendAttributes(popart::OpSerialiserBase &os) const override
-
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
-
inline float getSubgraphValue() const final
-
LeakyReluGradOp(const LeakyReluOp&)
-
class LeakyReluInplaceOp : public popart::ElementWiseInplaceUnaryOp, public popart::LeakyReluOpBaseAttributes
Public Functions
-
LeakyReluInplaceOp(const LeakyReluOp&)
-
void appendAttributes(popart::OpSerialiserBase &os) const override
-
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
-
LeakyReluInplaceOp(const LeakyReluOp&)
-
class LeakyReluOp : public popart::ElementWiseUnaryOp, public popart::LeakyReluOpBaseAttributes
Public Functions
-
LeakyReluOp(const OperatorIdentifier &_opid, float _alpha, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
void appendAttributes(popart::OpSerialiserBase &os) const override
-
void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
LeakyReluOp(const OperatorIdentifier &_opid, float _alpha, const Op::Settings &settings_)
-
class LessOp : public popart::BinaryComparisonOp
-
class LinearVariadicGradOp : public popart::VariadicGradOp
Subclassed by popart::MeanArgGradOp, popart::SumArgGradOp
Public Functions
-
LinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
inline virtual bool hasScale() const
-
inline virtual float getScale() const
-
LinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
class Log1pGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class Log1pInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class Log1pOp : public popart::ElementWiseUnaryOp
Public Functions
-
Log1pOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
Log1pOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class LogGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class LogOp : public popart::ElementWiseUnaryOp
-
class LogSoftmaxGradOp : public popart::Op
Public Functions
-
LogSoftmaxGradOp(const LogSoftmaxOp&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
LogSoftmaxGradOp(const LogSoftmaxOp&)
-
class LogSoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp
Public Functions
-
LogSoftmaxInplaceOp(const LogSoftmaxOp&)
-
inline int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
LogSoftmaxInplaceOp(const LogSoftmaxOp&)
-
class LogSoftmaxOp : public popart::ElementWiseUnaryOp
Public Functions
-
LogSoftmaxOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings_)
-
int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
LogSoftmaxOp(const OperatorIdentifier &_opid, int64_t axis, const Op::Settings &settings_)
-
class LoopOp : public popart::SubgraphOp
Public Functions
-
LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_)
-
LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_, int numImplicitScanOutputs_)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline int getTripCountValue() const
-
inline void setTripCountValue(int value)
-
int getNumExplicitInputs() const
-
int getNumImplicitInputs() const
-
inline int getNumImplicitScanOutputs()
-
inline void setNumImplicitScanOutputs(int numOutputs)
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override
-
void addLoopInput(InIndex index, TensorId tensorId, TensorId subgraphTensorId, bool overwrite)
Add a variadic input to the loop operator.
- Parameters
index – The position at which a Tensor is consumed by the Op.
tensorId – The id of the tensor to add as an input.
subgraphTensorId – Tensor which is going to be created in the subgraph.
overwrite – If true the original tensor at
index
will be replaced.
-
inline void growAliasModel(AliasModel &m) const override
Public Static Functions
-
static inline InIndex getMaximumTripCountInIndex()
Indexing on the LoopOp.
- Returns
The LoopOp input index for the maximum number of loop iterations
-
static inline InIndex getTerminationConditionInIndex()
Indexing on the LoopOp.
- Returns
The LoopOp input index specifying the termination condition status
-
static inline InIndex getFirstInputInIndex()
Indexing on the LoopOp.
- Returns
The first regular, user-defined LoopOp input index
-
static inline OutIndex getFirstOutputOutIndex()
Indexing on the LoopOp.
- Returns
The first regular, user-defined LoopOp output index
-
static inline InIndex getLoopGraphIterationInIndex()
Indexing on the body graph.
- Returns
The loop body graph input index specifying the current loop iteration
-
static inline InIndex getLoopGraphTerminationConditionInIndex()
Indexing on the body graph.
- Returns
The loop body graph input index specifying the current termination condition status
-
static inline InIndex getLoopGraphFirstInputInIndex()
Indexing on the body graph.
- Returns
The first regular, user-defined loop body graph input index
-
LoopOp(const OperatorIdentifier&, const Op::Settings&, Graph &callee_)
-
class LossOp : public popart::Op
Subclassed by popart::CtcOp, popart::IdentityLossOp, popart::L1Op, popart::NllOp
Public Functions
-
LossOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const ReductionType reduction_)
-
bool isLossOp() const override
-
inline ReductionType getReductionType() const
Public Static Functions
-
static std::string reductionTypeToString(ReductionType reduction)
-
static ReductionType reductionTypeFromString(std::string reduction)
-
LossOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const ReductionType reduction_)
-
class LossScaleUpdateOp : public popart::Op
Public Functions
-
inline LossScaleUpdateOp(const OperatorIdentifier &_opid, const DataType &updateFactorDType_, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
void growAliasModel(AliasModel &m) const override
-
inline LossScaleUpdateOp(const OperatorIdentifier &_opid, const DataType &updateFactorDType_, const Op::Settings &settings_)
-
class MatMulBaseGradOp : public popart::MatMulBaseOp
Subclassed by popart::MatMulLhsGradOp, popart::MatMulRhsGradOp
Public Functions
-
MatMulBaseGradOp(const OperatorIdentifier &_opid, const MatMulOp &fwdOp, Phase phase)
-
MatMulBaseGradOp(const MatMulBaseGradOp&) = default
-
~MatMulBaseGradOp() override = default
-
inline float getSubgraphValue() const override
-
MatMulBaseGradOp(const OperatorIdentifier &_opid, const MatMulOp &fwdOp, Phase phase)
-
class MatMulBaseOp : public popart::Op
The matmul op supports inputs of IR datatype FLOAT8_143 and FLOAT8_152.
Inputs of this are a special case because they type require an additional scalar INT32 tensor input known as the
log2Scale
. This argument may only be used if and only if the two matmul operands are one of the FLOAT8_* types.If the matmul inputs are valid FLOAT8 and log2Scale inputs then the matmul is considered a ‘pow2 scaled matmul’. A pow2 scaled matmul is an operation of the form
result := A @ B * 2^(log2scale)
where @ is the matrix multiply op. In this case, the output and partials type must be FLOAT16. Note that the multiplication by 2^(log2scale) is handled by Poplar and is not listed as an Op in the IR.Subclassed by popart::MatMulBaseGradOp, popart::MatMulOp
Public Functions
-
MatMulBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const Phase phase_, const nonstd::optional<float> availableMemoryProportion_, const SerialiseSettings &serialization_, const OptionalDataType outputType_, const MatMulPartialsType partialsType_, const bool enableFullyConnectedPass_ = true)
-
MatMulBaseOp(const MatMulBaseOp&) = default
-
~MatMulBaseOp() override = default
-
virtual std::unique_ptr<Op> clone() const override = 0
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
bool useFullyConnectedPass() const
-
inline void setUseFullyConnectedPass(bool b)
-
inline nonstd::optional<float> getAvailableMemoryProportion() const
-
inline void setAvailableMemoryProportion(const nonstd::optional<float> v)
-
inline const SerialiseSettings &getSerialiseSettings() const
-
inline SerialiseSettings &getSerialiseSettings()
-
inline OptionalDataType getOutputType() const
-
virtual void appendOutlineAttributes(OpSerialiserBase &os) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
virtual void appendMore(OpSerialiserBase &os) const override
Append additional attributes to the stream.
This method should be overridden if the derived class has additional attributes.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline MatMulPartialsType getPartialsType() const
-
inline void setPartialsType(const MatMulPartialsType &pt)
-
inline virtual bool canShard() const override
Check if the operation can be sharded into multiple operations.
- Returns
true
if the operation can be sharded,false
otherwise.
-
MatMulBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const Phase phase_, const nonstd::optional<float> availableMemoryProportion_, const SerialiseSettings &serialization_, const OptionalDataType outputType_, const MatMulPartialsType partialsType_, const bool enableFullyConnectedPass_ = true)
-
class MatMulLhsGradOp : public popart::MatMulBaseGradOp
Public Functions
-
MatMulLhsGradOp(const MatMulLhsGradOp&) = default
-
MatMulLhsGradOp &operator=(const MatMulLhsGradOp&) = delete
-
~MatMulLhsGradOp() override = default
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
MatMulLhsGradOp(const MatMulLhsGradOp&) = default
-
class MatMulOp : public popart::MatMulBaseOp
Public Functions
-
MatMulOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const nonstd::optional<float> &availableMemoryProportion, const SerialiseSettings &serialization_, const OptionalDataType &outputType, const MatMulPartialsType &partialsType_ = MatMulPartialsType::FLOAT)
-
~MatMulOp() override = default
-
void setup() final
-
inline void setCanCreateInputs(bool value)
-
inline bool getCanCreateInputs() const
-
inline float getSubgraphValue() const final
-
bool isPow2ScaledMatMul() const
-
MatMulOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const nonstd::optional<float> &availableMemoryProportion, const SerialiseSettings &serialization_, const OptionalDataType &outputType, const MatMulPartialsType &partialsType_ = MatMulPartialsType::FLOAT)
-
class MatMulRhsGradOp : public popart::MatMulBaseGradOp
Public Functions
-
MatMulRhsGradOp(const MatMulRhsGradOp&) = default
-
MatMulRhsGradOp &operator=(const MatMulRhsGradOp&) = delete
-
~MatMulRhsGradOp() override = default
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
MatMulRhsGradOp(const MatMulRhsGradOp&) = default
-
class MaxArgGradOp : public popart::NonLinearVariadicGradOp
-
class MaxOp : public popart::VariadicOp
-
class MaxPoolGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
Public Members
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class MaxPoolOp : public popart::HasReceptiveFieldOp
Public Functions
-
MaxPoolOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &kernelShape_, int64_t storageOrder, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
-
int64_t getNOutChans() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
bool canBeReplacedByIdentity() const override
-
MaxPoolOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &kernelShape_, int64_t storageOrder, const HasReceptiveFieldOp::ReceptiveOpAttributes &attributes, const Op::Settings &settings)
-
class MeanArgGradOp : public popart::LinearVariadicGradOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
inline bool hasScale() const final
-
inline float getScale() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class MeanOp : public popart::VariadicOp
-
class MinArgGradOp : public popart::NonLinearVariadicGradOp
-
class MinOp : public popart::VariadicOp
-
class ModifyRandomSeedOp : public popart::Op
Public Functions
-
ModifyRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline bool isOutlineable() const final
Public Static Functions
-
ModifyRandomSeedOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class MulArg0GradOp : public popart::ElementWiseBinaryArg0GradOp
-
class MulArg1GradOp : public popart::ElementWiseBinaryArg1GradOp
-
class MulLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp
-
class MulRhsInplaceOp : public popart::ElementWiseBinaryInplaceRhsOp
-
class MultiCollectiveBaseOp : public popart::CollectivesBaseOp
The base class for a multi-collective which performs all-gather, all-reduce reduce-scatter operations on lists of tensors by first merging them into a larger tensor.
This improves bandwidth utilization and decreases the number of syncs needed.
Subclassed by popart::MultiReplicatedAllGatherOp, popart::MultiReplicatedAllReduceOp, popart::MultiReplicatedReduceScatterOp
Public Functions
-
MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, CommGroup commGroup, const Op::Settings &settings, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)
Constructor for the MultiReplicatedBaseOp.
- Parameters
operatorIdentifier – the identifier for the constructed op
commGroup – all of the inputs will be reduced scattered across the same communications group
settings – the settings of the op are shared across all inputs
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph
-
MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, const ReplicaGrouping &grouping, const Op::Settings &settings, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)
Constructor for the MultiReplicatedBaseOp.
- Parameters
operatorIdentifier – the identifier for the constructed op
grouping – all of the inputs will be reduced scattered across the same communications group
settings – the settings of the op are shared across all inputs
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedReduceScatterOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph
-
virtual std::unique_ptr<Op> clone() const override = 0
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() override
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in) const
Get virtual graph ID and tile set associated with an input index.
- Parameters
InIndex – The input index.
- Returns
The virtual graph ID and tile set at the input index.
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out) const
Get virtual graph ID and tile set associated with an output index.
- Parameters
OutIndex – The output index.
- Returns
The virtual graph ID and tile set at the output index.
-
virtual VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex in, std::set<OpId> &visited) const override
Get virtual graph ID and tile set associated with an input index.
- Parameters
InIndex – The input index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.
- Returns
The virtual graph ID and tile set at the input index.
-
virtual VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex out, std::set<OpId> &visited) const override
Get virtual graph ID and tile set associated with an output index.
- Parameters
OutIndex – The output index.
visited – The set of labels associated with this operator to distinguish it from other operators in the virtual graph.
- Returns
The virtual graph ID and tile set at the output index.
-
virtual void growAliasModel(AliasModel &m) const override
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
MultiCollectiveBaseOp(const OperatorIdentifier &operatorIdentifier, CommGroup commGroup, const Op::Settings &settings, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)
-
class MultiConvBaseOp : public popart::Op
Subclassed by popart::ConvOp, popart::MultiConvOp
Public Functions
-
MultiConvBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, std::vector<int64_t> flatStrides_, std::vector<int64_t> flatPads_, std::vector<int64_t> flatDilations_, const AutoPad &padType_, const MultiConvOptions &convOpts_)
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline virtual int numConvs() const
-
inline int64_t getNSpatialDims(int convIndex) const
-
inline int64_t getGroups(int convIndex) const
-
inline int64_t getNOutChans(int convIndex) const
-
inline int64_t getNInChans(int convIndex) const
-
ConvParameters getParameters(int convIndex) const
-
virtual void restoreAttributesFromParams(const std::vector<ConvParameters>&)
-
inline const MultiConvOptions &getConvOptions() const
-
inline void setConvOptions(const MultiConvOptions &opts)
-
int64_t getCumulativeSpatialDims(int64_t i) const
-
ConvStrides getStrides(int64_t convIndex) const
-
ConvDilations getDilations(int64_t convIndex) const
-
ConvDilations getInDilations(int64_t convIndex) const
Public Static Functions
-
static void appendConvParameterAttributes(const ConvParameters&, const std::string&, OpSerialiserBase&)
-
MultiConvBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, std::vector<int64_t> flatStrides_, std::vector<int64_t> flatPads_, std::vector<int64_t> flatDilations_, const AutoPad &padType_, const MultiConvOptions &convOpts_)
-
class MultiConvDataGradBaseOp : public popart::Op
Subclassed by popart::ConvDataGradOp, popart::MultiConvDataGradOp
Public Functions
-
MultiConvDataGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline const std::vector<GradInOutMapper> &gradInputInfo() const final
-
inline const std::map<int, int> &gradOutToNonGradIn() const final
-
inline const ConvParameters &getParameters(int convIndex) const
-
inline virtual int numConvs() const
-
inline const MultiConvOptions &getConvOptions() const
-
inline void setConvOptions(const MultiConvOptions &opts)
-
inline TensorInfo getDataInfo(int convIndex) const
-
MultiConvDataGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
-
class MultiConvDataGradOp : public popart::MultiConvDataGradBaseOp
Public Functions
-
MultiConvDataGradOp(const MultiConvOp&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
MultiConvDataGradOp(const MultiConvOp&)
-
class MultiConvOp : public popart::MultiConvBaseOp
Public Functions
-
MultiConvOp(const OperatorIdentifier &_opid, const Settings &settings_, const std::vector<int64_t> &flatStrides_, const std::vector<int64_t> &flatPads_, const std::vector<int64_t> &flatDilations_, const MultiConvOptions &mcOpts_)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
MultiConvOp(const OperatorIdentifier &_opid, const Settings &settings_, const std::vector<int64_t> &flatStrides_, const std::vector<int64_t> &flatPads_, const std::vector<int64_t> &flatDilations_, const MultiConvOptions &mcOpts_)
-
class MultiConvWeightsGradBaseOp : public popart::Op
Subclassed by popart::ConvWeightsGradOp, popart::MultiConvWeightsGradOp
Public Functions
-
MultiConvWeightsGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
-
void setup() final
-
inline const std::vector<GradInOutMapper> &gradInputInfo() const final
-
inline const std::map<int, int> &gradOutToNonGradIn() const final
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline virtual int numConvs() const
-
inline const ConvParameters &getParameters(int convIndex) const
-
inline const MultiConvOptions &getConvOptions() const
-
MultiConvWeightsGradBaseOp(const MultiConvBaseOp&, const OperatorIdentifier&)
-
class MultiConvWeightsGradOp : public popart::MultiConvWeightsGradBaseOp
Public Functions
-
MultiConvWeightsGradOp(const MultiConvOp&)
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
MultiConvWeightsGradOp(const MultiConvOp&)
-
class MultiExchangeOp : public popart::ExchangeBaseOp
Public Functions
-
MultiExchangeOp(const OperatorIdentifier&, const Op::Settings&, const std::vector<ExchangeDescriptor>)
-
void setup() final
-
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
int numLoads() const
-
int numStores() const
-
inline bool isRemote(int index)
-
inline void setRemoteBufferId(int index, RemoteBufferId remotebuffer_id)
-
inline RemoteBufferId getRemoteBufferId(int index) const
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const final
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const final
-
inline void growAliasModel(AliasModel &m) const override
-
inline bool canShard() const final
-
bool hasSideEffect() const final
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
inline int getNumExchanges() const final
-
ExchangeDescriptor getExchangeDescriptor(int index) const final
-
std::pair<int, int> inIndexToDescriptorIndex(InIndex index) const override
Map input index to a tuple of integers
(a,b)
that corresponds to the input associated withindex
.That is, the
b
th input ofgetExchangeDescriptor(a)
corresponds to the input atindex
.- Parameters
index – the input index to look up.
- Returns
a pair of integers comprising the index of the descriptor and the index of the input associated with the input within the descriptor.
-
std::pair<int, int> outIndexToDescriptorIndex(OutIndex index) const override
Map output index to a tuple of integers
(a,b)
that corresponds to the output associated withindex
.That is, the
b
th output ofgetExchangeDescriptor(a)
corresponds to the output atindex
.- Parameters
index – the output index to look up.
- Returns
a pair of integers comprising the index of the descriptor and the index of the output associated with the output within the descriptor.
-
MultiExchangeOp(const OperatorIdentifier&, const Op::Settings&, const std::vector<ExchangeDescriptor>)
-
class MultiReplicatedAllReduceOp : public popart::MultiCollectiveBaseOp
A multi-collective class for performing an all-reduce operation on a list of tensors.
The tensors will be merged into a single large tensor and reduced as one, leading to better bandwidth utilization and fewer syncs between replicas than doing the all-reduce on a per-tensor basis. The class supports mixing in-place and out-place all-reduce operations, but requires that all tensors use the same collective group i.e. reduction is over the same replicas. This op is usually constructed in the MergeCollectivesTransform
Public Functions
-
MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, CommGroup commGroup, const Settings &settings, std::vector<bool> modifiesIndexInplace, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)
Constructor for the MultiReplicatedAllReduceOp.
- Parameters
collectiveOperator – the collective operator is the same for all input tensors
commGroup – all of the inputs will be reduced across the same communications group
settings – the settings of the op are shared across all inputs
modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph
-
MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, const ReplicaGrouping &grouping, const Settings &settings, const std::vector<bool> &modifiesIndexInplace, const std::vector<TensorInfo> &outInfoFromBaseOps, const std::vector<VGraphIdAndTileSet> &inputVirtualGraphIdAndTileSet, const std::vector<VGraphIdAndTileSet> &outputVirtualGraphIdAndTileSet)
Constructor for the MultiReplicatedAllReduceOp.
- Parameters
collectiveOperator – the collective operator is the same for all input tensors
grouping – all of the inputs will be reduced across the same communications group
settings – the settings of the op are shared across all inputs
modifiesIndexInplace – for each of the inputs, specify whether it should be modified in place
outInfoFromBaseOps – the output information for each tensor, usually inherited from a ReplicatedAllReduceOp for that tensor
inputVirtualGraphIdAndTileSet – each input tensor has it’s own associated virtual graph
outputVIrtualGraphIdAnTileSet – each output tensor has it’s own associated virtual graph
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
inline CollectiveOperator getCollectiveOp() const
Returns the type of the collective used in the all reduce e.g.
addition the same collective operator is used across all the inputs to be reduced
-
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
Return which inputs and outputs are replicated tensor sharding pairs.
-
virtual view::Regions modifies(InIndex index) const override
Return the input region which this op modifies (for inplace ops).
- Parameters
InIndex – The input index.
- Returns
The regions which this op modifies.
-
virtual view::Regions aliases(InIndex in, OutIndex out) const override
Return the input region which the op output will alias (for inplace and view-changing ops).
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Parameters
InIndex – The input index.
OutIndex – The output index.
- Returns
The regions which the output will alias.
-
virtual void growAliasModel(AliasModel &m) const override
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
Determine whether output tensors are guaranteed to have an equal value across all replicas.
This means that they are “replica equal”. The check is based on information about the replica equal status of input tensors (and the same for any inputs that are modified by the op).
The default implementation sets each output tensor as being replica-equal if and only if all tensor inputs are replica-equal. For modified inputs, the default is to assume it is replica-equal only if there is an output that is deemed replica-equal that fully aliases all elements of the input. This default implementation is not correct for all ops. Ops that need a specialized implementation should override this virtual function.
- Parameters
aliasModel – An alias model object.
inputMap – A map that stores, for each input, whether the inputs are data-equivalent over all replicas.
proxy – A helper object passed in by the replica-equal analysis.
- Returns
A tuple comprising of:
a mapping from output index to a replica-equal status with an entry for each output tensor.
a vector of input indices for inputs that were modified by the op to a value that is not replica-equal.
-
MultiReplicatedAllReduceOp(CollectiveOperator collectiveOperator, CommGroup commGroup, const Settings &settings, std::vector<bool> modifiesIndexInplace, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet)
-
class NearbyIntOp : public popart::OneWayUnaryOp
Public Functions
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
class NegateGradOp : public popart::NegateOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class NegateOp : public popart::ElementWiseUnaryOp
Subclassed by popart::NegateGradOp
-
class NllGradOp : public popart::Op
Public Functions
-
NllGradOp(const TensorId &lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const bool inputIsLogProbability, const Op::Settings &settings)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline ReductionType getReductionType() const
-
inline bool hasIgnoreIndex() const
-
inline nonstd::optional<int> getOptionalIgnoreIndex() const
-
int getIgnoreIndex() const
-
inline bool inputIsLogProbability() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline bool canShard() const override
-
NllGradOp(const TensorId &lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const bool inputIsLogProbability, const Op::Settings &settings)
-
class NllOp : public popart::LossOp
Public Functions
-
NllOp(const OperatorIdentifier &_opid, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, bool inputIsLogProbability, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline bool hasIgnoreIndex() const
-
inline nonstd::optional<int> getOptionalIgnoreIndex() const
-
int getIgnoreIndex() const
-
inline bool inputIsLogProbability() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline bool canShard() const override
-
inline ReductionType getShardReductionType(OutIndex index) const override
-
NllOp(const OperatorIdentifier &_opid, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, bool inputIsLogProbability, const Op::Settings &settings_)
-
class NlllWithSoftmaxGradDirectOp : public popart::Op
Public Functions
-
NlllWithSoftmaxGradDirectOp(const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline ReductionType getReductionType() const
-
inline bool hasIgnoreIndex() const
-
inline int getIgnoreIndex() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline bool canShard() const override
-
inline ReductionType getShardReductionType(OutIndex index) const override
-
NlllWithSoftmaxGradDirectOp(const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
-
class NonLinearVariadicGradOp : public popart::VariadicGradOp
Subclassed by popart::MaxArgGradOp, popart::MinArgGradOp
Public Functions
-
NonLinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
NonLinearVariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
class NopOp : public popart::ElementWiseUnaryOp
-
class NotOp : public popart::ElementWiseUnaryOp
-
class NormalizeImageOp : public popart::Op
Public Functions
-
NormalizeImageOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings_, float _scale)
-
void setup() override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
inline float getScale() const
-
bool canBeReplacedByIdentity() const override
Public Static Functions
-
static OperatorIdentifier getOpId(const Ir &ir)
-
static inline std::string opName()
-
NormalizeImageOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings_, float _scale)
-
class OneWayUnaryInPlaceOp : public popart::ElementWiseInplaceUnaryOp
Subclassed by popart::CeilInplaceOp, popart::FloorInplaceOp, popart::NearbyIntInplaceOp, popart::RoundInplaceOp, popart::SignInplaceOp
-
class OneWayUnaryOp : public popart::ElementWiseUnaryOp
Subclassed by popart::CeilOp, popart::FloorOp, popart::NearbyIntOp, popart::RoundOp, popart::SignOp
-
class OnehotGradOp : public popart::Op
Public Functions
-
void setup() override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
void setup() override
-
class OnehotOp : public popart::Op
Public Functions
-
OnehotOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_)
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline int64_t getAxis() const
-
inline float getSubgraphValue() const final
-
OnehotOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_)
-
class OrOp : public popart::BinaryComparisonOp
-
class PReluOp : public popart::ElementWiseBinaryOp
-
class PackedDataBlockOp : public popart::Op
Public Functions
-
PackedDataBlockOp(const OperatorIdentifier&, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, Graph &callback, const Op::Settings&)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex, std::set<OpId> &visited) const override
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex, std::set<OpId> &visited) const override
-
int64_t numCallbackInputs() const
-
int64_t numDataInputs() const
-
int64_t getCallbackIterations() const
-
std::vector<PackedSequences> getPackedInputs()
-
PackedSequences getPackedOutput()
-
inline int64_t getCallbackBatchSize()
-
inline std::vector<int64_t> getMaxSequenceLengths()
-
inline int64_t getMaxSequenceLength(int64_t dataIndex)
-
std::vector<TensorInfo> callbackSequenceInInfos()
-
PackedDataBlockOp(const OperatorIdentifier&, const std::vector<int64_t> &maxSequenceLengths, int64_t resultSize, int64_t callbackBatchSize, Graph &callback, const Op::Settings&)
-
class PadOp : public popart::BasePadOutplaceOp
Public Functions
-
PadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
PadOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &_pads, const std::vector<unsigned> &_flips, float value_, const std::string &_mode, const Op::Settings &settings_)
-
template<typename TDerivedOp, typename TOpParams>
class ParameterizedOp : public popart::Op Generic base class for simple ops with parameterized attributes.
The aim of this class is to regroup all the common logic in the implementation of custom ops. In particular, it forces gathering all parameters/attributes into a proper data structure, helping generalizing the rest of the code.
- Template Parameters
TDerivedOP – CRTP template type.
TOpParams – Structure containing the op parameters.
Public Functions
-
inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParamsType &_params, const popart::Op::Settings &_settings)
Construct a custom op.
- Parameters
_opid – Operator id (default one if not provided).
_params – Operation parameters.
_settings – Settings.
-
inline ParameterizedOp(const ParamsType &_params, const popart::Op::Settings &_settings)
-
template<typename T>
inline ParameterizedOp(const popart::OperatorIdentifier &_opid, const ParameterizedOp<T, TOpParams> &_op) Construct a custom op from another op with same parameters.
Typically, this constructor build a grad op from a fwd op.
- Template Parameters
T – Op input type.
- Parameters
_opid – Operator identifier (default one if not provided).
_op – Operation to extract setting and parameters from.
-
template<typename T>
inline ParameterizedOp(const ParameterizedOp<T, TOpParams> &_op)
-
inline virtual std::unique_ptr<Op> clone() const override
Clone the operator.
NOTE: using CRTP trick for generic implementation!
- Returns
std::unique_ptr<Op> A unique pointer to the op.
-
inline virtual void appendAttributes(popart::OpSerialiserBase &os) const override
Append attributes when serialising the op to a stream.
This is used for debugging and also to generate the PopART IR hash. This hash is used to determine whether a Poplar cache can be reused so it is important that op attributes which may alter the Poplar compilation are appended to this stream. If this method is overridden, then it must also call the base class method.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual void appendOutlineAttributes(popart::OpSerialiserBase &os) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline virtual float getSubgraphValue() const override
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
inline virtual bool requiresRandomSeed() const override
Check if the op requires a random seed.
This is set to
false
by default and should be overridden and set totrue
if an IPU random seed tensor is required by the op. If, so it will be connected to inTensor(getSeedInIndex()) by the IR process.- Returns
true
if the op requires a random seed,false
otherwise.
Public Static Functions
-
static inline std::unique_ptr<TDerivedOp> createOpFromCreatorInfo(const popart::OpCreatorInfo &info)
Build the op from a PopART OpCreatorInfo data structure.
- Parameters
info – The OpCreatorInfo to use.
- Returns
Unique ptr of the op created.
-
static inline TDerivedOp *createOpInGraph(popart::Graph &graph, const std::map<popart::InIndex, popart::TensorId> &in, const std::map<popart::OutIndex, popart::TensorId> &out, const popart::OperatorIdentifier &opid, const TOpParams ¶ms, const popart::Op::Settings &settings)
Create the custom op connected in a graph.
- Parameters
graph – Graph where to create and connect the op.
in – Map of input tensor ids (i.e. name).
out – Map of input tensor ids (i.e. name).
opid – PopART operator identifier (default one if not provided).
params – Custom op parameters.
settings – Custom op settings.
- Returns
Pointer to the custom op created (owned by the graph?)
-
class PopartLSTMGradOp : public popart::Op
Gradient operator for PopartLSTMOp.
Public Functions
-
PopartLSTMGradOp(const PopartLSTMOp&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual const std::vector<GradInOutMapper> &gradInputInfo() const final
Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
- Returns
The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
-
virtual const std::map<int, int> &gradOutToNonGradIn() const final
Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
virtual std::set<InIndex> optionalInputs() const final
Return the input indices of all optional inputs to the op.
-
int64_t getInputSize() const
-
int64_t getMaxSeqLength() const
-
int64_t getBatchSize() const
-
int64_t getHiddenSize() const
-
inline ActivationFunction getActivation() const
-
inline ActivationFunction getRecurrentActivation() const
Public Members
-
const bool outputFullSequence
Public Static Functions
-
PopartLSTMGradOp(const PopartLSTMOp&)
-
class PopartLSTMOp : public popart::Op
Public Functions
-
PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)
-
PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, ActivationFunction activation, ActivationFunction recurrent_activation, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)
-
void setup() final
-
inline float getSubgraphValue() const final
-
bool hasBiasesInput() const
-
bool hasSeqLenInput() const
-
int64_t getMaxSeqLength() const
-
int64_t getBatchSize() const
-
int64_t getInputSize() const
-
int64_t getHiddenSize() const
-
int64_t getNumIntermediates() const
-
nonstd::optional<float> getAvailableMemoryProportion() const
-
inline ActivationFunction getActivation() const
-
inline ActivationFunction getRecurrentActivation() const
Public Members
-
const bool outputFullSequence
Public Static Functions
-
PopartLSTMOp(const OperatorIdentifier&, bool outputFullSequence_, const Op::Settings&, const nonstd::optional<float> available_memory_proportion_ = nonstd::nullopt)
-
class PowArg0GradOp : public popart::ElementWiseBinaryArg0GradOp
-
class PowArg1GradOp : public popart::ElementWiseBinaryArg1GradOp
-
class PowLhsInplaceOp : public popart::ElementWiseBinaryInplaceLhsOp
-
class PrintTensorOp : public popart::ElementWiseUnaryOp
Public Functions
-
PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const Op::Settings&)
-
PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const PrintTensorFmt &fmt, const Op::Settings&)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void appendOutlineAttributes(OpSerialiserBase &os) const final
-
inline bool canBeReplacedByIdentity() const final
-
inline bool hasSideEffect() const override
-
inline bool shouldPrint() const
-
inline const std::string &getTitle() const
-
inline void setTitle(std::string title_)
-
inline const PrintTensorFmt &getFmt() const
-
PrintTensorOp(const OperatorIdentifier&, bool printSelf, bool printGradient, const std::string &title, const Op::Settings&)
-
class RMSPropUpdaterOp : public popart::Op
Public Functions
-
RMSPropUpdaterOp(OptimizerValue eps, bool TFVariant, const Op::Settings&)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSubgraphValue() const final
-
inline bool isOptimizerOp() const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
-
RMSPropUpdaterOp(OptimizerValue eps, bool TFVariant, const Op::Settings&)
-
class RNNGradOp : public popart::BaseOnnxRNNGradOp
Gradient operator for RNNOp.
Public Functions
Public Members
-
const ActivationFunction activation_attribute
-
const ActivationFunction activation_attribute
-
class RNNOp : public popart::BaseOnnxRNNOp
This op applies a single-layer Elman RNN with a non-linearity to a batch of input sequences.
The op follows the ONNX specification described in https://github.com/onnx/onnx/blob/main/docs/Operators.md#RNN
For each batch element, the following output is computed:
\[ h_t = f(W x_t + b_x + R h_{t-1} + b_h) \]where:\(f\) is a supported nonlinearity function
\(W\) is the input weight
\(x_t\) is the t’th element of the input sequence
\(R\) is the recurrence weight matrix
\(h_{t-1}\) is the previous output sequence element. \(h_0\) can be provided by the user
\(b_x\) and \(b_h\) are the input and recurrence biases respectively
The op outputs the full sequence \(h_1, h_2, ...\), as well as the last element of the sequence.
If the biases or \(h_0\) are not set, they are considered to be 0 and not trained (are treated as constant 0s in the model).
Public Functions
-
RNNOp(const OperatorIdentifier &_opid, ActivationFunction activation, nonstd::optional<int64_t> hidden_size, const Op::Settings &settings_)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const override
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
virtual int getInBatchAxis(InIndex) const override
Get the batch axis for the input index.
- Returns
The batch axis for the input index.
-
virtual int getOutBatchAxis(OutIndex) const override
Get the batch axis for the output index.
- Returns
The batch axis for the output index.
-
inline virtual bool isOutlineable() const override
Check if op can be outlined.
If this method returns
false
, it will mean that any possible subgraph that this op is part of will not be cached.- Returns
true
if the op can be outlined,false
otherwise. Default:true
.
-
inline virtual std::string getName() const final
Public Members
-
const ActivationFunction activation_attribute
-
class RandomBaseOp : public popart::ShapeOrLikeOp
Subclassed by popart::DropoutBaseOp, popart::RandomNormalBaseOp, popart::RandomUniformBaseOp
Public Functions
-
RandomBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
inline bool requiresRandomSeed() const final
Public Static Functions
-
static void errorIfSeedIsSet(const Attributes &attr, OperatorIdentifier opid)
-
RandomBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
class RandomNormalBaseOp : public popart::RandomBaseOp
Subclassed by popart::RandomNormalLikeOp, popart::RandomNormalOp
Public Functions
-
RandomNormalBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getMean() const
-
inline float getScale() const
-
RandomNormalBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
-
class RandomNormalLikeOp : public popart::RandomNormalBaseOp
Public Functions
-
RandomNormalLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
-
void setup() final
-
std::unique_ptr<RandomNormalOp> foldInputTensor(const Op::Settings&) const
-
RandomNormalLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float mean_, float scale_, const Op::Settings &settings_)
-
class RandomNormalOp : public popart::RandomNormalBaseOp
-
class RandomUniformBaseOp : public popart::RandomBaseOp
Subclassed by popart::RandomUniformLikeOp, popart::RandomUniformOp
Public Functions
-
RandomUniformBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getHigh() const
-
inline float getLow() const
-
RandomUniformBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
-
class RandomUniformLikeOp : public popart::RandomUniformBaseOp
Public Functions
-
RandomUniformLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
-
void setup() final
-
std::unique_ptr<RandomUniformOp> foldInputTensor(const Op::Settings&) const
-
RandomUniformLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, float high_, float low_, const Op::Settings &settings_)
-
class RandomUniformOp : public popart::RandomUniformBaseOp
-
class ReciprocalGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class ReciprocalOp : public popart::ElementWiseUnaryOp
-
class ReduceGradOp : public popart::Op
Subclassed by popart::ReduceL1GradOp, popart::ReduceL2GradOp, popart::ReduceLogSumExpGradOp, popart::ReduceLogSumGradOp, popart::ReduceMaxGradOp, popart::ReduceMeanGradOp, popart::ReduceMedianGradOp, popart::ReduceMinGradOp, popart::ReduceProdGradOp, popart::ReduceSumGradOp, popart::ReduceSumSquareGradOp
Public Functions
-
ReduceGradOp(const AiGraphcoreOpIdV1 &opid, const ReduceOp &fwdOp, const Shape &backward_shape)
-
void setup() override
-
const std::vector<int64_t> &getAxes() const
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline float getSubgraphValue() const final
-
ReduceGradOp(const AiGraphcoreOpIdV1 &opid, const ReduceOp &fwdOp, const Shape &backward_shape)
-
class ReduceL1GradOp : public popart::ReduceGradOp
Public Functions
-
ReduceL1GradOp(const ReduceL1Op &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceL1GradOp(const ReduceL1Op &fwdOp, const Shape &backward_shape)
-
class ReduceL2GradOp : public popart::ReduceGradOp
Public Functions
-
ReduceL2GradOp(const ReduceL2Op &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceL2GradOp(const ReduceL2Op &fwdOp, const Shape &backward_shape)
-
class ReduceLogSumExpGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceLogSumExpGradOp(const ReduceLogSumExpOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceLogSumExpGradOp(const ReduceLogSumExpOp &fwdOp, const Shape &backward_shape)
-
class ReduceLogSumGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceLogSumGradOp(const ReduceLogSumOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceLogSumGradOp(const ReduceLogSumOp &fwdOp, const Shape &backward_shape)
-
class ReduceMaxGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceMaxGradOp(const ReduceMaxOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceMaxGradOp(const ReduceMaxOp &fwdOp, const Shape &backward_shape)
-
class ReduceMeanGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceMeanGradOp(const ReduceMeanOp &fwdOp, const Shape &backward_shape)
-
ReduceMeanGradOp(const ReduceMeanOp &fwdOp, const Shape &backward_shape)
-
class ReduceMedianGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceMedianGradOp(const ReduceMedianOp &fwd_op, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
ReduceMedianGradOp(const ReduceMedianOp &fwd_op, const Shape &backward_shape)
-
class ReduceMinGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceMinGradOp(const ReduceMinOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceMinGradOp(const ReduceMinOp &fwdOp, const Shape &backward_shape)
-
class ReduceOp : public popart::Op
Subclassed by popart::ReduceL1Op, popart::ReduceL2Op, popart::ReduceLogSumExpOp, popart::ReduceLogSumOp, popart::ReduceMaxOp, popart::ReduceMeanOp, popart::ReduceMedianOp, popart::ReduceMinOp, popart::ReduceProdOp, popart::ReduceSumOp, popart::ReduceSumSquareOp
Public Functions
-
ReduceOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
-
void setup() override
-
const std::vector<int64_t> &getAxes() const
-
bool getKeepDims() const
-
void setAxes(std::vector<int64_t> value)
-
void setKeepDims(int64_t value)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
ReduceOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
-
class ReduceProdGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceProdGradOp(const ReduceProdOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceProdGradOp(const ReduceProdOp &fwdOp, const Shape &backward_shape)
-
class ReduceSumGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceSumGradOp(const ReduceSumOp &fwdOp, const Shape &backward_shape)
-
ReduceSumGradOp(const ReduceSumOp &fwdOp, const Shape &backward_shape)
-
class ReduceSumOp : public popart::ReduceOp
Subclassed by popart::AddArg0GradOp, popart::AddArg1GradOp, popart::AddBiasBiasGradOp, popart::SubtractArg0GradOp
-
class ReduceSumSquareGradOp : public popart::ReduceGradOp
Public Functions
-
ReduceSumSquareGradOp(const ReduceSumSquareOp &fwdOp, const Shape &backward_shape)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
ReduceSumSquareGradOp(const ReduceSumSquareOp &fwdOp, const Shape &backward_shape)
-
class ReduceSumSquareOp : public popart::ReduceOp
Public Functions
-
ReduceSumSquareOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
-
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, CommGroup shardingDomain) final
-
void configureForReplicatedTensorSharding(ReplicatedTensorShardingIndices indices, const ReplicaGrouping &grouping) final
-
ReduceSumSquareOp(const OperatorIdentifier &_opid, const nonstd::optional<std::vector<int64_t>> &axes, const int64_t keepdims, const Op::Settings &settings)
-
class ReluInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class ReluOp : public popart::ElementWiseUnaryOp
Public Functions
-
ReluOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
ReluOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class RemoteBaseOp : public popart::ExchangeBaseOp
Subclassed by popart::RemoteLoadOp, popart::RemoteStoreOp
Public Functions
-
inline RemoteBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, RemoteBufferId rbid_)
-
inline virtual RemoteBufferId getRemoteBufferId() const final
-
inline virtual bool canShard() const final
-
inline virtual void setRemoteBufferId(RemoteBufferId remoteBufferId_) final
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline RemoteBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, RemoteBufferId rbid_)
-
class RemoteLoadInplaceOp : public popart::RemoteLoadOp
Remote Load Inplace Op.
See also
See also
RemoteLoadOp for explanation.
Public Functions
-
RemoteLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)
Construct the
RemoteLoadInplaceOp
.See constructor of the parent class for the input parameters.
-
RemoteLoadInplaceOp(const RemoteLoadOp&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual view::Regions modifies(InIndex) const final
Return the input region which this op modifies (for inplace ops).
- Parameters
InIndex – The input index.
- Returns
The regions which this op modifies.
-
virtual view::Regions aliases(InIndex, OutIndex) const final
Return the input region which the op output will alias (for inplace and view-changing ops).
See also
For more information on views, refer to the IPU Programmer’s Guide.
- Parameters
InIndex – The input index.
OutIndex – The output index.
- Returns
The regions which the output will alias.
-
virtual view::RegMap fwdRegMap(InIndex, OutIndex) const final
Map regions of the input tensor at the input index to the regions of the output tensor at the output index that these input regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual view::RegMap bwdRegMap(InIndex, OutIndex) const final
Map regions of the output tensor at the output index to the regions of the input tensor at the input index that these output regions alias.
- Parameters
InIndex – The op input index.
OutIndex – The op output index.
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
ExchangeDescriptor getExchangeDescriptor(int index) const final
-
RemoteLoadInplaceOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)
-
class RemoteLoadOp : public popart::RemoteBaseOp
Remote Load Op.
Loads a tensor from remote (off-chip) buffer. The tensor will be loaded from the memory location corresponding to
RemoteBufferId
, and will be stored in the memory location corresponding toinTensor
.This class takes between one and two
TensorIds
as inputs (as indicated inThe
TensorId
of theinTensor
.In the inplace version this will be aliased to the output tensor
In the outplace version this
Op
will clone theinTensor
, then write the loaded data to the clone
The (optional)
TensorId
to a 0-rank tensor calledoffset
.If set to a value >= 0
offset
will specify the row in the remote buffer the tensor will be loaded.If set to -1
RemoteSetup
will assign a unique value.
See also
graphcoreoperators.hpp).
The relationship between
offset
,RemoteBufferId
andRemoteSetup
is thoroughly described in
The output is the
TensorId
of the loaded tensor.See also
See also
See also
Subclassed by popart::RemoteLoadInplaceOp
Public Functions
-
RemoteLoadOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)
Construct the
RemoteLoadOp
.Parameters specifically related to this class:
See constructor of the parent class for the rest of input parameters.
- Parameters
RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the
RemoteSetup
will automatically choose the right buffer. TheRemoteBufferId
can only be used with tensors of identical shape.
-
virtual std::unique_ptr<Op> clone() const override
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const final
Return which inputs and outputs are replicated tensor sharding pairs.
-
ExchangeDescriptor getExchangeDescriptor(int index) const override
-
virtual std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const override
Return the variants of this op (if any) which can modify / alias the inputs at the given indices.
This function doesn’t check for anchor violations or topological order violations. When there are several ops, they should be returned in descending order of preference If the op can be replaced by an in-place variant of itself, this method should be overridden to return a vector of <OperatorIdentifier, float> tuples in descending order of preference.
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
Instantiate a particular in-place variant of the op with a specified OperatorIdentifier from the vector returned by inplacePriorityDefault().
- Parameters
OperatorIdentifier – The operator identifier of the op to be instantiated.
- Returns
An instance of the required op.
-
virtual void growAliasModel(AliasModel&) const final
For certain tasks which involve analysing how tensors alias each other, such as inplacing, a poprithms::memory::inplace::Graph that corresponds to this op’s graph is constructed.
The Poprithms graph can then be queried for aliasing information, and can have algorithms run on it.
To construct the Poprithms graph, each PopART op defines what its Poprithms equivalent ops are. This method inserts this op’s poprithms::memory::inplace::Op equivalents into the Poprithms Graph, which is the container popAliaser.
See also
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
- Pre
All input tensors of this op have mappings in
aliasModel
before the call toaliasModel
.- Post
All output tensors of this op have mappings in
aliasModel
after to the call toaliasModel
.
-
virtual poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const final
Translate a PopART inplacing proposal.
This replaces an outplace op with an inplace op of type
inplaceId
, into an AliasModel equivalent.This method is defined as a void method which sets a value passed by reference, as opposed to a getter method, so that no Poprithms headers need to be included in this file.
- Parameters
aliasModel – The mapping between this op’s (PopART) graph and the Poprithms graph.
2 – The operator identifier to translate to the AliasModel equivalent.
- Returns
A tuple where the first element corresponds to an alias gate in the AliasModel and the second element is a input index.
-
class RemoteStoreOp : public popart::RemoteBaseOp
Remote Store Op.
Stores a tensor to a remote (off-chip) buffer. This
Op
is typically used when the user wants to store several different identically shaped tensors to the same remote buffer by specifying theoffset
(see below).This class takes between one and two
TensorIds
as inputs (as indicated inThe
TensorId
of theinTensor
to copy to remote memory.The (optional)
TensorId
0-rank tensor calledoffset
.If set to a value >= 0
offset
will specify the row in the remote buffer theinTensor
will be written to (see below for explanation).If set to -1
RemoteSetup
will assign a unique value.
See also
graphcoreoperators.hpp).
If
inTensor
is of rank x , the remote buffer of a certainRemoteBufferId
will be of rank x+1, where the new dimension (the row) will be of sizeN
.Op
instances with matchingRemoteBufferId
will outline together, meaning that if multiple different tensors are to be stored under the same remote buffer ID, a different offset value has to be supplied for each tensor.For using the automatic
If not using the automatic
RemoteSetup
, alloffsets
andRemoteBufferIds
need to be >= 0. Each remote buffer ID needs then to be registered withIr::setRemoteBufferInfo
manually.See also
RemoteSetup configuration, the
offset
tensor should be a unique constant tensor perinTensor
perRemoteBufferId
. If the constantoffset
tensor has value -1,RemoteSetup
will assign a unique value, otherwise the suppliedoffset
value will be used.RemoteSetup
will callIr::setRemoteBufferInfo
to configure the shape (equal to theinTensor
shape) and number of rows (N
) in the remote memory.This Op does not have any output.
See also
See also
Public Functions
-
RemoteStoreOp(const OperatorIdentifier&, const Op::Settings&, RemoteBufferId rbid_ = -1UL)
Construct the
RemoteStoreOp
.Parameters specifically related to this class:
See constructor of the parent class for the rest of input parameters.
- Parameters
RemoteBufferId – The id of the remote buffer. Can be any integer. If not specified (or set to -1), the
RemoteSetup
will automatically choose the right buffer. TheRemoteBufferIds
can only be used with tensors of identical shape.
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
inline virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
inline virtual bool hasSideEffect() const override
Check if the op has any effect that is not captured by the (modification of) input or output tensors, such as modifying the state of the IPU or host system.
- Returns
true
if the op has side effects,false
otherwise. Default=false
.
-
virtual ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
Return which inputs and outputs are replicated tensor sharding pairs.
-
ExchangeDescriptor getExchangeDescriptor(int index) const final
-
class ReplicatedAllGatherOp : public popart::CollectivesBaseOp
Public Functions
-
ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&)
-
ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&, TensorInfo outInfo)
-
ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&)
-
ReplicatedAllGatherOp(const OperatorIdentifier&, const ReplicaGrouping &grouping, const Op::Settings&, const TensorInfo &outInfo)
-
void setup() final
-
inline float getSubgraphValue() const final
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
bool isConfigureOutputForReplicatedTensorSharding() const override
Check RTS mode (see collectives.hpp)
- Returns
True if this operation is configured for replicated tensor sharding
-
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
ReplicatedAllGatherOp(const OperatorIdentifier&, CommGroup group, const Op::Settings&)
-
class ReplicatedAllReduceInplaceOp : public popart::ReplicatedAllReduceOp
Public Functions
-
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, CommGroup group, const Op::Settings &settings_)
-
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, const ReplicaGrouping &grouping, const Op::Settings &settings_)
-
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
ReplicatedAllReduceInplaceOp(const ReplicatedAllReduceOp&)
-
void setup() final
-
ReplicatedAllReduceInplaceOp(const OperatorIdentifier &_opid, CollectiveOperator op_, CommGroup group, const Op::Settings &settings_)
-
class ReplicatedAllReduceOp : public popart::CollectivesBaseOp
Subclassed by popart::ReplicatedAllReduceInplaceOp
Public Functions
-
ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)
-
ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)
-
ReplicatedAllReduceOp(const OperatorIdentifier&, const Op::Settings&)
-
virtual std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const override
-
void setup() override
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline CollectiveOperator getCollectiveOp() const
-
void growAliasModel(AliasModel&) const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
ReplicatedAllReduceOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)
-
class ReplicatedReduceScatterOp : public popart::CollectivesBaseOp
Public Functions
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, const Op::Settings&)
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, const ReplicaGrouping &grouping, const Op::Settings&)
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, const Op::Settings&)
-
void setup() override
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline CollectiveOperator getCollectiveOp() const
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
bool isConfigureOutputForReplicatedTensorSharding() const override
Check RTS mode (see collectives.hpp)
- Returns
True if this operation is configured for replicated tensor sharding
-
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
ReplicatedReduceScatterOp(const OperatorIdentifier&, CollectiveOperator op, CommGroup group, bool configureOutputForReplicatedTensorSharding, const Op::Settings&)
-
class ReshapeBaseOp : public popart::Op
Subclassed by popart::ReshapeInplaceOp, popart::ReshapeOp
Public Functions
-
inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Shape &ots_, const Op::Settings &settings_, bool handleZero_ = true)
-
inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero_ = true)
-
void setup() final
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
void growAliasModel(AliasModel&) const override
-
inline ReshapeBaseOp(const OperatorIdentifier &_opid, const Shape &ots_, const Op::Settings &settings_, bool handleZero_ = true)
-
class ReshapeInplaceOp : public popart::ReshapeBaseOp
Public Functions
-
ReshapeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
inline std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
inline bool isInplaceViewChange() const override
-
ReshapeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
class ReshapeOp : public popart::ReshapeBaseOp
Subclassed by popart::ReshapeGradOp
Public Functions
-
inline ReshapeOp(const OperatorIdentifier &_opid, const Shape &s, const Op::Settings &settings_, bool handleZero = true)
-
inline ReshapeOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, bool handleZero = true)
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
inline bool isOutplaceViewChange() const override
-
inline ReshapeOp(const OperatorIdentifier &_opid, const Shape &s, const Op::Settings &settings_, bool handleZero = true)
-
class ResizeOp : public popart::Op
Subclassed by popart::ResizeGradOp
Public Functions
-
ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales)
-
ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales, ResizeNearestMode nearestMode, ResizeCoordinateTransformationMode)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline ResizeMode getMode() const
-
inline const std::vector<float> &getScales() const
-
inline ResizeNearestMode getNearestMode() const
-
inline ResizeCoordinateTransformationMode getCoordinateTransformationMode() const
-
ResizeOp(const OperatorIdentifier&, const Op::Settings&, ResizeMode, const std::vector<float> &scales)
-
class RestoreInplaceOp : public popart::RestoreOp
Public Functions
-
RestoreInplaceOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
-
inline void growAliasModel(AliasModel &m) const override
Public Members
-
bool requiredForRecompute = false
-
RestoreInplaceOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
-
class RestoreOp : public popart::Op
Subclassed by popart::RestoreInplaceOp
Public Functions
-
RestoreOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
-
void setup() final
-
inline float getSubgraphValue() const final
-
inline int64_t getStashSize() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool isOutlineable() const override
-
RestoreOp(const OperatorIdentifier&, int64_t stashSize, const Op::Settings&)
-
class ReverseBaseOp : public popart::Op
Subclassed by popart::ReverseInplaceOp, popart::ReverseOp
Public Functions
-
inline ReverseBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
-
void setup() final
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
inline std::vector<int64_t> getDimensions() const
-
void growAliasModel(AliasModel&) const override
-
inline ReverseBaseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
-
class ReverseInplaceOp : public popart::ReverseBaseOp
-
class ReverseOp : public popart::ReverseBaseOp
Subclassed by popart::ReverseGradOp
Public Functions
-
inline ReverseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
inline ReverseOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, const std::vector<int64_t> &dimensions_)
-
class RoiAlignGradOp : public popart::Op
Public Functions
-
RoiAlignGradOp(const RoiAlignOp&)
-
virtual void setup()
-
virtual const std::vector<popart::GradInOutMapper> &gradInputInfo() const
-
const std::map<int, int> &gradOutToNonGradIn() const
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline float getSpatialScale() const
-
inline uint64_t getSamplingRatio() const
-
inline uint64_t getAlignedHeight() const
-
inline uint64_t getAlignedWidth() const
-
RoiAlignGradOp(const RoiAlignOp&)
-
class RoiAlignOp : public popart::Op
Region of Interest (RoI) align operation described in the Mask R-CNN paper.
- Param spatialScale
Multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling, i.e., spatial scale of the input feature map X relative to the input image.
- Param samplingRatio
Number of sampling points in the interpolation grid used to compute the output value of each pooled output bin.
- Param alignedHeight
Pooled output Y’s height.
- Param alignedWidth
Pooled output X’s height.
Public Functions
-
RoiAlignOp(const popart::OperatorIdentifier &_opid, const popart::Op::Settings &settings, const float spatialScale, const uint64_t samplingRatio, const uint64_t alignedHeight, const uint64_t alignedWidth)
-
RoiAlignOp(const RoiAlignOp&) = default
-
RoiAlignOp &operator=(const RoiAlignOp&) = delete
-
~RoiAlignOp() override = default
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() override
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
virtual std::vector<std::unique_ptr<popart::Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
Append the op attributes that are relevant for outlining ops.
Ops should override this function if there are additional attributes. Two ops with identical type and outline attributes can be outlined and are supposed to be functionally equivalent.
- Parameters
OpSerialiserBase – The stream to which the attributes should be appended.
-
inline float getSpatialScale() const
-
inline uint64_t getSamplingRatio() const
-
inline uint64_t getAlignedHeight() const
-
inline uint64_t getAlignedWidth() const
-
class RoundInplaceOp : public popart::OneWayUnaryInPlaceOp
-
class RoundOp : public popart::OneWayUnaryOp
Public Functions
-
RoundOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
RoundOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class ScaleGradOp : public popart::ScaleOp
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class ScaleInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class ScaleOp : public popart::ElementWiseUnaryOp
Subclassed by popart::ScaleGradOp
Public Functions
-
ScaleOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)
-
inline void setScaleFactor(float value)
-
float getScaleFactor() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
bool canBeReplacedByIdentity() const override
-
ScaleOp(const OperatorIdentifier &_opid, float scale_, const Op::Settings &settings_)
-
class ScaledAddLhsInplaceOp : public popart::ScaledAddOp
Public Functions
-
ScaledAddLhsInplaceOp(const ScaledAddOp&)
-
ScaledAddLhsInplaceOp(const ScaledAddOp&)
-
class ScaledAddOp : public popart::Op
Subclassed by popart::ScaledAddLhsInplaceOp, popart::ScaledAddRhsInplaceOp
Public Functions
-
ScaledAddOp(const OperatorIdentifier &_opid, float scale_0_, float scale_1_, const Op::Settings &settings_)
-
void setup() override
-
inline float getScale0() const
-
inline float getScale1() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
inline bool canShard() const override
-
ReplicatedTensorShardingIndices getReplicatedTensorShardingIndices() const override
-
inline float getSubgraphValue() const override
-
void growAliasModel(AliasModel&) const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
ScaledAddOp(const OperatorIdentifier &_opid, float scale_0_, float scale_1_, const Op::Settings &settings_)
-
class ScaledAddRhsInplaceOp : public popart::ScaledAddOp
Public Functions
-
ScaledAddRhsInplaceOp(const ScaledAddOp&)
-
ScaledAddRhsInplaceOp(const ScaledAddOp&)
-
class ScanOp : public popart::SubgraphOp
Public Functions
-
ScanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, Graph &callee_, int numScanInputs_, int numImplicitInputs_, std::vector<int64_t> scanInputAxes_, std::vector<int64_t> scanInputDirections_, std::vector<int64_t> scanOutputAxes_, std::vector<int64_t> scanOutputDirections_)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
int getTripCountValue() const
-
inline int getNumScanInputs() const
-
inline int getNumVariables() const
-
inline int getNumImplicitInputs() const
-
inline int getNumScanOutputs() const
-
int64_t getScanInputAxis(int i) const
-
inline bool isScanInputReversed(int i) const
-
int64_t getScanOutputAxis(int i) const
-
inline bool isScanOutputReversed(int i) const
-
int64_t getScanInputDirection(int i) const
-
int64_t getScanOutputDirection(int i) const
-
ScanOp(const OperatorIdentifier &_opid, const Op::Settings &settings_, Graph &callee_, int numScanInputs_, int numImplicitInputs_, std::vector<int64_t> scanInputAxes_, std::vector<int64_t> scanInputDirections_, std::vector<int64_t> scanOutputAxes_, std::vector<int64_t> scanOutputDirections_)
-
class ScatterDataGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final override
-
const std::map<int, int> &gradOutToNonGradIn() const final override
-
void setup() final override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
float getSubgraphValue() const final override
-
int64_t getAxis() const noexcept
-
nonstd::optional<float> getAvailableMemoryProportion() const noexcept
-
const std::vector<GradInOutMapper> &gradInputInfo() const final override
-
class ScatterOp : public popart::ScatterReduceOp
Public Functions
-
ScatterOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
ScatterOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings &settings_, const nonstd::optional<float> &available_memory_proportion_ = nonstd::nullopt)
-
class ScatterReduceGradOp : public popart::Op
Public Functions
-
ScatterReduceGradOp(const ScatterReduceOp &op)
-
void setup() final override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final override
-
const std::map<int, int> &gradOutToNonGradIn() const final override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
float getSubgraphValue() const final override
-
int64_t getAxis() const noexcept
-
int64_t getGroupSize() const noexcept
-
ScatterReduction getReduction() const noexcept
-
bool indexBroadcasted() const noexcept
-
bool indexBroadcastEnabled() const noexcept
-
bool hasInitialValues() const noexcept
-
nonstd::optional<float> getAvailableMemoryProportion() const noexcept
Public Static Functions
-
ScatterReduceGradOp(const ScatterReduceOp &op)
-
class ScatterReduceOp : public popart::Op
Subclassed by popart::ScatterOp
Public Functions
-
ScatterReduceOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t axis_size_, ScatterReduction reduction_, int64_t group_size_, bool enable_index_broadcast_, const nonstd::optional<float> &available_memory_proportion_, const Op::Settings &settings_)
-
void setup() final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
float getSubgraphValue() const final override
-
int64_t getAxis() const noexcept
-
int64_t getGroupSize() const noexcept
-
ScatterReduction getReduction() const noexcept
-
bool indexBroadcasted() const noexcept
-
bool indexBroadcastEnabled() const noexcept
-
nonstd::optional<float> getAvailableMemoryProportion() const noexcept
-
void setAvailableMemoryProportion(const nonstd::optional<float> &v)
Public Static Functions
-
static std::string reductionToString(ScatterReduction reduction)
-
static ScatterReduction reductionFromString(const std::string &reductionStr)
-
ScatterReduceOp(const OperatorIdentifier &_opid, int64_t axis_, int64_t axis_size_, ScatterReduction reduction_, int64_t group_size_, bool enable_index_broadcast_, const nonstd::optional<float> &available_memory_proportion_, const Op::Settings &settings_)
-
class ScatterUpdateGradOp : public popart::Op
Public Functions
-
void setup() final override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final override
-
const std::map<int, int> &gradOutToNonGradIn() const final override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
float getSubgraphValue() const final override
-
int64_t getAxis() const noexcept
-
nonstd::optional<float> getAvailableMemoryProportion() const noexcept
-
void setup() final override
-
class SeluGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SeluInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SeluOp : public popart::ElementWiseUnaryOp
Public Functions
-
SeluOp(const OperatorIdentifier &opid, float _alpha, float _gamma, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
inline float getGamma() const
-
SeluOp(const OperatorIdentifier &opid, float _alpha, float _gamma, const Op::Settings &settings)
-
class SequenceSliceInplaceOp : public popart::SequenceSliceOp
Public Functions
-
SequenceSliceInplaceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
-
SequenceSliceInplaceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
-
class SequenceSliceOp : public popart::Op
Subclassed by popart::SequenceSliceInplaceOp
Public Functions
-
SequenceSliceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
-
inline float getSubgraphValue() const final
-
void setup() override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
void growAliasModel(AliasModel&) const override
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
Public Members
-
const bool zeroUnused
-
SequenceSliceOp(const OperatorIdentifier&, bool zeroUnused, const Op::Settings&)
-
class ShapeOrLikeOp : public popart::Op
Subclassed by popart::RandomBaseOp, popart::ZerosBaseOp
Public Functions
-
ShapeOrLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
inline float getSubgraphValue() const override
-
void validateDataType(DataType dataType, OperatorIdentifier opid)
-
inline const OptionalDataType &getDataType() const
Public Static Functions
-
static OptionalDataType getOptionalDataType(const Attributes &attr, OperatorIdentifier opid)
-
static const OpDefinition::DataTypes &likeSupportedInputTypes()
-
ShapeOrLikeOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
class ShapedDropoutOp : public popart::DropoutBaseOp
Subclassed by popart::ShapedDropoutGradOp
Public Functions
-
ShapedDropoutOp(const OperatorIdentifier &_opid, float ratio_, const Shape &shape_, const Op::Settings &settings_)
-
inline const std::vector<int64_t> &getShape() const
-
void setup() override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
ShapedDropoutOp(const OperatorIdentifier &_opid, float ratio_, const Shape &shape_, const Op::Settings &settings_)
-
class ShapedDropoutGradOp : public popart::ShapedDropoutOp
Public Functions
-
ShapedDropoutGradOp(const ShapedDropoutOp &fwdOp)
-
const std::vector<GradInOutMapper> &gradInputInfo() const override
-
const std::map<int, int> &gradOutToNonGradIn() const override
-
ShapedDropoutGradOp(const ShapedDropoutOp &fwdOp)
-
class ShrinkGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class ShrinkInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class ShrinkOp : public popart::ElementWiseUnaryOp
Public Functions
-
ShrinkOp(const OperatorIdentifier &opid, float lambd, float bias, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float lambd() const
-
inline float bias() const
-
ShrinkOp(const OperatorIdentifier &opid, float lambd, float bias, const Op::Settings &settings)
-
class SigmoidGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline float getSubgraphValue() const final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class SigmoidInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SigmoidOp : public popart::ElementWiseUnaryOp
Public Functions
-
SigmoidOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SigmoidOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class SignInplaceOp : public popart::OneWayUnaryInPlaceOp
-
class SignOp : public popart::OneWayUnaryOp
Public Functions
-
SignOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
inline float getSubgraphValue() const final
Public Static Functions
-
static OperatorIdentifier getOpId(const Ir &ir)
-
SignOp(const OperatorIdentifier &_opid, const Op::Settings &settings)
-
class SinGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SinOp : public popart::ElementWiseUnaryOp
-
class SinhGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SinhInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SinhOp : public popart::ElementWiseUnaryOp
Public Functions
-
SinhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SinhOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class SliceGradOp : public popart::BasePadOutplaceOp
Public Functions
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline bool canShard() const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
class SliceInplaceOp : public popart::BaseSliceOp
Public Functions
-
SliceInplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
SliceInplaceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
class SliceOp : public popart::BaseSliceOp
Subclassed by popart::PadGradOp
Public Functions
-
SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const Op::Settings &settings_)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SliceOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &starts_, const std::vector<int64_t> &ends_, const std::vector<int64_t> &axes_, const std::vector<int64_t> &steps_, const Op::Settings &settings_)
-
class SoftPlusGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SoftPlusInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SoftPlusOp : public popart::ElementWiseUnaryOp
Public Functions
-
SoftPlusOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SoftPlusOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class SoftSignGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SoftSignInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SoftSignOp : public popart::ElementWiseUnaryOp
Public Functions
-
SoftSignOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SoftSignOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class SoftmaxGradDirectOp : public popart::Op
Public Functions
-
SoftmaxGradDirectOp(const TensorId lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
-
void setup() final
-
bool hasNlllFwdOp() const
-
inline float getSubgraphValue() const final
-
inline ReductionType getReductionType() const
-
inline bool hasIgnoreIndex() const
-
inline nonstd::optional<int> getOptionalIgnoreIndex() const
-
inline int getIgnoreIndex() const
-
virtual void appendOutlineAttributes(OpSerialiserBase&) const final
-
SoftmaxGradDirectOp(const TensorId lossId, const nonstd::optional<int> ignoreIndex, const ReductionType reduction, const Op::Settings &settings)
-
class SoftmaxGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
int64_t getAxis() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline bool canShard() const override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class SoftmaxInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SoftmaxOp : public popart::ElementWiseUnaryOp
Public Functions
-
SoftmaxOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings&)
-
int64_t getAxis() const
-
void setAxis(int64_t)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SoftmaxOp(const OperatorIdentifier &_opid, int64_t axis_, const Op::Settings&)
-
class SplineBasisOp : public popart::Op
Public Functions
-
SplineBasisOp(const OperatorIdentifier &opid, int degree, const Op::Settings &settings)
-
void setup() override
-
float getSubgraphValue() const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
unsigned getDegree() const noexcept
Public Static Functions
-
SplineBasisOp(const OperatorIdentifier &opid, int degree, const Op::Settings &settings)
-
class SplineWeightingOp : public popart::Op
Public Functions
-
SplineWeightingOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
void setup() override
-
float getSubgraphValue() const override
-
void appendOutlineAttributes(OpSerialiserBase&) const override
Public Static Functions
-
SplineWeightingOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class SplitOp : public popart::Op
Public Functions
-
SplitOp(const OperatorIdentifier&, int64_t axis_, const std::vector<int64_t> split_, const Op::Settings&)
-
void setup() final
-
inline float getSubgraphValue() const final
-
std::vector<int64_t> getSplitSizes() const
-
inline int64_t getAxis() const
-
inline bool canShard() const override
-
SplitOp(const OperatorIdentifier&, int64_t axis_, const std::vector<int64_t> split_, const Op::Settings&)
-
class SqrtGradOp : public popart::Op
Public Functions
-
void setup() final
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline float getSubgraphValue() const final
-
void setup() final
-
class SqrtOp : public popart::ElementWiseUnaryOp
-
class SquareOp : public popart::ElementWiseUnaryOp
-
class StashOp : public popart::Op
Public Functions
-
StashOp(const OperatorIdentifier&, int64_t stashSize_, const Op::Settings&)
-
void setup() final
-
int64_t getStashSize()
-
inline float getSubgraphValue() const final
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline bool isOutlineable() const override
-
StashOp(const OperatorIdentifier&, int64_t stashSize_, const Op::Settings&)
-
class SubgraphOp : public popart::Op
Subclassed by popart::CallOp, popart::LoopOp, popart::ScanOp
Public Functions
-
SubgraphOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void appendOutlineAttributes(OpSerialiserBase &os) const override
-
std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqual(const AliasModel &aliasModel, const ReplEqInputMap &inputMap, ReplicaEqualAnalysisProxy &proxy) const override
-
VGraphIdAndTileSet getIntrospectionInVirtualGraphId(InIndex index, std::set<OpId> &visited) const override
-
VGraphIdAndTileSet getIntrospectionOutVirtualGraphId(OutIndex index, std::set<OpId> &visited) const override
-
bool hasSideEffect() const override
-
virtual InIndex opInToSubgraphInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
-
virtual InIndex subgraphInToOpInIndex(SubgraphIndex subgraphIndex, InIndex inIndex) const override
-
virtual OutIndex opOutToSubgraphOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
-
virtual OutIndex subgraphOutToOpOutIndex(SubgraphIndex subgraphIndex, OutIndex outIndex) const override
-
float calcAutoVirtualGraphCost(std::set<int> &inputs_seen) override
-
virtual void setCalledSubgraphGradInfo(const FwdGraphToBwdGraphInfo &calledGraphsGradInfo) override
Public Static Functions
-
static bool existsInOpInputs(std::vector<std::pair<TensorId, TensorInfo>> &opInputs, TensorId &tensorId)
-
SubgraphOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class SubsampleBaseOp : public popart::Op
Subclassed by popart::SubsampleInplaceOp, popart::SubsampleOp
Public Functions
-
SubsampleBaseOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
-
void setup() override
-
inline std::vector<int64_t> getStrides() const
-
std::vector<uint32_t> strides_u32() const
-
bool strideSizeOne() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
void growAliasModel(AliasModel&) const override
-
inline float getSubgraphValue() const final
Public Members
-
std::vector<int64_t> strides
-
SubsampleBaseOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
-
class SubsampleGradOp : public popart::Op
Public Functions
-
SubsampleGradOp(const SubsampleBaseOp &fwdOp)
-
void setup() override
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
inline std::vector<int64_t> getStrides() const
-
std::vector<uint32_t> strides_u32() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
SubsampleGradOp(const SubsampleBaseOp &fwdOp)
-
class SubsampleInplaceOp : public popart::SubsampleBaseOp
-
class SubsampleOp : public popart::SubsampleBaseOp
Public Functions
-
SubsampleOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
inline std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
SubsampleOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &strides_, const Op::Settings &settings_)
-
class SubtractArg0GradOp : public popart::ReduceSumOp
-
class SubtractArg1GradOp : public popart::ElementWiseBinaryArg1GradOp
-
class SumArgGradOp : public popart::LinearVariadicGradOp
-
class SumOp : public popart::VariadicOp
-
class SwishGradOp : public popart::ElementWiseNonLinearUnaryGradOp
-
class SwishInplaceOp : public popart::ElementWiseInplaceUnaryOp
-
class SwishOp : public popart::ElementWiseUnaryOp
Public Functions
-
SwishOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
SwishOp(const OperatorIdentifier &opid, const Op::Settings &settings)
-
class TanhOp : public popart::ElementWiseUnaryOp
-
class TensorRemapOp : public popart::Op
Op that creates a new output tensor with tensor layout created by downstream consumers, and then copies the input tensor to the output tensor.
Can improve tile memory liveness if the tensor without remapping is unsuitable for downstream consumers. Should only be used if actual issues occur, since remapping clones the tensor and can introduce more rearrangement and data copies than necessary.
Public Functions
-
TensorRemapOp(const OperatorIdentifier&, const TensorRemapType&, const Op::Settings&)
-
TensorRemapOp(const TensorRemapOp&)
-
virtual std::unique_ptr<Op> clone() const final
Return a copy of the op.
This method must be implemented. The compiler throws an error if this method is not implemented.
-
virtual void setup() final
Set the shape and type of the arguments to the op.
This MUST set the type and shape information for all the output TensorInfo objects.
-
inline TensorRemapType getTensorRemapType() const
-
virtual std::vector<std::unique_ptr<Op>> getGradOps() final
Determine the corresponding grad op for each op in the forward graph to automatically generate the backward pass.
There can be a separate gradient op for each input or a single gradient op that generates gradients for all inputs.
The mapping from the index of each output tensor of the gradient op to the index of each input tensor of the non-grad op is configured using the gradOutToNonGradIn() method that should be overridden in the grad op definitions.
Throws an error if this op is already a gradient op.
-
virtual const std::vector<GradInOutMapper> &gradInputInfo() const final
Get the mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
- Returns
The mapping between input indices in the grad op (for inputs, outputs and grad outputs) to the input indices in the corresponding non-grad op.
-
virtual const std::map<int, int> &gradOutToNonGradIn() const final
Get the mapping between the grad op outputs and the inputs of the corresponding non-grad op.
This method throws an error if the op this is called on is not a grad op.
-
inline virtual float getSubgraphValue() const final
Get the subgraph value.
This is used by outlining algorithm to determine whether or not to outline ops. There are high bounding values retrieved by getHighSubgraphValue() (for expensive ops such as Conv) or low bounding values retrieved by getLowSubgraphValue() (for inexpensive ops such as Relu).
- Returns
The subgraph value. Default: 0.
-
inline virtual bool isOutlineable() const final
Check if op can be outlined.
If this method returns
false
, it will mean that any possible subgraph that this op is part of will not be cached.- Returns
true
if the op can be outlined,false
otherwise. Default:true
.
-
TensorRemapOp(const OperatorIdentifier&, const TensorRemapType&, const Op::Settings&)
-
class ThresholdedReluGradOp : public popart::ElementWiseNonLinearUnaryGradOp
Public Functions
-
ThresholdedReluGradOp(const ThresholdedReluOp&)
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
ThresholdedReluGradOp(const ThresholdedReluOp&)
-
class ThresholdedReluInplaceOp : public popart::ElementWiseInplaceUnaryOp
Public Functions
-
ThresholdedReluInplaceOp(const ThresholdedReluOp&)
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
ThresholdedReluInplaceOp(const ThresholdedReluOp&)
-
class ThresholdedReluOp : public popart::ElementWiseUnaryOp
Public Functions
-
ThresholdedReluOp(const OperatorIdentifier &opid, float _alpha, const Op::Settings &settings)
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
void appendAttributes(OpSerialiserBase&) const override
-
inline float getAlpha() const
-
ThresholdedReluOp(const OperatorIdentifier &opid, float _alpha, const Op::Settings &settings)
-
class TiedGatherGradOp : public popart::GatherGradOp
Public Functions
-
TiedGatherGradOp(const TiedGatherOp *fwdOp, int64_t axis)
Public Members
-
const TiedGatherOp *fwdOp
-
TiedGatherGradOp(const TiedGatherOp *fwdOp, int64_t axis)
-
class TileOp : public popart::Op
Subclassed by popart::TileGradOp
Public Functions
-
TileOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
TileOp(const OperatorIdentifier &_opid, const std::vector<int64_t> &repeats_, const Shape &outShape_, const Op::Settings &settings_)
-
void setup() final
-
const std::vector<int64_t> &getRepeats() const
-
bool canBeReplacedByIdentity() const override
-
inline float getSubgraphValue() const final
-
TileOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class TopKGradOp : public popart::Op
Public Functions
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
int64_t getAxis() const
-
const TensorInfo &getGradOutInfo() const
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
inline float getSubgraphValue() const final
-
inline nonstd::optional<float> getAvailableMemoryProportion() const
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
class TopKOp : public popart::BaseSortOp
Subclassed by popart::SortOp
Public Functions
-
TopKOp(const OperatorIdentifier &_opid, int64_t k, int64_t axis, bool largest, bool sorted, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)
-
void setup() final
-
int64_t getK() const noexcept
-
bool getLargest() const noexcept
-
bool getSorted() const noexcept
-
bool getStable() const noexcept
-
void appendOutlineAttributes(OpSerialiserBase&) const final
-
inline nonstd::optional<float> getAvailableMemoryProportion() const
-
TopKOp(const OperatorIdentifier &_opid, int64_t k, int64_t axis, bool largest, bool sorted, const Op::Settings &settings, const nonstd::optional<float> &available_memory_proportion = nonstd::nullopt)
-
class TransposeBaseOp : public popart::Op
Subclassed by popart::TransposeInplaceOp, popart::TransposeOp
Public Functions
-
TransposeBaseOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
-
void setup() final
-
inline float getSubgraphValue() const final
-
std::vector<uint64_t> getPerm_u64() const
-
inline bool canShard() const override
-
void growAliasModel(AliasModel&) const override
-
TransposeBaseOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
-
class TransposeGradOp : public popart::TransposeOp
Public Functions
-
TransposeGradOp(const TransposeOp &fwdOp)
-
const std::vector<GradInOutMapper> &gradInputInfo() const final
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
TransposeGradOp(const TransposeOp &fwdOp)
-
class TransposeInplaceOp : public popart::TransposeBaseOp
Public Functions
-
TransposeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
TransposeInplaceOp(const TransposeOp&)
-
inline bool isInplaceViewChange() const override
-
TransposeInplaceOp(const OperatorIdentifier &_opid, const Shape&, const Op::Settings &settings_)
-
class TransposeOp : public popart::TransposeBaseOp
Subclassed by popart::TransposeGradOp
Public Functions
-
TransposeOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
-
void appendOutlineAttributes(OpSerialiserBase&) const override
-
bool canBeReplacedByIdentity() const override
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier &o) const final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel&, OperatorIdentifier) const override
-
inline bool isOutplaceViewChange() const override
-
TransposeOp(const OperatorIdentifier &_opid, const Shape &perm_, const Op::Settings &settings_)
-
class UnaryZeroGradOp : public popart::ZerosLikeOp
Public Functions
-
UnaryZeroGradOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
-
inline const std::vector<GradInOutMapper> &gradInputInfo() const
-
inline const std::map<int, int> &gradOutToNonGradIn() const
-
UnaryZeroGradOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
-
class VariadicGradOp : public popart::Op
Subclassed by popart::LinearVariadicGradOp, popart::NonLinearVariadicGradOp
Public Functions
-
VariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
const std::map<int, int> &gradOutToNonGradIn() const final
-
void setup() final
-
inline const TensorInfo &getFwdInputInfo()
-
inline float getSubgraphValue() const final
-
VariadicGradOp(const OperatorIdentifier &_opid, const VariadicOp&, InIndex)
-
class VariadicOp : public popart::Op
Subclassed by popart::MaxOp, popart::MeanOp, popart::MinOp, popart::SumOp
-
class WhereOp : public popart::Op
Subclassed by popart::WhereLhsInplaceOp, popart::WhereRhsInplaceOp
Public Functions
-
WhereOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
void setup() final
-
std::vector<std::tuple<OperatorIdentifier, float>> inplacePriorityDefault() const final
-
std::unique_ptr<Op> getInplaceVariant(const OperatorIdentifier&) const final
-
inline float getSubgraphValue() const final
-
poprithms::memory::inplace::Proposal mapInplaceProposal(const AliasModel &aliasModel, OperatorIdentifier opId) const override
-
void growAliasModel(AliasModel &m) const override
-
WhereOp(const OperatorIdentifier &_opid, const Op::Settings &settings_)
-
class ZerosBaseOp : public popart::ShapeOrLikeOp
Subclassed by popart::ZerosLikeOp, popart::ZerosOp
Public Functions
-
ZerosBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
ZerosBaseOp(const OperatorIdentifier &opid_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
class ZerosLikeOp : public popart::ZerosBaseOp
Subclassed by popart::UnaryZeroGradOp
Public Functions
-
ZerosLikeOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
-
void setup() final
-
ZerosLikeOp(const OperatorIdentifier &opid_, const Op::Settings &settings_)
-
class ZerosOp : public popart::ZerosBaseOp
Public Functions
-
ZerosOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, const Op::Settings &settings_)
-
void setup() final
-
ZerosOp(const OperatorIdentifier &opid_, const Shape &shape_, const OptionalDataType &dataType_, const Op::Settings &settings_)
14.8.4. Available Ops (Opx class)
-
class AbsOpx : public popart::popx::ElementWiseUnaryOpx
-
class AccumulateBaseOpx : public popart::popx::VarUpdateOpx
Subclassed by popart::popx::AccumulateOpx, popart::popx::RescaleAccumulateOpx, popart::popx::SparseAccumulateOpx
Public Functions
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
class AccumulateOpx : public popart::popx::AccumulateBaseOpx
-
class AccumulatorScaleOpx : public popart::popx::VarUpdateOpx
-
class AdaDeltaUpdaterOpx : public popart::popx::Opx
Public Functions
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
class AdamVarUpdateOpx : public popart::popx::VarUpdateOpx
-
class AddArg0GradOpx : public popart::popx::ReduceSumOpx
-
class AddArg1GradOpx : public popart::popx::ReduceSumOpx
-
class AddBiasBiasGradOpx : public popart::popx::ReduceSumOpx
-
class AddBiasInplaceOpx : public popart::popx::AddBiasOpx
-
class AddBiasOpx : public popart::popx::Opx
Subclassed by popart::popx::AddBiasInplaceOpx
Public Functions
-
class AddLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class AddOpx : public popart::popx::ElementWiseBinaryOutplaceOpx
-
class AddRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class AndOpx : public popart::popx::BinaryComparisonOpx
-
class ArgExtremaOpx : public popart::popx::Opx
Subclassed by popart::popx::ArgMaxOpx, popart::popx::ArgMinOpx
-
class ArgMaxOpx : public popart::popx::ArgExtremaOpx
-
class ArgMinOpx : public popart::popx::ArgExtremaOpx
-
class AsinInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class AsinOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class Atan2LhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class Atan2Opx : public popart::popx::ElementWiseBinaryOutplaceOpx
-
class AtanInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class AtanOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class BaseConcatOpx : public popart::popx::Opx
Subclassed by popart::popx::ConcatInplaceOpx, popart::popx::ConcatOpx
-
class BaseExpandOpx : public popart::popx::Opx
Subclassed by popart::popx::ExpandInplaceOpx, popart::popx::ExpandOpx
-
class BasePadOpx : public popart::popx::Opx
Subclassed by popart::popx::PadInplaceOpx, popart::popx::PadOpx
-
class BaseSliceOpx : public popart::popx::Opx
Subclassed by popart::popx::SliceInplaceOpx, popart::popx::SliceOpx
-
class BaseSortOpx : public popart::popx::Opx
Subclassed by popart::popx::TopKOpx
-
class BaseWhereOpx : public popart::popx::Opx
Subclassed by popart::popx::WhereLhsInplaceOpx, popart::popx::WhereOpx, popart::popx::WhereRhsInplaceOpx
Public Functions
-
class BinaryComparisonOpx : public popart::popx::Opx
Subclassed by popart::popx::AndOpx, popart::popx::EqualOpx, popart::popx::GreaterOpx, popart::popx::LessOpx, popart::popx::OrOpx
-
class BitwiseBinaryOpx : public popart::popx::ElementWiseBinaryOpx
-
class BitwiseNotOpx : public popart::popx::ElementWiseUnaryOpx
-
class CallOpx : public popart::popx::SubgraphOpx
Subclassed by popart::popx::CallGradOpx
-
class CastOpx : public popart::popx::Opx
Subclassed by popart::popx::CastGradOpx
-
class CeilInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class CeilOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class ClipInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ClipOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class CollectivesBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::MultiCollectiveBaseOpx, popart::popx::ReplicatedAllGatherOpx, popart::popx::ReplicatedAllReduceOpx, popart::popx::ReplicatedReduceScatterOpx
Public Functions
-
ReplicatedTensorShardingGroup getCollectiveLinkedGroup(ReplicatedTensorShardingIndicesIndex groupIndex) const
Function to determine which collective Ops need to be in the same collective linked group.
Ops in the same collective linked group need to use the same collective balanced reorder to ensure tensor layouts of tensors that interact with each other in the graph, are compatible.
Scenarios leading to collective Ops belonging to the same group:
The
CollectivesBaseOp::getCollectiveLinkedIndex()
is connected to the same root tensor (i.e. tensor A connects to thegetCollectiveLinkedIndex
of aReduceScatter
andAllGather
, directly or indirectly):A -> ReduceScatter -> IdentiyOp -> AllGather
The RTS enabled input/output tensors of RTS enabled collective operations meet in the compute graph:
B -> ReduceScatter -> C -> AllGather -> F -> ReduceScatter -> G \ VarUpdateOp / D -> ReduceScatter -> E
C, E and the VarUpdateOp in this graph are replicated tensor sharded (RTS) and therefore, both ReduceScatter Ops and the AllGather Op end up in the same collective linked group. B, D, F, G are not sharded, and therefore, the ReduceScatter between F and G can be in a different collective linked group.
The primary motivation for collective linked groups is “folding” multiple RTS tensors together via e.g. outlining. Folding in this context is when two operations or tensors that were unique now use the same code or memory, which implies that for example tensor layouts need to be identical too. If the graph has 3 RTS enabled variables, for example, and 2 of them use the same
VarUpdateOp
due to outlining, this implies that we need to ensure all RTS related Ops connected to those 2 variables use identical CBR (collective balanced reorder) rearrangement.CBR is set in the collective Ops themselves either during
Opx::unwindTensorLayout
,Opx:createInput
orOpx::grow
by callingcreateCollectiveBalancedReorder
The third variable would use a separate VarUpdateOp, and therefore is in a separate collective linked group, and can instantiate it’s own CBR, even if the tensor shapes matches.
getCollectiveLinkedGroup
uses Ops that introduce RTS/CBR as a starting point (ReduceScatter
&AllGather
) and tracks all associated Ops that propagate RTS with a DFS search on the graph.- Parameters
groupIndex – The index of the rtsIndices for which to return the collective group.
- Returns
Returns all linked tensors and their connected ops to coordinate tensor mapping of collective inputs and outputs
-
gcl::CollectiveBalancedReorder *getCollectiveBalancedReorder(ReplicatedTensorShardingIndicesIndex groupIndex) const
Get the existing CBR.
- Parameters
groupIndex – The index of the rtsIndices for which to return the collective group.
- Returns
Existing CBR for the input/output tensor of the collective Op
-
gcl::CollectiveBalancedReorder *createCollectiveBalancedReorder(poplar::Tensor tensor, ReplicatedTensorShardingIndicesIndex groupIndex) const
Create a new CBR instance for the reference
tensor
.- Parameters
tensor – non-sharded reference tensor
groupIndex – The index of the rtsIndices for which to return the collective group.
- Returns
New CBR for the input/output tensor of the collective Op
-
ReplicatedTensorShardingGroup getCollectiveLinkedGroup(ReplicatedTensorShardingIndicesIndex groupIndex) const
-
class ConcatInplaceOpx : public popart::popx::BaseConcatOpx
-
class ConcatOpx : public popart::popx::BaseConcatOpx
-
class ConvOpx : public popart::popx::MultiConvBaseOpx
Public Functions
-
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
class ConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx
-
class CopyVarUpdateOpx : public popart::popx::VarUpdateOpx
Public Functions
-
class CosOpx : public popart::popx::ElementWiseUnaryOpx
-
class DetachInplaceOpx : public popart::popx::ElementWiseUnaryOpx
-
class DetachOpx : public popart::popx::ElementWiseUnaryOpx
-
class DivOpx : public popart::popx::ElementWiseBinaryOpx
-
class DropoutOpx : public popart::popx::ElementWiseUnaryOpx
-
class DynamicAddInplaceOpx : public popart::popx::DynamicAddOpx
-
class DynamicAddOpx : public popart::popx::DynamicUpdateOpx
Subclassed by popart::popx::DynamicAddInplaceOpx
-
class DynamicSliceInplaceOpx : public popart::popx::DynamicSliceOpx
-
class DynamicSliceOpx : public popart::popx::Opx
Subclassed by popart::popx::DynamicSliceInplaceOpx
Public Functions
-
class DynamicUpdateInplaceOpx : public popart::popx::DynamicUpdateOpx
-
class DynamicUpdateOpx : public popart::popx::Opx
Subclassed by popart::popx::DynamicAddOpx, popart::popx::DynamicUpdateInplaceOpx
Public Functions
-
class DynamicZeroInplaceOpx : public popart::popx::DynamicZeroOpx
-
class DynamicZeroOpx : public popart::popx::Opx
Subclassed by popart::popx::DynamicZeroInplaceOpx
Public Functions
-
class ElementWiseBinaryInplaceOpx : public popart::popx::ElementWiseBinaryOpx
Subclassed by popart::popx::AddLhsInplaceOpx, popart::popx::AddRhsInplaceOpx, popart::popx::Atan2LhsInplaceOpx, popart::popx::MulLhsInplaceOpx, popart::popx::MulRhsInplaceOpx, popart::popx::PowLhsInplaceOpx
-
class ElementWiseBinaryOpx : public popart::popx::Opx
Subclassed by popart::popx::BitwiseBinaryOpx, popart::popx::DivOpx, popart::popx::ElementWiseBinaryInplaceOpx, popart::popx::ElementWiseBinaryOutplaceOpx, popart::popx::FmodOpx, popart::popx::PReluOpx, popart::popx::SubtractOpx
Public Functions
-
class ElementWiseBinaryOutplaceOpx : public popart::popx::ElementWiseBinaryOpx
Subclassed by popart::popx::AddOpx, popart::popx::Atan2Opx, popart::popx::MulOpx, popart::popx::PowOpx
-
class ElementWiseUnaryInplaceOpx : public popart::popx::ElementWiseUnaryOpx
Subclassed by popart::popx::AsinInplaceOpx, popart::popx::AtanInplaceOpx, popart::popx::CeilInplaceOpx, popart::popx::ClipInplaceOpx, popart::popx::EluInplaceOpx, popart::popx::ExpInplaceOpx, popart::popx::Expm1InplaceOpx, popart::popx::FloorInplaceOpx, popart::popx::GeluErfInplaceOpx, popart::popx::GeluInplaceOpx, popart::popx::HardSigmoidInplaceOpx, popart::popx::IncrementModInplaceOpx, popart::popx::LeakyReluInplaceOpx, popart::popx::Log1pInplaceOpx, popart::popx::LogSoftmaxInplaceOpx, popart::popx::NearbyIntInplaceOpx, popart::popx::ReluInplaceOpx, popart::popx::RoundInplaceOpx, popart::popx::ScaleInplaceOpx, popart::popx::SeluInplaceOpx, popart::popx::ShrinkInplaceOpx, popart::popx::SigmoidInplaceOpx, popart::popx::SignInplaceOpx, popart::popx::SinhInplaceOpx, popart::popx::SoftmaxInplaceOpx, popart::popx::SoftPlusInplaceOpx, popart::popx::SoftSignInplaceOpx, popart::popx::SwishInplaceOpx, popart::popx::ThresholdedReluInplaceOpx
-
class ElementWiseUnaryOpx : public popart::popx::Opx
Subclassed by popart::popx::AbsOpx, popart::popx::BitwiseNotOpx, popart::popx::CosOpx, popart::popx::DetachInplaceOpx, popart::popx::DetachOpx, popart::popx::DropoutOpx, popart::popx::ElementWiseUnaryInplaceOpx, popart::popx::ElementWiseUnaryOutplaceOpx, popart::popx::ErfxGradOpx, popart::popx::ErfxOpx, popart::popx::IdentityGradOpx, popart::popx::IdentityOpx, popart::popx::IsInfx, popart::popx::IsNaNx, popart::popx::LogOpx, popart::popx::LogSoftmaxGradOpx, popart::popx::MeanOpx, popart::popx::NegateGradOpx, popart::popx::NegateOpx, popart::popx::NotOpx, popart::popx::ReciprocalOpx, popart::popx::SigmoidGradOpx, popart::popx::SinOpx, popart::popx::SoftmaxGradOpx, popart::popx::SqrtOpx, popart::popx::SquareOpx
-
class ElementWiseUnaryOutplaceOpx : public popart::popx::ElementWiseUnaryOpx
Subclassed by popart::popx::AsinOpx, popart::popx::AtanOpx, popart::popx::CeilOpx, popart::popx::ClipOpx, popart::popx::EluOpx, popart::popx::Expm1Opx, popart::popx::ExpOpx, popart::popx::FloorOpx, popart::popx::GeluErfOpx, popart::popx::GeluOpx, popart::popx::HardSigmoidOpx, popart::popx::IncrementModOpx, popart::popx::LeakyReluOpx, popart::popx::Log1pOpx, popart::popx::LogSoftmaxOpx, popart::popx::NearbyIntOpx, popart::popx::ReluOpx, popart::popx::RoundOpx, popart::popx::ScaleOpx, popart::popx::SeluOpx, popart::popx::ShrinkOpx, popart::popx::SigmoidOpx, popart::popx::SignOpx, popart::popx::SinhOpx, popart::popx::SoftmaxOpx, popart::popx::SoftPlusOpx, popart::popx::SoftSignOpx, popart::popx::SwishOpx, popart::popx::ThresholdedReluOpx
-
class EluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class EluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class EqualOpx : public popart::popx::BinaryComparisonOpx
-
class ErfxGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class ErfxOpx : public popart::popx::ElementWiseUnaryOpx
-
class ExchangeBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::HostBaseOpx, popart::popx::MultiExchangeOpx, popart::popx::RemoteBaseOpx, popart::popx::RemoteCodeLoadOpx
-
class ExpInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ExpOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class ExpandInplaceOpx : public popart::popx::BaseExpandOpx
-
class ExpandOpx : public popart::popx::BaseExpandOpx
-
class Expm1InplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class Expm1Opx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class FloorInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class FloorOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class FmodOpx : public popart::popx::ElementWiseBinaryOpx
-
class GatherBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::GatherOpx, popart::popx::TiedGatherOpx
-
class GatherOpx : public popart::popx::GatherBaseOpx
-
class GeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class GeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class GeluErfInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class GeluErfOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class GreaterOpx : public popart::popx::BinaryComparisonOpx
-
class HardSigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class HardSigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class HostBaseOpx : public popart::popx::ExchangeBaseOpx
Subclassed by popart::popx::HostLoadOpx, popart::popx::HostStoreOpx
-
class HostLoadInplaceOpx : public popart::popx::HostLoadOpx
-
class HostLoadOpx : public popart::popx::HostBaseOpx
Subclassed by popart::popx::HostLoadInplaceOpx
Public Functions
-
class HostStoreOpx : public popart::popx::HostBaseOpx
-
class IdentityGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class IdentityOpx : public popart::popx::ElementWiseUnaryOpx
-
class IfOpx : public popart::popx::Opx
Subclassed by popart::popx::IfGradOpx
-
class IncrementModInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class IncrementModOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class IpuCopyOpx : public popart::popx::Opx
Public Functions
-
PreparedCopyTensors createPipelinedOutput() const
-
void growPipelined(poplar::program::Sequence&, PreparedCopyTensors) const
-
PreparedCopyTensors createPipelinedOutput() const
-
class LSTMOpx : public popart::popx::Opx
Public Functions
-
popnn::lstm::LstmParams createLSTMParams() const
-
popnn::lstm::LstmParams createLSTMParams() const
-
class LeakyReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class LeakyReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class LessOpx : public popart::popx::BinaryComparisonOpx
-
class Log1pInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class Log1pOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class LogOpx : public popart::popx::ElementWiseUnaryOpx
-
class LogSoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class LogSoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class LogSoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class LoopOpx : public popart::popx::SubgraphOpx
Public Functions
-
class MatMulOpx : public popart::popx::Opx
Public Functions
-
~MatMulOpx() override = default
-
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const
Public Static Functions
-
static void appendPoplarOptionsForOp(const MatMulBaseOp &op, poplar::OptionFlags &opts)
-
static void addPartialsType(const MatMulPartialsType &partialsType, poplar::OptionFlags &opts)
-
~MatMulOpx() override = default
-
class MeanOpx : public popart::popx::ElementWiseUnaryOpx
-
class MulLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class MulOpx : public popart::popx::ElementWiseBinaryOutplaceOpx
-
class MulRhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class MultiCollectiveBaseOpx : public popart::popx::CollectivesBaseOpx
A base class for the lowering of different subclasses of MultiCollectiveBaseOp.
Each output tensor can be grown separately.
Subclassed by popart::popx::MultiReplicatedAllGatherOpx, popart::popx::MultiReplicatedAllReduceOpx, popart::popx::MultiReplicatedReduceScatterOpx
Public Functions
-
std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const override
Defines which “parts” use a particular input tensor There are “output->n()” parts in the collective operation: part “i” uses input “i” and the indices tensor at “i + output->n()” this logic is the same for all collective ops, even in the absence of an indices tensor.
- Parameters
inTensor – the tensor for which to return a part id
-
OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const override
Defines which “part” is responsible for constructing a particular output There are “output->n()” parts: each part “i” produces output “i”.
- Parameters
outTensor – the tensor for which to return a corresponding part id
-
std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const override
-
class MultiConvBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::ConvOpx, popart::popx::MultiConvOpx
Public Functions
-
poplar::OptionFlags getConvOptions(int, std::string pass = "") const
-
std::string getFwdPassFlagString() const
-
inline virtual std::vector<poplar::Tensor> convolve(poplar::program::Sequence &prog, const std::vector<poplar::Tensor> &weights) const
-
inline virtual poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const
-
inline virtual poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const
-
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const
-
poplar::OptionFlags getConvOptions(int, std::string pass = "") const
-
class MultiConvOpx : public popart::popx::MultiConvBaseOpx
Public Functions
-
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
poplar::Tensor createDataInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
poplar::Tensor createWeightsInput(const poplar::DebugNameAndId &dnai, int convIndex) const final
-
class MultiConvWeightsGradBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::ConvWeightsGradOpx, popart::popx::MultiConvWeightsGradOpx
Public Functions
-
poplar::OptionFlags getConvOptions(int convIndex = 0) const
-
void verifyCacheSizeUnchanged(size_t beforeCacheSize) const
-
poplar::OptionFlags getConvOptions(int convIndex = 0) const
-
class MultiConvWeightsGradOpx : public popart::popx::MultiConvWeightsGradBaseOpx
-
class MultiExchangeOpx : public popart::popx::ExchangeBaseOpx
Public Functions
-
std::vector<std::pair<int, int>> getSegments() const
-
std::set<OpxGrowPartId> getInGrowPartIds(Tensor *inTensor) const final
-
OpxGrowPartId getOutGrowPartId(Tensor *outTensor) const final
-
void growPart(OpxGrowPartId id) const final
-
std::vector<std::pair<int, int>> getSegments() const
-
class MultiReplicatedAllReduceOpx : public popart::popx::MultiCollectiveBaseOpx
Lowers the MultiReplicatedAllReduceOp to Poplar by growing each individual output tensor, and performing a to-destination all-reduce on a concatenation of the input tensors.
Mixing of both in-place and out-place all-reduce operations is supported.
Public Functions
-
void growPart(OpxGrowPartId id) const override
-
void growPart(OpxGrowPartId id) const override
-
class NegateGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class NegateOpx : public popart::popx::ElementWiseUnaryOpx
-
class NllOpx : public popart::popx::Opx
-
Public Static Functions
-
static void flattenAndEncodeOneHot(const Opx &opx, poplar::program::Sequence &prog, const poplar::Tensor &probs, const poplar::Tensor &label, poplar::Tensor &probs2D, poplar::Tensor &label1D, poplar::Tensor &oneHot)
-
static poplar::Tensor applyMaskInPlaceForIgnoredIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor labels, int ignoreIndex, poplar::program::Sequence &prog)
-
static void applyScalingInPlaceForMeanReduction(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::program::Sequence &prog)
-
static void applyScalingInPlaceForMeanReductionWithIgnoreIndex(const Opx &opx, poplar::Tensor t, poplar::Tensor scale, poplar::Tensor mask, poplar::program::Sequence &prog)
-
static void handleLossGradScaling(const Opx &opx, bool hasIgnoreIndex, int64_t ignoreIndex, bool meanReduce, poplar::Tensor &oneHot, poplar::Tensor &gradIn, poplar::Tensor &label1D, poplar::program::Sequence &prog)
-
static void flattenAndEncodeOneHot(const Opx &opx, poplar::program::Sequence &prog, const poplar::Tensor &probs, const poplar::Tensor &label, poplar::Tensor &probs2D, poplar::Tensor &label1D, poplar::Tensor &oneHot)
-
class NormOpx : public popart::popx::Opx
Subclassed by popart::popx::BatchNormGradOpx, popart::popx::BatchNormOpx, popart::popx::GroupNormGradOpx, popart::popx::GroupNormOpx, popart::popx::InstanceNormGradOpx, popart::popx::InstanceNormOpx
-
class NotOpx : public popart::popx::ElementWiseUnaryOpx
-
class OrOpx : public popart::popx::BinaryComparisonOpx
-
class PReluOpx : public popart::popx::ElementWiseBinaryOpx
-
class PadInplaceOpx : public popart::popx::BasePadOpx
-
class PadOpx : public popart::popx::BasePadOpx
-
template<typename LSTMOP>
class PopartLSTMOpxBase : public popart::popx::Opx Subclassed by popart::popx::PopartLSTMGradOpx, popart::popx::PopartLSTMOpx
-
class PowLhsInplaceOpx : public popart::popx::ElementWiseBinaryInplaceOpx
-
class PowOpx : public popart::popx::ElementWiseBinaryOutplaceOpx
-
class ReciprocalOpx : public popart::popx::ElementWiseUnaryOpx
-
class ReduceSumOpx : public popart::popx::Opx
Subclassed by popart::popx::AddArg0GradOpx, popart::popx::AddArg1GradOpx, popart::popx::AddBiasBiasGradOpx, popart::popx::SubtractArg0GradOpx
-
class ReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class RemoteBaseOpx : public popart::popx::ExchangeBaseOpx
Subclassed by popart::popx::RemoteLoadOpx, popart::popx::RemoteStoreOpx
-
class RemoteLoadInplaceOpx : public popart::popx::RemoteLoadOpx
-
class RemoteLoadOpx : public popart::popx::RemoteBaseOpx
Subclassed by popart::popx::RemoteLoadInplaceOpx
Public Functions
-
class RemoteStoreOpx : public popart::popx::RemoteBaseOpx
-
class ReplicatedAllGatherOpx : public popart::popx::CollectivesBaseOpx
Public Functions
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
class ReplicatedAllReduceInplaceOpx : public popart::popx::ReplicatedAllReduceOpx
-
class ReplicatedAllReduceOpx : public popart::popx::CollectivesBaseOpx
Subclassed by popart::popx::ReplicatedAllReduceInplaceOpx
Public Functions
-
class ReplicatedReduceScatterOpx : public popart::popx::CollectivesBaseOpx
Public Functions
-
DnfTensorIds mustExistBeforeCreateDNF(InIndex index0) const final
-
ViewChangers getCreatorViewChangers(InIndex index) const final
-
DnfTensorIds mustExistBeforeCreateDNF(InIndex index0) const final
-
class RescaleAccumulateOpx : public popart::popx::AccumulateBaseOpx
-
class ReshapeBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::ReshapeInplaceOpx, popart::popx::ReshapeOpx
-
class ReshapeGradOpx : public popart::popx::ReshapeOpx
-
class ReshapeInplaceOpx : public popart::popx::ReshapeBaseOpx
-
class ReshapeOpx : public popart::popx::ReshapeBaseOpx
Subclassed by popart::popx::ReshapeGradOpx
-
template<typename Derived>
class RestoreBaseOpx : public popart::popx::Opx Base class for restore opxs.
- Template Parameters
Opx – is subclass of RestoreBaseOpx. Must have type alias
OpType
defined as the Op that it corresponds to.
-
class ReverseBaseOpx : public popart::popx::Opx
Subclassed by popart::popx::ReverseInplaceOpx, popart::popx::ReverseOpx
-
class ReverseGradOpx : public popart::popx::ReverseOpx
-
class ReverseInplaceOpx : public popart::popx::ReverseBaseOpx
-
class ReverseOpx : public popart::popx::ReverseBaseOpx
Subclassed by popart::popx::ReverseGradOpx
-
class RoundInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class RoundOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SGD0VarUpdateOpx : public popart::popx::VarUpdateOpx
-
class SGD1AcclUpdateOpx : public popart::popx::VarUpdateOpx
-
class SGD1VarUpdateOpx : public popart::popx::VarUpdateOpx
-
class ScaleInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ScaleOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
Subclassed by popart::popx::ScaleGradOpx
-
class ScaledAddLhsInplaceOpx : public popart::popx::ScaledAddOpx
-
class ScaledAddOpx : public popart::popx::Opx
Subclassed by popart::popx::ScaledAddLhsInplaceOpx, popart::popx::ScaledAddRhsInplaceOpx
-
class ScaledAddRhsInplaceOpx : public popart::popx::ScaledAddOpx
-
class ScaledVarUpdateOpx : public popart::popx::VarUpdateOpx
-
class ScatterOpx : public popart::popx::ScatterReduceOpx
-
class ScatterReduceOpx : public popart::popx::Opx
Subclassed by popart::popx::ScatterOpx
Public Functions
-
~ScatterReduceOpx()
-
~ScatterReduceOpx()
-
class SeluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SeluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class ShrinkInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ShrinkOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SigmoidGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class SigmoidInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SigmoidOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SinOpx : public popart::popx::ElementWiseUnaryOpx
-
class SinhInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SinhOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SliceInplaceOpx : public popart::popx::BaseSliceOpx
-
class SliceOpx : public popart::popx::BaseSliceOpx
Subclassed by popart::popx::PadGradOpx
-
class SoftPlusInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SoftPlusOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SoftSignInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SoftSignOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SoftmaxGradOpx : public popart::popx::ElementWiseUnaryOpx
-
class SoftmaxInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SoftmaxOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class SparseAccumulateOpx : public popart::popx::AccumulateBaseOpx
-
class SqrtOpx : public popart::popx::ElementWiseUnaryOpx
-
class SquareOpx : public popart::popx::ElementWiseUnaryOpx
-
class SubgraphOpx : public popart::popx::Opx
Subclassed by popart::popx::CallOpx, popart::popx::LoopOpx
Public Functions
-
PreparedTensorInfos getInputsToPrepare() const override
-
PreparedTensorInfos getOutputsToPrepare() const override
-
PreparedTensorInfos getInputsToPrepare() const override
-
class SubtractArg0GradOpx : public popart::popx::ReduceSumOpx
-
class SubtractOpx : public popart::popx::ElementWiseBinaryOpx
-
class SwishInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class SwishOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class ThresholdedReluInplaceOpx : public popart::popx::ElementWiseUnaryInplaceOpx
-
class ThresholdedReluOpx : public popart::popx::ElementWiseUnaryOutplaceOpx
-
class TiedGatherOpx : public popart::popx::GatherBaseOpx
-
class TopKOpx : public popart::popx::BaseSortOpx
-
class TransposeGradOpx : public popart::popx::TransposeOpx
-
class TransposeOpx : public popart::popx::Opx
Subclassed by popart::popx::TransposeGradOpx
-
class VarUpdateOpx : public popart::popx::Opx
Subclassed by popart::popx::AccumulateBaseOpx, popart::popx::AccumulatorScaleOpx, popart::popx::AdamVarUpdateOpx, popart::popx::CopyVarUpdateOpx, popart::popx::ScaledVarUpdateOpx, popart::popx::SGD0VarUpdateOpx, popart::popx::SGD1AcclUpdateOpx, popart::popx::SGD1VarUpdateOpx
-
class WhereLhsInplaceOpx : public popart::popx::BaseWhereOpx
-
class WhereOpx : public popart::popx::BaseWhereOpx
-
class WhereRhsInplaceOpx : public popart::popx::BaseWhereOpx
14.9. Patterns
#include <popart/patterns/patterns.hpp>
-
class Patterns
A class to hold which patterns are enabled and disabled.
Public Functions
-
Patterns(PatternsLevel level)
Constructor for the Patterns class.
- Parameters
level – The pattern set to run.
-
inline Patterns()
Default constructor for the Patterns class.
The pattern set to run is set to PatternsLevel::Default.
-
Patterns(std::vector<std::string> patterns)
Constructor for the Patterns class.
- Parameters
patterns – A vector of pattern names of patterns to be run.
-
bool isPatternEnabled(const std::type_index &t)
Check if a pattern (of class PreAliasPattern) is enabled.
- Parameters
t – The pattern to check.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isPatternEnabled(const std::string &t)
Check if pattern (not of class PreAliasPattern) is enabled.
- Parameters
t – The name of the pattern to check.
- Returns
true
if pattern is enabled;false
otherwise.
-
Patterns &enablePattern(const std::type_index &t, bool v)
Enable a pattern of class PreAliasPattern.
- Parameters
t – The pattern to enable.
v – If
true
then enable pattern. Iffalse
then disable pattern.
- Returns
Pattern.
-
Patterns &enablePattern(const std::string &t, bool v)
Enable a pattern not of class PreAliasPattern.
- Parameters
t – The pattern to enable.
v – If
true
then enable pattern. Iffalse
then disable pattern.
- Returns
Pattern.
-
bool isInitAccumulateEnabled()
Check if InitAccumulatePattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isPreUniReplEnabled()
Check if PreUniRepl is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isPostNReplEnabled()
Check if PostNRepl is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isSoftMaxGradDirectEnabled()
Check if SoftMaxGradDirect is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isNlllWithSoftMaxGradDirectEnabled()
Check if NlllWithSoftMaxGradDirect is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isSplitGatherEnabled()
Check if SplitGatherPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isOpToIdentityEnabled()
Check if OpToIdentityPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isUpsampleToResizeEnabled()
Check if UpsampleToResizePattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isSubtractArg1GradOpEnabled()
Check if SubtractArg1GradOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isMulArgGradOpEnabled()
Check if MulArgGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isReciprocalGradOpEnabled()
Check if ReciprocalGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isAtan2Arg0GradOpEnabled()
Check if Atan2Arg0GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isAtan2Arg1GradOpEnabled()
Check if Atan2Arg1GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isDivArg0GradOpEnabled()
Check if DivArg0GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isDivArg1GradOpEnabled()
Check if DivArg1GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isPowArg0GradOpEnabled()
Check if PowArg0GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isPowArg1GradOpEnabled()
Check if PowArg1GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isSinGradOpEnabled()
Check if SinGradOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isCosGradOpEnabled()
Check if CosGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
inline bool isInPlaceEnabled()
Check if InPlace is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
inline bool isUpdateInplacePrioritiesForIpuEnabled()
Check if UpdateInplacePrioritiesForIpu is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isSqrtGradOpEnabled()
Check if SqrtGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isConvFlipWeightsDoubleFlipEnabled()
Check if ConvFlipWeightsDoubleFlipPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isConvFlipWeightsGradOpEnabled()
Check if ConvFlipWeightsGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isExpandCastEnabled()
Check if ExpandCastPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isExpGradOpEnabled()
Check if ExpGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isExpm1GradOpEnabled()
Check if Expm1GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isLog1pGradOpEnabled()
Check if Log1pGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isLogGradOpEnabled()
Check if LogGradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isNegativeOneScaleEnabled()
Check if NegativeOneScalePattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isMatMulOpEnabled()
Check if MatMulOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isMatMulLhsGradOpEnabled()
Check if MatMulLhsGradOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isMatMulRhsGradOpEnabled()
Check if MatMulRhsGradOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isRandomNormalLikeOpPatternEnabled()
Check if RandomNormalLikeOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isRandomUniformLikeOpPatternEnabled()
Check if RandomUniformLikeOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isZerosLikeOpPatternEnabled()
Check if ZerosLikeOp is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isDecomposeBinaryConstScalarEnabled()
Check if DecomposeBinaryConstScalar is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isFmodArg0GradOpEnabled()
Check if FmodArg0GradOpPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isLambSerialisedWeightEnabled()
Check if LambSerialisedWeightPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isTiedGatherEnabled()
Check if TiedGatherPattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
bool isTiedGatherAccumulateEnabled()
Check if TiedGatherAccumulatePattern is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
Patterns &enableInitAccumulate(bool v)
Enable or disable InitAccumulatePattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enablePreUniRepl(bool v)
Enable or disable PreUniRepl.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enablePostNRepl(bool v)
Enable or disable PostNRepl.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableSoftMaxGradDirect(bool v)
Enable or disable SoftMaxGradDirect.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableNlllWithSoftMaxGradDirect(bool v)
Enable or disable NlllWithSoftMaxGradDirect.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableSplitGather(bool v)
Enable or disable SplitGatherPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableOpToIdentity(bool v)
Enable or disable OpToIdentityPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableUpsampleToResize(bool v)
Enable or disable UpsampleToResizePattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableSubtractArg1GradOp(bool v)
Enable or disable SubtractArg1GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableMulArgGradOp(bool v)
Enable or disable MulArgGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableReciprocalGradOp(bool v)
Enable or disable ReciprocalGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableAtan2Arg0GradOp(bool v)
Enable or disable Atan2Arg0GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableAtan2Arg1GradOp(bool v)
Enable or disable Atan2Arg1GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableDivArg0GradOp(bool v)
Enable or disable DivArg0GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableDivArg1GradOp(bool v)
Enable or disable DivArg1GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enablePowArg0GradOp(bool v)
Enable or disable PowArg0GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enablePowArg1GradOp(bool v)
Enable or disable PowArg1GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableSinGradOp(bool v)
Enable or disable SinGradOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableCosGradOp(bool v)
Enable or disable CosGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
inline Patterns &enableInPlace(bool v)
Enable or disable InPlace.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
inline Patterns &enableUpdateInplacePrioritiesForIpu(bool v)
Enable or disable UpdateInplacePrioritiesForIpu.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableSqrtGradOp(bool v)
Enable or disable SqrtGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableConvFlipWeightsDoubleFlip(bool v)
Enable or disable ConvFlipWeightsDoubleFlipPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableConvFlipWeightsGradOp(bool v)
Enable or disable ConvFlipWeightsGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableExpGradOp(bool v)
Enable or disable ExpGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableExpm1GradOp(bool v)
Enable or disable Expm1GradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableLog1pGradOp(bool v)
Enable or disable Log1pGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableLogGradOp(bool v)
Enable or disable LogGradOpPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableNegativeOneScale(bool v)
Enable or disable NegativeOneScalePattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableMatMulOp(bool v)
Enable or disable MatMulOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableMatMulLhsGradOp(bool v)
Enable or disable MatMulLhsGradOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableMatMulRhsGradOp(bool v)
Enable or disable MatMulRhsGradOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableRandomNormalLikeOpPattern(bool v)
Enable or disable RandomNormalLikeOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableRandomUniformLikeOpPattern(bool v)
Enable or disable RandomUniformLikeOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableZerosLikeOpPattern(bool v)
Enable or disable ZerosLikeOp.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableDecomposeBinaryConstScalar(bool v)
Enable or disable DecomposeBinaryConstScalar.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableLambSerialisedWeight(bool v)
Enable or disable LambSerialisedWeightPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableTiedGather(bool v)
Enable or disable TiedGatherPattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
Patterns &enableTiedGatherAccumulate(bool v)
Enable or disable TiedGatherAccumulatePattern.
- Parameters
v – If
true
then enable pattern. Iffalse
then disable pattern.
-
inline Patterns &enableRuntimeAsserts(bool b)
Enable or disable runtime asserts.
If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.
- Parameters
v – If
true
then enable runtime asserts. Iffalse
then disable run time asserts.
-
std::vector<std::unique_ptr<PreAliasPattern>> getPreAliasList()
Get list of patterns to be run before aliasing.
- Returns
A vector of pointers to patterns of class PreAliasPattern.
-
bool operator==(const Patterns &p) const
Equality operator.
- Parameters
p – Pattern to compare to.
- Returns
true
if patterns are equal;false
otherwise.
-
inline const std::map<std::type_index, bool> &getSettings() const
Get the settings (enabled or disabled) for patterns.
- Returns
Map of which patterns are enabled or disabled, indexed by value of std::type_index.
-
inline bool getInplaceEnabled() const
Check if the pattern InPlace is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
inline bool getUpdateInplacePrioritiesForIpuEnabled() const
Check if the pattern UpdateInplacePrioritiesForIpu is enabled.
- Returns
true
if pattern is enabled;false
otherwise.
-
inline bool getRuntimeAssertsOn() const
Check if runtime asserts are enabled.
If runtime asserts are enabled, then a check is performed to confirm that all mandatory patterns are enabled.
- Returns
true
if runtime asserts are enabled;false
otherwise.
Public Static Functions
-
static Patterns create(std::vector<std::string> patterns)
Create a set of pattern to be run.
- Parameters
patterns – A vector of pattern names of patterns to be run.
-
static std::vector<std::string> getAllPreAliasPatternNames()
Get the names of all patterns of class PreAliasPattern, using the same order as getPreAliasList().
- Returns
A vector of the names of all patterns of class PreAliasPattern.
-
static bool isMandatory(Pattern &pattern)
Check if a pattern is mandatory.
Mandatory patterns must be enabled and must be run.
This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is
true
.- Parameters
pattern – The pattern to check.
- Returns
If
true
then pattern is mandatory. Iffalse
then pattern is not mandatory.
-
static bool isMandatory(std::string &patternName)
Check if a pattern is mandatory.
Mandatory patterns must be enabled and must be run.
This method throws an error at runtime if the pattern is disabled and if enableRuntimeAsserts() is
true
.- Parameters
patternName – The name of the pattern to check.
- Returns
If
true
then pattern is mandatory. Iffalse
then pattern is not mandatory.
Friends
-
friend std::ostream &operator<<(std::ostream &os, const Patterns &patterns)
Write a string representation of patterns to an output stream.
- Parameters
os – An output stream that the the string representation should be written to.
patterns – The patterns for which the string representation is created.
- Returns
An output stream containing the string representation of the patterns.
-
Patterns(PatternsLevel level)
-
class PreAliasPattern : public popart::Pattern
Subclassed by popart::AllReduceToIdentityPattern, popart::BinaryGradOpPattern, popart::ContiguateIpuCopyIndicesPattern, popart::ConvDataGradPattern, popart::ConvFlipWeightsDoubleFlipPattern, popart::ConvFlipWeightsGradOpPattern, popart::ConvTransposePattern, popart::CosGradOpPattern, popart::CoshOpPattern, popart::DecomposeBinaryConstScalar, popart::ElementWiseGradOpPattern< GRADOP, DOP >, popart::ExpandCastPattern, popart::ExpGradOpPattern, popart::Expm1GradOpPattern, popart::Fuser, popart::InitAccumulatePattern, popart::LambSerialisedWeightPattern, popart::LikeOpsPattern< L >, popart::Log1pGradOpPattern, popart::LogGradOpPattern, popart::LoopScanOutPattern, popart::LSTMPattern, popart::MatMulGradPattern, popart::MatMulPattern, popart::MulArgGradOpPattern, popart::NlllWithSoftmaxGradDirect, popart::OptimizerDecompose, popart::PackedDataBlockPattern, popart::PadSumPattern, popart::PostNRepl, popart::PreUniRepl, popart::ReciprocalGradOpPattern, popart::RemoveUnnecessaryLossGradCast, popart::ScanToLoopPattern, popart::SequenceExpander, popart::SlicePattern, popart::SplitGatherPattern, popart::SplitOpPattern, popart::SqrtGradOpPattern, popart::SumToAddPattern, popart::TiedGatherAccumulatePattern, popart::TiedGatherPattern, popart::TransposeToIdentityOrReshapePattern, popart::UpsampleToResizePattern, popart::ViewSimplifyPattern
14.9.1. Available patterns
-
class AllReduceToIdentityPattern : public popart::PreAliasPattern
-
class BinaryGradOpPattern : public popart::PreAliasPattern
Subclassed by popart::Atan2Arg0GradOpPattern, popart::Atan2Arg1GradOpPattern, popart::DivArg0GradOpPattern, popart::DivArg1GradOpPattern, popart::FmodArg0GradOpPattern, popart::PowArg0GradOpPattern, popart::PowArg1GradOpPattern, popart::SubtractArg1GradOpPattern
-
class ContiguateIpuCopyIndicesPattern : public popart::PreAliasPattern
-
class ConvDataGradPattern : public popart::PreAliasPattern
-
class ConvFlipWeightsDoubleFlipPattern : public popart::PreAliasPattern
-
class ConvFlipWeightsGradOpPattern : public popart::PreAliasPattern
-
class ConvTransposePattern : public popart::PreAliasPattern
-
class CosGradOpPattern : public popart::PreAliasPattern
-
class CoshOpPattern : public popart::PreAliasPattern
-
class DecomposeBinaryConstScalar : public popart::PreAliasPattern
-
template<class GRADOP, class DOP>
class ElementWiseGradOpPattern : public popart::PreAliasPattern
-
class ExpGradOpPattern : public popart::PreAliasPattern
-
class ExpandCastPattern : public popart::PreAliasPattern
-
class Expm1GradOpPattern : public popart::PreAliasPattern
-
class Fuser : public popart::PreAliasPattern
Subclassed by popart::SoftmaxGradDirect
-
class InitAccumulatePattern : public popart::PreAliasPattern
-
class LSTMPattern : public popart::PreAliasPattern
-
class LambSerialisedWeightPattern : public popart::PreAliasPattern
This Pattern finds Weights that have been serialised and are being updated in the Lamb Optimizer in slices.
Transforming:
Slice(W) U_sliced } | (R1) | (R2) } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } Into:
Slice(W) U_sliced } | | } ReduceScatter | } (Optional, to support RTS) | | } LambSquare LambSquare } x N | | } Sum Sum \ / AllReduce AllReduce } (Optional, to support RTS) \ / } AdamVarUpdate } x N
A key property of LambSquare is that the output has not been sqrt yet, so it is valid to just Sum the outputs.
-
template<class L>
class LikeOpsPattern : public popart::PreAliasPattern
-
class Log1pGradOpPattern : public popart::PreAliasPattern
-
class LogGradOpPattern : public popart::PreAliasPattern
-
class LoopScanOutPattern : public popart::PreAliasPattern
-
class MatMulGradPattern : public popart::PreAliasPattern
Subclassed by popart::MatMulLhsGradPattern, popart::MatMulRhsGradPattern
Public Functions
-
class MatMulPattern : public popart::PreAliasPattern
-
class MulArgGradOpPattern : public popart::PreAliasPattern
-
class NlllWithSoftmaxGradDirect : public popart::PreAliasPattern
-
class OptimizerDecompose : public popart::PreAliasPattern
Subclassed by popart::AdamDecompose, popart::AdaptiveDecompose, popart::SGD0Decompose, popart::SGD1Decompose, popart::SGD2Decompose
-
class PackedDataBlockPattern : public popart::PreAliasPattern
-
class PadSumPattern : public popart::PreAliasPattern
-
class PostNRepl : public popart::PreAliasPattern
-
class PreUniRepl : public popart::PreAliasPattern
-
class ReciprocalGradOpPattern : public popart::PreAliasPattern
-
class RemoveUnnecessaryLossGradCast : public popart::PreAliasPattern
The RemoveUnnecessaryLossGradCast changes
fp32_lossScale -- Cast -- fp16_lossScale -- NllLossGradOp -- fp16_grad fp16_probs -------'
to
fp32_lossScale -- NllLossGradOp -- fp16_grad fp16_probs -------'
This corner case can occur in a model with fp16 activations when its fp16 loss scale is anchored for summation and upcast to fp32 in order to prevent overflow. In this case if we have a loss scale >max(fp16) the downcasting will result in a clipping of the loss scale.
Notice that even if the loss scale is >max(fp16) the resulting gradients can be within fp16 range. If the resulting gradients are >max(fp16), they will be clipped (unless the user has enabled
NaN
on overflow).
-
class ScanToLoopPattern : public popart::PreAliasPattern
-
class SequenceExpander : public popart::PreAliasPattern
Subclassed by popart::NegativeOneScalePattern, popart::OpToIdentityPattern, popart::SplitGradOpToConcatPattern
-
class SplitGatherPattern : public popart::PreAliasPattern
-
class SplitOpPattern : public popart::PreAliasPattern
-
class SqrtGradOpPattern : public popart::PreAliasPattern
-
class SumToAddPattern : public popart::PreAliasPattern
-
class TiedGatherAccumulatePattern : public popart::PreAliasPattern
-
class TiedGatherPattern : public popart::PreAliasPattern
-
class TransposeToIdentityOrReshapePattern : public popart::PreAliasPattern
-
class UpsampleToResizePattern : public popart::PreAliasPattern
-
class ViewSimplifyPattern : public popart::PreAliasPattern
-
class AdamDecompose : public popart::OptimizerDecompose
Public Functions
-
TensorId rescaleRatio(Graph &graph, AdamComboOp *combo) const
-
TensorId rescaleRatio(Graph &graph, AdamComboOp *combo) const
-
class AdaptiveDecompose : public popart::OptimizerDecompose
-
class Atan2Arg0GradOpPattern : public popart::BinaryGradOpPattern
-
class Atan2Arg1GradOpPattern : public popart::BinaryGradOpPattern
-
class DivArg0GradOpPattern : public popart::BinaryGradOpPattern
-
class DivArg1GradOpPattern : public popart::BinaryGradOpPattern
-
class FmodArg0GradOpPattern : public popart::BinaryGradOpPattern
-
class MatMulLhsGradPattern : public popart::MatMulGradPattern
Public Functions
-
class MatMulRhsGradPattern : public popart::MatMulGradPattern
Public Functions
-
class NegativeOneScalePattern : public popart::SequenceExpander
-
class OpToIdentityPattern : public popart::SequenceExpander
-
class PowArg0GradOpPattern : public popart::BinaryGradOpPattern
-
class PowArg1GradOpPattern : public popart::BinaryGradOpPattern
-
class SGD0Decompose : public popart::OptimizerDecompose
Decomposes an SGD0ComboOp into the Ops and Tensors that implement the SGD0 optimiser step it describes.
If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix
reservedAccumPrefix()
. If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD0ComboOp.See also
See also
SGD.
Recall the SGD0 optimiser step, possibly with gradient accumulation, replication:
(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (4) w = (w * wdsf0) - (slr0 * a) (5) a = 0 [if grad acc]
(1) is implemented by a ReplicatedAllReduceOp.
(2) is implemented by an AccumulateOp.
(3) is implemented by a ReplicatedAllReduceInplaceOp.
(4) is implemented by an SGD0VarUpdateOp.
(5) is implemented by an AccumulatorUpdateOp.
For all the above ops, if they consume a non-const OptimizerValue, then the SGD0ComboOp will have an additional input for that scalar, which will be connected to the new Op.
If gradient accumulation, (3-5) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.
The SGD0ComboOp will be disconnected and erased.
-
class SGD1Decompose : public popart::OptimizerDecompose
Decomposes an SGD1ComboOp into the Ops and Tensors that implement the SGD1 optimiser step it describes.
Will create the accl tensor (combined accumulator and first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix
reservedAcclPrefix()
. If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised toslr1 * w_0
, where w_0 is the initial value of w.See also
See also
SGD.
Recall the SGD1 optimiser step, possibly with gradient accumulation and replication:
(_) for each micro batch (1) allReduce(g) [if OptimizerReductionType=GradReduce] (2) v += dpsf1 * g (_) if enable nesterov momentum: (_) a += g (3) v = allReduce(v) [if OptimizerReductionType=AcclReduce] (_) if enable nesterov momentum: (_) a = allReduce(a) [if OptimizerReductionType=AcclReduce] (4) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := g if enable nesterov momentum else v] (5) w = w - slr1 * x (6) v = v * smm1 + swd1 * w
See the SGD docs in optimizer.hpp for derivation of the above.
(1) is implemented by a ReplicatedAllReduceOp.
(2) is implemented by an AccumulateOp.
(3) is implemented by a ReplicatedAllReduceInplaceOp.
(4) is implemented by a MulOp and a SGD1NesterovOp.
(5) is implemented by an SGD1VarUpdateOp.
(6) is implemented by an SGD1AcclUpdateOp.
For all the above ops, if they consume a non-const OptimizerValue, then the SGD1ComboOp will have an additional input for that scalar, which will be connected to the new Op.
If gradient accumulation, (3), (4), (5), (6) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.
The SGD1ComboOp will be disconnected and erased.
Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD1ComboOp to the new ops without breaking this. To do this:
At the start of apply, add a topo con from (1) to the combo op.
Transfer topo cons from combo to (2). Since (1)/(2) are the first op to run in the optimiser step (the other ops consume (2)’s output so will always run after), this ensures the pre-existing topo cons on combo are respected.
Insert topo con from (5) to (6), to ensure w update happens before the next step’s v update.
-
class SGD2Decompose : public popart::OptimizerDecompose
Decomposes an SGD2ComboOp into the Ops and Tensors that implement the SGD2 optimiser step it describes.
Will create the accl1 tensor (first-order momentum). The tensor will be added to the Ir’s model proto, so will be part of the serialised Ir. The tensor’s id will have prefix
reservedAccl1Prefix()
. The tensor will be initialised to 0.The DataType of the tensor is as specified in the SGD2ComboOp.See also
See also
SGD.
If gradient accumulation, will create the accum tensor (gradient accumulator). This is not a persistent state tensor so will not be added to the Ir’s model proto. The tensor’s id will have prefix
reservedAccumPrefix()
. If the tensor has an initialiser in the model proto, the tensor will be initialised with that value. Otherwise, it will be initialised to 0. The DataType of the tensor is as specified in the SGD2ComboOp.Recall the SGD2 optimiser step, possibly with gradient accumulation, replication:
(_) for each micro batch (1) g = allReduce(g) [if OptimizerReductionType=GradReduce] (2) a += g [if grad acc] (_) [let a := g if not grad acc] (3) a = allReduce(a) [if OptimizerReductionType=AccumReduce] (_) // Note we break the single v update equation into two steps: (4) v += dpsf1 * a (5) v = v * smm1 + swd1 * w (6) if enable nesterov momentum: ils = ndsf * dpsf1 a = ngsf * (ils * a + wd * w) + mm * v (_) [let x := a if enable nesterov momentum else v] (7) w = w - slr1 * x (8) a = 0 [if grad acc]
See the SGD docs in optimizer.hpp for derivation of the above.
(1) is implemented by a ReplicatedAllReduceOp.
(2) is implemented by an AccumulateOp.
(3) is implemented by a ReplicatedAllReduceInplaceOp.
(4) is implemented by an AccumulateOp.
(5) is implemented by an SGD2AcclUpdateOp. Note this is equivalent to an SGD1AcclUpdateOp.
(6) is implemented by a MulOp and a SGD1NesterovOp.
(7) is implemented by an SGD2VarUpdateOp. Note this is equivalent to an SGD1VarUpdateOp.
(8) is implemented by an AccumulatorUpdateOp.
For all the above ops, if they consume a non-const OptimizerValue, then the SGD2ComboOp will have an additional input for that scalar, which will be connected to the new Op.
If gradient accumulation, (3-8) are put outside the microbatches loop implicitly by setting op->settings.executionContext = ExecutionContext::AccumulateOuterFragment Additionally, we will set op->settings.schedulePriority = 0.0f op->setExecutionPhases({}) because VarUpdateOps default to minimum possible schedule priority so they are scheduled last, but this is not desirable for gradient accumulation, so we reset to a neutral priority.
The SGD2ComboOp will be disconnected and erased.
Additional topo cons are required to ensure the VarUpdateOps run in the manner described above. We also must transfer the topo cons from the SGD2ComboOp to the new ops without breaking this. To do this:
Transfer topo cons from combo to (1).
Transfer topo cons from combo to (2).
Insert topo con from (7) to (8) to ensure accum not zeroed until after v update (which consumes it).
Transfer topo cons from combo to (8). Only required if not grad acc.
-
class SplitGradOpToConcatPattern : public popart::SequenceExpander
-
class SubtractArg1GradOpPattern : public popart::BinaryGradOpPattern
14.10. Transforms
#include <popart/transforms/transform.hpp>
-
class Transform
Subclassed by popart::AccumulateOuterFragmentParallelizer, popart::Autodiff, popart::AutomaticLossScale, popart::AutoVirtualGraph, popart::BatchSerialize, popart::ClipWeightGradientsByNorm, popart::ContiguateCollectivesTransform, popart::DecomposeLoops, popart::DecomposeSum, popart::DynamicOpTransform, popart::EnsureFp32LossScale, popart::ExplicitRecompute, popart::HostIOSetup, popart::InferPipelineStages, popart::InplaceAccumulateGradPartialsIntoOptimizerAccumTensor, popart::InterIpuCopy, popart::IoComputeTileCopy, popart::MainLoops, popart::MergeCollectivesTransform, popart::MergeCopies, popart::MergeDuplicateOps, popart::MergeExchange, popart::MergeLoops, popart::MergeVarUpdates, popart::OverlapIO, popart::Pipeline, popart::PreAutomaticLossScale, popart::Prune, popart::RandomSetup, popart::RemoteSetup, popart::SerializeMatMuls, popart::StochasticRounding, popart::StreamingMemory, popart::SubgraphOutline
14.10.1. Available transforms
-
class AccumulateOuterFragmentParallelizer : public popart::Transform
Public Functions
-
AccumulateOuterFragmentParallelizer()
-
virtual ~AccumulateOuterFragmentParallelizer()
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
AccumulateOuterFragmentParallelizer()
-
class AutoVirtualGraph : public popart::Transform
Public Functions
-
inline AutoVirtualGraph()
-
inline ~AutoVirtualGraph() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline AutoVirtualGraph()
-
class Autodiff : public popart::Transform
Class responsible for the automatic differentiation (autodiff) transform.
Public Types
Public Functions
-
bool apply(Graph &graph) const override
Perform automatic differentiation.
Implemented as
applyToIr(graph.getIr()))
- Parameters
graph – The
autodiff
transform is applied to the IR containing the Graphgraph
.- Returns
An indication of whether the automatic differentiation has been completed (
true
) or not (false
).
-
virtual bool applyToIr(Ir &ir) const
Perform automatic differentiation.
- Parameters
ir – The IR to apply the
autodiff
transform to.- Returns
An indication of whether the automatic differentiation has been completed (
true
) or not (false
).
-
virtual FwdGraphToBwdGraphInfo apply(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo, AutodiffStitchStrategy stitchStrategy)
Create a backward graph.
Apply
createBwdGraph()
andstitch()
recursively, top-down, to create a backward graph for the forward graph with IDfwdGraphId
.The forward graph being differentiated can call subgraphs. If the
autodiff
transform has already been applied to the subgraphs and the result stored in, then the backward graph that has already been created for the subgraphs will be used. Otherwise, this method will recurse on the subgraphs.
When recursing on a subgraph, this method does not know for which tensors gradients are required. If a null
gradsRequiredForFwdId
is passed, theautodiff
transform will produce gradients for all input tensors.For control over which gradients are produced for the subgraph, first (manually) call the
autodiff
transform on the subgraph and passgradsRequiredForFwdId
. Store the resultantBwdGraphInfo
in theFwdGraphToBwdGraphInfo
map passed to theautodiff
call for the forward graph.NOTE: This method may fail if any required gradient cannot be produced.
- Parameters
ir – The IR to which this transform is applied.
fwdGraphId – The ID of the graph to differentiate.
gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, the
autodiff
transform will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, theautodiff
transform will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark these all as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.
calledGraphsGradInfo – The result of applying the
autodiff
transform to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.stitchStrategy – The method used to stitch any result of the
autodiff
transform for graphs that are directly or indirectly called by the graph. This stitch strategy will be universally applied to all relevant inputs.
- Returns
An
FwdGraphToBwdGraphInfo
object that containsBwdGraphInfo
for all descended graphs and for which all entries have the following properties:expectedInputs
may contain a tuple(t, ExpectedConnectionType::Fwd)
ifft
is an input or output tensor of the forward graph. Only tensorst
ingradsProvidedForFwdId
may appear as a tuple(t, ExpectedConnectionType::FwdGrad)
inexpectedInputs
. IfgradsProvidedForFwdId
is set, the first inputs will match the gradients ofgradsProvidedForFwdId
, respecting the order.expectedOutputs
may only contain tuples of the type(t, ExpectedConnectionType::FwdGrad)
wheret
is an input tensor of the forward graph. IfgradsRequiredForFwdId
is set, theexpectedOutputs
list matches the size and order ofgradsRequiredForFwdId
exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.
-
virtual BwdGraphInfo createBwdGraph(Ir &ir, const GraphId &fwdGraphId, const nonstd::optional<TensorIds> &gradsProvidedForFwdId, const nonstd::optional<TensorIds> &gradsRequiredForFwdId, const FwdGraphToBwdGraphInfo &calledGraphsGradInfo)
Create backward graph information for a specific subgraph (non-recursive).
This method returns an “unstitched” result. This means that it is not guaranteed that all non-gradient inputs to a backward graph are available as inputs or outputs of the forward graph. This is a precondition for
BwdGraphInfo
objects used as values incalledGraphsGradInfo
. So, you must callstitch
on the result before using the result information in anautodiff
call.NOTE: This method may fail if any required gradient cannot be produced.
- Parameters
ir – The IR to which this transform is applied.
fwdGraphId – The ID of the subgraph to differentiate.
gradsProvidedForFwdId – An optional list of tensors (normally outputs of the forward graph) for which gradient tensors are available to be used as inputs to the backward graph. If set, autodiff will make the gradients of these forward tensors the first inputs of the the backward graph. If not set, autodiff will use whatever gradients of outputs of the forward graph it needs as outputs of the backward graph to allow the backward graph to produce the gradients that are required.
gradsRequiredForFwdId – The IDs of the forward graph tensors for which the backward graph should produce gradients. If set, the backward graph will compute only the gradients of the specified tensors and mark them as outputs. If one of these gradients cannot be computed, it is an error. If not set, the backward graph will produce as many gradients of the forward graph inputs as possible, and mark all these as outputs. If set, but an empty vector is passed, this is an error, as you are requesting no gradients be computed at all, resulting in an empty graph.
calledGraphsGradInfo – The result of applying autodiff to the graphs that are called by subgraph ops in the forward graph. It is a precondition of this method that the graphs provided in this map are stitched.
- Returns
A
BwdGraphInfo
object with the following properties:expectedInputs
may contain arbitrary tuples(t, ExpectedConnectionType::Fwd)
wheret
is any tensor in the forward graph (it need not be an input or output). Only tensorst
ingradsProvidedForFwdId
may appear as a tuple(t, ExpectedConnectionType::FwdGrad)
inexpectedInputs
. IfgradsProvidedForFwdId
is set, the first inputs will match the gradients ofgradsProvidedForFwdId
, respecting the order.expectedOutputs
may only contain tuples of the type(t, ExpectedConnectionType::FwdGrad)
wheret
is an input tensor of the forward graph. IfgradsRequiredForFwdId
is set, theexpectedOutputs
list matches the size and order ofgradsRequiredForFwdId
exactly. If not set, the list is ordered in the order of the forward graph inputs, although some gradients of the forward graph inputs may be missing.
-
virtual BwdGraphInfo stitch(Ir &ir, const GraphId &fwdGraphId, const BwdGraphInfo &bwdGraphInfo, AutodiffStitchStrategy stitchStrategy, const nonstd::optional<std::vector<InIndex>> &stitchIndices)
Stitch a forward-backward graph pair.
To stitch a forward-backward graph pair means to make it so that the backward graph no longer has any non-gradient inputs of the forward graph tensors that are neither inputs nor outputs of the forward graph.
When applying the
autodiff
transform to a graph, PopART assumes that all input tensors to the gradient ops are either 1) a forward op input 2) a forward op output or 3) the gradient of a forward op output. For this to be true for gradient ops of subgraph ops (for example:CallOp
andIfOp
), typically the backward graphs of those called subgraphs must not have inputs that are associated with non-gradient forward tensors that are neither inputs nor outputs of the forward graph. This is because the inputs and outputs of a forward subgraph typically map to the inputs and outputs of the associated forward op. Similarly, the inputs and outputs of a backward subgraph typically map to the inputs and outputs of the associated gradient op.For stitch strategies that affect the forward graph’s inputs or outputs,
stitch()
should also amend all call sites of the forward graph as appropriate. Conversely, for the backward graphs, it is assumed there are no call sites as it’s anticipated this method is called before parents of the backward graph exist.NOTE: This method may modify the forward graph, backward graph, or any graphs that call these graphs, depending on the method. It also may raise a
popart::error
if it is unable to stitch an index.- Parameters
ir – The IR in the context of which this transformation is applied.
fwdGraphId – The ID of the subgraph to differentiate.
bwdGraphInfo – The data structure describing the backward graph.
stitchStrategy – The method by which to stitch any autodiff result for graphs that are directly or indirectly called by the graph.
stitchIndices – If provided, backward graph input indices not in this list must be ignored and backward graph input indices in this list must be stitched (or an exception raised). If not set, it is up to the stitcher to decide what indices to stitch.
- Throws
- Returns
An updated
BwdGraphInfo
data structure (with someexpectedInputs
removed).
-
inline std::size_t getId() const override
Get the ID of the autodiff transform.
-
inline std::string getName() const override
Get the name of the autodiff transform.
Public Static Functions
-
static std::size_t id()
ID of the autodiff transform.
-
bool apply(Graph &graph) const override
-
class AutomaticLossScale : public popart::Transform
Public Functions
-
inline AutomaticLossScale()
-
inline ~AutomaticLossScale() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
static Op *executeOpNTimesEveryMTimes(Op *op, unsigned n, unsigned m, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasMode)
When applied to an op it will be effectively executed n times every m times.
It returns a pointer to an IfOp which either calls an ‘empty’ subgraph, or calls a subgraph containing the op passed as the argument. The ‘empty’ subgraph is meant to be low intensity compute. It is possible to connect inputs and outputs via nop operations and set up default values of outputs in the ‘empty’ subgraph.
- Parameters
op – Operator whose execution frequency is modified.
n – Execute the op n times every m times.
m – Execute the op n times every m times.
identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.
outputIndiciesAndValues – Map of pairs of output indices and values. Note: inplacing and aliasing of inputs are not supported. If the op inplace-modifies or aliases an input, in the transformed graph after this method is called, this will not longer be the case.
-
inline AutomaticLossScale()
-
class BatchSerialize : public popart::Transform
Public Functions
-
inline BatchSerialize(int pass_)
-
inline ~BatchSerialize() override
-
inline std::size_t getId() const final
-
inline std::string getName() const final
Public Static Functions
-
static std::size_t id(int)
-
inline BatchSerialize(int pass_)
-
class ContiguateCollectivesTransform : public popart::Transform
A transform that inserts topological constraints into the graph.
These force collective operations which can potentially be merged to be scheduled contiguously (one right after the other) in the schedule.
Currently supported collective types:
ReplicatedAllReduceOp
ReplicatedReduceScatterOp
ReplicatedAllGatherOp
Public Functions
-
inline ContiguateCollectivesTransform()
-
inline ~ContiguateCollectivesTransform() override
-
inline std::size_t getId() const override
-
inline std::string getName() const override
-
template<typename BaseType>
void processOp(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess) const Processing baseOp involves finding all other collective ops in the graph with which baseOp can be merged, the inserting constraints between the matching ops and baseOp, that ensure the ops are scheduled contiguously one after another.
- Parameters
baseOp – is the Op that should be merged with other collectives
schedule – is a vector of ops sorted in schedule order
opsToProcess – is set of all other collective ops in the graph (which are candidates for merging with base op)
- Returns
void, modifies the graph of baseOp
Public Static Functions
-
static std::size_t id()
-
template<typename BaseType>
static bool checkCollectiveOp(BaseType *baseOp, BaseType *candidate) Check whether two ops use the same collective operator.
- Parameters
baseOp – against which to compare the candidate op
candidate – op of the same type as baseOp
- Returns
true, if the two ops use the same collective operator or if neither uses a collective operator
-
template<typename BaseType>
static std::set<BaseType*, POpCmp> lookForMatchingOps(BaseType *baseOp, const std::vector<Op*> &schedule, std::set<Op*, POpCmp> &opsToProcess) Loop through the ops in the schedule and find those matching baseOp to avoid merging the same op twice, make sure it is still in opsToProcess.
- Parameters
baseOp – the op that should be merged with other collectives
schedule – the schedule of the (Collective) ops in the graph
opsToProcess – the (Collective) ops that can still be considered for merging
- Returns
a vector of collective ops that can be merged with the baseOp
-
class DecomposeGradSum : public popart::DecomposeSum
Public Functions
-
inline std::size_t getId() const override
-
inline std::string getName() const override
Public Static Functions
-
static std::size_t id()
-
inline std::size_t getId() const override
-
class DecomposeLoops : public popart::Transform
Transform that generically decomposes/unrolls loop iterations to:
Unroll
LoopOp
iterations in generalArrange IO
Ops
to enable overlap between IO and compute tilesArrange
Ops
PipelineStages
to enable overlap betweenPipelineStages
If we want to unroll a loop by a factor of 2, each
Op
that existed in the loop needs 3 instances, denoted as 0, 1 and 2, one per apparent iteration. If we want to unroll such that iterations can partially overlap (IO and compute overlap), we can’t generally, for all operations, place 0 before the loop, 2 after loop and 1 during the loop (see skewed unrolling below), because this would not lead to overlap between either pipeline stages or IO and compute operations.Rather, we classify Ops (see DecomposeLoopOpTypeEnum), according to their data, topological dependencies and the tile set they are running on, into one of the categories. The available categories depend on the DecomposeLoopModel implementation. We can then shuffle the operations to before, during and after the loop accordingly. Note that every operation is cloned 2 extra times (for an unroll factor of 2), but the original operation in the loop remains.
However, the “apparent iteration” (iteration that the Op instance corresponds to in the LoopOp before unrolling) has changed.
The number of apparent iterations in total is always the unroll factor (counting all iterations before and after the loop) plus one iteration for the loop itself:
num_apparent_iterations = unroll_factor + 1
In loop iteration n, the Ops (depending on classification) now correspond to iterations i (0), i+1 (1) and i+2 (2) respectively. The Ops unrolled before the loop process iterations 0 (0) and 1 (1) The Ops unrolled after the loop process iterations n-1 (1) and n (2) (where (0) (1) and (2) correspond to the cloned operations)
As an example for apparent iteration: Before unrolling, there is an operation in a loop (denoted as {}): { Op }
If we unroll by a factor of 2, the operation is cloned into the parent graph twice,and there are different possible arrangements, depending on how we skew the unrolling:
a.) { Op } - Op0 - Op1
In this case: Op - unrollIndex -1 - apparent iteration 0 - before loop: no Op0 - unrollIndex 0 - apparent iteration 1 - before loop: no Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no
(use case example: if Op is a HostStoreOp that should do overlapped IO with compute (such as a MatMulOp))
b.) Op0 - { Op } - Op1
In this case: Op - unrollIndex -1 - apparent iteration 1 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 2 - before loop: no
(use case example: if Op is a MatMulOp that should do overlapped compute with IO (such as HostloadOp and HostStoreOp))
c.) Op0 - Op1 - { Op }
In this case: Op - unrollIndex -1 - apparent iteration 2 - before loop: no Op0 - unrollIndex 0 - apparent iteration 0 - before loop: yes Op1 - unrollIndex 1 - apparent iteration 1 - before loop: yes
(use case example: if Op is a HostLoadOp that should do overlapped IO with compute (such as a MatMulOp))
Use case example:
HostLoadOp0 HostLoadOp1 { HostLoadOp } MatMulOp0 { MatMulOp } MatMulOp1 { HostStoreOp } HostStoreOp0 HostStoreOp1 ^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^ overlap overlap overlap
{ } denotes the LoopOp
Where the data dependencies are: HostLoadOp0 -> MatMulOp0 -> HostStoreOp HostLoadOp1 -> MatMulOp -> HostStoreOp0 HostLoadOp -> MatMulOp1 -> HostStoreOp1
This skew is controlled by the decomposition model (see DecomposeLoopOpTypeEnum for details). If the model is unrolling pipeline stages, for example, each stage will be skewed differently (see DecomposeLoopPipelineModel).
Public Functions
-
inline DecomposeLoops()
-
inline ~DecomposeLoops() override
-
virtual bool apply(Graph &graph) const final
Decomposes all
LoopOps
in thegraph
using the standard model of loop decomposition (which isDecomposeLoopOverlapModel()
)- Parameters
graph – Graph containing the
LoopOp
to decompose- Returns
true If apply is successful. An error will be thrown if not.
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
static bool isComputeOp(Op *op)
Check if an
Op
should be classified as compute.The condition is that the operation is on compute tiles.
-
static bool isIOOp(Op *op)
Checks if an
Op
is an IO operation.The condition is that the operation is one of
HostLoadOp
,HostStoreOp
,RemoteLoadOp
,RemoteStoreOp
,MultiExchangeOp
.
-
static bool isComputeLikeIOOp(std::set<ExchangeStrategy> computeLikeStrategies, Op *op)
Checks if an
Op
is classified as IO, and executes on IO tiles, but should still be handled like a compute operation (as in, classified, unrolled and scheduled asDecomposeLoopOpTypeEnum::Compute
) (instead of an IO operation that should overlap with compute (classifiedDecomposeLoopOpTypeEnum::IoBeforeCompute
orDecomposeLoopOpTypeEnum::IoAfterCompute
)).Operations should be handled like compute instead of IO operations when they are not required to overlap with compute.
-
class DynamicOpTransform : public popart::Transform
Public Functions
-
inline DynamicOpTransform()
-
inline ~DynamicOpTransform() override
-
inline std::size_t getId() const final
-
inline std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline DynamicOpTransform()
-
class EnsureFp32LossScale : public popart::Transform
Public Functions
-
inline EnsureFp32LossScale()
-
inline ~EnsureFp32LossScale() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
-
bool isPassThroughOp(Op *op) const
For deciding whether to continue graph traversal from op’s outputs, or to terminate the traversal at this op.
- Parameters
op – The op.
- Returns
True if the op has a single input, and all its outputs are of the same type as the input.
-
FromLossScaleTraversalOps traverseFromLossScaleTensor(const Graph &graph) const
Traverse the graph from the loss scale tensor.
We ‘pass through’ single-input ops that do not combine the loss scale (or a descendant of it) with an activation tensor.
Otherwise we terminate the traversal. We refer to these terminal ops as ‘mixed precision loss grad op’ (or MPLGO) candidates.
- Parameters
graph – The graph to be traversed.
- Returns
A pair containing the list of pass-through ops and MPLGO candidates.
-
bool shouldApply(const Graph &graph) const
Run the checks to see if the transform should be applied.
- Parameters
graph – The graph that the checks are run on.
- Returns
True if the checks pass.
-
void upCastTensor(Op *op, InIndex index) const
Upcast fp16 tensor at input index index to op to fp32.
This is done by disconnecting the input tensor, inserting a CastOp, and re-connecting the output tensor of the CastOp at index.
- Parameters
op – The op whose input is to be upcast.
index – The input index to op at which the tensor is to be upcast.
Public Static Functions
-
static std::size_t id()
-
static bool isMixedPrecisionLossGradOp(Op *op)
To return true, the op’s implementation must be able to handle mixed precision maths.
We have no good way to know this programmatically at the point of running this transform, so we hard code this information here.
- Parameters
op – The op we want to check if it has an impelemntation that is known to support mixed precision inputs.
- Returns
True if it is has an implementation known to support mixed precision inputs.
-
inline EnsureFp32LossScale()
-
class ExplicitRecompute : public popart::Transform
Explicit recomputation is a transformation that clones forward-pass operations marked for recomputation and clones them.
Consider a fragment of the training graph before the explicit recomputation transform, where one gradient operation (CheckpointOp1Grad) requires a value from the forward pass (RecomputeOp1) which is considered for recomputation:
CheckpointOp0 | RecomputeOp0 | RecomputeOp1 -. … | \ | CheckpointOp1 CheckpointOp1Grad … … | | Loss ———–—
(where CheckpointOp* is an op with op->settings.recomputeType == RecomputeType::Checkpoint and RecomputeOp* is an op with op->settings.recomputeType == RecomputeType::Recompute)
By marking these ops as ‘recompute’, the output of RecomputeOp1 does not need to remain live until the recomputation of CheckpointOp1Grad. In other words, the memory used to store this tensor is freed for allocation of other tensors as soon as RecomputeOp1’s output is read during the computation of CheckpointOp1. How does this work in practice?
After the transform, the graph fragment will look like:
CheckpointOp0 -. | \ RecomputeOp0 RecomputeOp0Clone | | RecomputeOp1 RecomputeOp1Clone … | | | CheckpointOp1 –— CheckpointOp1Grad … … | | Loss —————————–—
Where every operation marked as
Recompute
will be cloned and added to the backward pass, while allCheckpoint
operation will remain connected as-is.In pipelining, every copy operation between pipeline stages is (required to be) checkpointed (in order to not cause data dependencies between stages running in parallel), while everything else is recomputed. The user can choose to checkpoint more, but not recompute more (with pipelining).
The alternative, in the case of implicit recomputation, is to not transform the graph at the IR level, and to use these recomputation settings to affect the Ir lowering. In this case, the
poplar::program::Sequence
s that correspond to the lowered RecomputeOps are added once to the main program as scheduled in the forward pass, and then again directly preceding thepoplar::program::Sequence
of the CheckpointOp1Grad. See the `FindRequiredRecomputes class in irlowering.cppPublic Functions
-
inline ExplicitRecompute()
-
inline ~ExplicitRecompute() override
-
inline std::size_t getId() const final
-
inline std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline ExplicitRecompute()
-
class HostIOSetup : public popart::Transform
Public Functions
-
inline HostIOSetup(int pass_)
-
inline ~HostIOSetup() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id(int)
-
inline HostIOSetup(int pass_)
-
class InferPipelineStages : public popart::Transform
Public Functions
-
inline InferPipelineStages()
-
inline ~InferPipelineStages() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline InferPipelineStages()
-
class InplaceAccumulateGradPartialsIntoOptimizerAccumTensor : public popart::Transform
Replaces an accumulation tree consumed by an AccumulateOp (which has its own accumulator tensor), with an accumulation tree directly on the AccumulateOp’s accumulator tensor, thereby removing one allocation from the graph (the accumulation tree’s original accumulation tensor).
More precisely:
Init | dW0 pW0 \ / AddLhsInPlace0 | dW1 pW1 \ / AddLhsInPlace1 | A dw2 accum -—| \ | Accumulate3 | accum’ | B
Becomes:
A | accum pW0 \ / Accumulate | dW1 pW1 \ / Accumulate | accum’ | B
See below comment for more discussion of the conditions required to be able to perform this transform.
The primary use case of this is a decomposed grad sum whose addition tree is fed into an AccumulateOp as part of the optimiser step.
Public Functions
-
InplaceAccumulateGradPartialsIntoOptimizerAccumTensor()
-
~InplaceAccumulateGradPartialsIntoOptimizerAccumTensor() final
-
inline std::size_t getId() const final
-
inline std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
InplaceAccumulateGradPartialsIntoOptimizerAccumTensor()
-
class InterIpuCopy : public popart::Transform
Public Functions
-
inline InterIpuCopy()
-
inline ~InterIpuCopy() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline InterIpuCopy()
-
class IoComputeTileCopy : public popart::Transform
Public Functions
-
inline IoComputeTileCopy()
-
inline ~IoComputeTileCopy() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline IoComputeTileCopy()
-
class MainLoops : public popart::Transform
Public Functions
-
inline MainLoops()
-
inline ~MainLoops() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
static inline std::string getStepGraphName()
Return the name of the step subgraph.
The step subgraph is the body of the
LoopOp
stepLoop
. ThestepLoop
is run whensession.run(...)
is called, and will runbatchesPerStep
number of times (i.e. thetrip_count
of the loop equalsbatchesPerStep
). A step thus constitutes a call tosession.run(...)
. As a call tosession.run(...)
involves a call toengine.run()
(which is expensive, and will involve returning to the host for more data) we would like to have as large abatchesPerStep
as possible.See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator
- Returns
The name of the step graph
-
static inline std::string getAccumulationGraphName()
Return the name of the gradient accumulation subgraph.
The gradient accumulation subgraph is the body of the
LoopOp
accumLoop
. TheaccumLoop
will runaccumulationFactor
number of times (i.e. thetrip_count
of the loop equalsbatchesPerStep
) and will accumulate the gradients for each pass. These accumulated gradients will be used to calculate the weigth update.See https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop for details about the loop operator
- Returns
The name of the accumulation graph
-
static Graph &getInnerLoopSubgraph(Ir &ir)
Helper function for accessing the subgraph of the inner loop.
The inner loop depends on the values of
accumulationFactor
andbatchesPerStep
. The inner loop equals:The
mainGraph
ifaccumulationFactor
= 1 andbatchesPerStep
= 1The
accumulationGraph
ifaccumulationFactor
> 1 andbatchesPerStep
= 1The
stepGraph
ifaccumulationFactor
= 1 andbatchesPerStep
> 1The
accumulationGraph
ifaccumulationFactor
> 1 andbatchesPerStep
> 1
Warning
Should only be used after the transform has been applied, this means after call to apply() has been made.
Note
innerLoop
andouterLoop
are represented by the differnt graphs only whenaccumulationFactor
> 1 andbatchesPerStep
> 1. In that case theouterLoop
repeats theinnerLoop
- Returns
The inner loop subgraph
-
static Graph &getOuterLoopSubgraph(Ir &ir)
Helper function for accessing the subgraph of the outer loop.
The outer loop depends on the values of
accumulationFactor
andbatchesPerStep
. The outer loop equals:The
mainGraph
ifaccumulationFactor
= 1 andbatchesPerStep
= 1The
accumulationGraph
ifaccumulationFactor
> 1 andbatchesPerStep
= 1The
stepGraph
ifaccumulationFactor
= 1 andbatchesPerStep
> 1The
stepGraph
ifaccumulationFactor
> 1 andbatchesPerStep
> 1
Warning
Should only be used after the transform has been applied, this means after call to apply() has been made.
Note
innerLoop
andouterLoop
are represented by the differnt graphs only whenaccumulationFactor
> 1 andbatchesPerStep
> 1. In that case theouterLoop
repeats theinnerLoop
- Returns
The outer loop subgraph
-
inline MainLoops()
-
class MergeAllVarUpdates : public popart::MergeVarUpdates
Public Functions
-
inline MergeAllVarUpdates()
-
inline ~MergeAllVarUpdates() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeAllVarUpdates()
-
class MergeAuto : public popart::MergeVarUpdates
Subclassed by popart::MergeLooseThreshold, popart::MergeTightThreshold
-
class MergeLooseThreshold : public popart::MergeAuto
Public Functions
-
inline MergeLooseThreshold()
-
inline ~MergeLooseThreshold() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeLooseThreshold()
-
class MergeTightThreshold : public popart::MergeAuto
Public Functions
-
inline MergeTightThreshold()
-
inline ~MergeTightThreshold() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeTightThreshold()
-
class MergeCollectivesTransform : public popart::Transform
A transform for merging multiple compatible collective operations into a single larger collective operation.
Ops are only merged if they apear in contiguous order in the schedule.
Currently supported collective types:
ReplicatedAllReduceOp
Public Functions
-
inline MergeCollectivesTransform()
-
inline ~MergeCollectivesTransform() override
-
inline std::size_t getId() const override
-
inline std::string getName() const override
-
template<typename BaseType>
bool collectiveOpCheck(BaseType *A, BaseType *B) const Confirm that two collective ops of the same BaseType use the same collective operation i.e.
ADD, MUL etc. If the BaseType does not require a collective op (gather), return true
- Parameters
A – the first op
B – the second op
- Returns
true is A and B use the same collective operation or both use none
-
template<typename MultiOpType, typename BaseType>
Op *attemptToMergeOnOp(BaseType *baseOp, std::vector<Op*>::iterator &schedulePos, std::vector<Op*> &opSchedule) const Given a collective operation, attempt to merge it with other compatible collective ops which are tied (in the schedule) to the current op.
- Parameters
baseOp – a collective op that should be merged
opSchedule – the schedule of all (collective) ops in the graph
- Returns
pointer the constructed op
-
template<typename MultiOpType, typename BaseType>
std::unique_ptr<MultiOpType> constructMultiOp(BaseType *baseOp, std::vector<TensorInfo> outInfoFromBaseOps, std::vector<VGraphIdAndTileSet> inputVirtualGraphIdAndTileSet, std::vector<VGraphIdAndTileSet> outputVirtualGraphIdAndTileSet, std::vector<BaseType*> matchingOps) const Constructs a new MultiOpType which will replace the baseOp and all matching ops.
- Parameters
baseOp – is the operation to be replaced
outInfoFromBaseOps – is the output information for each output tensor collected from the ops with which base op will be merged.
inputVirtualGraphIdAndTileSet – the input virtual graph and tile set information collected from the ops that will be merged
outputVirtualGraphIdAndTileSet – the output virtual graph and tile set information collected from the ops that will be merged
matchingOps – the vector of matching ops
- Returns
a unique pointer to the new multi-collective op
Public Static Functions
-
static std::size_t id()
-
class MergeCopies : public popart::Transform
Public Functions
-
inline MergeCopies()
-
inline ~MergeCopies() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeCopies()
-
class MergeDuplicateOps : public popart::Transform
Public Functions
-
inline MergeDuplicateOps()
-
inline ~MergeDuplicateOps() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeDuplicateOps()
-
class MergeExchange : public popart::Transform
Public Functions
-
inline MergeExchange()
-
inline ~MergeExchange() override
-
inline std::size_t getId() const override
-
inline std::string getName() const override
Public Static Functions
-
static std::size_t id()
-
inline MergeExchange()
-
class MergeLoops : public popart::Transform
Public Functions
-
inline MergeLoops()
-
inline ~MergeLoops() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline MergeLoops()
-
class MergeVarUpdates : public popart::Transform
Subclassed by popart::MergeAllVarUpdates, popart::MergeAuto
Public Types
-
using PartitionId = std::string
-
using PartitionMap = std::map<PartitionId, std::vector<VarUpdateStartEnd>>
Public Functions
-
PartitionId getPartitionId(Op *op) const
-
PartitionMap getLargestGroupTargetsMap(const Graph&) const
-
using PartitionId = std::string
-
class OverlapIO : public popart::Transform
Public Functions
-
inline OverlapIO()
-
inline ~OverlapIO() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
static std::map<ExchangeStrategy, std::set<PipelineStage>> overlapIORequired(Ir &ir)
Check what level of
ExchangeStrategy
is required with overlapped IO.Each pipeline stage can contain IO operations that belong to any of the strategies defined in the
ExchangeStrategy
enum. This will then inform how the IO operations of each pipeline stages have to be unrolled.- Parameters
ir – IR to check for overlapped IO settings
- Returns
Map of required exchange strategies and pipeline stages in which exchanges occur. The set of stages will be empty if the
ExchangeStrategy
is not set on the InputSettings or AnchorReturnType of an input or output respectively.HostLoad
andHostStore
operations inserted by the HostIoSetup transform will inherit theExchangeStrategy
from InputSettings or AnchorReturnType respectively.
-
inline OverlapIO()
-
class Pipeline : public popart::Transform
Public Functions
-
inline Pipeline()
-
inline ~Pipeline() override
-
virtual bool apply(Graph &graph) const final
Checks if the pipelining settings are valid and applies either implicit or explicit pipelining transforms to the graph.
- Parameters
graph – top-level IR graph (main graph) for implicit pipelining, pipeline loop subgraph for explicit pipelining
- Returns
true if the transformation has changed the graph
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
-
bool addDynamicStashAndRestoreOps(Graph &graph) const
Add all required dynamic update and dynamic slice operations to the graph, which link forward and recompute/backward stages together via stashes Only works for explicit pipelining.
- Parameters
graph – Pipeline loop subgraph
- Returns
True if successful, will raise error if not
-
bool contiguateIpuCopies(Graph &graph) const
Add required IpuCopyOps to ensure that within the pipelined execution, no copies between non-contiguous pipeline stages occur.
- Parameters
graph – Pipeline loop subgraph
- Returns
True if successful, will raise error if not
-
int getStashSize(const Ir &ir, PipelineStage stashStage, PipelineStage maxRestoreStage) const
Calculate the required stash size.
- Parameters
ir – The current IR
stashStage – The stage in which the stash is updated
maxRestoreStage – The last stage in which the stash is restored
- Returns
Required number of stash entries
Public Static Functions
-
static std::size_t id()
-
static bool inplaceRestoreRequiredForRecompute(Op *op)
Implicit pipelining and implicit recompute only! Test if the (implicit) recompute logic requires an inplace restored version of a forward ActGrad tensor (from the stash)
- Parameters
op – the Op to check if it is convertible to RestoreInplaceOp and is required for (implicit) recompute
- Returns
True if the inplace restore is required
-
static bool inplaceRecomputationConflict(Op *op, InIndex in, OutIndex out)
Implicit pipelining and implicit recompute only! Check if implicit recompute is in conflict with implicit pipelining when restoring a forward ActGrad tensor inplace.
-
static void setFinalFwdStageRecomputation(Graph &graph)
Implicit pipelining and implicit recompute only! This annotation pass will try to set the Ops between the topologically final Checkpoints and the loss to NOT be recomputed.
This avoid a program where operations are run twice in a row with no benefit to liveness.
- Parameters
graph – top-level IR graph (main graph)
-
static void checkOpsPipelineStage(Graph &graph)
Check and adjust pipeline stage annotations on operations.
- Parameters
graph – Graph on which to check pipeline stages
-
static std::map<PipelineStage, PipelineStage> withStages(const Ir &ir)
Check which stages should be executed with which other stage.
- Parameters
ir – IR from which to read the pipeline stages
- Returns
Map of pipeline stages to which stage to execute with in sequence.
-
inline Pipeline()
-
class PreAutomaticLossScale : public popart::Transform
A transform that annotates tensors in the forward graph, so that their gradients can be tracked in automatic loss scaling.
This transform reads a list of user-provided tensor IDs in the forward graph and inserts AutoLossScaleProxyOps after them (see example below). Later in the lowering process, the Autodiff transform will place the corresponding AutoLossScaleProxyGradOps in the backward graph, marking the tensor locations in the graph, for which to track gradients.
Example graph before applying the transform: A — MulOp — C B -’
Example graph after applying the transform with toTrackTensors = [“A”, “C”]: A — AlsProxyOp — A* — MulOp — C — AlsProxyOp — C* B ———————’
It is important to apply the AutomaticLossScale transform after PreAutomaticLossScale and Autodiff to remove all AutoLossScaleProxyOps and AutoLossScaleProxyGradOps.
Public Functions
-
inline PreAutomaticLossScale()
-
inline ~PreAutomaticLossScale() override
-
virtual bool apply(Graph &graph) const final
Annotate tensors in the forward graph, so that their gradients can be found and tracked in automatic loss scaling.
See class documentation for details.
- Parameters
graph – The graph which to transform.
- Throws
- Returns
true if there was a change to the graph.
- Returns
false if there wasn’t a change to the graph.
-
inline virtual std::size_t getId() const final
-
virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline PreAutomaticLossScale()
-
class Prune : public popart::Transform
Public Functions
-
inline Prune()
-
inline ~Prune() override
-
inline std::size_t getId() const override
-
inline std::string getName() const override
Public Static Functions
-
static std::size_t id()
-
inline Prune()
-
class SerializeMatMuls : public popart::Transform
Public Functions
-
inline SerializeMatMuls()
-
inline ~SerializeMatMuls() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline SerializeMatMuls()
-
class StochasticRounding : public popart::Transform
Public Functions
-
inline StochasticRounding()
-
inline ~StochasticRounding() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
inline StochasticRounding()
-
class StreamingMemory : public popart::Transform
Public Functions
-
inline StreamingMemory(int pass_)
-
inline ~StreamingMemory() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id(int)
-
inline StreamingMemory(int pass_)
-
class SubgraphOutline : public popart::Transform
Class for creating functionally equivalent subgraphs from SubgraphableOpClusters, and replacing instances of SubgraphableOpClusters with calls to these subgraphs.
Further down the stack, this allows for code-reuse, which results in a lower memory footprint for the compiled graph.
Public Functions
-
inline SubgraphOutline()
-
inline ~SubgraphOutline() override
-
inline virtual std::size_t getId() const final
-
inline virtual std::string getName() const final
Public Static Functions
-
static std::size_t id()
-
static Graph &createSubgraph(const std::vector<SubgraphableOpCluster> instances, Ir &ir, std::map<Op*, int> &index_map, std::string subgraphId = "call")
Create a subgraph from a set of identitcal op clusters.
- Parameters
instances – A set of SubgraphableOpClusters that can be replaced by a call to the same subgraph. All SubgraphableOpCluster instances must be functionally equivalent.
ir – The IR.
index_map – An empty map, passed by reference. Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance. Required as input argument to ‘replaceWithCallOp’.
subgraphId – The returned subgraph’s id.
- Returns
A Graph that is functionally equivalent to each SubgraphableOpCluster instance.
-
static Graph &createEmptySubgraph(const SubgraphableOpCluster &instance, Ir &ir, std::string subgraphId, const std::map<InIndex, OutIndex> &identityInputToOutputIndiciesMapping, const std::map<OutIndex, float> &outputIndiciesAndValues, AliasModel &aliasModel)
Create an ‘empty’ subgraph from an op cluster.
- Parameters
instance – A SubgraphableOpCluster that is used as a template for which we build ‘empty’ subgraph where inputs and output tensors can be connected via nops and output tensors can be set to default values.
ir – The IR.
subgraphId – The returned subgraph’s id.
identityInputToOutputIndiciesMapping – Specifies the connections of inputs to outputs via nop operations in the ‘empty’ subgraph. Each pair must have the same shape and type.
outputIndiciesAndValues – Map of pairs of output indices and values.
- Returns
A Graph, low compute subgraph which stands for the op when it is not executed.
-
static void setSubgraphOpSettingsFromClusterInstance(Op *op, const SubgraphableOpCluster &instance)
-
static Op *replaceWithCallOp(const SubgraphableOpCluster &instance, Graph &subgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap)
Replace a cluster of ops with a call to a subgraph.
- Parameters
instance – The SubgraphableOpClusters instance to be replaced.
subgraph – The subgraph, a call to which is to replace the
instance
.index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.
aliasesMap – AliasesMap with alias information for instance’s graph.
- Returns
The replacement CallOp’s pointer.
-
static Op *replaceWithEmptyElseBranchIfOp(const SubgraphableOpCluster &instance, Graph &subgraph, Graph &emptySubgraph, const std::map<Op*, int> &index_map, AliasesMap &aliasesMap, Tensor *flag)
Replace an op with if op.
Where the op is moved to the first branch of if op. Its second branch is for low intensity compute which passes input tensors to outputs or provide default output tensors.
- Parameters
instance – The SubgraphableOpClusters instance which holds op to be replaced.
subgraph – if then branch subgraph which contains the op.
emptySubgraph – if else low intensity compute branch subgraph.
index_map – Used to map from ops in the new subgraph to their corresponding indices in the first SubgraphableOpCluster instance.
aliasesMap – AliasesMap with alias information for instance’s graph.
flag – a Tensor deciding which branch should be used.
- Returns
The replacement IfOp’s pointer.
-
inline SubgraphOutline()
#include <popart/bwdgraphinfo.hpp>
-
struct BwdGraphInfo
A data structure that captures the result of applying autodiff to a graph.
Public Functions
-
bool operator==(const BwdGraphInfo &rhs) const
Equality operator.
-
bool operator==(const BwdGraphInfo &rhs) const
-
enum class popart::ExpectedConnectionType
The type of tensor expected to connect to a graph input or output.
Values:
-
enumerator Fwd = 0
A tensor from a forward graph.
-
enumerator FwdGrad = 1
The gradient of a tensor from a forward graph.
-
enumerator Fwd = 0
-
struct ExpectedConnection
Description of tensor expected to connect to graph input or output.
Public Functions
-
bool operator==(const ExpectedConnection &rhs) const
Equality operator.
Public Members
-
ExpectedConnectionType type
Either fwdId or getGradId(fwdId).
-
bool operator==(const ExpectedConnection &rhs) const
14.11. Utility classes
14.11.1. Graph
#include <popart/graphutils.hpp>
14.11.2. Region
#include <popart/region.hpp>
14.11.3. Error handling
#include <popart/error.hpp>
-
enum class popart::ErrorSource
Values:
-
enumerator popart = 0
-
enumerator popart_internal
-
enumerator poplar
-
enumerator poplibs
-
enumerator unknown
-
enumerator popart = 0
-
class error : public runtime_error
Exception class for popart.
Subclassed by popart::internal_error, popart::memory_allocation_err, popart::runtime_error
Public Functions
-
template<typename ...Args>
inline explicit error(const char *s, const Args&... args) Variadic constructor for error which allows the user to use a fmt string for the message.
throw error(“This is an error reason {}”, 42);
-
template<typename ...Args>
inline explicit error(ErrorUid uid, const std::string &s, const Args&... args)
-
const std::string &stackreport() const
-
inline ErrorUid uid() const
-
template<typename ...Args>
-
class internal_error : public popart::error
Exception class specific to internal errors This should be used as an assert; for states where the user should not have been able to create.
-
class memory_allocation_err : public popart::error
Subclassed by popart::popx::devicex_memory_allocation_err
Public Functions
-
inline memory_allocation_err(const std::string &info)
-
virtual std::unique_ptr<memory_allocation_err> clone() const = 0
-
virtual std::string getSummaryReport() const = 0
-
virtual std::string getProfilePath() const = 0
-
inline memory_allocation_err(const std::string &info)
-
class runtime_error : public popart::error
Exception class specific to errors that occur when running a model.
For example, this error could be thrown when a user-implemented IStepIO callback doesn’t return any data.
NOTE: This is different from a C++ runtime error.
-
class devicex_memory_allocation_err : public popart::memory_allocation_err
Public Functions
-
devicex_memory_allocation_err(const devicex_memory_allocation_err &rhs)
-
devicex_memory_allocation_err(const poplar::graph_memory_allocation_error &e, const poplar::OptionFlags &_reportOptions)
-
std::unique_ptr<memory_allocation_err> clone() const
-
std::string getSummaryReport() const
-
std::string getProfilePath() const
-
devicex_memory_allocation_err(const devicex_memory_allocation_err &rhs)
14.11.4. Debug context
#include <popart/debugcontext.hpp>
-
class DebugContext
Public Functions
-
DebugContext(SourceLocation loc = SourceLocation::Current())
-
DebugContext(const char *name, SourceLocation loc = SourceLocation::Current())
-
DebugContext(std::string name, SourceLocation loc = SourceLocation::Current())
-
DebugContext(const DebugInfo &debugInfo, std::string name = "", SourceLocation loc = SourceLocation::Current())
-
DebugContext(const DebugNameAndId &debugNameAndId, std::string name = "", SourceLocation loc = SourceLocation::Current())
-
DebugContext(DebugContext&&)
-
DebugContext(const DebugContext&)
-
~DebugContext()
-
std::string getPathName() const
-
DebugContext(SourceLocation loc = SourceLocation::Current())
-
class DebugInfo
Subclassed by popart::OnnxOpDebugInfo, popart::OnnxVariableDebugInfo, popart::OpDebugInfo, popart::TensorDebugInfo
Public Types
Public Functions
-
DebugInfo(const DebugContext &debugContext, const std::string &layer)
-
virtual ~DebugInfo()
-
DebugId getId() const
-
std::string getPathName() const
-
bool setValue(std::string name, ProfileValue value)
Public Static Functions
-
static void initializeStreamer(const std::string &fileName, const SerializationFormat &format = SerializationFormat::CBOR)
-
static void closeStreamer()
-
DebugInfo(const DebugContext &debugContext, const std::string &layer)
-
class OnnxOpDebugInfo : public popart::DebugInfo
Public Functions
-
OnnxOpDebugInfo(const DebugContext &debugContext, const Node &node)
-
OnnxOpDebugInfo &operator=(const OnnxOpDebugInfo&) = delete
-
OnnxOpDebugInfo(const OnnxOpDebugInfo&) = delete
-
virtual ~OnnxOpDebugInfo() = default
-
OnnxOpDebugInfo(const DebugContext &debugContext, const Node &node)
-
class OnnxVariableDebugInfo : public popart::DebugInfo
Public Functions
-
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::TensorProto &proto)
-
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto)
-
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::ValueInfoProto &proto, const TensorInfo &ti)
-
OnnxVariableDebugInfo &operator=(const OnnxVariableDebugInfo&) = delete
-
OnnxVariableDebugInfo(const OnnxVariableDebugInfo&) = delete
-
virtual ~OnnxVariableDebugInfo() = default
-
OnnxVariableDebugInfo(const DebugContext &dc, const ONNX_NAMESPACE::TensorProto &proto)
-
class OpDebugInfo : public popart::DebugInfo
Public Functions
-
OpDebugInfo(const DebugContext &debugContext, const Op &_op)
-
virtual ~OpDebugInfo()
-
OpDebugInfo &operator=(const OpDebugInfo&) = delete
-
OpDebugInfo(const OpDebugInfo&) = delete
-
void finalize()
-
OpDebugInfo(const DebugContext &debugContext, const Op &_op)
-
class TensorDebugInfo : public popart::DebugInfo
Public Functions
-
TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorInfo &info, const TensorType &tt)
-
TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorType &tt)
-
TensorDebugInfo &operator=(const TensorDebugInfo&) = delete
-
TensorDebugInfo(const TensorDebugInfo&) = delete
-
virtual ~TensorDebugInfo() = default
-
TensorDebugInfo(const DebugContext &debugContext, const TensorId &tenid, const TensorInfo &info, const TensorType &tt)
14.11.5. Attributes
#include <popart/attributes.hpp>
-
class Attributes
Wrapper around the container of
ONNX_NAMESPACE::AtrributeProtos
of aNode
.Provides faster and cleaner reads of values from keys (strings) than
ONNX_NAMESPACE::AttributesProto
.Public Types
-
using Ints = std::vector<int64_t>
The types of attributes as defined in the ONNX spec.
-
using Int = int64_t
-
using Floats = std::vector<float>
-
using Float = float
-
using Strings = std::vector<std::string>
-
using String = std::string
-
using Graphs = std::vector<ONNX_NAMESPACE::GraphProto>
-
using Graph = ONNX_NAMESPACE::GraphProto
Public Functions
-
Attributes(const NodeAttributes&)
-
Attributes() = default
-
const std::vector<std::string> &getNames() const
-
onnxAttPtr at(const std::string &name) const
-
void append(std::stringstream &ss, std::string prefix = "") const
-
bool hasAttribute(const std::string &key) const
-
void takeAttribute(const std::string &key, const Attributes &attributes)
Take an attribute identified by
key
from the givenAttributes
object.
-
template<typename UnaryPredicate>
inline Attributes filter(UnaryPredicate p) const Take the set of attributes that match the given predicate.
-
Attributes::Graphs getAllGraphAttributes() const
-
template<>
Attributes filter(const char *key) const
-
template<>
Attributes filter(std::string key) const
-
template<>
void setIfPresent(std::vector<int64_t>&, const std::string &key) const
-
template<>
void setIfPresent(int64_t&, const std::string &key) const
-
template<>
void setIfPresent(bool &v, const std::string &key) const
-
template<>
void setIfPresent(std::string&, const std::string &key) const
-
template<>
void setIfPresent(float&, const std::string &key) const
-
template<>
void set(std::vector<int64_t> &vs, const std::string &key) const
-
template<>
void set(std::vector<float> &vs, const std::string &key) const
-
template<>
void set(std::vector<std::string> &vs, const std::string &key) const
-
template<>
void set(float &v, const std::string &key) const
-
template<>
void set(int64_t &v, const std::string &key) const
-
template<>
Attributes::Ints getAttribute(const std::string &key, const Attributes::Ints &defaultValue) const
-
template<>
Attributes::Int getAttribute(const std::string &key, const Attributes::Int &defaultValue) const
-
template<>
Attributes::String getAttribute(const std::string &key, const Attributes::String &defaultValue) const
-
template<>
Attributes::Float getAttribute(const std::string &key, const Attributes::Float &defaultValue) const
-
template<>
Attributes::Ints getAttribute(const std::string &key) const
-
template<>
void setAttribute(const std::string &key, Attributes::Ints&)
-
template<>
void setAttribute(const std::string &key, Attributes::Int&)
-
template<>
void setAttribute(const std::string &key, Attributes::String&)
-
using Ints = std::vector<int64_t>
14.11.6. Void data
#include <popart/voiddata.hpp>
-
class ConstVoidData
A class to point to constant data.
Public Functions
-
ConstVoidData() = default
-
ConstVoidData(const void *data_, const TensorInfo &info_)
-
inline bool storesData() const
-
void store(std::vector<char> &&d, const TensorInfo &i)
-
ConstVoidData() = default
-
class MutableVoidData
A class to point to non-constant data.
14.11.7. Input shape information
#include <popart/inputshapeinfo.hpp>
-
class InputShapeInfo
Class that contains what is known about the input tensors (as TensorInfo objects) in the IR prior to compilation.
This knowledge can sometimes be compiled into the IR, and for certain backends is even required, for example the IPU requires all Stream Tensor shapes.
Public Functions
-
InputShapeInfo() = default
Default constructor for the InputShapeInfo class.
-
void add(TensorId, const TensorInfo&)
Add the identifier and TensorInfo object for a tensor to the InputShapeInfo object.
- Parameters
TensorId – The identifier of the tensor for which information is being added.
TensorInfo – The tensor information to be added.
-
const TensorInfo &get(TensorId) const
Get the information of a tensor.
- Parameters
TensorId – The identifier of the tensor for which to get the tensor information.
-
bool has(TensorId) const
Check if the InputShapeInfo object contains information for a tensor.
- Parameters
TensorId – The identifier of the tensor to check.
- Returns
If
true
, the InputShapeInfo object contains information for the tensor. Iffalse
, the InputShapeInfo object does not contain information for the tensor.
-
std::vector<TensorId> getAllTensorIds() const
Get all unique tensor identifiers of tensors in the InputShapeInfo object.
- Returns
Vector of tensor identifiers.
-
inline const std::map<TensorId, TensorInfo> &getInfos() const
Get all information contained the InputShapeInfo object.
- Returns
Map of tensor identifiers and the corresponding tensor information.
-
InputShapeInfo() = default
14.11.8. Profiling
#include <popart/liveness.hpp>
-
class LivenessAnalyzer
Public Types
-
using PendingCopies = std::vector<LivenessNode>
Public Functions
-
LivenessAnalyzer(const Ir *ir_, const SubgraphCopyingStrategy *subgraphCopyingStrat)
-
void apply()
-
int64_t getGlobalSchedulePosition(CallStack callStack) const
-
inline size_t getOpScheduleSize() const
-
inline const LivenessNode &getOpScheduleAt(int64_t scheduleIndex) const
-
inline const std::vector<int64_t> &getCallSiteLinksAt(int64_t scheduleIndex) const
-
inline const std::vector<int64_t> &getCallSiteLinksInvAt(int64_t scheduleIndex) const
-
int64_t getContextStartIndex(ExecutionContext context) const
-
int64_t getContextEndIndex(ExecutionContext context) const
-
using PendingCopies = std::vector<LivenessNode>
#include <popart/subgraphpartitioner.hpp>
-
class SubgraphPartitioner
When lowering CallOps, we would previously copy all tensors from the call site (the CallOp’s input tensors) to the subgraph’s input tensors, do the call and then copy the subgraph’s output tensors back to the call site’s output tensors:
Copy(caller_in_1, subgraph_in_1) Copy(caller_in_2, subgraph_in_2) Call(subgraph) Copy(subgraph_out_1, caller_out_1) Copy(subgraph_out_2, caller_out_2)
With this approach both, subgraph_in_1 and subgraph_in_2 are live during the call. This can be suboptimal — in some cases some subgraph inputs may not be required until later in the subgraph and copying them later would improve the required memory. Analogously, in some cases some subgraph outputs may be ready to copy well before the end of the subgraph and it may be advantageous to do this copy early. This is especially true for subgraphs that deal with multiple inputs/outputs in sequence.
To that end, graphs now support lowering over multiple “subgraph parts” to allow CallOps that have these subgraphs as their called graph to copy inputs later and outputs earlier. Essentially, each graph is ‘split’ over multiple PopART fragments / Poplar sequences to facilitate any parent graph that calls it to do a Copy of inputs or outputs between.
The scheduling of copies for subgraph ops is already modelled by the LivenessAnalyzer. We base our partitioning on this model. This class simply interprets the LivenessAnalyzer’s schedule and determines how to split subgraphs into parts based on the LivenessAnalyzer’s schedule.
Public Types
-
enum class CallOpPartType
Enum type for CallOpPart types.
Values:
-
enumerator Undefined = 0
-
enumerator CopyInput
-
enumerator CopyOutput
-
enumerator CopyModified
-
enumerator CallSubgraphPart
-
enumerator Undefined = 0
-
using CallOpSchedule = std::vector<std::tuple<CallOpPart, SubgraphPartIndex>>
Public Functions
-
SubgraphPartitioner() = default
Default contructor.
-
virtual ~SubgraphPartitioner() = default
Default destructor.
-
virtual void apply()
Prepare the results.
Errors if IR or liveness analyser not set.
-
virtual void setLivenessAnalyzer(const LivenessAnalyzer*)
Set the LivenessAnalyzer dependency to use.
-
virtual int getNumSubgraphParts(const Graph&) const
Interpret the liveness analysis and work out what how many subgraph parts a graph needs to lower all fragments between input/output copies.
Errors if apply was not run.
-
virtual SubgraphPartIndex getOpSubgraphPartBegin(Op*) const
Interpret the liveness analysis and work out what subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.
For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the first such part. Errors if apply was not run.
-
virtual SubgraphPartIndex getOpSubgraphPartEnd(Op*) const
Interpret the liveness analysis and work out what index is one larger than the last subgraph part an op is in based on the copying of inputs/outputs the subgraph the op is in.
For ops that spread over multiple subgraph parts (i.e. CallOps) this returns the last such part. Errors if apply was not run.
-
virtual CallOpSchedule getCallOpSchedule(CallOp*) const
Intepret the liveness analysis results and work out how a CallOp is broken down over various subgraph parts.
The result is a vector of pairs of CallOp ‘parts’ and the ‘subgraph parts’ they should be lowered in.
Public Static Functions
-
class CallOpPart
A class to represent a part of a CallOp.
-
enum class CallOpPartType
#include <popart/aliaszerocopy.hpp>
-
class AliasZeroCopy
Public Functions
-
AliasZeroCopy(const Ir *ir, const LivenessAnalyzer *analyzer)
-
void apply()
-
std::set<Tensor*, PTensorCmp> getPostIRAliases(Tensor*) const
-
std::set<Tensor*, PTensorCmp> getTensorsWithPostIRAliases() const
-
std::set<Tensor*, PTensorCmp> getProposedAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const
-
std::set<Tensor*, PTensorCmp> getActiveAliasedTensors(std::set<Tensor*, PTensorCmp> tensors, bool fullyAliased) const
-
void printLivenessIntervals(std::set<Tensor*, PTensorCmp> tensors, ProducerInterval producerInterval)
-
Intervals getLivenessIntervals(Tensor*, ProducerInterval)
-
Intervals getCandidateLivenessIntervals(Tensor*, ProducerInterval = ProducerInterval::Enforce, bool forceUpdateCache = false)
-
AliasZeroCopy(const Ir *ir, const LivenessAnalyzer *analyzer)
-
class Intervals
14.11.9. Task information
#include <popart/taskid.hpp>
-
class TaskId
A class describing an IR-to-poplar lowering task.
This is a class that is cheap to construct. We construct and compare TaskIds a lot in
irlowering.cpp
so it pays to make these cheap operations. Note that previously TaskId was astd::string
and creating a TaskId typically involved some string manipulation, meaning heap memory may be involved. Comparing strings for equality or ordering strings is also typically not constant-time.Public Types
-
enum class Type
TaskId type.
Values:
-
enumerator AnchorStreamToHostTask = 0
-
enumerator AnchorSumTask
-
enumerator AnchorToHostTask
-
enumerator FromHostTask
-
enumerator FromHostUpdateTask
-
enumerator FromOpTask
-
enumerator InitBatchCounterTensorsTask
-
enumerator InitRngSeedsTask
-
enumerator InitRandomSeedTask
-
enumerator InitRngStateTensorTask
-
enumerator InitTensorTask
-
enumerator PipelinedCopyTask
-
enumerator RandomSeedToHostTask
-
enumerator RngStateFromHostTask
-
enumerator RngStateToHostTask
-
enumerator SetInitTensorValTask
-
enumerator StreamFromHostTask
-
enumerator UpdateBatchCountTask
-
enumerator WeightStreamToHostTask
-
enumerator WeightToHostTask
-
enumerator Undefined
-
enumerator AnchorStreamToHostTask = 0
Public Functions
-
TaskId()
-
TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier)
-
TaskId(Type type, const OpId &opId, const OperatorIdentifier &opIdentifier, const OpxGrowPartId &opxGrowPartId)
-
TaskId(Type type, nonstd::optional<TensorId> tensorId, nonstd::optional<OpId> opId, nonstd::optional<OperatorIdentifier> opIdentifier, nonstd::optional<OpxGrowPartId> opxGrowPartId)
-
bool empty() const
-
enum class Type
14.11.10. Type definitions
-
namespace onnx
-
namespace popart
-
namespace view
Typedefs
-
using Shape = std::vector<int64_t>
The dimensions of a tensor, equivalent to
numpy.shape
.
-
using Rank = int
Rank of a tensor. That is, the number of indices.
-
typedef std::string TensorId
Label put on a tensor to distinguish it from the others in the graph.
-
using OpName = std::string
Name of the instance of the operator.
-
using OpDomain = std::string
Specifies who created the operator.
Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)
-
using OpType = std::string
Specifies the type of an operator.
Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)
-
using OpVersion = unsigned
Specifies the version of the operator.
Part of domain.type:version used as an Op identifier by ONNX (https://github.com/onnx/onnx/blob/master/docs/Versioning.md)
-
using OpId = int
Label put on a operator to distinguish it from the others in the graph.
-
using ReturnPeriod = int
-
using ReplicaIndex = int
The index of a replica.
-
using SubgraphPartIndex = int
The index of the subgraph part.
-
using OpxGrowPartId = int
Identifies a part of an Opx grow function.
-
using CollectiveBalancedReorderId = int
The identifier of the collective balanced host rearrangement.
-
using ReplicatedTensorShardingIndices = std::set<std::pair<std::set<InIndex>, std::set<OutIndex>>>
The set of indices that have to be replica sharded together, and the outputs that will be replica sharded as a result.
-
using ReplicatedTensorShardingIndicesIndex = int
The position in ReplicatedTensorShardingIndices for which to get the ReplicatedTensorShardingGroup.
-
using ReplicatedTensorShardingGroupId = int
The unique integer id for a ReplicatedTensorShardingGroup.
-
using PipelineCycle = int64_t
-
using VGraphId = int64_t
-
using PipelineStage = int64_t
-
using ExecutionPhase = int64_t
-
using BatchSerializedPhase = int64_t
-
using StashIndex = int64_t
-
using RemoteBufferId = int64_t
-
using RemoteBufferIndex = int64_t
-
using RandomReferenceId = int64_t
-
using ConvDilations = std::vector<int64_t>
-
using ConvGroup = int64_t
-
using ConvPads = std::vector<int64_t>
-
using ConvStrides = std::vector<int64_t>
-
using ConvTruncs = std::vector<int64_t>
-
using MultiConvInputs = std::vector<ConvInputs>
-
using MultiConvDilations = std::vector<ConvDilations>
-
using MultiConvStrides = std::vector<ConvStrides>
-
using TensorInterval = std::pair<size_t, size_t>
-
using TensorIntervalList = std::vector<TensorInterval>
-
using onnxAttPtr = const ONNX_NAMESPACE::AttributeProto*
-
using Node = ONNX_NAMESPACE::NodeProto
-
using IsReplicaEqual = bool
-
using ReplEqInputMap = std::map<InIndex, IsReplicaEqual>
-
using ReplEqOutputMap = std::map<InIndex, IsReplicaEqual>
-
using ReplEqModifiedInputMap = ReplEqInputMap
-
using ReplEqFun = std::function<std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap>(const ReplEqInputMap&)>
Enums
-
enum StochasticRoundingMethod
Used to describe the stochastic rounding which is applied to the output(s) of an Op.
See also
docs/notes/ir/attributes/stochasticroundingmethod.md
Values:
-
enumerator DifferingSeeds = 1
Apply stochastic rounding with a replica-local seed.
That is, stochastic rounding performed by an Op on one replica is nominally different to stochastic rounding performed by the same Op on another replica. Use this setting for Ops where you want to apply stochastic rounding but you cannot meet the condition of StochasticRoundingMethod::IdenticalSeeds. For example, this setting can be useful for gradient accumulation steps.
-
enumerator IdenticalSeeds = 2
Apply stochastic rounding with a RNG state (the value of poplar::getHwSeeds) that is identical across replicas.
Use this option on, e.g., the weight update step to ensure that the weight tensor on each replica has stochastic rounding applied to it in the same way and there is no weight drift.
REQUIREMENT: The ability to provide an RNG state (the value of poplar::getHwSeeds) that is identical on each replica relies on all Ops that use this setting to behave in a way that does not violate this property for Ops that follow it. More formally, you must only apply this setting to Ops for which you can guarantee that if the RNG state is the same across replicas before the Op is executed then the RNG state is still the same on all replicas after the Op is done executing. A typically sufficient (but not necessary) condition is that all input tensors of the Op have the same value across replicas.
-
enumerator DifferingSeeds = 1
-
namespace view
-
using popart::FwdGraphToBwdGraphInfo = std::map<GraphId, BwdGraphInfo>
Mapping from fwdGraph to info on the bwdGraph.
14.11.11. Enums
-
enum class popart::AccumulationType
Values:
-
enumerator Add = 0
-
enumerator DampenedAdd
-
enumerator DampenedAddSquare
-
enumerator DecayAdd
-
enumerator DecayAddSquare
-
enumerator MovingAverage
-
enumerator MovingAverageSquare
-
enumerator Infinity
-
enumerator Mean
-
enumerator Add = 0
-
enum class popart::ActivationFunction
Values:
-
enumerator Sigmoid = 0
-
enumerator Relu
-
enumerator Tanh
-
enumerator Gelu
-
enumerator GeluErf
-
enumerator Swish
-
enumerator Softmax
-
enumerator SoftmaxStable
-
enumerator SoftmaxScaled
-
enumerator N
-
enumerator Invalid
-
enumerator Sigmoid = 0
-
enum class popart::AutoPad
Values:
-
enumerator NOTSET = 0
-
enumerator SAME_UPPER
-
enumerator SAME_LOWER
-
enumerator VALID
-
enumerator NOTSET = 0
-
enum class popart::CollectiveOperator
Values:
-
enumerator Add = 0
-
enumerator Mean
-
enumerator Mul
-
enumerator Min
-
enumerator Max
-
enumerator LogicalAnd
-
enumerator LogicalOr
-
enumerator SquareAdd
-
enumerator Local
-
enumerator N
-
enumerator Add = 0
-
enum class popart::DeviceSelectionCriterion
Controls how to select an available IPU.
Values:
-
enumerator First = 0
-
enumerator Random
Select the first device available. (Default).
Select a device randomly from those available.
-
enumerator First = 0
-
enum class popart::ResizeCoordinateTransformationMode
Values:
-
enumerator HalfPixel
-
enumerator PytorchHalfPixel
-
enumerator AlignCorners
-
enumerator Asymmetric
-
enumerator TfCropAndResize
-
enumerator N
-
enumerator HalfPixel
-
enum class popart::ResizeMode
Values:
-
enumerator Nearest
-
enumerator Linear
-
enumerator Cubic
-
enumerator N
-
enumerator Nearest
-
enum class popart::ResizeNearestMode
Values:
-
enumerator RoundPreferFloor
-
enumerator RoundPreferCeil
-
enumerator Floor
-
enumerator Ceil
-
enumerator Pytorch
-
enumerator N
-
enumerator RoundPreferFloor
-
enum class popart::ScatterReduction
Values:
-
enumerator Sum = 0
-
enumerator Max
-
enumerator Min
-
enumerator Mul
-
enumerator None
-
enumerator Sum = 0
-
enum class popart::TensorRemapType
Enum describing how the tensor layout should be remapped during the forward and backward pass (backward pass remapping requires the Op to exist in the IR before autodiff).
Values:
-
enumerator FwdBwdReverse = 0
Remap the tensor in the forward pass, reverse-apply the remapping in the backward pass.
-
enumerator FwdBwd
Remap the tensor in the forward pass and backward pass independently.
-
enumerator Fwd
Only remap the tensor in the forward pass, use identity for the backward pass.
-
enumerator FwdBwdReverse = 0
14.11.12. Structs
-
struct BranchInfo
Public Functions
-
struct ClonedGraphMaps
Struct of maps that map cloned Op and Tensor Ids back to the original, and vice-versa.
-
struct ConvParameters
Public Members
-
int64_t batchSize
-
int64_t numInChannelsPerGroup
-
int64_t numOutChannelsPerGroup
-
int64_t numGroups
-
struct popart::ConvParameters::Input inputTransformation
-
struct popart::ConvParameters::Input kernelTransformation
-
struct popart::ConvParameters::Output outputTransformation
-
struct Input
-
struct Output
-
int64_t batchSize
-
struct OpxInAndOutIndex
Public Functions
-
OpxInAndOutIndex() = default
-
inline bool operator==(const OpxInAndOutIndex &rhs) const
-
OpxInAndOutIndex() = default
-
struct PTensorCmp
-
struct ReplicatedTensorShardingOpInfo
Struct that describes which inputs/outputs of an Op belong to the sharding group.
Regular operations typically belong to only one sharding group, however:
Subgraphing operations (CallOp, LoopOp)
MultiExchangeOp can belong to multiple sharding groups, depending on the input and ouput indices.
14.11.13. Other classes
-
template<typename T, uint32_t V = 0>
class BasicOptional A temporary solution to removing boost::optional from certain header files This class is an incomplete replacement of boost::optional (and std::optional).
template parameter T: the type which will optionally be stored template parameter V: has no effect, but enables compiler errors when two objects of type T should not be compared
Public Functions
-
inline BasicOptional() noexcept
Construct an unset BasicOptional<T>
-
BasicOptional(const BasicOptional<T, V> &rhs) = default
-
BasicOptional<T, V> &operator=(const BasicOptional<T, V>&) = default
-
inline BasicOptional<T, V> &operator=(const T &t)
-
inline explicit operator bool() const
Return true if set.
Can be used as:
BasicOptional<Foo> foo(6); if (foo){ *foo = 7; }
-
inline void reset() noexcept
-
inline BasicOptional() noexcept
-
class ExchangeDescriptor
Class describing an external exchanges from IPUs.
Public Functions
-
ExchangeDescriptor(ExchangeDirection direction, TensorId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)
Create an ExchangeDescriptor for a host exchange.
- Parameters
direction –
Load
(from host) orStore
(to host)id – Host stream tensor ID
vgid – Virtual graph for the exchange
tileSet – Tile set for the exchange
numInputs – Number of tensor inputs expected
numOutputs – Number of tensor outputs expected
inplace – If the output of the exchange should alias the input during
Load
-
ExchangeDescriptor(ExchangeDirection direction, RemoteBufferId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)
Create an ExchangeDescriptor for a remote exchange.
- Parameters
direction –
Load
(from host) orStore
(to host)id – Remote buffer id
vgid – Virtual graph for the exchange
tileSet – Tile set for the exchange
numInputs – Number of tensor inputs expected
numOutputs – Number of tensor outputs expected
inplace – If the output of the exchange should alias the input during
Load
-
ExchangeDescriptor(ExchangeDirection direction, GraphId id, TileSet destination, CodeMemoryType destinationType)
Create an ExchangeDescriptor for an External code copy op.
- Parameters
direction –
Load
(from host) orStore
(to host)id – GraphId of the graph to load.
destination – The destination TileSet to load to .
destinationType – The destination memory type to load to.
-
inline const ExchangeDirection &getDirection() const
-
inline bool isRemoteExchange() const
-
inline bool isHostExchange() const
-
inline bool isCodeCopy() const
Returns true if this exchange descriptor is is associated with a code copy operation.
- Returns
true If it is associated with a code copy op.
- Returns
false Otherwise.
-
inline const RemoteBufferId &getRemoteBufferId() const
-
inline void setRemoteBufferId(RemoteBufferId id)
-
inline const OptionalGraphId &getGraphToLoadId() const
GraphId of the graph which this op will load code for.
- Returns
const OptionalGraphId& Id in question.
-
inline OptionalCodeMemoryType getDestinationCodeMemoryType() const
Get the Destination Location the code will be sent to, if this is an ExchangeDescriptor for an RemoteCodeLoadOpOp.
Buffer - Stored in non-executable buffer memory. ExecutableMemory - Stored in executable memory.
- Returns
OptionalLocationType One of:
-
const std::string getResourceId() const
Get an identifier representing which resource (landing pad tensor) this exchange will be using.
- Returns
Resource identifier
-
inline OptionalVGraphId getVGraphID() const
-
inline int getNumInputs() const
-
inline int getNumOutputs() const
-
inline bool isInplace() const
-
ExchangeDescriptor(ExchangeDirection direction, TensorId id, OptionalVGraphId vgid, TileSet tileSet, int numInputs, int numOutputs, bool inplace)
-
class GraphId
-
class LeakyReluOpBaseAttributes
Subclassed by popart::LeakyReluGradOp, popart::LeakyReluInplaceOp, popart::LeakyReluOp
-
class MultiConvOptions
Public Functions
-
MultiConvOptions(const std::map<std::string, std::string> sessionConvOptions, const Attributes &attr)
-
std::map<std::string, std::string> getConvOptions(int convIndex) const
-
std::map<std::string, std::string> getGlobalOptions() const
-
MultiConvOptions(const std::map<std::string, std::string> sessionConvOptions, const Attributes &attr)
-
class OpEquivIdCreator : public popart::OpSerialiserBase
Public Functions
-
void appendAttribute(const std::string&, nonstd::optional<int64_t>) override
-
void appendAttribute(const std::string&, nonstd::optional<float>) override
-
void appendAttribute(const std::string&, nonstd::optional<double>) override
-
std::string str()
-
template<>
void appendAttr(const TensorIndexMap &tmap)
-
void appendAttribute(const std::string&, nonstd::optional<int64_t>) override
-
class OpJsonSerialiser : public popart::OpSerialiserBase
-
class OpSerialiser : public popart::OpSerialiserBase
-
class OpSerialiserBase
Subclassed by popart::OpEquivIdCreator, popart::OpJsonSerialiser, popart::OpSerialiser
Public Functions
-
inline virtual ~OpSerialiserBase()
-
void appendAttribute(const std::string&, float)
-
void appendAttribute(const std::string&, double)
-
void appendAttribute(const std::string&, int)
-
void appendAttribute(const std::string&, int64_t)
-
void appendAttribute(const std::string&, uint32_t)
-
void appendAttribute(const std::string&, uint64_t)
-
void appendAttribute(const std::string&, const std::string&)
-
void appendAttribute(const std::string&, const std::vector<float>&)
-
void appendAttribute(const std::string&, const std::vector<double>&)
-
void appendAttribute(const std::string&, const std::vector<int64_t>&)
-
void appendAttribute(const std::string&, bool)
-
virtual void appendAttribute(const std::string&, nonstd::optional<int64_t>) = 0
-
virtual void appendAttribute(const std::string&, nonstd::optional<float>) = 0
-
virtual void appendAttribute(const std::string&, nonstd::optional<double>) = 0
-
template<typename T, uint32_t V>
inline void appendAttribute(const std::string &key, const BasicOptional<T, V> &value)
-
inline virtual ~OpSerialiserBase()
-
class PriTaskDependency
Public Functions
-
inline DependencyType getType() const
-
bool operator==(PriTaskDependency const &rhs) const
-
inline DependencyType getType() const
-
class ReplicaEqualAnalysisProxy
Interface for object passed to
Op::fwdPropagateIsReplicaEqual
.Public Functions
-
virtual ReplEqModifiedInputMap getModifiedInputMapFromAliases(const Op *op, const ReplEqOutputMap &replEqOpOutputMap) const = 0
Work out replica-equal values for modified inputs by setting replica-equal values of modified inputs to true if and only if the Op has an output that is an alias of the modified input, containing all elements of the input, and the output is deemed replica-equal.
If this doesn’t hold a modified input is assumed to be not replica-equal.
NOTE: It is possible for an Op to modify an input to a replica-equal value in a way that will not be detected by this implementation, but it’s generally true for currently supported Ops at time of writing.
-
virtual std::tuple<ReplEqOutputMap, ReplEqModifiedInputMap> fwdPropagateIsReplicaEqualThroughGraph(const Graph *graph, const ReplEqInputMap &replEqGraphInputMap) = 0
A method that can be called to work out how replica-equal values for graph inputs propagate to replica-equal values for graph outputs.
NOTE: Graphs never copy-modify input tensors, although Ops that call graphs might (like CallOp, LoopOp).
- Parameters
graph – The graph to propagate replica-equal values through.
replEqGraphInputMap – The replica-equal values for the graph’s inputs.
- Returns
A tuple containing a ReplEqOutputMap that describes replica-equal values for the graph’s outputs and a ReplEqModifiedInputMap that describes the final replica-equal values of the graph’s inputs.
-
inline virtual ~ReplicaEqualAnalysisProxy()
-
virtual ReplEqModifiedInputMap getModifiedInputMapFromAliases(const Op *op, const ReplEqOutputMap &replEqOpOutputMap) const = 0
-
class ReplicatedTensorShardingTracer
Class that traces the graph and finds all tensors that are: 1.) Replicated tensor sharded 2.) Have the same meta-shape describing the tensor shape before sharding 3.) Use the same collective balanced reorder (CBR) when lowered to Poplar 4.) Share the same elementwise compatible tensor layout by virtue of 2.) and 3.)
Public Functions
-
ReplicatedTensorShardingTracer(const Ir &ir_)
Instantiate the tracer and trace.
- Parameters
ir_ – IR to operate on
startTensors_ – Tensors to trace from
-
bool hasGroup(const ReplicatedTensorShardingOpInfo &opInfo) const
Check if the Op associated with the opId has a replicated tensor sharding group.
- Parameters
opInfo –
OpId
and input/output indices- Returns
True if there is a group associated with the
opId
-
bool hasGroup(const TensorId &tensorId) const
Check if the tensor associated with the tensorId has a replicated tensor sharding group.
- Parameters
tensorId –
TensorId
- Returns
True if there is a group associated with the
tensorId
-
const ReplicatedTensorShardingGroup &getGroup(const ReplicatedTensorShardingOpInfo &opInfo) const
Get the replicated tensor sharding group associated with the
opId
.- Parameters
opInfo –
OpId
and input/output indices- Returns
Associated replicated tensor sharding group.
-
const ReplicatedTensorShardingGroup &getGroup(const TensorId &tensorId) const
Get the replicated tensor sharding group associated with the
tensorId
.- Parameters
tensorId –
TensorId
- Returns
Associated replicated tensor sharding group.
-
void trace(const std::set<Tensor*, PTensorCmp> &startTensors)
Traverse the graph to trace out operators and tensors belonging to the same replicated tensor sharding group.
- Parameters
startTensors –
-
ReplicatedTensorShardingTracer(const Ir &ir_)
-
class TensorLocationInfo
Public Functions
-
inline void setRemote(bool remote_)
-
inline bool isRemote() const
-
inline void setSharded(bool sharded_)
-
inline bool isSharded() const
-
inline void setRemoteBufferInfo(RemoteBufferId rbId, RemoteBufferIndex index)
-
inline const std::pair<RemoteBufferId, RemoteBufferIndex> getRemoteBufferInfo() const
-
inline bool operator==(const TensorLocationInfo &rhs) const
-
inline void setRemote(bool remote_)
-
class InputCreatorCandidate : public popart::popx::ICreatorCandidate
Public Functions
-
InputCreatorCandidate(InIndex index_, const Opx *opx_, std::vector<OpxInAndOutIndex> pathFromInput_, int64_t scheduleIndex_)
-
InputCreatorCandidate() = default
-
~InputCreatorCandidate() override = default
-
std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override
-
DnfTensorIds mustExistBeforeCreate() override
-
double getMaxCreatorPriority() const override
-
int64_t getNumElems() const override
-
inline std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final
-
inline void setPathFromInput(const std::vector<OpxInAndOutIndex> &value)
-
std::string str() override
-
inline int64_t getScheduleIndex() const final
-
InputCreatorCandidate(InIndex index_, const Opx *opx_, std::vector<OpxInAndOutIndex> pathFromInput_, int64_t scheduleIndex_)
-
class InputMultiCreatorCandidate : public popart::popx::ICreatorCandidate
Public Functions
-
InputMultiCreatorCandidate()
-
~InputMultiCreatorCandidate() override = default
-
std::pair<poplar::Tensor, ViewChangers> createInput(const poplar::DebugNameAndId &dnai) override
-
DnfTensorIds mustExistBeforeCreate() override
-
double getMaxCreatorPriority() const override
-
int64_t getNumElems() const override
-
std::string str() override
-
bool addCreatorCandidate(ICreatorCandidatePtr)
-
std::vector<std::vector<OpxInAndOutIndex>> getPathsFromInput() final
-
int64_t getScheduleIndex() const final
-
InputMultiCreatorCandidate()
-
class IsInfx : public popart::popx::ElementWiseUnaryOpx
-
class IsNaNx : public popart::popx::ElementWiseUnaryOpx
-
class ViewChanger
Subclassed by popart::popx::ReplicatedGatherInScatterOutViewChanger, popart::popx::ReplicatedGatherOutScatterInViewChanger
-
class ViewChangers
-
class ReplicatedGatherInScatterOutViewChanger : public popart::popx::ViewChanger
Public Functions
-
inline ReplicatedGatherInScatterOutViewChanger(int64_t nelms_, ReplicatedTensorShardingGroupId group_)
-
inline bool containsAllDataRegions() const final
-
inline bool operator==(const ViewChanger &rhs) const final
-
inline ReplicatedGatherInScatterOutViewChanger(int64_t nelms_, ReplicatedTensorShardingGroupId group_)
-
class ReplicatedGatherOutScatterInViewChanger : public popart::popx::ViewChanger
Public Functions
-
inline ReplicatedGatherOutScatterInViewChanger(const gcl::CollectiveBalancedReorder *cbr_, ReplicatedTensorShardingGroupId group_)
-
inline bool operator==(const ViewChanger &rhs) const final
-
inline ReplicatedGatherOutScatterInViewChanger(const gcl::CollectiveBalancedReorder *cbr_, ReplicatedTensorShardingGroupId group_)
-
class Reader
A class which facilitates deserialization process.
It allows reading serialized streams allowing restoring PopART state. For more information on what components are deserialized please refer to
Writer
class.Public Functions
Constructs Reader class object.
- Parameters
in – Vector of source streams from which a PopEF file will be read.
-
~Reader()
Default destructor.
-
size_t readExecutableHash() const
- Returns
The executable hash or 0 if the stream contains corrupted data.
-
bool containsPoplarExecutable() const
- Returns
True if the stream contains a Poplar executable.
-
bool containsExecutable() const
- Returns
True if the stream contains a Popart executable.
-
bool containsPopefMetadata()
- Returns
True if the stream contains a PopEF metadata.
-
poplar::Executable deserializePoplarExecutable() const
Deserializes Poplar executable from an executable blob which is part of a PopEF file.
- Returns
Poplar executable.
-
std::unique_ptr<popart::popx::Executablex> deserializeExecutable(popart::Ir &ir, popart::popx::IrLowering &lowering) const
Load a PopART executable from a PopEF file.
- Parameters
ir – Object which some of the deserialized data will be written to.
lowering – Object which some of the deserialized data will be written to.
- Returns
PopART executable.
Public Static Functions
-
static nonstd::optional<size_t> checkFileForValidPoplarExecutable(const std::string &filePath)
Check that a PopART executable can be loaded from a PopEF file.
- Parameters
filePath – The full path to the popef file.
- Returns
nonstd::optional<size_t> The hash of the PopART IR if an executable could be loaded.