PyTorch for the IPU: User Guide
Version: 3.0.0
1. Introduction
1.1. Data batching
1.2. Parallel and Distributed execution
1.3. Constraints
1.4. Other resources
2. Installation
2.1. Version compatibility
2.2. Using a Python virtual environment
2.3. Setting the environment variables
2.4. Validating the setup
3. From PyTorch to PopTorch
3.1. Preparing your data
3.2. Creating your model
3.2.1. Training
3.2.2. Inference
3.3. The training loop
3.4. Multiple/custom losses
3.5. Optimizers
3.6. Going further
4. Features
4.1. Options
4.1.1. Setting options via config file
4.2. Model wrapping functions
4.2.1. poptorch.trainingModel
4.2.2. poptorch.inferenceModel
4.2.3. poptorch.PoplarExecutor
4.2.4. poptorch.isRunningOnIpu
4.3. Error handling
4.3.1. Recoverable runtime errors
4.3.2. Unrecoverable runtime errors
4.3.3. Application and other errors
4.4. Multi-IPU execution strategies
4.4.1. Annotations
Model partitioning using blocks
poptorch.Stage and poptorch.AutoStage
poptorch.Stage
poptorch.AutoStage
poptorch.Phase
Advanced annotation with strings
4.4.2. Available execution strategies
Pipelined execution
Sharded execution
Phased execution
Serial phased execution
Parallel phased execution
poptorch.Liveness
4.5. Optimizers
4.5.1. Loss scaling
4.5.2. Velocity scaling (SGD combined variant only)
4.5.3. Accumulation types
4.5.4. Constant attributes
4.5.5. Reading and writing optimizer state
4.6. PopTorch ops
4.6.1. poptorch.ctc_beam_search_decoder
4.6.2. poptorch.ipu_print_tensor
4.6.3. poptorch.identity_loss
4.6.4. poptorch.MultiConv
4.6.5. poptorch.nop
4.6.6. poptorch.dynamic_slice
4.6.7. poptorch.serializedMatMul
4.6.8. poptorch.set_available_memory
4.6.9. Miscellaneous functions
4.7. 16-bit float support
4.8. Automatic mixed-precision casting
4.9. PyTorch buffers
4.10. Creating custom ops
4.10.1. Implementing the custom op
4.10.2. Make the op available to PyTorch
4.10.3. Passing attributes to the custom op
4.11. Precompilation and caching
4.11.1. Caching
4.11.2. Precompilation
4.12. Environment variables
4.12.1. Logging level
4.12.2. Profiling
4.12.3. IPU Model
4.12.4. Wait for an IPU to become available
4.12.5. Enable executable caching
5. Efficient data batching
5.1. poptorch.DataLoader
5.2. poptorch.AsynchronousDataAccessor
5.2.1. Rebatching iterable datasets
5.3. poptorch.Options.deviceIterations
5.4. poptorch.Options.replicationFactor
5.5. poptorch.Options.Training.gradientAccumulation
5.6. poptorch.Options.outputMode
6. IPU supported operations
6.1. Torch operations
6.1.1. Tensor operations
Creation ops
Indexing, slicing, joining and mutating ops
Random samplers
6.1.2. Math operations
Pointwise ops
Reduction ops
Comparison ops
Other ops
BLAS and LAPACK Operations
6.2. Torch.nn operations
6.2.1. Containers
6.2.2. Convolution layers
6.2.3. Pooling layers
6.2.4. Padding layers
6.2.5. Activations
6.2.6. Normalization layers
6.2.7. Recurrent layers
6.2.8. Linear layers
6.2.9. Dropout
6.2.10. Sparse layers
6.2.11. Loss functions
6.2.12. Vision Layers
6.3. 16-bit float operations
6.4. 16-bit float migration
6.5. Gradient computation control
7. Debugging your model
7.1. Inspecting tensors
7.2. Anchoring tensors
7.3. Retrieving tensors
7.4. Inspecting optimiser state
8. Efficient IPU I/O
8.1. Prefetch and Multibuffering
8.2. Overlapping compute and I/O
9. Examples
9.1. MNIST example
10. Experimental features
10.1. Distributed execution without PopRun
10.2. torch.nn.CTCLoss
11. Legacy tracing frontend
11.1. Dispatcher support
11.2. Constraints when using tracing
11.3. 16-bit float operations when using tracing
11.3.1. Casting
11.3.2. Creation functions
11.3.3. Normalization
11.4. Automatic mixed-precision casting
11.4.1. Custom casting policies
12. API reference
12.1. Options
12.2. Helpers
12.3. PopTorch Ops
12.4. Model wrapping functions
12.5. Parallel execution
12.6. Optimizers
12.7. Data batching
12.8. Enumerations
12.9. Autocasting
13. Index
14. Legal notices
15. Changelog
15.1. v3.0 (Poplar SDK 3.0)
15.1.1. New features
15.1.2. API changes
15.1.3. Bug Fixes
15.2. v2.6 (Poplar SDK 2.6)
15.2.1. New features
15.2.2. API changes
15.2.3. Bug Fixes
15.3. v2.5 (Poplar SDK 2.5)
15.3.1. New features
15.3.2. API changes
15.3.3. Bug Fixes
15.4. v2.4 (Poplar SDK 2.4)
15.4.1. New features
15.4.2. API changes
15.4.3. Bug Fixes
15.5. v2.3 (Poplar SDK 2.3)
15.5.1. New features
15.5.2. Bug Fixes
15.5.3. API changes
15.6. v2.2 (Poplar SDK 2.2)
15.6.1. New features
15.6.2. API changes
15.7. v2.1 (Poplar SDK 2.1)
15.7.1. New features
15.7.2. API changes
15.7.3. Known issues
15.8. v2.0 (Poplar SDK 2.0)
15.8.1. New features
15.8.2. API changes
15.9. v1.0 (Poplar SDK 1.4)
15.9.1. New features
15.9.2. Known issues
15.10. v0.1 (Poplar SDK 1.3)
15.10.1. New features
PyTorch for the IPU: User Guide
Index
_
|
A
|
B
|
C
|
D
|
E
|
F
|
G
|
H
|
I
|
J
|
L
|
M
|
N
|
O
|
P
|
R
|
S
|
T
|
U
|
V
_
__call__() (poptorch.PoplarExecutor method)
__init__() (poptorch.AsynchronousDataAccessor method)
(poptorch.Block method)
(poptorch.CPU method)
(poptorch.DataLoader method)
(poptorch.optim.Adam method)
(poptorch.optim.AdamW method)
(poptorch.optim.LAMB method)
(poptorch.optim.RMSprop method)
(poptorch.optim.SGD method)
(poptorch.ParallelPhasedExecution method)
(poptorch.Phase method)
(poptorch.PipelinedExecution method)
(poptorch.SerialPhasedExecution method)
(poptorch.Stage method)
_DistributedOptions (class in poptorch.options)
_JitOptions (class in poptorch.options)
_PrecisionOptions (class in poptorch.options)
_TensorLocationOptions (class in poptorch.options)
_TrainingOptions (class in poptorch.options)
A
accumulationAndReplicationReductionType() (poptorch.options._TrainingOptions method)
Adam (class in poptorch.optim)
AdamW (class in poptorch.optim)
anchorTensor() (poptorch.Options method)
appendToLocationExcludes() (poptorch.Options method)
AsynchronousDataAccessor (class in poptorch)
attachToDevice() (poptorch.PoplarExecutor method)
autocast (class in poptorch.autocasting)
autocastEnabled() (poptorch.options._PrecisionOptions method)
autocastPolicy() (poptorch.options._PrecisionOptions method)
autoRoundNumIPUs() (poptorch.Options method)
AutoStage (class in poptorch)
availableMemoryProportions() (poptorch.MultiConv method)
B
BeginBlock (class in poptorch)
Block (class in poptorch)
BlockFunction() (in module poptorch)
blocks (poptorch.Stage property)
broadcastBuffers() (poptorch.Options method)
C
Channel (class in poptorch.profiling)
clone() (poptorch.Options method)
combinedBatchSize (poptorch.DataLoader property)
compile() (poptorch.PoplarExecutor method)
compileAndExport() (poptorch.PoplarExecutor method)
Compiler (class in poptorch)
configureProcessId() (poptorch.options._DistributedOptions method)
ConnectionType (class in poptorch)
connectionType() (poptorch.Options method)
copyWeightsToDevice() (poptorch.PoplarExecutor method)
copyWeightsToHost() (poptorch.PoplarExecutor method)
copyWeightsToHostIfNeeded() (poptorch.PoplarExecutor method)
CPU (class in poptorch)
ctc_beam_search_decoder() (in module poptorch)
custom_op (class in poptorch)
cycleBackOff() (poptorch.MultiConv method)
cycleCount() (poptorch.PoplarExecutor method)
D
DataLoader (class in poptorch)
DataLoaderMode (class in poptorch)
defaultOutputMode() (poptorch.Options method)
destroy() (poptorch.PoplarExecutor method)
detachFromDevice() (poptorch.PoplarExecutor method)
deviceIterations() (poptorch.Options method)
disable() (poptorch.options._DistributedOptions method)
disableModuleNamescope() (poptorch.Options method)
Distributed (poptorch.Options property)
dynamic_slice() (in module poptorch)
E
enableConvDithering() (poptorch.MultiConv method)
enableExecutableCaching() (poptorch.Options method)
enableFloatingPointExceptions() (poptorch.options._PrecisionOptions method)
enableProfiling() (poptorch.Options method)
enableStableNorm() (poptorch.Options method)
enableStochasticRounding() (poptorch.options._PrecisionOptions method)
enableSyntheticData() (poptorch.Options method)
execute() (poptorch.CPU method)
F
for_loop() (in module poptorch)
G
getComputeLatency() (poptorch.PoplarExecutor method)
getHostIpuLatency() (poptorch.PoplarExecutor method)
getIpuHostLatency() (poptorch.PoplarExecutor method)
getLatency() (poptorch.PoplarExecutor method)
getPerfCounters() (poptorch.PoplarExecutor method)
getTensorNames() (poptorch.PoplarExecutor method)
gradientAccumulation() (poptorch.options._TrainingOptions method)
H
halfFloatCasting() (poptorch.options._PrecisionOptions method)
HalfFloatCastingBehavior (class in poptorch)
hasMLIRSupportOnPlatform (class in poptorch)
I
identity_loss() (in module poptorch)
inferenceModel() (in module poptorch)
instrument() (poptorch.profiling.Channel method)
ipu() (poptorch.Stage method)
ipu_print_tensor() (in module poptorch)
ipuHardwareIsAvailable() (in module poptorch)
ipuHardwareVersion() (in module poptorch)
ipus() (poptorch.Phase method)
isAttachedToDevice() (poptorch.PoplarExecutor method)
isCompiled() (poptorch.PoplarExecutor method)
isConstant() (poptorch.optim.VariableAttributes method)
isRunningOnIpu() (in module poptorch)
J
Jit (poptorch.Options property)
L
LAMB (class in poptorch.optim)
Liveness (class in poptorch)
load() (in module poptorch)
load_state_dict() (poptorch.PoplarExecutor method)
loadExecutable() (poptorch.PoplarExecutor method)
loadFromFile() (poptorch.Options method)
logCycleCount() (poptorch.Options method)
logDir() (poptorch.Options method)
M
markAsConstant() (poptorch.optim.VariableAttributes method)
markAsVariable() (poptorch.optim.VariableAttributes method)
MatMulSerializationMode (class in poptorch)
MeanReductionStrategy (class in poptorch)
minElementsForOffChip() (poptorch.TensorLocationSettings method)
minElementsForReplicatedTensorSharding() (poptorch.TensorLocationSettings method)
model (poptorch.PoplarExecutor property)
modelName() (poptorch.Options method)
MultiConv (class in poptorch)
MultiConvPlanType (class in poptorch)
N
NameScope (class in poptorch)
nop() (in module poptorch)
numIOTiles() (poptorch.options._TensorLocationOptions method)
numProcesses (poptorch.options._DistributedOptions property)
O
Options (class in poptorch)
options (poptorch.DataLoader property)
(poptorch.PoplarExecutor property)
OutputMode (class in poptorch)
outputMode() (poptorch.Options method)
OverlapMode (class in poptorch)
P
ParallelPhasedExecution (class in poptorch)
partialsTypes() (poptorch.MultiConv method)
perConvReservedTiles() (poptorch.MultiConv method)
Phase (class in poptorch)
phase() (poptorch.ParallelPhasedExecution method)
(poptorch.SerialPhasedExecution method)
PipelinedExecution (class in poptorch)
planType() (poptorch.MultiConv method)
Policy (class in poptorch.autocasting)
PoplarExecutor (class in poptorch)
Precision (poptorch.Options property)
processId (poptorch.options._DistributedOptions property)
R
randomSeed() (poptorch.Options method)
recomputationCheckpoint() (in module poptorch)
ReductionType (class in poptorch)
registerPersistentData() (poptorch.CPU method)
relaxOptimizerAttributesChecks() (poptorch.Options method)
removeBlocks() (in module poptorch)
replicationFactor() (poptorch.Options method)
RMSprop (class in poptorch.optim)
rng_state (poptorch.PoplarExecutor property)
runningStatisticsAlwaysFloat() (poptorch.options._PrecisionOptions method)
S
save() (poptorch.PoplarExecutor method)
serializedMatMul() (in module poptorch)
SerialPhasedExecution (class in poptorch)
set_available_memory() (in module poptorch)
set_overlap_for_input() (in module poptorch)
set_overlap_for_output() (in module poptorch)
setAccumulatorLocation() (poptorch.options._TensorLocationOptions method)
setActivationLocation() (poptorch.options._TensorLocationOptions method)
setAutomaticLossScaling() (poptorch.options._TrainingOptions method)
setAvailableMemoryProportion() (poptorch.Options method)
setConvolutionDithering() (poptorch.options._TrainingOptions method)
setEnvVarNames() (poptorch.options._DistributedOptions method)
setExecutionStrategy() (poptorch.Options method)
setLogLevel() (in module poptorch)
setMeanAccumulationAndReplicationReductionStrategy() (poptorch.options._TrainingOptions method)
setOptimizer() (poptorch.PoplarExecutor method)
setOptimizerLocation() (poptorch.options._TensorLocationOptions method)
setPartialsType() (poptorch.options._PrecisionOptions method)
setTensorsLiveness() (poptorch.SerialPhasedExecution method)
setWeightLocation() (poptorch.options._TensorLocationOptions method)
SGD (class in poptorch.optim)
ShardedExecution (class in poptorch)
SharingStrategy (class in poptorch)
showCompilationProgressBar() (poptorch.Options method)
sourceLocationExcludes() (poptorch.Options method)
Stage (class in poptorch)
stage() (poptorch.ParallelPhasedExecution method)
(poptorch.PipelinedExecution method)
(poptorch.SerialPhasedExecution method)
(poptorch.ShardedExecution method)
step() (poptorch.optim.LAMB method)
SyncPattern (class in poptorch)
syncPattern() (poptorch.Options method)
T
TensorLocations (poptorch.Options property)
TensorLocationSettings (class in poptorch)
terminate() (poptorch.AsynchronousDataAccessor method)
(poptorch.DataLoader method)
traceModel() (poptorch.options._JitOptions method)
tracepoint() (poptorch.profiling.Channel method)
Training (poptorch.Options property)
trainingModel() (in module poptorch)
U
useAutoId() (poptorch.Block static method)
useIOTilesToLoad() (poptorch.TensorLocationSettings method)
useIOTilesToStore() (poptorch.TensorLocationSettings method)
useIpuId() (poptorch.Options method)
useIpuModel() (poptorch.Options method)
useOfflineIpuTarget() (poptorch.Options method)
useOnChipStorage() (poptorch.TensorLocationSettings method)
useReplicatedTensorSharding() (poptorch.TensorLocationSettings method)
useSeparateBackwardPhase() (poptorch.ParallelPhasedExecution method)
(poptorch.SerialPhasedExecution method)
V
VariableAttributes (class in poptorch.optim)