13. Changelog
13.1. v2.3 (Poplar SDK 2.3)
13.1.1. New features
Support for
torch.bitwise_and
,torch.bitwise_or
,torch.bitwise_xor
Support for
torch.logical_and
,torch.logical_or
,Support K-dimensional NLLLoss, K-dimensional CrossEntropyLoss
Support for non-default affine parameter flags in normalisation ops
Support for
torch.Tensor.T
Support for
torch.bool
intorch.zeros
,torch.zeros_like
,torch.ones
,torch.ones_like
Support for
torch.scatter
and its in-place variantSupport for in-place modification to buffers on IPU
Support for taking slices of scalars
Support version of bilinear upsampling specifying intended output size instead of scale factors
Add support for overlapping host IO on inputs via
poptorch.set_overlap_for_input()
.Add option for setting number of IO tiles via
numIOTiles
inpoptorch.Options
(required foruseIOTilesToLoad()
andpoptorch.set_overlap_for_input()
.)Improve PopTorch’s parity with PyTorch’s Softplus
Improve implementation of torch.SiLU by using Poplar’s Swish operator
Additional support for operation overloads
Fix issue where PopTorch recalculated upsampling scales in fp16
Fix issue where the last use of
poptorch.set_available_memory
would be prunedAdd documentation on available memory proportion to incorporate embeddings and indexing operations
Add documentation on how users can generate debug information
Support replicated tensor sharding when running on multiple processes
Allow selection for a non-constant x input.
Support for
enableConvDithering
convolution option
13.1.2. API changes
Default mean reduction strategies have changed from the deprecated PostAndLoss strategy to Post or Running based on optimiser accumulation type
Mean reduction strategy can now be set via
poptorch.Options.Training.setMeanAccumulationAndReplicationReductionStrategy
.Add warning that IPU-specific optimiser states cannot be read from the host, when calling
get_state()
on poptorch.optim optimisers
13.2. v2.2 (Poplar SDK 2.2)
13.2.1. New features
Migrated to PyTorch version 1.9.0
Support for
torch.roll
Support for
torch.clone
Add modelName session option that can be passed to PopART
Support List inputs to a model
Tuples/Lists of constants can now be returned by a model
Add
enableProfiling
convenience method inpoptorch.Options
to enable profile report generationFix bug with
torch.Tensor.repeat
when applied to an input during trainingFix bug with
aten::to
when applied to a constant used as an input to another nodeImproved error message when encountering untraceable types during compilation
Support for
torch.gather
. Please note: this operator is known to cause long compilation times. Consider using a onehot-based solution instead ortorch.index_select
if appropriate.Using a convolution layer op with the value of
padding
greater than or equal tokernel_size`
is now supported.Support for Poplar recoverable and unrecoverable errors.
Support for
torch.flip
.Support for
torch.Tensor.new_ones
andtorch.Tensor.new_zeros
13.2.2. API changes
Removed
accumulationReductionType
which was deprecated in 2.1 in favour ofaccumulationAndReplicationReductionType
inpoptorch.Options.Training
Removed
runningVarianceAlwaysFloat
which was deprecated in 2.1 and replaced byrunningStatisticsAlwaysFloat
inpoptorch.Options.Precision
,
13.3. v2.1 (Poplar SDK 2.1)
13.3.1. New features
Support for
torch.unbind
Add option to set
poptorch.Options
using options specified in a config file.Add
mode=poptorch.DataLoaderMode.AsyncRebatched
Support for PopART name scopes via
poptorch.NameScope
Add mixed precision automatic casting
Support for
torch.cross
Support for
torch.functional.one_hot
Support for
torch.int8
data typesSupport for
torch.median
Support for
torch.index_select
Support for
torch.scatter_add
Add
poptorch.Options.Precision.enableFloatingPointExceptions
to control floating point exception behaviorSupport for inplace changes to inputs.
Add option to log the number of IPU cycles used in executing the main graph
Support for
torch.nn.GRU
Add automatic loss scaling option which can be enabled via
poptorch.Options.Training.setAutomaticLossScaling
.Add
poptorch.BlockFunction
decorating for assigning an existing function to a block.Add mechanism for inspecting arbitrary tensors
Add custom operator for CTC beam search decoding:
poptorch.ctc_beam_search_decoder
Add a separate tensor variant (now default) to the SGD optimiser.
Add a TensorFlow variant to the RMSProp optimiser.
13.3.2. API changes
Removed
Options.Popart
which was deprecated in v2.0 and replaced withOptions._Popart
Removed
MultiConvPartialsType
which was deprecated in v2.0Deprecated
poptorch.Options.Training.accumulationReductionType
in favour ofpoptorch.Options.Training.accumulationAndReplicationReductionType
Deprecated
runningVarianceAlwaysFloat
in favour ofrunningStatisticsAlwaysFloat
inpoptorch.Options.Precision
, as this new option computes both the running mean and variance in FP32 when this option is set toTrue
.Use of SGD via PyTorch’s or PopTorch’s API now results in use of the new separate tensor variant by default. To revert to the previous default variant, use
poptorch.optim.SGD
withuse_combined_accum=True
.
13.3.3. Known issues
Using a convolution layer op with the value of
padding
greater than or equal tokernel_size`
results in an error when training. Use a constant pad layer instead of the excess padding prior to the convolution.
13.4. v2.0 (Poplar SDK 2.0)
13.4.1. New features
Support for the following activation functions:
torch.nn.acosh
torch.nn.asinh
torch.nn.atanh
torch.nn.Hardshrink
torch.nn.SiLU
torch.nn.Softplus
torch.nn.Softshrink
torch.nn.Threshold
Support for the following random sampling operations:
torch.bernoulli
torch.distributions.Bernoulli
Experimental support for torch.nn.CTCLoss
Add Adam optimizer
Support for
torch.nn.AdaptiveAvgPool1d
,torch.nn.AdaptiveAvgPool3d
Migrated to PyTorch version 1.7.1
Support for
aten::index
,aten::index_put_
Support for
torch.zeros_like
,torch.ones_like
Allow the user to specify which Optimizer attributes are constant or not.
Allow the user to specify
mode=poptorch.DataLoaderMode.Async
inpoptorch.DataLoader
constructor instead of explicitly creating an AsynchronousDataAccessorSupport for
torch.nn.EmbeddingBag
Support for
torch.clamp_max
andtorch.clamp_min
Support for
torch.min(tensor, dim=.*, keepdim=.*)
andtorch.max(tensor, dim=.*, keepdim=.*)
overloads.Support for
poptorch.isRunningOnIpu
. This function returnsTrue
when executing on IPU andFalse
when executing the model outside IPU scope.Support for
torch.amax
andtorch.amin
Support for attributes in custom ops.
Support for precompilation and reloading exported executables (
poptorch.PoplarExecutor.compileAndExport
andpoptorch.load
)Support for slices with variable start index (slice size must be constant).
Add
ipuHardwareVersion
function to read the version of the IPU hardware present on the system.Changed default targetd Ipu version for the model and offline compilation to
2
.Changed
accumulationReductionType(reduction)
option to now apply to replication reduction as wellAdd environment variable
POPTORCH_CACHE_DIR
Support for
torch.fmod
, andtorch.remainder
Support for
torch.addcdiv
Support for
torch.bitwise_not
13.4.2. API changes
Deprecated
Options.Popart
,Options._Popart
may be used experimentally.
13.5. v1.0 (Poplar SDK 1.4)
13.5.1. New features
Support for torch.nn.InstanceNorm1d, torch.nn.InstanceNorm2d and torch.nn.InstanceNorm3d
Fixed issue with torch.nn.GroupNorm where only 4-dimensional inputs could be used
Replaced Adam with AdamW optimizer.
Support for the following loss functions:
torch.nn.KLDivLoss
torch.nn.PoissonNLLLoss
torch.nn.HingeEmbeddingLoss
torch.nn.BCEWithLogitsLoss
torch.nn.SmoothL1Loss
torch.nn.SoftMarginLoss
torch.nn.CosineEmbeddingLoss
torch.nn.MarginRankingLoss
torch.nn.TripletMarginLoss
torch.nn.NLLLoss for aten::nll_loss2d
Support for torch.optim.RMSprop optimizer
Support for bool inputs to models
Improved support for half type models and inputs.
Using a mix of float 16 and float 32 inputs is now supported. Please see the documentation for cases in which a model might use different types compared to when run natively with PyTorch.
Support for serialized matrix multiplications (poptorch.serializedMatMul)
Support for
POPTORCH_IPU_MODEL_VERSION
environment variable.Support for torch.cumsum
Support for pipelined / phased / sharded execution.
Add PoplarExecutor.compile() to compile the model without executing it.
Use sphinx-build to generate the documentation.
Use Miniconda as build environment.
Support for torch.meshgrid
Support for torch.cartesian_prod
Optimized torch.matmul implementation with limitations
Fused its input 0’s batch dimensions with the row dimension to avoid ReduceSum in its backward pass, for performance purpose
Partial support for torch.einsum
Diagonals and ellipsis notation is unsupported
Support for executable caching: poptorch.Options.enableExecutableCaching()
Add optional title argument to poptorch.ipu_print_tensor
Add len() method to poptorch.AsynchronousDataLoader
Support for LAMB optimizer
Support for recomputationCheckpoint()
Support for torch.tensordot
Support for rounding up the number of IPU used to allow models which specify of number of IPUs which is not a power of 2: poptorch.Options.autoRoundNumIPUs(True) NB, this will reserve but not use IPUs and so it is preferable to specify the model to use a number of IPUs which is a power of two
Optimized torch.matmul implementation with limitations
Fused its input 0’s batch dimensions with the row dimension to avoid ReduceSum in its backward pass, for performance purpose
Support for multi-convolutions with poptorch.MultiConv
Support for PopART batch serialization settings
These can be set via poptorch.Options().Popart.set()
Support for PopVision System Analyser added: tracing can be enabled by setting
PVTI_OPTIONS='{"enable":"true"}'
13.5.2. Known issues
Race condition in
poptorch.DataLoader
when using several workers resulting in the iteration sometimes finishing one element early.Workaround: set
num_workers
to 0 or 1.
poptorch.custom_op()
doesn’t allow the user to set attributes.Workaround: hardcode the attributes in the custom operation or pass them as regular inputs.
Graphs containing block annotations (
poptorch.Block
orpoptorch.BeginBlock
) cannot be exported usingtorch.save()
Workaround: Make a soft copy of the model that doesn’t contain Blocks and use it to save /load the weights. (The weights should be shared between the two models).
Lists of tensors are not supported as inputs.
Workaround: Use tuples instead.
# Use a tuple assert inference_model((t1, t2)) # instead of [t1, t2]
13.6. v0.1 (Poplar SDK 1.3)
13.6.1. New features
PopTorch now exposes PopART anchor options to choose how much data to return from a model. These are passed into the model wrapper via anchor_mode. options are Sum, All, Final and EveryN.
Support for batched LSTM and batch first
An Options object can now be passed to poptorch.trainingModel / poptorch.inferenceModel to configure the session and select IPUs
The ‘profile’ option has been removed, instead profiling can be enabled by setting the environment variable
POPLAR_ENGINE_OPTIONS='{autoReport.all:true, autoReport.directory:.}'
Support for
POPTORCH_IPU_MODEL
andPOPTORCH_WAIT_FOR_IPU
environment variables.Support for the torch comparisons operations:
torch.eq
torch.ge
torch.gt
torch.le
torch.lt
torch.max
torch.min
torch.ne
torch.isnan
torch.topk
torch.min and torch.max only support (tensor, tensor) and (tensor) overloads. They do not support the (tensor, dim=, keepdim=) overload.
torch.topk only supports sorted=False and Largest=True
Automatically synchronise the weights back to the Host after using the IPU for training. (i.e no need to explicitly call copyWeightsToHost() anymore)
Support for non-linear activations torch.nn.PReLU and torch.nn.Hardtanh
Support for Adam optimizer.
Support for half type models and inputs.
Models that require operations on input tensors of mixed precision are not currently supported. For example:
def forward(self, x, y): x # Half y # Float32 return x + y # Not supported.
Support for
tensor.fill_
,torch.full
,torch.full_like
Support for user provided custom operations. See PopART documentation for information on how to write them. They are exposed by
poptorch.custom_op
this takes in a list of input tensors, strings for the PopART op name and domain, the domain version, and a list of tensors the same shape and size as the expected output tensors. This is to ensure the pytorch trace remains valid as it traces on CPU so won’t actually execute the operation when building the graph.Support for torch.nn.Conv1D / torch.nn.Conv2D / torch.nn.Conv3D
Support for torch.nn.Upsample (‘nearest’ mode only)
Support for tensor.size
Support for the following random sampling operations.
torch.rand
torch.uniform_
torch.distributions.Uniform
torch.randn
torch.normal
torch.normal_
For repeatable random number generation use the
randomSeed
method ofpoptorch.Options
Support for torch.clamp
Adds poptorch.DataLoader
Adds optimized poptorch.AsynchronousDataAccessor which allows for a dataloader to be offloaded to a background thread asynchronously.
Support for torch.norm
Upgraded from torch 1.5.0 to torch 1.6.0
Experimental support for single host distributed execution
Add torch.where and tensor.masked_fill