5.9. PopART

3.3.0

New features

  • Added support for the largest=false option in the TopK operation.

  • Added the NormalizeImage op that normalises the data and pads to four channels on-device after the data has been transferred to the IPU. This can enable better performance for subsequent convolution operations.

  • Added the stashAllTensorsInferencePipeline option to the SessionOptions class to enable all tensors to be stashed when doing inference. This may improve performance for certain use cases.

  • Added the ability to fine-tune the auto virtual graph result with the virtualGraphSplitRatios option in the SessionOptions class.

  • Added the SplineBasis and SplineWeighting ops to support the PyTorch Geometric SplineConv operator.

  • Switched use of the deprecated deviceIteration to deviceIterations.

Bug Fixes

  • Fixed the Loop and Scan ops so that any constant tensors are implicitly added into the subgraphs of these ops as inputs.

  • Fixed the overflow in the Clip operator.

Other improvements

None

Known issues

None

3.2.1

New features

  • Added the environment variable POPART_PRELOAD_POPEF. When this is set to “full-preload” it will cause a full sequential read of a PopEF file from the filesystem, enabling subsequent operations to be carried out on a cached file. This environment variable makes it viable to store PopEF files on S3 and have them loaded on demand when there is a cache hit.

3.2.0

New features

  • Removed dependency on snap library. The related popart::PopOpx class has been removed. See Compatibility changes for more information.

  • Add support for scatter reduction mul operation.

Bug Fixes

  • Modify the PopART implementation of scatterreduce to match the PyTorch scatter_reduce implementation on the CPU.

Other improvements

  • Add support for grouped gather operations.

  • Add support for grouped scatterreduce operations.

  • Use multiUpdate instead of scatter in topk gradient implementation (improves performance for torch.max).

  • To save memory, prevent weight duplication for inference. The weight tensor can be treated as a constant for inference.

  • Add useLoopCandidateCreator for weights shared by loop operators and non-loop operators, in order to optimize the layout.

Known issues

None

Compatibility changes

  • The popart::PopOpx class has been removed. Please change your custom ops to inherit from popart::Opx instead.