5.1. PopTorch

3.2.0

Upgrade supported torch version from 1.13.0 to 1.13.1.
Added support for automatic fusion of scatter operations into a grouped scatter operation to improve performance.
Support for batch_sampler in poptorch.DataLoader.
Support for torch.linalg.norm operations:
- torch.linalg.norm: partial support
  
  2-norm and nuclear norm are unsupported for matrices.
- torch.linalg.matrix_norm: partial support
  
  2-norm and nuclear norm are unsupported.
- torch.linalg.vector_norm: supported
Update support for latest PyTorch norm op implementation from torch.linalg.norm.
Add support for torch.Tensor.index_reduce.
Add poptorch.dynamic_update function.
Add HeteroData support in DataLoaders.
Allow the values of poptorch.Options to be set via an environment variable.

Calling the loadFromFile method twice on the same poptorch.Options object now has well-defined behaviour.
PopTorch replica-sharded variables fail when copying optimiser state to host.
Cannot access the data pointer of a Tensor that doesn’t have storage.
Fix dataloader rebatched size in async mode when batch size is equal 1.
Fix the implementation of scatter_reduce to match the PyTorch implementation on the CPU.

None

None

Upgraded from PyTorch 1.10 to 1.13.
Added support for variables being sharded across replicas.
poptorch.set_overlap_for_input and poptorch.set_overlap_for_output can now be applied to tuples, lists, and dicts of tensors.
PopTorch now catches aten::lstm directly when compiling with dispatch for PopART, allowing set_available_memory to work with it.
Added support for aten::index_fill_.int_Scalar.
Added support for dict inputs.
Added support for torch.count_nonzero.
Support the tanh approximation for GELU.
Added support for torch.scatter_reduce operation.

Fixed clamp_max in cases where the max is large.
Fixed shape inference failing on PopART for argsort, GRU and norm ops.
Fixed shape inference for strided slices.
Fixed casting of groupnorm.
Fixed an issue where the alpha and beta arguments were flipped for torch.addmm.
Fixed a “not representable” error when using BCEWithLogitsLoss with a dtype of half.
Fixed intermittent compilation hang caused by tqdm (progress bar).

Fixed in-place modification of slice regions.
Documentation typo fixes and clarifications.
Improved error message when encountering CPU tensors.
Use the IPU DispatchKey instead of the XLA DispatchKey, which means that error messages will now mention IPU rather XLA.

None

Dropped support for Python 3.6 (in order to upgrade to PyTorch 1.13).
Removed support for torch.jit.trace(). For help on migration issues when using the dispatcher frontend, see the Legacy tracing frontend section in the 3.0.0 version of the PyTorch for the IPU: User Guide.
Removed support for building on CentOS 7.x.
Removed the Autocast API (this was only available when using the tracing frontend).