5.1. PopTorch

3.2.0

New features

  • Upgrade supported torch version from 1.13.0 to 1.13.1.

  • Added support for automatic fusion of scatter operations into a grouped scatter operation to improve performance.

  • Support for batch_sampler in poptorch.DataLoader.

  • Support for torch.linalg.norm operations:

    • torch.linalg.norm: partial support

      2-norm and nuclear norm are unsupported for matrices.

    • torch.linalg.matrix_norm: partial support

      2-norm and nuclear norm are unsupported.

    • torch.linalg.vector_norm: supported

  • Update support for latest PyTorch norm op implementation from torch.linalg.norm.

  • Add support for torch.Tensor.index_reduce.

  • Add poptorch.dynamic_update function.

  • Add HeteroData support in DataLoaders.

  • Allow the values of poptorch.Options to be set via an environment variable.

Bug Fixes

  • Calling the loadFromFile method twice on the same poptorch.Options object now has well-defined behaviour.

  • PopTorch replica-sharded variables fail when copying optimiser state to host.

  • Cannot access the data pointer of a Tensor that doesn’t have storage.

  • Fix dataloader rebatched size in async mode when batch size is equal 1.

  • Fix the implementation of scatter_reduce to match the PyTorch implementation on the CPU.

Other improvements

  • Add torch_scatter to compatibility table in PopTorch documentation.

Known issues

None

Compatibility changes

None

3.1.0

New features

  • Upgraded from PyTorch 1.10 to 1.13.

  • Added support for variables being sharded across replicas.

  • poptorch.set_overlap_for_input and poptorch.set_overlap_for_output can now be applied to tuples, lists, and dicts of tensors.

  • PopTorch now catches aten::lstm directly when compiling with dispatch for PopART, allowing set_available_memory to work with it.

  • Added support for aten::index_fill_.int_Scalar.

  • Added support for dict inputs.

  • Added support for torch.count_nonzero.

  • Support the tanh approximation for GELU.

  • Added support for torch.scatter_reduce operation.

Bug Fixes

  • Fixed clamp_max in cases where the max is large.

  • Fixed shape inference failing on PopART for argsort, GRU and norm ops.

  • Fixed shape inference for strided slices.

  • Fixed casting of groupnorm.

  • Fixed an issue where the alpha and beta arguments were flipped for torch.addmm.

  • Fixed a “not representable” error when using BCEWithLogitsLoss with a dtype of half.

  • Fixed intermittent compilation hang caused by tqdm (progress bar).

Other improvements

  • Fixed in-place modification of slice regions.

  • Documentation typo fixes and clarifications.

  • Improved error message when encountering CPU tensors.

  • Use the IPU DispatchKey instead of the XLA DispatchKey, which means that error messages will now mention IPU rather XLA.

Known issues

None

Compatibility changes

  • Dropped support for Python 3.6 (in order to upgrade to PyTorch 1.13).

  • Removed support for torch.jit.trace(). For help on migration issues when using the dispatcher frontend, see the Legacy tracing frontend section in the 3.0.0 version of the PyTorch for the IPU: User Guide.

  • Removed support for building on CentOS 7.x.

  • Removed the Autocast API (this was only available when using the tracing frontend).