5.1.8. Poplar libraries (PopLibs) changelog

2.6.0+5997

New features

  • Added support for 8-bit floating point numbers for casts, convs, matmuls and data ops like dynamic slice and transpose

  • Extended the convolution planner to support serial splits of any amount

  • Added a stable version of the top-k op

Bug fixes

  • Fixed a bug in LSTMs where the final cell had an incorrect result when using variable time steps

  • Fixed an incorrect compile time error when using literals in a map expression

  • Fixed a bug where strided reductions didn’t account for input partials offsets when merging reductions for a single vertex

  • Fixed a bug that prevented upsampling in half precision

Other improvements

  • Added an option to the histogram op to specify output tensor type

  • Improved documentation of the triangular solve ops

  • Extended the range supported by the nx1 convolution kernel

  • Improve the convolution library to elide some copies when the expandDims transfom is used

  • Ported more kernels to be of type MultiVertex

  • Added new methods for CTC validation that don’t use beam search

  • Extended the embedding layer planner to serialise embeddings when the temporary memory requirements would not fit within the tile memory

  • Added a method to dump the C++ kernel generated by a map expression

  • Added output allocator methods for elementwise and convolution ops

  • Improved the error messages generated when convolution validation fails

  • Added an option to the convolutions to disable stochastic rounding

  • Improved the PVTI output for convolutions by attaching op specific metadata to the event

  • Prevented the user from being able to lose precision with the partial type when using the block sparse library

2.5.1

New features

  • Added support for the ROIAlign layer

  • Added support for a stable sort using the new bitonic sort algorithm

  • Extended embedding layer to support groups

Bug fixes

  • Fixed a segfault that could happen for reductions

  • Fixed incorrect documentation of the return type of the random functions

  • Fixed incorrect documentation for building the third-party dependencies in the README

  • Fixed an issue in the CTC planner where it used the wrong memory estimate for the reduction

  • Added DebugContext in the fill operation

Other improvements

  • Optimised the scaled add codelets to utilise interleaved memory

  • Improved support for parallelising a transpose across workers

  • Prevent the partials type from being smaller than the output type in all layers

  • Attached user source location to PopLibs exceptions

  • Optimisations to the ERF layer

  • Added int32 support to the power elementwise operation

  • Improvements for MultiSlice when given a single offset

  • Added a default memory proportion to the embedding planner