5.9. PopLibs

3.1.0

New features

Note

The two FP8 datatypes are F143 and F152 where the three digits indicate the number of bits used to represent the sign, exponent and mantissa respectively.

  • poplin:

    • Added option to use experimental expand dims pre-convolution transformation that can avoid rearranging inputs in some cases (off by default).

    • Added convolution support for F143 and F152 FP8 inputs and weights.

    • Implemented QR Factorization. This functionality is experimental.

  • popops:

    • Added support for multiple outputs in map expressions.

    • Added setMetadataTensor function to combine an integer scale with a constant metadata format for use with F143 and F152 FP8 tensors.

    • Added new element-wise unary operation exp2 (power of 2).

  • popsparse:

    • Added support for sparse-dense and dense-sparse matrix multiplication where the sparsity structure of the sparse operand does not change. Only square block sizes of 1, 4, 8 and 16 are supported.

Bug Fixes

  • poplin:

    • Minor optimisation to avoid some expensive target copies - small improvement in release builds and large improvement in debug builds.

    • Correction to quarter input convolution vertex.

  • popops:

    • Fixed the assembler version of the FP8 popops::cast operation which gave incorrect results in previous releases.

Other improvements

  • poplin:

    • Performance improvements of Triangular Solve algorithm.

    • Improved addConstant performance by 10x (and 20x when broadcasting) when adding large constants (some models saw a 20-40% compilation time reduction), and added stricter value checking to catch unintentional truncation of initial values.

  • popops:

    • Added simple usage documentation for dynamic slice & update.

Known issues

None

Compatibility changes

None

3.0.0

New features

  • poplin:

    • Add new options to measure accuracy of the planner by comparing the planner to the profiled output in the single_conv_layer tool.

    • Improve common error messages to identify if the error was for a convolution or matrix multiply.

  • popops:

    • Add support to cast between FP8 and FP32.

    • Optimise gather, scatter and other operations which have padding indices.

Bug Fixes

  • popnn:

    • Fixed incorrect arithmetic expression in popnn the calculation of nonlinearity estimated cycles.

  • popops

    • Fixed bug in popops::multiUpdatemax which returned the wrong value.

  • poprand

    • Fixed Poplar rand to ensure random numbers are generated within the requested range.

Other improvements

  • Rename movz to movnz. Due to an initial bug in the ISA, movnz was incorrectly named as movz. movz is now deprecated.

  • Poplar vertices can now return a void instead of a bool. Multiple PopLibs vertices have been updated to return void.

  • poplin

    • Improve documentation for layerNormStatistics.

    • Add better error reporting of valid types for convolution and matmul calls.

  • popops

    • Balance work across worker threads for scatters to improve performance.

    • Added an assembler implementation of the casts between FP8 and FP16 that improves performance by a factor of 36.

Known issues

None

Compatibility changes

None