5.9. PopLibs
3.1.0
New features
Note
The two FP8 datatypes are F143 and F152 where the three digits indicate the number of bits used to represent the sign, exponent and mantissa respectively.
poplin:
Added option to use experimental expand dims pre-convolution transformation that can avoid rearranging inputs in some cases (off by default).
Added convolution support for F143 and F152 FP8 inputs and weights.
Implemented QR Factorization. This functionality is experimental.
popops:
Added support for multiple outputs in map expressions.
Added
setMetadataTensor
function to combine an integer scale with a constant metadata format for use with F143 and F152 FP8 tensors.Added new element-wise unary operation
exp2
(power of 2).
popsparse:
Added support for sparse-dense and dense-sparse matrix multiplication where the sparsity structure of the sparse operand does not change. Only square block sizes of 1, 4, 8 and 16 are supported.
Bug Fixes
poplin:
Minor optimisation to avoid some expensive target copies - small improvement in release builds and large improvement in debug builds.
Correction to quarter input convolution vertex.
popops:
Fixed the assembler version of the FP8
popops::cast
operation which gave incorrect results in previous releases.
Other improvements
poplin:
Performance improvements of Triangular Solve algorithm.
Improved
addConstant
performance by 10x (and 20x when broadcasting) when adding large constants (some models saw a 20-40% compilation time reduction), and added stricter value checking to catch unintentional truncation of initial values.
popops:
Added simple usage documentation for dynamic slice & update.
Known issues
None
Compatibility changes
None
3.0.0
New features
poplin:
Add new options to measure accuracy of the planner by comparing the planner to the profiled output in the
single_conv_layer
tool.Improve common error messages to identify if the error was for a convolution or matrix multiply.
popops:
Add support to cast between FP8 and FP32.
Optimise gather, scatter and other operations which have padding indices.
Bug Fixes
popnn:
Fixed incorrect arithmetic expression in popnn the calculation of nonlinearity estimated cycles.
popops
Fixed bug in
popops::multiUpdatemax
which returned the wrong value.
poprand
Fixed Poplar
rand
to ensure random numbers are generated within the requested range.
Other improvements
Rename
movz
tomovnz
. Due to an initial bug in the ISA,movnz
was incorrectly named asmovz
.movz
is now deprecated.Poplar vertices can now return a void instead of a bool. Multiple PopLibs vertices have been updated to return void.
poplin
Improve documentation for
layerNormStatistics
.Add better error reporting of valid types for convolution and matmul calls.
popops
Balance work across worker threads for scatters to improve performance.
Added an assembler implementation of the casts between FP8 and FP16 that improves performance by a factor of 36.
Known issues
None
Compatibility changes
None