5.1.9. Graphcore Communication Library (GCL) changelog
2.6.0+5997
New features
Started supporting orthogonal groups using store and forward when
GCL_OPTION
syncful.useStoreAndForward
is set totrue
Added
toPopopsOperation()
in Collective APIAdded the possibility to create GraphViz .gv files from crossReplicaCopy() maps when
GCL_CROSS_REPLICA_COPY_GRAPH_PATH
environment variable is set to visualize cross replica communication patternsCollective Balanced Reorder API: Added setter for
gatheredToRefSlices
Collective Balanced Reorder API: Added a size safe API
Implemented and enabled multi-phase ReduceScatter
Broadcast method is now used when collective method is auto for small tensors and small group sizes
Bug fixes
Fixed readTileMemory
Fixed
CommGroup::size()
logging for orthogonal groupsVerify optionals after unidirectional ring calls
Fixed unchecked optional accesses
Avoided exceptions in
fromPublic()
Set proper initial value for logical op in TestIO.cpp
Other improvements
Started using multislicing in broadcast AllReduce when group size is greater than 2
Parameterized grain sizes for bidirectional reduceScatter
Added ReduceScatter/AllGather information to DebugNameAndInfo structure
Adjusted ringAllReduce grain size based on which collective method is used
Improved error handling of floating point conversions
Multiple improvements to the documentation
Added dispatcher logging for multi phase collectives
Performance improvements to
meetInMiddle`
inCollectivesProgram
2.5.0
New features
Extended GCL group API to include interleaved groups
Added a broadcast/oneToAll collective
Added handling for GCL_OPTIONS environment variable
Added support for many tensor multi phase reductions
Several latency improvements for GW-Links traffic
Bug fixes
Fixed grain size used in Collective Balanced Reorder API for multi phase AllReduce
Fixed SQUARE_ADD operation for multi phase AllReduce
Fixed uneven use of GW-Links on IPU-POD128 system
Other improvements
Added syncful.useOptimisedLayout GCL option
Multiple improvements to GCL’s memory footprint
Added support for n-phased cycle counts
Parallelised host side result validation
Relaxed mapping requirements for non-replicated collectives
Exposed concatChunks in the Collectives API
Added guards preventing modifications of input tensor
Added a GCL code example to the Poplar and PopLibs User Guide