Poplar and PopLibs
|
Helper class to reorder a tensor in a per-tile-balanced fashion such that each replica obtains (for inputs to AllGather or outputs of ReduceScatter) an equally sized 1D tensor with equally sized regions. More...
#include <CollectiveBalancedReorder.hpp>
Public Member Functions | |
CollectiveBalancedReorder (poplar::Graph &graph_, poplar::Tensor tensor_, unsigned replicationFactor_, const poplar::DebugNameAndId &dnai_, bool allowElementMap=false, unsigned grainSize=1) | |
Constructor. More... | |
poplar::Tensor | createReplicaSlice (const poplar::Type &type) |
Create a tensor mapped efficiently over the same tiles as the reference tensor. More... | |
poplar::Tensor | createCollectivesTensor (const poplar::Type &type, const std::string &debugPrefix) |
Create a tensor mapped efficiently over the same tiles as the reference tensor. More... | |
poplar::Tensor | undoRearrangeForCollective (const poplar::Tensor &tensor) const |
Reorder tensor back into the expected IR tensor shape and order. More... | |
std::vector< std::size_t > | getReferenceShape () const |
Get the shape of the reference tensor. More... | |
const CollectiveBalancedHostRearrangement & | getHostRearrangement () const |
Get a helper class that implements allows to apply the rearrangement on the host. More... | |
void | zeroPaddingInCollectiveTensor (poplar::Tensor &collectiveTensor, poplar::program::Sequence &prog) const |
Zero the padding of a collective friendly tensor. More... | |
Helper class to reorder a tensor in a per-tile-balanced fashion such that each replica obtains (for inputs to AllGather or outputs of ReduceScatter) an equally sized 1D tensor with equally sized regions.
This helper class reduces the memory used by the syncful collective. The reordering process:
gcl::CollectiveBalancedReorder::CollectiveBalancedReorder | ( | poplar::Graph & | graph_, |
poplar::Tensor | tensor_, | ||
unsigned | replicationFactor_, | ||
const poplar::DebugNameAndId & | dnai_, | ||
bool | allowElementMap = false , |
||
unsigned | grainSize = 1 |
||
) |
Constructor.
graph_ | The poplar graph. |
tensor_ | The reference tensor to rearrange. |
replicationFactor_ | The replication factor of the graph. |
dnai_ | Debug name and id. |
allowElementMap | Allow alternative representation of the host rearrangements. Sometimes it is beneficial to collapse all intervals into simple 1-to-1 element map. This flag should be set true in all new code and deprecated when all frameworks implement serialisation of newly added elementMap field. |
grainSize | The grain size to use when padding the tensor. |
poplar::Tensor gcl::CollectiveBalancedReorder::createCollectivesTensor | ( | const poplar::Type & | type, |
const std::string & | debugPrefix | ||
) |
Create a tensor mapped efficiently over the same tiles as the reference tensor.
The returned tensor has the size of the input of the reduce scatter and of the result of the all gather.
type | The type to use when creating the tensor. |
debugPrefix | The debug prefix. |
poplar::Tensor gcl::CollectiveBalancedReorder::createReplicaSlice | ( | const poplar::Type & | type | ) |
Create a tensor mapped efficiently over the same tiles as the reference tensor.
The returned tensor has the size of the result of the reduce scatter and of the input of the all gather.
type | The type to use when creating the tensor. |
|
inline |
Get a helper class that implements allows to apply the rearrangement on the host.
|
inline |
Get the shape of the reference tensor.
poplar::Tensor gcl::CollectiveBalancedReorder::undoRearrangeForCollective | ( | const poplar::Tensor & | tensor | ) | const |
Reorder tensor back into the expected IR tensor shape and order.
tensor | The tensor to rearrange. |
void gcl::CollectiveBalancedReorder::zeroPaddingInCollectiveTensor | ( | poplar::Tensor & | collectiveTensor, |
poplar::program::Sequence & | prog | ||
) | const |
Zero the padding of a collective friendly tensor.
collectiveTensor | The collective tensor to zero the padding of. |
prog | The sequence to add the zeroing program to. |