Poplar and PopLibs
|
This class contains functions and data necessary to rearrange tensors on the host side at runtime. More...
#include <CollectiveBalancedReorder.hpp>
Public Member Functions | |
CollectiveBalancedHostRearrangement (const CollectiveBalancedHostRearrangement &)=default | |
Defaulted to avoid warnings in deprecation period. | |
CollectiveBalancedHostRearrangement (CollectiveBalancedHostRearrangement &&) noexcept=default | |
Defaulted to avoid warnings in deprecation period. | |
CollectiveBalancedHostRearrangement & | operator= (const CollectiveBalancedHostRearrangement &)=default |
Defaulted to avoid warnings in deprecation period. More... | |
CollectiveBalancedHostRearrangement & | operator= (CollectiveBalancedHostRearrangement &&) noexcept=default |
Defaulted to avoid warnings in deprecation period. More... | |
void | rearrangeForCollective (const void *in, void *out, int64_t elemByteSize) const |
Balanced reorder the tensor in a collective-friendly manner (host-side). More... | |
void | rearrangeForCollective (const void *in, std::size_t inSize, void *out, std::size_t outSize, std::size_t elemByteSize) const |
Balanced reorder the tensor in a collective-friendly manner (host-side). More... | |
template<typename T > | |
void | rearrangeForCollective (const std::vector< T > &in, std::vector< T > &out, std::size_t elemByteSize=sizeof(T)) const |
Balanced reorder the tensor in a collective-friendly manner (host-side). More... | |
void | undoRearrangeForCollective (const void *in, void *out, int64_t elemByteSize) const |
Reorder tensor back into the expected IR tensor shape and order (host-side). More... | |
template<typename T > | |
void | undoRearrangeForCollective (const std::vector< T > &in, std::vector< T > &out, std::size_t elemByteSize=sizeof(T)) const |
Reorder tensor back into the expected IR tensor shape and order (host-side). More... | |
void | undoRearrangeForCollective (const void *in, std::size_t inSize, void *out, std::size_t outSize, std::size_t elemByteSize) const |
Reorder tensor back into the expected IR tensor shape and order (host-side). More... | |
size_t | getNumRearrangedTensorElems () const |
Number of elements in the collective balanced (reordered) tensor. More... | |
void | rearrange (const void *in, void *out, int64_t elemByteSize, bool refToGathered) const |
Host tensor rearrangement routine. More... | |
unsigned | getReplicationFactor () const |
The graph's replication factor. More... | |
void | setReplicationFactor (unsigned replicationFactor) |
The graph's replication factor. More... | |
std::size_t | getTotalElementsPerReplica () const |
The total number for one replica's fragment. More... | |
void | setTotalElementsPerReplica (std::size_t totalElementsPerReplica) |
The total number for one replica's fragment. More... | |
const std::vector< poplar::Interval > & | getGatheredToRefSlices () const |
The mapping from the gathered tensor back to the reference tensor. More... | |
void | setGatheredToRefSlices (std::vector< poplar::Interval > slices) |
Set the mapping from the gathered tensor back to the reference tensor. More... | |
const std::vector< uint32_t > & | getElementMap () const |
Simple indices map for mapping individual elements one by one. More... | |
Public Attributes | |
unsigned | replicationFactor = 0 |
The graph's replication factor. | |
std::size_t | totalElementsPerReplica = 0 |
The total number for one replica's fragment. | |
std::vector< poplar::Interval > | gatheredToRefSlices |
The mapping from the gathered tensor back to the reference tensor. | |
std::vector< uint32_t > | elementMap |
Simple indices map for mapping individual elements one by one. More... | |
Friends | |
class | CollectiveBalancedReorder |
Let CBR be able to update. | |
This class contains functions and data necessary to rearrange tensors on the host side at runtime.
The separation is made so that we can serialize the state and restore it without having to create a poplar::Graph
.
const std::vector< uint32_t > & gcl::CollectiveBalancedHostRearrangement::getElementMap | ( | ) | const |
Simple indices map for mapping individual elements one by one.
It is used instead gatheredToRefSlices for short intervals.
const std::vector< poplar::Interval > & gcl::CollectiveBalancedHostRearrangement::getGatheredToRefSlices | ( | ) | const |
The mapping from the gathered tensor back to the reference tensor.
size_t gcl::CollectiveBalancedHostRearrangement::getNumRearrangedTensorElems | ( | ) | const |
Number of elements in the collective balanced (reordered) tensor.
unsigned gcl::CollectiveBalancedHostRearrangement::getReplicationFactor | ( | ) | const |
The graph's replication factor.
std::size_t gcl::CollectiveBalancedHostRearrangement::getTotalElementsPerReplica | ( | ) | const |
The total number for one replica's fragment.
|
defaultnoexcept |
Defaulted to avoid warnings in deprecation period.
|
default |
Defaulted to avoid warnings in deprecation period.
void gcl::CollectiveBalancedHostRearrangement::rearrange | ( | const void * | in, |
void * | out, | ||
int64_t | elemByteSize, | ||
bool | refToGathered | ||
) | const |
Host tensor rearrangement routine.
in | Pointer to the input buffer. |
out | Pointer to the output buffer. |
elemByteSize | The byte size of the elements. |
refToGathered | Whether to rearrange from reference to gathered or the other way. |
|
inline |
Balanced reorder the tensor in a collective-friendly manner (host-side).
in | Input buffer. |
out | Output buffer. |
elemByteSize | The byte size of the elements. |
void gcl::CollectiveBalancedHostRearrangement::rearrangeForCollective | ( | const void * | in, |
std::size_t | inSize, | ||
void * | out, | ||
std::size_t | outSize, | ||
std::size_t | elemByteSize | ||
) | const |
Balanced reorder the tensor in a collective-friendly manner (host-side).
in | Pointer to the input buffer. |
inSize | The size of the in buffer in bytes. |
out | Pointer to the output buffer. |
outSize | The size of the out buffer in bytes. |
elemByteSize | The byte size of the elements. |
void gcl::CollectiveBalancedHostRearrangement::rearrangeForCollective | ( | const void * | in, |
void * | out, | ||
int64_t | elemByteSize | ||
) | const |
Balanced reorder the tensor in a collective-friendly manner (host-side).
in | Pointer to the input buffer. |
out | Pointer to the output buffer. |
elemByteSize | The byte size of the elements. |
void gcl::CollectiveBalancedHostRearrangement::setGatheredToRefSlices | ( | std::vector< poplar::Interval > | slices | ) |
Set the mapping from the gathered tensor back to the reference tensor.
slices |
void gcl::CollectiveBalancedHostRearrangement::setReplicationFactor | ( | unsigned | replicationFactor | ) |
The graph's replication factor.
replicationFactor |
void gcl::CollectiveBalancedHostRearrangement::setTotalElementsPerReplica | ( | std::size_t | totalElementsPerReplica | ) |
The total number for one replica's fragment.
totalElementsPerReplica |
|
inline |
Reorder tensor back into the expected IR tensor shape and order (host-side).
in | Input buffer. |
out | Output buffer. |
elemByteSize | The byte size of the elements. |
void gcl::CollectiveBalancedHostRearrangement::undoRearrangeForCollective | ( | const void * | in, |
std::size_t | inSize, | ||
void * | out, | ||
std::size_t | outSize, | ||
std::size_t | elemByteSize | ||
) | const |
Reorder tensor back into the expected IR tensor shape and order (host-side).
in | Pointer to the input buffer. |
inSize | The size of the in buffer in bytes. |
out | Pointer to the output buffer. |
outSize | The size of the out buffer in bytes. |
elemByteSize | The byte size of the elements. |
void gcl::CollectiveBalancedHostRearrangement::undoRearrangeForCollective | ( | const void * | in, |
void * | out, | ||
int64_t | elemByteSize | ||
) | const |
Reorder tensor back into the expected IR tensor shape and order (host-side).
in | Pointer to the input buffer. |
out | Pointer to the output buffer. |
elemByteSize | The byte size of the elements. |
std::vector<uint32_t> gcl::CollectiveBalancedHostRearrangement::elementMap |
Simple indices map for mapping individual elements one by one.
It is used instead gatheredToRefSlices for short intervals.