5. PopDist C++ API reference
#include <popdist/backend.hpp>
-
namespace popdist
Functions
-
void registerDefaultBackend()
Automatically registers the default backend for PopDist.
PopDist will try to automatically locate the default backend and register it. An error will be thrown if it could not be found. The default backend for PopDist is based on OpenMPI.
-
void registerBackend(const std::string &file_path_backend)
Registers the provided shared object as the backend for PopDist.
PopDist backends allow for system specific features, such as shared buffers between instances/replicas.
- Parameters
file_path_backend – The filepath pointing to the
.so file containing the backend.
-
void initializeBackend()
Initialize PopDist backend.
Calls the
initialize()
function defined by the PopDist backend. Depending on the implementation of the backend, this can trigger erroneous behavior when called multiple times (or simultaneously). We recommended using the Python API (popdist.initializeBackend()
) whenever possible.
-
bool isBackendInitialized()
Checks whether the backend has been initialized.
- Returns
true
if the backend is initialized andfalse
otherwise.
-
void finalizeBackend()
Finalize PopDist backend.
Calls the
finalize()
function defined by the PopDist backend. Depending on the implementation of the backend, this can trigger erroneous behavior when called multiple times (or simultaneously). We recommended using the Python API (popdist.finalizeBackend()
) whenever possible.
-
constexpr const char *defaultCommunicatorId()
A default communicator ID that can be passed to functions accepting a
communicator_id
argument.
-
void registerCommunicator(const std::set<uint32_t> &participants, const std::string &communicator_id = defaultCommunicatorId())
Register a new communicator that will be used for collectives/synchronization.
This is a collective function used for registration of new communicators, hence all instances need to call it with the same parameters. A subset of instances present in the
participants
argument can then call collectives/synchronize using the samecommunicator_id
andparticipants
arguments. Non-present instances must not participate in that collective/synchronization call. This function only needs to be called when registering communicators for subsets of instances. Collective calls that only need communicators with customcommunicator_id
can call collective functions with a new ID directly, without having to register it first.- Parameters
participants – A subset of instances which will use the newly registered communicator. Default value indicates that all instances will use the communicator.
communicator_id – An ID associated with the new communicator.
-
void synchronize(const std::string &communicator_id = defaultCommunicatorId(), const std::set<uint32_t> &participants = {})
Synchronizes code execution over instances.
Creates a barrier crossing selected instances, halting them until each of them has crossed the barrier. This function is not thread-safe and must be called from one thread per instance only.
- Parameters
communicator_id – An ID of the communicator used for this call.
participants – A subset of instances which participate in the barrier. Default value indicates that all instances participate.
Initializes a shared buffer for a specific instance/replica combination.
This function initializes a raw buffer for each local replica of the instance it is called from. The size of the buffer is provided by the caller. This function is not thread-safe and must be called from one thread per instance only.
- Parameters
buffer_size – Size of the shared buffer.
Writes a provided buffer to the shared buffer.
- Parameters
instance_id – The instance id of the shared buffer to write to.
local_replica_id – The local replica id of the shared buffer to write to.
data – The raw buffer.
data_size – The size of the shared buffer.
Deallocates all shared buffer for the instance it was called from.
Reads the shared buffer for a specific instance/replica combination.
This function reads the shared buffer, copies it to locally accessible memory and returns the memory address to the caller. The size of the buffer is written to the
buffer_size
parameter.- Parameters
instance_id – The identifier of the instance.
local_replica_id – The identifier of the local replica.
buffer_size – The size of the buffer in bytes.
- Returns
The memory address of the first byte of the shared buffer.
-
void run(poplar::Engine &engine, uint32_t program_id = 0, const std::string &debug_name = "", const std::string &communicator_id = defaultCommunicatorId(), const std::set<uint32_t> &participants = {})
Calls
engine.run()
in a synchronized setting.This function synchronizes selected instances before calling
engine.run()
. If two instances callpopdist::run()
with different program identifiers, this will throw an exception. These checks will be skipped when the program is not launched through PopRun. Additionally, this function is thread-safe and can be called from multiple threads within a single instance. Each thread that participates in the same reduction needs to use the samecommunicator_id
but all threads within a single instance need to use a uniquecommunicator_id
.- Parameters
engine – The Poplar engine to run.
program_id – The Poplar program id.
debug_name – The Poplar debug name.
communicator_id – An ID of the communicator used for this call. Must be unique across threads within a single instance
participants – A subset of instances which participate in the synchronization. Default value indicates that all instances participate.
-
void registerDefaultBackend()
#include <popdist/context.hpp>
-
namespace popdist
Functions
-
unsigned getNumTotalReplicas()
Get number of total replicas.
Will try and infer the total number of replicas from environment variables. Will default to 1 if no environment is set.
-
unsigned getNumIpusPerReplica()
Get number of ipus per replica.
Will try and infer the ipus per replica from environment variables. Will default to 1 if no environment is set.
-
bool checkNumIpusPerReplica(unsigned expected)
Check if ipus per replica in context matches expected number.
Will return
false
if environment variables are set and do not match the given value.
-
bool isUniformReplicasPerInstance()
Gets whether the number of replicas per instance is uniform.
Will try and infer from environment variables. Will default to
false
if no environment is set.
-
unsigned getNumLocalReplicas()
Get number of local replicas.
Will try and infer the number of local replicas from environment variables. Will default to 1 if no environment is set.
-
unsigned getReplicaIndexOffset()
Get replica index offset.
Will try and infer the replica index offset from environment variables. Will default to 0 if no environment is set.
-
unsigned getLocalInstanceIndex()
Get local instance index.
The relative index of a instance within a host.
Will try and infer the local instance index from environment variables. Will default to 0 if no environment is set.
-
unsigned getInstanceIndex()
Gets the index of the current instance.
Can only be used with a uniform number of replicas per instance.
- Returns
The index of the current instance.
-
unsigned getNumInstances()
Gets the total number of instances.
Can only be used with a uniform number of replicas per instance.
- Returns
The total number of instances.
-
unsigned getNumTotalReplicas()
#include <popdist/collectives.hpp>
-
namespace popdist
-
namespace collectives
Functions
-
void allGather(const void *data, void *destination, size_t num_elements, const poplar::Type &type, const std::string &communicator_id = defaultCommunicatorId(), bool inplace = false, const std::set<uint32_t> &participants = {})
Allgather collective operation.
This function gathers data from selected instances and distributes the result back to the selected instances. Additionally, this operation is thread-safe and can be called from multiple threads within a single instance. Each thread that participates in the same reduction needs to use the same
communicator_id
but all threads within a single instance need to use a uniquecommunicator_id
.- Parameters
data – Pointer to the first element of the data being gathered.
destination – Pointer to the buffer where the result of the operation is written to.
num_elements – Number of elements in
data
anddestination
.type – Type of the elements in
data
anddestination
.communicator_id – An ID of the communicator used for this call. Must be unique across threads within a single instance. May be reused once the collective operation with the re-used tag is completed.
inplace – Perform the allGather inplace if
true
and not otherwise. Defaults tofalse
. Iftrue
, you should passnullptr
to thedata
parameter, and the input data of each process must be located inside thedestination
buffer where that process’s data will be written.participants – A subset of instances which participate in the collective call. Default value indicates that all instances participate.
-
void allReduceSum(void *data, size_t num_elements, const poplar::Type &type, const std::string &communicator_id = defaultCommunicatorId(), const std::set<uint32_t> &participants = {})
Allreduce collective operation.
This function sums values from selected instances and distributes the result back to the selected instances.
- Parameters
data – Pointer to the first element of the data being reduced.
num_elements – Number of elements in
data
.type – Type of the elements in
data
.communicator_id – An ID of the communicator used for this call. Must be unique across threads within a single instance. May be reused once the collective operation with the re-used tag is completed.
participants – A subset of instances which participate in the collective call. Default value indicates that all instances participate.
-
void broadcast(void *data, size_t num_elements, const poplar::Type &type, const uint32_t root = 0, const std::string &communicator_id = defaultCommunicatorId(), const std::set<uint32_t> &participants = {})
Broadcast collective operation.
This function broadcasts data from the
root
instance to all other selected instances.- Parameters
data – Pointer to the first element of the data being broadcasted.
num_elements – Number of elements in
data
.type – Type of the elements in
data
.root – Rank of the instance broadcasting
data
.communicator_id – An ID of the communicator used for this call. Must be unique across threads within a single instance. May be reused once the collective operation with the re-used tag is completed.
participants – A subset of instances which participate in the collective call. Default value indicates that all instances participate.
-
void allGather(const void *data, void *destination, size_t num_elements, const poplar::Type &type, const std::string &communicator_id = defaultCommunicatorId(), bool inplace = false, const std::set<uint32_t> &participants = {})
-
namespace collectives
#include <popdist/popdist_poplar.hpp>
-
namespace popdist
Functions
-
poplar::Graph createGraph(poplar::TargetType targetType, unsigned ipusPerReplica = 0)
Create a Poplar graph that works in the PopDist context.
The created graph will have the appropriate replication factor set. If no context is present (no environment variables are set), then the graph will have no replication factor.
- Parameters
targetType – The required targetType. Unless the context is a simple 1 replica system, this type must be TargetType::IPU.
ipusPerReplica – The expected ipusPerReplica that the graph should be created over. If this does not match the PopDist context then an exception will be thrown. If zero, the value is obtained from the PopDist context.
-
void setEngineOptions(poplar::OptionFlags &opt)
Set the Poplar engine options to match the PopDist context.
If no context is present (no environment variables are set) then the option flags are not changed.
- Parameters
opt – The option flags to be updated.
-
poplar::Device getDevice(poplar::TargetType targetType, unsigned ipusPerReplica, const poplar::OptionFlags &opt = {})
Get a device that works in the PopDist context.
If no context is present (no environment variables are set), then a suitable available device is still returned.
- Parameters
targetType – The required targetType. Unless the context is a simple 1 replica system, this type must by TargetType::IPU.
ipusPerReplica – The expected ipusPerReplica that the device should respect. If this does not match the PopDist context then an exception is thrown.
opt – Option flags for the target creation.
-
void prepareParentDevice()
Prepare the current parent device for PopDist execution.
This needs to be called before every engine load in order to reset the IPU state for loading a new executable. PopRun does the initial preparation, so this only needs to be called by the application when loading its second engine onwards.
Only the first instance does the actual preparation (the others do nothing), so it is safe to call this function from all the instances.
Note that all the instances must detach from their respective child devices before calling this function. This can be achieved by detaching and then performing a global synchronization barrier before the call to this function. A barrier may also be needed after the call to ensure that the preparation is complete before attempting to re-attach to the child devices.
All the necessary information is read from popdist environment variables.
-
void prepareDevice(poplar::Device &device, unsigned ipusPerReplica, unsigned numReplicas)
Prepare the given device for PopDist execution.
All the necessary information must be passed in.
-
unsigned getDeviceId(unsigned ipusPerReplica = 0)
Returns the Poplar ID of the parent device.
- Parameters
ipusPerReplica – The number of IPUs per replica. If 0 is provided, the value will be read from the environment variable set by PopRun.
- Returns
unsigned
-
poplar::Graph createGraph(poplar::TargetType targetType, unsigned ipusPerReplica = 0)