5. Managing devices
Model Runtime defines a set of helper classes that create an abstraction of a physical device that represents a group of one or more IPUs and manages the connections to them.
5.1. Device
The Device
class is a thin wrapper around the
Poplar Device class.
It extends poplar::Device
with:
The IPU architecture version you have declared (
ipuVersion()
),An extra step of binding the device to
Session
. If there is already a session bound to theDevice
instance, it first gets unloaded from the device (unloadFromDevice()
) before the new device is bound.
Note
The Device
class is not intended to be used directly. To create
a Device
object use DeviceManager
(Section 5.2, Device manager).
5.2. Device manager
The DeviceManager
class provides an interface to manage the hardware resources available in your system. From a programming perspective, it is a
builder class that creates Device
objects.
Note
DeviceManager
does not own the
Device
objects it creates. Full ownership is
transferred to the model.
To create a Device
object suitable for your model and representing one or more physical
IPUs, you can use one of the following DeviceManager
methods (see Section 5.2.1, Control which devices to use for more information on restricting which IPUs are visible):
getDevice(std::shared_ptr<popef::Model>, const DeviceWaitConfig &)
accepts aModel
object and aDeviceWaitConfig
object and returns a newly createdDevice
instance.Note
DeviceWaitConfig
is a helper class that stores the configuration of howDeviceManager
will wait for the specified hardware to become available. By default, if there are no hardware resources suitable to run the model on, an exception is thrown. More information aboutDeviceWaitConfig
can be found in the description of the Session constructor “wait_config” param.DeviceManager
reads the model’s Metadata to figure out the parameters of a suitableDevice
instance. If there is no suitable hardware available in the system, an exception is thrown.// Assuming there is a model (popef::Model) prepared/loaded in advance model_runtime::DeviceManager mgr; auto device = mgr.getDevice(model); // acquire device suitable for the model
getDevice(int64_t, const poplar::OptionFlags &, const DeviceWaitConfig &, const DeviceConstraints &)
is an overloaded version ofgetDevice
. You can choose to manually determine the set of requestedDevice
parameters (number of IPUs requested, device options,DeviceWaitConfig
andDeviceConstraints
).Note
DeviceConstraints
is a class that gathers constraints for the requested hardware. Currently, there is only one constraint defined:requiresRemoteBuffersSupport
(see: Poplar remote memory buffers).DeviceManager mgr; const int64_t number_of_ipus = 1; const model_runtime::DeviceConstraints constraints{true /*requiresRemoteBuffersSupport*/}; const model_runtime::DeviceWaitConfig wait_config{}; const poplar::OptionFlags options{}; std::shared_ptr<model_runtime::Device> device; try { device = mgr.getDevice(number_of_ipus, options, wait_config, constraints); } catch (std::runtime_error &err) { // catch and handle the exception if thrown }
tryGetDevice(std::shared_ptr<popef::Model>, const DeviceWaitConfig &)
andtryGetDevice(int64_t, const poplar::OptionFlags &, const DeviceWaitConfig &, const DeviceConstraints &)
perform the same tasks as the correspondinggetDevice
methods, but do not throw exceptions in the case of a failure.tryGetDevice
also does not check the IPU hardware version.// Assuming there is a model (popef::Model) prepared and loaded in advance model_runtime::DeviceManager mgr; auto device = mgr.tryGetDevice(model); // Acquire device suitable for the model if (device == nullptr) { // No Device object created, handle error }
There are other variations of the
Device
getter methods:getSpecificDevice(int64_t, std::shared_ptr<popef::Model>, const DeviceWaitConfig &)
getSpecificDevice(int64_t, const poplar::OptionFlags &, const DeviceWaitConfig &)
Specific to these
getSpecificDevice
methods is their first argument,device_id
that represents the numerical ID of the device.model_runtime::DeviceManager mgr; int64_t id; { model_runtime::Device device = mgr.getDevice(1); id = device->device().getId(); } // should have been released when device went out of scope std::shared_ptr<model_runtime::Device> device; try { device = mgr.getSpecificDevice(id); } catch (std::runtime_error &err) { // catch and handle the exception if thrown }
The following functions perform the same tasks as the corresponding
getSpecificDevice
methods, but do not throw exceptions in the case of a failure:tryGetSpecificDevice(int64_t, std::shared_ptr<popef::Model>, const DeviceWaitConfig &)
tryGetSpecificDevice(int64_t, const poplar::OptionFlags &, const DeviceWaitConfig &)
model_runtime::DeviceManager mgr; int64_t id; { model_runtime::Device device = mgr.getDevice(1); id = device->device().getId(); } // should have been released when device went out of scope std::shared_ptr<model_runtime::Device> device = mgr.tryGetSpecificDevice(id); if (device == nullptr) { // no Device object created, handle error }
If you do not have any IPUs available in the system, or for another
reason do not want to operate on a physical IPU, you can use the following DeviceManager
methods
to create Device
objects representing simulated IPU models:
Note
Both versions of the createIpuModelDevice
function
accept the tiles_per_ipu
parameter as the last argument. The value of this
parameter sets the number of tiles per IPU to be simulated. The larger this
value is, the longer the simulated computations take.
model_runtime::DeviceManager mgr;
const int64_t number_of_ipus = 1;
const int64_t ipu_version = 2;
const int64_t tiles_per_ipu = 100;
std::shared_ptr<model_runtime::Device> device =
mgr.createIpuModelDevice(number_of_ipus, ipu_version, tiles_per_ipu);
The last two methods deliver shortcuts to create a
Device
object referring to a “small” IPU model (only 4
tiles per IPU):
model_runtime::DeviceManager mgr;
const int64_t number_of_ipus = 1;
const int64_t ipu_version = 2;
std::shared_ptr<model_runtime::Device> device =
mgr.createSmallIpuModelDevice(number_of_ipus, ipu_version);
5.2.1. Control which devices to use
You can use the IPU_VISIBLE_DEVICES
environment variable to control the IPUs that Model Runtime uses to run an application.
You can use this to allocate resources or to restrict an application to a specific IPU. For example, you will get better performance if the CPU, memory and IPU are on the same NUMA node. While numactl
can be used in Linux to specify the NUMA node for the CPU and memory, it cannot be used for IPUs and so IPU_VISIBLE_DEVICES
can used for this purpose.
Set the environment variable to a comma-separated list of IPUs. Model Runtime will only select devices in the list. For example, to include IPUs with IDs 1, 3 and 4, you would set:
$ export IPU_VISIBLE_DEVICES=1,3,4
Note
The list order doesn’t matter. Model Runtime just checks if the device is included in the list or not.
If IPU_VISIBLE_DEVICES
is not set, then Model Runtime will use any device that meets the model requirements.
If all the devices in IPU_VISIBLE_DEVICES
are busy, then the run will fail with a “no available device” exception. This can happen even if there are IPUs available, but they are not included in the IPU_VISIBLE_DEVICES
list.
To set the CPU affinity:
Identify the NUMA node you wish to use.
Set the CPU and memory to this NUMA node.
$ numactl -n<node-no> -m<node-no>
where
<node-no>
is the NUMA node that you wish to use. For example, if you wish to use NUMA node 0, set:$ numactl -n0 -m0
Identify all IPU devices that are on the NUMA node you wish to use. You can do this with the gc-info command.
$ gc-info -d <device-id> -i|grep numa
where
device-id
is the IPU device ID. For example, rungc-info
on device 0 to check if is on NUMA node 0:$ gc-info -d 0 -i|grep numa
Note
You will have to do this for each device ID. You can get a list of all device IDs with the gc-inventory command.
Once you have the IPUs that are on the NUMA node you wish to use, then set the
IPU_VISIBLE_DEVICES
environment variable to these IPUs.$ export IPU_VISIBLE_DEVICES=<list of IPUs on the selected NUMA node>
For example, if you have found that IPUs 0,1 and 3 are on NUMA node 0, then set:
$ export IPU_VISIBLE_DEVICES=0,1,3
This will ensure that the CPU, memory and IPUs are all on the same NUMA node.