10. Model Runtime C++ API reference
10.1. High level API
10.1.1. Device management
-
enum class model_runtime::DeviceWaitStrategy
Defines the different options for waiting for a device to become available.
Values:
-
enumerator NO_WAIT
An exception will be thrown if no IPU device is immediately available.
-
enumerator WAIT_WITH_TIMEOUT
The device manager will wait for a specified amount of time for an IPU device to become available.
The device manager will try to attach to the required device at a specified interval.
-
enumerator WAIT_FOREVER
The device manager will wait until an IPU device is available.
The device manager will try to attach to the required device at a specified interval.
-
enumerator NO_WAIT
-
struct DeviceWaitConfig
The configuration of how to wait for the expected device.
Public Functions
-
constexpr DeviceWaitConfig() = default
Constructor with default values.
-
inline constexpr DeviceWaitConfig(DeviceWaitStrategy p_strategy, std::chrono::seconds p_timeout = std::chrono::seconds{0}, std::chrono::seconds p_sleepTime = std::chrono::seconds{15})
Constructor with specified device waiting strategy.
- Parameters
p_strategy – [in] The device waiting strategy.
p_timeout – [in] The time in seconds to wait for a device. Only required if
p_strategyis DeviceWaitStrategy::WAIT_WITH_TIMEOUT.p_sleepTime – [in] The time in seconds between attach attempts. Required if
p_strategyis DeviceWaitStrategy::WAIT_WITH_TIMEOUT or DeviceWaitStrategy::WAIT_FOREVER.
-
inline explicit constexpr DeviceWaitConfig(std::chrono::seconds p_timeout, std::chrono::seconds p_sleepTime = std::chrono::seconds{15})
Constructor with device waiting strategy DeviceWaitStrategy::WAIT_WITH_TIMEOUT.
This means that the device manager will wait for a finite amount of time,
p_timeout, for a device to become available.- Parameters
p_timeout – [in] The time in seconds to wait for a device.
p_sleepTime – [in] The time in seconds between attach attempts.
Public Members
-
DeviceWaitStrategy strategy = {DeviceWaitStrategy::NO_WAIT}
The device waiting strategy.
-
std::chrono::seconds timeout = {0}
The time in seconds to wait for if no device is currently available.
-
std::chrono::seconds sleepTime = {15}
The time in seconds between attach attempts.
-
constexpr DeviceWaitConfig() = default
-
struct DeviceConstraints
Requirement for the device that the user wants to connect to.
Public Functions
-
constexpr DeviceConstraints() = default
Constructor with default values.
-
inline explicit constexpr DeviceConstraints(bool p_requiresRemoteBuffersSupport)
Constructor with specified values.
- Parameters
p_requiresRemoteBuffersSupport – [in] If true, sets that the device has to support remote buffers. If false, sets that the device does not have to support remote buffers. Default: false.
-
inline constexpr operator bool() const
Check whether the selection of the machine has any constraints.
If true, the device has constraints; if false, it does not.
Public Members
-
bool requiresRemoteBuffersSupport = false
Set that the device has to support remote buffers.
If true, sets that the device has to support remote buffers. If false, sets that the device does not have to support remote buffers. Default: false.
-
constexpr DeviceConstraints() = default
-
class Device
Create a device.
This is a wrapper around a Poplar device.
Public Functions
-
Device(poplar::Device device, int64_t ipu_version)
Constructor with specified values.
- Parameters
device – [in] The Poplar device. This is a device that can execute code.
ipu_version – [in] The version number of the IPU.
-
int64_t ipuVersion() const
Get the version of the device.
- Returns
The version number of the device or -1 if unknown.
Protected Functions
-
bool isActiveSession(const Session *session) const
Get whether a session is the active session for this device.
True if the session is the active session for this device, false otherwise.
-
void bindToSession(Session *session)
Bind the device to a session.
If the device is bound to a different session then unbind() is called on that session first.
- Parameters
session – [in] The new session to bind the device to.
-
void unbindSession()
Unbind the device from the active session.
Friends
- friend class Session
-
Device(poplar::Device device, int64_t ipu_version)
-
class DeviceManager
Select which device to run on.
Public Functions
-
DeviceManager &operator=(const DeviceManager &other) = delete
-
DeviceManager &operator=(DeviceManager&&) = delete
-
DeviceManager()
Constructor with default values.
-
DeviceManager(const DeviceManager&) = default
Default copy constructor.
-
DeviceManager(DeviceManager&&) = default
Default forward constructor.
-
int64_t ipuHardwareVersion()
- Returns
Either:
The version of the IPU on the physical system.
-1 if there is an IPU but the version is unknown.
0 if there is no IPU on the system.
Get a device matching the configuration needed by the given model.
- Parameters
model – [in] A PopEF model. This method gets the number of IPUs and the option flags from the model.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise throw an error.
Try to get a device matching the configuration needed by the given model.
- Parameters
model – [in] A PopEF model. This method gets the number of IPUs and the option flags from the model.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise a nullptr.
-
std::shared_ptr<Device> getDevice(int64_t num_ipus, const poplar::OptionFlags &device_options = {}, const DeviceWaitConfig &wait_config = {}, const DeviceConstraints &constrains = {})
Get a device matching the requested configuration.
- Parameters
num_ipus – [in] The number of IPUs the device must contain.
device_options – [in] The device options.
wait_config – [in] The configuration of how to wait for the expected device.
constrains – [in] The set of constraints that the device must meet.
- Returns
The device if attachment is successful, otherwise throw an error.
-
std::shared_ptr<Device> tryGetDevice(int64_t num_ipus, const poplar::OptionFlags &device_options = {}, const DeviceWaitConfig &wait_config = {}, const DeviceConstraints &constrains = {})
Get a device matching the requested configuration.
- Parameters
num_ipus – [in] The number of IPUs the device must contain.
device_options – [in] The device options.
wait_config – [in] The configuration of how to wait for the expected device.
constrains – [in] The set of constraints that the device must meet.
- Returns
The device if attachment is successful, otherwise a nullptr.
Get a specific device matching the requested configuration.
- Parameters
device_id – [in] The ID of the device to acquire.
model – [in] A PopEF model. This method gets the option flags from the model.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise throw an error.
-
std::shared_ptr<Device> getSpecificDevice(int64_t device_id, const poplar::OptionFlags &device_options = {}, const DeviceWaitConfig &wait_config = {})
Get a specific device matching the requested configuration.
- Parameters
device_id – [in] The ID of the device to acquire.
device_options – [in] The device options.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise throw an error.
Get a specific device matching the requested configuration.
- Parameters
device_id – [in] The ID of the device to acquire.
model – [in] A PopEF model. This method gets the option flags from the model.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise a nullptr.
-
std::shared_ptr<Device> tryGetSpecificDevice(int64_t device_id, const poplar::OptionFlags &device_options = {}, const DeviceWaitConfig &wait_config = {})
Try to get a specific device matching the requested configuration.
- Parameters
device_id – [in] The ID of the device to acquire.
device_options – [in] The device options.
wait_config – [in] The configuration of how to wait for the expected device.
- Returns
The device if attachment is successful, otherwise a nullptr.
-
std::shared_ptr<Device> createIpuModelDevice(int64_t num_ipus, int64_t ipu_version = 2, int64_t tiles_per_ipu = 0)
Create a model of the device.
- Parameters
num_ipus – [in] The number of IPUs the device must contain.
ipu_version – [in] The target architecture of the IPU.
tiles_per_ipu – [in] The number of tiles per IPU the model will have. If 0, defaults to the number of tiles for the chosen IPU version.
- Returns
The model of the device.
Create a model of the device.
- Parameters
model – [in] A PopEF model. This method gets the number of IPUs and the IPU version from the model.
tiles_per_ipu – [in] The number of tiles per IPU the model will have. If 0, defaults to the number of tiles for the chosen IPU version.
- Returns
The model of the device.
-
std::shared_ptr<Device> createSmallIpuModelDevice(int64_t num_ipus, int64_t ipu_version = 2)
Create a small IPU model of the device.
Small IPU models only have 4 tiles: they are much quicker to create and run than a full size model but can only run small programs.
- Parameters
num_ipus – [in] The number of IPUs the device must contain.
ipu_version – [in] The target architecture of the IPU.
- Returns
The model of the device.
Create a small IPU model of the device.
Small IPU models only have 4 tiles: they are much quicker to create and run than a full size model but can only run small programs.
- Parameters
model – [in] A PopEF model. This method gets the number of IPUs and the IPU version from the model.
- Returns
The model of the device.
-
DeviceManager &operator=(const DeviceManager &other) = delete
10.1.2. Tensor memory representation
-
struct TensorMemoryView
Mutable view to already allocated memory.
-
struct ConstTensorMemoryView
Immutable view to already allocated memory.
Public Functions
-
ConstTensorMemoryView() = default
Default constructor.
-
ConstTensorMemoryView(const TensorMemoryView &other)
Default copy constructor.
-
ConstTensorMemoryView(const void *data, uint64_t data_size_bytes)
Immutable view to const memory.
- Parameters
data – [in] The pointer to the allocated memory.
data_size_bytes – [in] The size of the memory block, in bytes.
-
ConstTensorMemoryView() = default
-
struct TensorMemory
Tensor memory manager responsible for allocating, storing, sharing and releasing tensor memory.
Public Functions
-
TensorMemory() = default
Default constructor.
-
TensorMemory(int64_t data_size_bytes)
Allocate user requested memory block and store it in a shared_ptr.
- Parameters
data_size_bytes – [in] The size of the memory block, in bytes.
-
TensorMemoryView getView()
Get the mutable memory view.
-
ConstTensorMemoryView getConstView()
Get the immutable memory view.
-
ConstTensorMemoryView getView() const
Get the immutable memory view.
-
TensorMemory() = default
10.1.3. Model Runner
-
using model_runtime::InputMemoryView = std::unordered_map<std::string, ConstTensorMemoryView>
Mapping between a tensor name and an immutable memory view.
Used as input to ModelRunner::execute.
-
using model_runtime::OutputMemoryView = std::unordered_map<std::string, TensorMemoryView>
Mapping between a tensor name and a memory view.
Used as output from ModelRunner::execute, when the output memory is allocated and managed by the ModelRunner client.
-
using model_runtime::OutputMemory = std::unordered_map<std::string, TensorMemory>
Mapping between a tensor name and memory.
Used as output from ModelRunner::execute output when memory is allocated during execution by the library.
-
using model_runtime::OutputFutureMemoryView = std::unordered_map<std::string, std::shared_future<TensorMemoryView>>
Mapping between a tensor name and a future memory view.
Used as output from the asynchronous ModelRunner::execute when the output memory is allocated and managed by the ModelRunner client.
-
using model_runtime::OutputFutureMemory = std::unordered_map<std::string, std::shared_future<TensorMemory>>
Mapping between a tensor name and a future memory.
Used as output from ModelRunner::execute output when memory is allocated during execution by the library.
-
struct DataDesc
The description of data used by ModelRunner.
Public Functions
-
DataDesc(std::string name, int64_t size_in_bytes, std::vector<int64_t> shape, popef::DataType data_type, bool popef_contains_tensor_data = false)
Create description of input/output data.
- Parameters
name – [in] The name of the input/output tensor.
size_in_bytes – [in] The size of the tensor measured in bytes.
shape – [in] A vector defining the shape of the tensor. The size of the vector is equal to the number of tensor dimensions. Each element of the vector indicates the size of a single dimension.
data_type – [in] The data type of a single tensor element.
popef_contains_tensor_data – [in] If true, the model has a tensor data blob associated with the tensor. If false, the model does not have a tensor data blob associated with the tensor. Default: false.
Public Members
-
std::string name
The name of the input/output tensor.
-
int64_t size_in_bytes
The size of the tensor measured in bytes.
-
std::vector<int64_t> shape
A vector defining the shape of the tensor.
The size of the vector is equal to the number of tensor dimensions. Each element of the vector indicates the size of a single dimension.
-
bool popef_contains_tensor_data
If true, the model has a tensor data blob associated with the tensor.
If false, the model does not have a tensor data blob associated with the tensor.
-
DataDesc(std::string name, int64_t size_in_bytes, std::vector<int64_t> shape, popef::DataType data_type, bool popef_contains_tensor_data = false)
-
using model_runtime::InputDesc = DataDesc
Description of input data required by ModelRunner.
-
using model_runtime::OutputDesc = DataDesc
Description of output data required by ModelRunner.
-
using model_runtime::ReplicaIdToOutputMemoryView = std::unordered_map<unsigned, OutputMemoryView>
Mapping replicas to the allocated memory defined for the output tensors required by ModelRunner.
-
using model_runtime::ReplicaIdToDevice = std::unordered_map<unsigned, std::shared_ptr<Device>>
Mapping replicas to physical entities that can execute the IPU programs.
-
struct ModelRunnerConfig
ModelRunner configuration options.
Public Members
-
unsigned replication_factor = 1
Number of replicas to be created.
-
bool run_save_programs = false
If true, save programs will be called during ModelRunner instance destruction.
If false, save programs will not be called. Default: false.
-
bool thread_safe = false
If true, the mutex will be locked on each execution call.
If false, the mutex will not be locked. By default the model runner is not thread-safe and each replica has an independent mutex. Default: false.
-
InputMemoryView frozen_inputs = {}
Map of the user-data required by the model during the loading onto hardware phase as well as any data that will be considered as constant during model execution.
It allows for overwriting of tensor data saved inside a PopEF file.
-
ReplicaIdToOutputMemoryView replica_to_save_programs_outputs = {}
Mapping between replica ID and OutputMemoryView.
Allows the user to get data returned by save programs. The PopEF data format allows for the creation of “save programs” that will be executed during ModelRunner instance destruction, if required.
-
ReplicaIdToDevice replica_to_device = {}
Mapping between replica ID and Device.
Allows the user to set a specific device for a given replica. By default, the model runner assigns devices automatically.
-
DeviceWaitConfig device_wait_config = {}
By default, the model runner throws an exception when it is not able to attach to any device required by the given model.
This behavior can be changed by setting a custom DeviceWaitConfig.
-
bool check_package_hash = true
If true, the Poplar hash will be checked before the executable is loaded onto the device.
If false, this check is not done. Default: true.
-
std::chrono::nanoseconds timeout_ns = std::chrono::seconds(5)
Duration in nanoseconds to wait before calling timeout callback when the IPU is waiting for input data, which is not available.
If 0, never call the timeout, in other words, wait forever for the data.
-
bool validate_io_params = true
If true, the I/O parameters will be checked during the execution ModelRunner “execute” functions.
If false, this check is not done. Default: true.
-
unsigned replication_factor = 1
-
class ModelRunner
Inference model abstraction.
The model runner creates a session, manages queues, runs Poplar executable programs as well as allows executing inference models synchronously and asynchronously.
Public Functions
-
ModelRunner(const ModelRunner&) = delete
-
ModelRunner &operator=(const ModelRunner &other) = delete
-
ModelRunner(ModelRunner&&) = default
Default forward constructor.
-
ModelRunner &operator=(ModelRunner&&) = default
Default move assignment operator.
-
explicit ModelRunner(const std::string &popef_path, const ModelRunnerConfig &config = ModelRunnerConfig{})
Create a new ModelRunner object.
- Parameters
popef_path – The path to PopEF files from which the model will be loaded.
config – The model runner configuration.
-
explicit ModelRunner(const std::vector<std::string> &popef_paths, const ModelRunnerConfig &config = ModelRunnerConfig{})
Create a new ModelRunner object.
- Parameters
popef_paths – Paths to PopEF files from which the model will be loaded.
config – The model runner configuration.
Create a new ModelRunner object.
- Parameters
model – The model which will be loaded and run.
config – The model runner configuration.
-
~ModelRunner()
Default destructor.
-
OutputMemory execute(const InputMemoryView &input_data, unsigned replica_id = 0)
Run model synchronously.
Allocate output memory internally.
- Parameters
input_data – [in] The user-allocated tensor buffer for all executable input tensors.
replica_id – [in] The user-selected replica that will execute computations. Must be less than
replication_factorprovided in ModelRunnerConfig.
- Returns
The output memory allocated by the library for all executable output tensors.
-
void execute(const InputMemoryView &input_data, const OutputMemoryView &output_data, unsigned replica_id = 0)
Run a model synchronously.
The user allocates and passes pointers to output memory.
- Parameters
input_data – [in] The user-allocated tensor buffer for all executable input tensors.
output_data – [in] The user-allocated tensor buffer for all executable output tensors.
replica_id – [in] The user-selected replica that will execute computations. Must be less than
replication_factorprovided in ModelRunnerConfig.
-
OutputFutureMemory executeAsync(const InputMemoryView &input_data, unsigned replica_id = 0)
Run a model asynchronously.
Allocate output memory internally.
- Parameters
input_data – [in] The user-allocated tensor buffer for all executable input tensors.
replica_id – [in] The user-selected replica that will execute computations. Must be less than
replication_factorprovided in ModelRunnerConfig.
- Returns
The future result of an asynchronous call for all executable output tensors.
-
OutputFutureMemoryView executeAsync(const InputMemoryView &input_data, const OutputMemoryView &output_data, unsigned replica_id = 0)
Run a model asynchronously.
The user allocates and passes pointers to output memory.
- Parameters
input_data – [in] The user allocated tensor buffer for all executable input tensors.
output_data – [in] The user allocated tensor buffer for all executable output tensors.
replica_id – [in] The user-selected replica that will execute computations. Must be less than replication_factor provided in ModelRunnerConfig.
- Returns
The future result of an asynchronous call for all executable output tensors.
-
std::vector<InputDesc> getExecuteInputs() const
Get a description of the input data required in the execute class methods.
- Returns
A vector of DataDesc instances.
-
std::vector<OutputDesc> getExecuteOutputs() const
Get a description of the output data required in the execute class methods.
- Returns
A vector of DataDesc instances.
-
std::vector<InputDesc> getModelInputs() const
Get a description of all the user-provided input data.
In addition to the data used by the execute calls, it will return a description of all tensors used by the model which must be provided during the phase of loading the model onto the device. The data required for the additional tensors may be included in PopEF files. In this case, the descriptions of the additional are loaded automatically by ModelRunner.
- Returns
A vector of DataDesc instances.
-
std::vector<OutputDesc> getModelOutputs() const
Get a description of all the user-provided output data.
In addition to the data used by the execute calls, it will return a list of descriptions of all tensors used by the model that the loading phase requires (weights tensors as an example). The data for these additional tensors can be included in PopEF files that are loaded automatically by the ModelRunner.
- Returns
The vector of DataDesc instances.
-
ModelRunner(const ModelRunner&) = delete
10.2. Low level API
10.2.1. Anchor callback management
-
struct CallbackInfo
Information passed to CallbackFactory.
-
using model_runtime::CallbackHandle = std::function<void(void*)>
The callback function called whenever the stream will be read from or written to by the device.
The memory location will only be valid for reading or writing for the duration of the callback.
-
using model_runtime::CallbackFactory = std::function<poplar::StreamCallbackHandle(const CallbackInfo &info)>
Factory to create a callback for the given callback information.
-
enum class model_runtime::PopefDataUsagePolicy
The policy of PopEF TensorData and TensorFeed usage for Anchors callbacks creation.
Values:
-
enumerator USE_POPEF_DATA_IF_ANY = 0
Utilize the TensorData and the TensorFeed stored in the model’s PopEF to implicitly create callbacks for the Anchors, for which the data exists.
-
enumerator USE_USER_DATA
Don’t use the data stored in the PopEF.
Let the user bind his own data source.
-
enumerator USE_POPEF_DATA_IF_ANY = 0
-
using model_runtime::PopefDataUsagePredicate = std::function<PopefDataUsagePolicy(const popef::Anchor&)>
PopefDataUsagePredicates are used to control the PopEF’s tensor or feed data usage while creating callbacks for Anchors.
For more information, see the description of anchors in the PopEF User Guide.
-
static const PopefDataUsagePredicate model_runtime::null_popef_data_usage_predicate = {}
-
enum class model_runtime::AnchorCallbackPolicy
Policy to handle anchor callback.
Values:
-
enumerator BIND_USER_CB = 0
Bind user callback to anchor.
-
enumerator BIND_EMPTY_CB
Bind empty (dummy) callback to anchor.
-
enumerator SKIP_CB
Skip binding a callback to the anchor.
-
enumerator BIND_USER_CB = 0
-
using model_runtime::AnchorCallbackPredicate = std::function<AnchorCallbackPolicy(const popef::Anchor&)>
AnchorCallbackPredicates are used to control the callback creation policy for an individual Anchor.
For more information, see the description of anchors in the PopEF User Guide.
-
static const AnchorCallbackPredicate model_runtime::null_anchor_callback_predicate = {}
-
namespace predicate_factory
Predefined callback predicates.
Set of basic predicates to control handling of popef data usage or anchor callbacks.
-
template<typename Policy>
class AnchorWithPolicy The desired anchor with callback handling policy.
- Template Parameters
Policy – The policy to be bound to an anchor.
-
template<typename Policy>
class ProgramsWithPolicy Program indices with desired callback handling policy.
- Template Parameters
Policy – The policy to be bound to the programsIndexes.
Public Members
-
const std::vector<popef::ProgramFlow::ProgramIndexType> &programsIndexes
A set of indices to named programs.
-
namespace anchor_callbacks
Functions
-
AnchorCallbackPredicate predProgramFlowLoad(const popef::ProgramFlow &flow, AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to filter all anchors owned by any load programs.
- Parameters
flow – [in] The user model PopEF program flow (to read load program numbers from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predProgramFlowMain(const popef::ProgramFlow &flow, AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to filter all anchors owned by the main program.
- Parameters
flow – [in] The user model PopEF program flow (to read main program number from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predProgramFlowSave(const popef::ProgramFlow &flow, AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to filter all anchors owned by any save programs.
- Parameters
flow – [in] The user model PopEF program flow (to read main program number from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predProgramNotAssigned(AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to filter all anchors that are not assigned to any programs.
- Parameters
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predProgramIndexes(const std::vector<popef::ProgramFlow::ProgramIndexType> &program_indexes, AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to filter all anchors owned by any of the programs passed in by the user.
- Parameters
program_indexes – [in] The program indices to filter.
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predProgramIndexes(const std::vector<ProgramsWithPolicy<AnchorCallbackPolicy>> &accepted_programs_policies, const AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to apply defined anchor callback handling policies to grouped program indices.
- Parameters
accepted_programs_policies – The program indices with defined anchor callback handling policies.
reject_policy – The anchor rejection policy.
- Returns
The anchor callback predicate.
-
AnchorCallbackPredicate predAnchorsPolicies(const std::vector<AnchorWithPolicy<AnchorCallbackPolicy>> &accepted_anchors_policies, const AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
Callback predicate to apply anchor handling policies.
- Parameters
accepted_anchors_policies – [in] The anchor indices with handling policies.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
template<typename ...Args>
AnchorCallbackPredicate orBind(AnchorCallbackPolicy accept_policy, AnchorCallbackPolicy reject_policy, Args&&... pred) Disjunction operator.
Combines multiple predicates into one Predicate.
- Parameters
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
pred – [in] The predicates which will be combined by operator.
- Returns
The anchor callback predicate which returns
accept_policywhen one of the passed predicates returns anaccept_policyotherwise returnsreject_policy.
-
AnchorCallbackPredicate predProgramFlowLoad(const popef::ProgramFlow &flow, AnchorCallbackPolicy accept_policy = AnchorCallbackPolicy::BIND_USER_CB, AnchorCallbackPolicy reject_policy = AnchorCallbackPolicy::BIND_EMPTY_CB)
-
namespace popef_data_usage
Functions
-
PopefDataUsagePredicate predProgramFlowLoad(const popef::ProgramFlow &flow, PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to filter all anchors owned by any load programs.
- Parameters
flow – [in] The user model PopEF program flow (to read load program numbers from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predProgramFlowMain(const popef::ProgramFlow &flow, PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to filter all anchors owned by the main program.
- Parameters
flow – [in] The user model PopEF program flow (to read main program number from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predProgramFlowSave(const popef::ProgramFlow &flow, PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to filter all anchors owned by any save programs.
- Parameters
flow – [in] The user model PopEF program flow (to read main program number from).
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predProgramNotAssigned(PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to filter all anchors that are not assigned to any programs.
- Parameters
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predProgramIndexes(const std::vector<popef::ProgramFlow::ProgramIndexType> &program_indexes, PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to filter all anchors owned by any of the programs passed in by the user.
- Parameters
program_indexes – [in] The program indices to filter.
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predProgramIndexes(const std::vector<ProgramsWithPolicy<PopefDataUsagePolicy>> &accepted_programs_policies, const PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to apply defined anchor callback handling policies to grouped program indices.
- Parameters
accepted_programs_policies – The program indices with defined anchor callback handling policies.
reject_policy – The anchor rejection policy.
- Returns
The anchor callback predicate.
-
PopefDataUsagePredicate predAnchorsPolicies(const std::vector<AnchorWithPolicy<PopefDataUsagePolicy>> &accepted_anchors_policies, const PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
Callback predicate to apply anchor handling policies.
- Parameters
accepted_anchors_policies – [in] The anchor indices with handling policies.
reject_policy – [in] The anchor rejection policy.
- Returns
The anchor callback predicate.
-
template<typename ...Args>
PopefDataUsagePredicate orBind(PopefDataUsagePolicy accept_policy, PopefDataUsagePolicy reject_policy, Args&&... pred) Disjunction operator.
Combines multiple predicates into one Predicate.
- Parameters
accept_policy – [in] The anchor acceptance policy.
reject_policy – [in] The anchor rejection policy.
pred – [in] The predicates which will be combined by operator.
- Returns
The anchor callback predicate which returns
accept_policywhen one of the passed predicates returns anaccept_policyotherwise returnsreject_policy.
-
PopefDataUsagePredicate predProgramFlowLoad(const popef::ProgramFlow &flow, PopefDataUsagePolicy accept_policy = PopefDataUsagePolicy::USE_USER_DATA, PopefDataUsagePolicy reject_policy = PopefDataUsagePolicy::USE_POPEF_DATA_IF_ANY)
-
template<typename Policy>
10.2.2. Queue memory management
-
class IMemoryPool
A common interface for all memory allocators.
Subclassed by model_runtime::RingMemoryPool
-
class RingMemoryPool : public model_runtime::IMemoryPool
Memory pool of fixed size blobs.
Allocate the requested number of blobs at construction time and loop over the blobs every time getMemoryBlob() is called.
Public Functions
-
RingMemoryPool(int64_t blob_size, int64_t num_blobs)
Create a ring memory pool.
It allocates memory of num_blobs * blob_size size under the hood.
- Parameters
blob_size – [in] The size of a single memory blob.
num_blobs – [in] The number of memory blobs.
-
int64_t numBlobs() const
Get the number of memory blobs.
-
virtual int64_t blobSize() const override
Get the size of a memory blob.
-
virtual void *getMemoryBlob() override
Get a pointer to the next blob.
Note
When the end of the memory pool is reached, the iteration starts from the beginning again.
-
RingMemoryPool(int64_t blob_size, int64_t num_blobs)
10.2.3. Queue management
-
template<typename BufferType>
class SpscRingBuffer A lock-free, fixed-size, single-producer, single-consumer ring buffer implementation.
auto dst = rb.writeLock(); if (!rb.isValid()) { // dst is not valid: don't dereference it. return; } *dst = obj; rb.writeComplete();
Note
writeLock() and readLock() are blocking calls which might return early if the ring buffer gets invalidated, so the user should always use isValid() after locking a buffer to check whether the returned buffer is safe to use or not.
Public Types
-
using ReadTimeoutCallback = std::function<void(SpscRingBuffer*)>
Signature of the function to call when readLock() times out.
Public Functions
-
SpscRingBuffer(const SpscRingBuffer &other) = delete
-
SpscRingBuffer(const SpscRingBuffer &&other) noexcept = delete
-
SpscRingBuffer &operator=(const SpscRingBuffer &other) = delete
-
SpscRingBuffer &operator=(SpscRingBuffer &&other) noexcept = delete
-
SpscRingBuffer(std::size_t num_buffers, const std::string &label, ReadTimeoutCallback timeout_cb = nullptr, std::chrono::nanoseconds timeout_ns = std::chrono::nanoseconds::zero())
Create a single-producer, single-consumer ring buffer.
- Parameters
num_buffers – [in] The number of buffers to use in the ring buffer.
label – [in] The debug string to use in printState().
timeout_cb – [in] The function to call when a read times out.
timeout_ns – [in] The duration in nanoseconds to wait before calling the timeout callback when no read input is available. If 0, never call the callback.
-
~SpscRingBuffer()
Default destructor.
-
void write(const BufferType &obj)
Lock the ring buffer, write to it and unlock it.
- Parameters
obj – [in] The buffer to be written to.
-
BufferType *writeLock()
Lock and return a buffer for writing.
Only one buffer can be locked for writing at any time. Calling writeLock() when a buffer is already locked will return the same buffer.
If no buffer is available, then the function will block until either:
An existing buffer becomes available.
The ring buffer is invalidated.
Note
isValid() must be used to determine whether the returned buffer is valid or not.
- Returns
A buffer to write to.
-
void writeComplete()
Unlock the currently write-locked buffer.
- Pre
A buffer must have been locked for writing using writeLock().
- Post
The next time writeLock() is called, a different buffer will be returned.
-
const BufferType &readLock()
Lock a buffer for read access.
If no buffer is available the function will block until either:
A buffer becomes available (writeComplete() is called from another thread)
The ring buffer is invalidated.
Some buffers are read-locked and readReset() is called.
Several buffers can be locked in reading mode and each call to readLock() will return a new buffer.
If
timeout_nsis greater than zero and a timeout callback was provided: when readLock() has been waiting for a buffer for longer thantimeout_nsthen call the callback function until a new read buffer becomes available.Note
isValid() must be used to determine whether the returned buffer is valid or not.
-
void readComplete()
Unlock the oldest read-locked buffer.
- Pre
A buffer must have been locked for reading using readLock().
-
void readReset()
All the buffers currently locked for reading are unlocked and placed back at the front of the reading queue.
-
bool readAvailable() const
Check whether any buffer is available to be read-locked.
@ return True if there is at least one buffer available to be read-locked, false if there are no buffers available.
-
void invalidate()
Invalidate the ring buffer.
All the calls after this call will become non blocking.
All the objects returned by calls on an invalidated ring buffer are invalid and should be discarded / ignored.
-
void reset()
Reset all ring buffer values to initial.
-
bool isValid() const
Check the state of the ring buffer.
- Returns
True if the ring buffer is in a valid state, or false if it was invalidated.
-
std::string getState(const std::string &prefix) const
Debug function to print the current state of the ring buffer.
- Parameters
prefix – [in] The string to prefix the state with.
- Returns
The current state of the ring buffer
-
std::size_t numBuffers() const
Return the maximum number of elements the ring buffer can store.
-
const std::string &label() const
Label associated with this ring buffer.
-
using ReadTimeoutCallback = std::function<void(SpscRingBuffer*)>
-
class IQueue
Common interface implemented by various queues.
The interface describes the memory requirements of the queue and provides an interface to disconnect the queue from its data source.
Subclassed by model_runtime::InputQueue, model_runtime::OutputQueue
Public Functions
-
virtual ~IQueue() = default
Default destructor.
-
virtual const popef::TensorInfo &tensorInfo() const = 0
Get the shape and data type of a tensor.
Each buffer in the queue has the same type and shape.
- Returns
The structure encapsulating the shape and data type of a tensor.
-
virtual int64_t numBuffers() const = 0
Get the number of buffers this queue can store.
-
virtual void disconnect() = 0
Disconnect the queue ring buffers: no longer wait for data.
All queue ring buffers are invalidated and immediately return from any blocking calls.
Disconnected queues can no longer be used to feed real data.
Typically disconnect() is used at shutdown to feed dummy data to the executable until it returns from its run() method and can be safely destroyed.
-
virtual ~IQueue() = default
-
RingMemoryPool model_runtime::allocateQueueStorage(const IQueue &queue, int64_t extra_buffers = 0)
Allocate a memory pool large enough to back the given queue.
- Parameters
queue – [in] The queue the memory pool will be used to feed.
extra_buffers – [in] The number of extra buffers to allocate in addition to the queue’s requirements.
- Returns
A memory pool.
-
using model_runtime::ReadStartCallback = std::function<void(void)>
Signature of the function called just before the first chunk of data is about to be transferred to Poplar to build a complete input for the executable.
-
using model_runtime::ReadCompleteCallback = ReadStartCallback
Signature of the function called when the data for a complete model input is about to be consumed by the executable.
-
using model_runtime::WriteCompleteCallback = std::function<void(void)>
Signature of the function called after the data has been written.
-
struct InputData
Structure represents queue input data.
Public Members
-
const uint8_t *data = {nullptr}
Pointer to the buffer containing the data to read.
-
int64_t data_size = {0}
Size in bytes of the data.
-
ReadStartCallback readStartCallback = {nullptr}
Optional function to call just before the first chunk of data is about to be fetched to finally build a complete model input.
Note
The callback might be called more than once if the data is prefetched, then discarded and fetched again.
-
ReadCompleteCallback readCompleteCallback = {nullptr}
Optional function to call when the data for a complete model input is about to be consumed by the executable.
Note
The callback might be called more than once if the data is prefetched, then discarded and fetched again.
-
const uint8_t *data = {nullptr}
-
struct OutputData
Structure represents queue output data.
Public Members
-
uint8_t *data = {nullptr}
Where to write the output.
-
int64_t data_size = {0}
Amount of data to write in bytes.
-
WriteCompleteCallback writeCompleteCallback = {nullptr}
Optional function to call after the data has been written.
Note
Callback might be called more than once if the data is prefetched, then discarded and fetched again.
-
uint8_t *data = {nullptr}
-
using model_runtime::InputRingBuffer = model_runtime::SpscRingBuffer<InputData>
Fixed size, single producer, single consumer ring buffer for input data.
-
using model_runtime::OutputRingBuffer = model_runtime::SpscRingBuffer<OutputData>
Fixed size, single producer, single consumer ring buffer for output data.
-
class InputQueue : public model_runtime::IQueue
Pack or split the data passed by the user to match the amount of data expected by the executable.
For example if the model was compiled to process an input tensor of size [256, 48, 48], which means samples of 48x48 across a batch size of 256, then the application can enqueue 48x48 tensors of any batch size and this queue will take care of either regrouping the inputs into a single run or splitting them across several runs.
Note
It is the user’s responsibility to ensure the data size enqueued is a multiple of a single sample size.
Note
InputQueue cannot be instantiated directly; it is created by QueueManager.
Public Functions
Create an input queue.
- Parameters
buffer – [in] The target ring buffer connected to a Poplar callback.
info – [in] The buffer description.
-
InputData *enqueueLock()
Lock a buffer for writing.
This is a blocking call and only one buffer can be locked for writing at any time. Calling enqueueLock() when a buffer is already locked will return the same buffer.
- Returns
A pointer to an InputData object to fill.
-
void enqueueComplete()
Unlock the current write-locked buffer.
- Pre
A buffer must have been locked for writing using enqueueLock().
- Post
The next time enqueueLock() is called, a different buffer will be returned.
-
void enqueue(const void *input, int64_t data_size, ReadStartCallback read_start_callback = nullptr, ReadCompleteCallback read_complete_callback = nullptr)
Convenience method to lock a buffer for writing, fill InputData and unlock the buffer.
Note
The callback might be called more than once if the data is prefetched, then discarded and fetched again.
- Parameters
input – [in] The address of the buffer containing the data. The data is not copied to a buffer, the pointer must remain valid until the data has been used by the ring buffer consumer.
data_size – [in] The size in bytes to use from the input. Must be a multiple of single sample size.
read_start_callback – [in] An optional callback to call when the data starts being read.
read_complete_callback – [in] An optional callback to call when the data read is complete.
-
void flush()
Flush any partial input still being created or enqueue a dummy batch.
-
void reset()
Reset underlying ring buffer.
-
virtual void disconnect() override
Parent interface method.
-
virtual const popef::TensorInfo &tensorInfo() const override
Parent interface method.
-
virtual int64_t numBuffers() const override
Parent interface method.
-
class OutputQueue : public model_runtime::IQueue
Pack or split the data returned by the executable to the application’s batches.
See also
Note
It is the user’s responsibility to ensure the data size enqueued is a multiple of the output sample size.
Note
The batch size used in OutputQueue must match the one enqueued to InputQueue.
Note
OutputQueue cannot be instantiated directly; it is created by QueueManager.
Public Functions
Create an output queue.
- Parameters
buffer – [in] The source ring buffer connected to a Poplar callback.
info – [in] The buffer description
-
OutputData *enqueueLock()
Blocking call to lock a buffer for reading.
Only one buffer can be locked for reading at any time. Calling readLock() when a buffer is already locked will return the same buffer.
Note
This is a queue of pointers/addresses (not data). The content of the buffer will be overwritten after readComplete() is called.
- Returns
A pointer to the buffer containing the data to be read.
-
void enqueueComplete()
Unlock the current read-locked buffer.
- Pre
A buffer must have been locked for reading using readLock().
- Post
The next time readLock() is called, a different buffer will be returned.
-
void enqueue(void *output, int64_t data_size, WriteCompleteCallback write_complete_callback = nullptr)
Convenience method to lock a buffer for reading, fill OutputData and unlock the buffer.
- Parameters
output – [in] The address of the buffer to be read. The pointer must remain valid until the data has been copied by the ring buffer producer.
data_size – [in] The size in bytes to copy to the output. Must be a multiple of the single sample size.
write_complete_callback – [in] An optional callback to call when the output has been filled.
-
void flush()
Flush any partial output still being created or if there isn’t any, enqueue a handler to handle the dummy output produced by the dummy input enqueues by the corresponding InputQueue flush() methods.
-
void reset()
Reset underlying ring buffer.
-
virtual void disconnect() override
Parent interface method.
-
virtual const popef::TensorInfo &tensorInfo() const override
Parent interface method.
-
virtual int64_t numBuffers() const override
Parent interface method.
-
using model_runtime::RingSizeMultiplierProdType = std::function<int64_t(const popef::Anchor&)>
A multiplier of the ring size of the producer data type.
A factory function type used to produce a ring size multiplier for a specific popef::Anchor. This kind of functor can be defined by the user to control the size of the QueueManager ring buffer (see:
SpscRingBufferclass). For example, in case of tensors loaded by the program only once (like weights or other tensors fetched from host in Load Program) the ring size buffer can be set up to 1, as there is no need to prefetch more values from the user.
-
class QueueManager
Create and manage queues related to a session.
Note
Queues currently only work for models with replica == 1.
Public Functions
-
QueueManager(const QueueManager&) = delete
-
QueueManager(QueueManager&&) = delete
-
QueueManager &operator=(const QueueManager &other) = delete
-
QueueManager &operator=(QueueManager&&) = delete
-
~QueueManager() = default
Default destructor.
-
InputQueue &inputQueue(const std::string &name)
Get the input queue of the named tensor or stream.
- Parameters
name – [in] The name of the tensor or stream.
-
OutputQueue &outputQueue(const std::string &name)
Get the output queue of the named tensor or stream.
- Parameters
name – [in] The name of the tensor or stream.
-
void flushAll()
Call flush() on all the queues.
-
void disconnectAll()
Disconnect all the queues from their ring buffers.
-
void resetAll()
Reset all the queues after session stop.
Public Members
-
std::map<std::string, InputQueue> inputs
Map the user input streams to their input queues.
-
std::map<std::string, OutputQueue> outputs
Map the user output streams to their output queues.
-
QueueManager(const QueueManager&) = delete
-
namespace ring_size_multiplier_factory
Functions
-
RingSizeMultiplierProdType ringSizeMultForProgs(const std::vector<popef::ProgramFlow::ProgramIndexType> &programs, int64_t selected_ring_size_multiplier, int64_t others_ring_size_multiplier = 1)
Factory function returning the selected ring buffer size multiplier for Anchors “owned” by the set of programs (programs that fetch Anchor data from the host).
For the remaining anchors the other value is returned. This function can be passed to the QueueManager constructor to control the sizes of its internal ring buffers created for model anchors.
- Parameters
programs – [in] The programs “owning” the anchors of interest.
selected_ring_size_multiplier – [in] The ring buffer size multiplier for anchors owned by the programs.
others_ring_size_multiplier – [in] The ring buffer size multiplier for remaining anchors. Note: It is preferred to set the ring buffer multiplier to 1 for anchors from the Load or Save ProgramFlow to reduce memory.
-
RingSizeMultiplierProdType ringSizeMultForMainProgs(const popef::Model &model, int64_t main_ring_size_multiplier, int64_t load_save_ring_size_multiplier = 1)
Factory function returning the selected ring buffer size multiplier for anchors “owned” by the main programs (programs of the main ProgramFlow that fetch Anchor data from the host).
For the remaining anchors (“owned” by Load or Save ProgramFlow) the other value is returned. This function can be passed to the QueueManager constructor to control the sizes of its internal ring buffers created for model Anchor objects.
Note
It is preferred to set the ring buffer multiplier to 1 for anchors from the Load or Save ProgramFlow to reduce memory.
- Parameters
model – [in] A combination of PopEF blobs representing a single model.
main_ring_size_multiplier – [in] The ring buffer size multiplier for anchors owned by the main programs.
load_save_ring_size_multiplier – [in] The ring buffer size multiplier for remaining anchors.
-
RingSizeMultiplierProdType ringSizeMultForProgs(const std::vector<popef::ProgramFlow::ProgramIndexType> &programs, int64_t selected_ring_size_multiplier, int64_t others_ring_size_multiplier = 1)
10.2.4. Runtime management
-
struct SessionConfig
Session configuration.
Public Members
-
LaunchPolicy policy = LaunchPolicy::Deferred
Session creation policy that is associated with acquiring a device.
Default: LaunchPolicy::Deferred.
-
PopefDataUsagePredicate pred_tensor_data = null_popef_data_usage_predicate
Predicate for anchor callback.
This controls user callback handling for the anchor. It is not used by default.
-
bool check_package_hash = true
If true, the Poplar hash will be checked before the executable is loaded onto the device.
If false, this check is not done. Default: true.
-
DeviceWaitConfig wait_config = {}
By default Session throws an exception when it is not able to attach any device needed by the given model.
This behavior can be changed by setting a custom device wait config.
-
LaunchPolicy policy = LaunchPolicy::Deferred
-
class Session
Link a model to a device.
Note
If two or more sessions share a device, runLoadPrograms() and runSavePrograms() will implicitly be called when the device gets bound or unbound to this session.
Public Types
-
using ProgIdxType = popef::ProgramFlow::ProgramIndexType
The index for a runnable program available in the executable.
-
using ProgramsAndAnchorsMap = std::map<ProgIdxType, std::vector<const popef::Anchor*>>
Mapping between program index and a vector of
popef::Anchorobjects appearing in that program that are available in the executable.
Public Functions
-
explicit Session(const std::vector<std::string> &popef_paths, const SessionConfig &config = {})
Create a new Session object.
- Parameters
popef_paths – The paths to PopEF files from which the model will be loaded.
config – The session configuration.
Create a Session object.
- Parameters
model – The model which will be loaded and executed on the IPU.
config – The session configuration.
-
~Session()
Default destructor.
Bind the session to a device and load the executable onto it.
If the sessions is already bound to a device, this method first unbinds the current device before binding to the new device.
- Parameters
device – [in] The wrapper around a Poplar device.
-
void runLoadPrograms()
Run the programs to copy the data to the device.
Note
This method is implicitly called before the first call to runMainPrograms() after the device was bound to this session.
- Pre
The session must be bound to a device.
-
void runMainPrograms()
Run the main programs.
Note
If the device was last used by a different session, this method first unbinds the device from that session, then binds it to this session and calls runLoadPrograms() before actually running the main programs.
- Pre
The session must be bound to a device.
-
void runSavePrograms()
Run the programs to copy the data back to the host.
Note
This method is implicitly called when the device bound to this session gets unbound.
- Pre
The session must be bound to a device.
-
void runPrograms(const std::vector<ProgIdxType> &progs)
Run your own set of programs.
Each program will run once.
Note
This function is for advanced users who understand what the programs do during the execution, and what the result is. Remember that the order of the programs in the vector matters. Programs will be run in sequence based on their position in the vector. Please be aware that if you run programs from the main set of programs and you did not run earlier programs for loading data you might observe incorrect results. The same situation applies if you run programs for saving data before running programs from the load and main set. Therefore proper index order in the vector is really important.
- Parameters
progs – [in] The set of program indices which you would like to run. Indices need to be present in the loaded popef::Model.
- Pre
The session must be bound to a device.
-
void unloadFromDevice()
Unload the session from the device it is currently bound to.
- Pre
The session must be bound to a device.
-
void setCallbackForAnchor(const std::string &anchor_handle, CallbackHandle callback)
Set the callback (data source/destination buffer and way of managing it) for popef::Anchor (input/output tensor).
- Parameters
anchor_handle – [in] The anchor handle to which the callback will be assigned. Each popef::Anchor has a unique handle.
callback – [in] The callback to be called whenever the stream is to be read or was written to by the device. This depends on whether the callback is assigned * to an input tensor or an output tensor.
- Pre
The session must be bound to a device.
-
void setUserOutputHandler(CallbackFactory factory, const AnchorCallbackPredicate &anchor_callback_predicate = null_anchor_callback_predicate, bool skip_connected = false)
Set up handlers for output tensors.
If the factory returns nullptr for a tensor then the existing callback remains in place.
If the factory returns a callback for a tensor which already had a callback associated with it then the existing callback is discarded and the new one is used instead.
- Parameters
factory – [in] The factory that will be called once per output tensor.
anchor_callback_predicate – [in] The functor controlling user callback binding.
skip_connected – [in] If true, call the factory only for the streams which are not connected. If false, call the factory for all the streams that are required.
- Pre
The session must be bound to a device.
-
void setUserInputHandler(CallbackFactory factory, const AnchorCallbackPredicate &anchor_callback_predicate = null_anchor_callback_predicate, bool skip_connected = false)
Set up handlers for input tensors.
If the factory returns nullptr for a tensor then the existing callback remains in place.
If the factory returns a callback for a tensor which already had a callback associated with it then the existing callback is discarded and the new one is used instead.
- Parameters
factory – [in] The factory that will be called once per input tensor.
anchor_callback_predicate – [in] The functor controlling user callback binding.
skip_connected – [in] If true, call the factory only for the streams which are not connected. If false, call the factory for all the streams that are required.
- Pre
The session must be bound to a device.
-
ProgramsAndAnchorsMap anchorsNotConnectedToCallbacks(const std::vector<ProgIdxType> &progs)
Returns anchors that are not connected to callbacks for particular programs.
- Parameters
progs – [in] The list of program indices.
- Returns
Map, where the key is the program index and value is the vector of anchors that have no linked callbacks for that program.
-
void errorIfAnchorsAreNotConnectedToCallbacks(const std::vector<ProgIdxType> &progs)
Check if all programs have connected all required anchors to the callbacks.
If there are any programs that are not connected to callbacks, then this method throws an error that lists these programs in the error message.
- Parameters
progs – [in] List of program indices.
-
void stop()
Stop the working session.
Send the stop signal to the executable. Disconnect queues if QueueManager is bound. The device will be left in an undefined state and no more programs can be run until reload() is called.
-
void reload()
Load executable again on the bound device.
-
template<typename ...T>
inline QueueManager *createQueueManager(T&&... args) Create QueueManager.
Session takes full ownership of the created QueueManager object. The lifetime of the created QueueManager object is strictly linked with the Session lifetime.
Params should be passed in the same order as in the QueueManager constructors. See model_runtime::QueueManager class.
- Returns
Access ptr to created QueueManager.
-
std::vector<const popef::Anchor*> getInputAnchors() const
Returns all inputs.
This includes inputs that need a user-defined callback or inputs that already have a callback defined based on data from the popef::Model.
- Returns
A vector of pointers to popef::Anchor objects.
-
using ProgIdxType = popef::ProgramFlow::ProgramIndexType