1. Introduction
Model Runtime is a library, built on the Poplar Runtime, that you can use to deploy models on Graphcore IPUs for a wide range of inference applications, for example inference serving backends. It consists of classes and functions to load and run models on the IPU and uses models stored in the Poplar Exchange Format (PopEF).
You can use Model Runtime in two ways:
A high-level interface, using ModelRunner, allows you to quickly deploy models. It requires minimal knowledge about the Poplar SDK libraries and the IPU hardware.
A low-level interface, using sessions, provides more flexibility but requires more knowledge about PopEF files, the Poplar runtime and the IPU hardware.
Section 2, Model Runtime overview provides an overview of the Model Runtime functionality.
The Model Runtime library has both C++ and Python APIs.
Examples of using Model Runtime can be found in the Examples appendix.