PopRun is a command line utility to launch distributed applications on Graphcore IPU-POD compute systems. Specifically, PopRun is used to create multiple instances. Each instance can either be launched on a single host server or multiple host servers within the same IPU-POD, depending on the number of host servers available on the target IPU-POD. Typically, an IPU-POD64 is configured with one, two or four host servers.
On large IPU-POD systems such as an IPU-POD128, PopRun will automatically launch multiple instances on remote host servers. With remote host servers we mean servers that are physically located in another interconnected IPU-POD. This makes PopRun a powerful tool for running applications at scale.
For PopRun to launch applications, the actual application must be written in a way so that it is able to take advantage of the additional compute power provided by many IPUs inside an IPU-POD. Other motivating factors to use PopRun are:
PopRun lets you launch multiple instances of your application on one or more IPU-PODs.
Depending on your application, launching multiple instances may increase the performance of your application.
PopRun lets you perform careful placement of multiple instances on the host server to minimise NUMA effects.
PopRun is required if you wish to scale your application beyond a single IPU-POD64.
The Poplar Distributed Configuration Library (PopDist) provides a set of APIs that you can use to make your application ready for distributed execution. Command line parameters passed to PopRun are exposed to the developer, which you can used to distribute the input/output data or other parts of the applications.
Both PopRun and PopDist come bundled with the Poplar SDK.
No additional installation is required.
The PopRun binary is called
poprun and can be found in the
directory of the Poplar SDK installation.
Before you proceed, make sure you have sourced the
script as described in the Getting Started Guide for your IPU system.
1.2. Validating the installation¶
To validate your Poplar SDK installation and thus make sure that
is available, run the following command:
$ poprun --num-instances 2 --num-replicas 2 --offline-mode=1 echo "Hello world!"
If PopRun is setup properly, you should see Hello world! printed out twice, once per instance:
Hello world! Hello world!
As mentioned previously, PopDist requires an external Python package called Horovod. Please consult Section 4, Poplar distributed configuration library (PopDist) of for more information.
The following sections will provide more detailed information regarding the various PopRun and PopDist features.