2. About the system
Graphcloud gives you access to IPU-based hardware that has been configured with the necessary software. This means you can easily run your programs created in standard machine-learning frameworks like TensorFlow and PyTorch.
This section describes the hardware you will have access to and how it is configured. In addition, this section describes the software resources that are available on Graphcloud, including the storage available.
2.1. Pod system
You will be allocated your Pod system as an entity called a “virtual Pod” (vPod). The vPod is provisioned as a single Virtual-IPU partition. A Virtual-IPU partition represents a number of IPUs which can communicate with one another. They are isolated so that all communication from physically neighbouring devices that are not in the same partition is prohibited.
You will have access to:
A secure Pod system fully isolated by VLAN.
Host servers (the exact number depends on the exact Pod you have chosen).
V-IPU admin & server software (preconfigured).
Local storage (4 TB); this will be shared by all users accessing your Pod system (Section 2.3, Storage).
Read-only access to a shared drive with code examples and public data sets (Section 2.4, Software and datasets).
Docker support as standard on virtualised hosts.
Root access provided on request.
Examples of some Pod systems available on Graphcloud that you can purchase access to:
Bow Pod16 systems (comprised of 4 Bow-2000s and 1 host server)
Bow Pod64 systems (comprised of 16 Bow-2000s and 1 host server)
Bow Pod128 systems (comprised of 32 Bow-2000s and 8 host servers)
Bow Pod256 systems (comprised of 64 Bow-2000s and 16 host servers)
Full pricing details for accessing these systems are available from Cirrascale or from Graphcore sales.
Note
Your vPod instance will always be set up with the latest released version of the IPU-M software and Poplar SDK. Thereafter, you need to request upgrade of this software.
2.2. Access
You access Graphcloud via an SSH connection. You can add multiple users to your vPod instance.
Full details are given in Section 4, Access.
Note
Multiple user jobs can be run at the same time provided that the total number of IPUs requested does not exceed the number of IPUs in the Pod system. This is because we use a reconfigurable Virtual-IPU partition that supports many smaller partition requests within it. Refer to the Monitoring Hardware Quick Start for more information on partitions.
2.3. Storage
2.3.1. Local storage
You have 4 TB of high-speed storage at /localdata
. You should run tests and
store data there. This is much faster than the disk used for /mnt/public
,
which is read-only.
A directory has been created with your username under /localdata
. This
provides fast access to a large amount of storage. By default, this personal
workspace is private to each user.
2.3.2. Temporary storage
Each user has a tmp
subdirectory (for example, /localdata/alice/tmp
)
which is configured (in the ~/.profile
file) to be used as temporary storage.
This provides fast storage and avoids the more limited system tmp
directory
filling up.
2.4. Software and datasets
There are several software packages already installed on your vPod instance including:
C++ compilers
Python (versions 2.7 and 3.6)
Graphcore Poplar SDK (in
/opt/gc
), which includes:the Poplar Graph Programming Framework
TensorFlow 1, TensorFlow 2 for the IPU
PyTorch for the IPU
PopART
examples
documentation
Section 5, Setup contains more information on how to configure the Poplar SDK before you can run programs.
Note
Your vPod instance will always be set up with the latest released version of the Poplar SDK. Thereafter, you need to request upgrade of this software.
Datasets for use with the Graphcore examples in
/localdata/datasets
. There are README files with more information on using each of these datasets. The same datasets are also on the shared drive/mnt/public/data
, which can be used to restore the data if necessary.Refer to Section 5.3, Installing the Graphcore examples and tutorials for more information on how to install Graphcore examples.
Licensing of datasets
Some datasets are provided locally for ease of use. These are made available for non-commercial use only and should not be downloaded from Graphcloud. Please take time to read the full terms & conditions of use and license information which you will find alongside the datasets.
The pre-installed public datasets are available for all users under
/localdata/datasets
.
2.5. Using your own data
You can upload your own data, within the available disk space.
2.6. Using synthetic data
You can use synthetic data to test model performance without the overhead of data transfer. The documentation for each framework describes how to enable using synthetic data, for example:
enableSyntheticData()
in PyTorchThe tutorial on efficient data loading is dedicated to data loading and explains how to use synthetic data.
--use_synthetic_data
in TensorFlow 2.