2. About the system

Graphcloud gives you access to IPU-based hardware that has been configured with the necessary software. This means you can easily run your programs created in standard machine-learning frameworks like TensorFlow and PyTorch.

This section describes the hardware you will have access to and how it is configured. In addition, this section describes the software resources that are available on Graphcloud, including the storage available.

2.1. Pod system

You will be allocated your Pod system as an entity called a “virtual Pod” (vPod). The vPod is provisioned as a single Virtual-IPU partition. A Virtual-IPU partition represents a number of IPUs which can communicate with one another. They are isolated so that all communication from physically neighbouring devices that are not in the same partition is prohibited.

You will have access to:

A secure Pod system fully isolated by VLAN.
Host servers (the exact number depends on the exact Pod you have chosen).
V-IPU admin & server software (preconfigured).
Local storage (4 TB); this will be shared by all users accessing your Pod system (Section 2.3, Storage).
Read-only access to a shared drive with code examples and public data sets (Section 2.4, Software and datasets).
Docker support as standard on virtualised hosts.
Root access provided on request.

Examples of some Pod systems available on Graphcloud that you can purchase access to:

Bow Pod₁₆ systems (comprised of 4 Bow-2000s and 1 host server)
Bow Pod₆₄ systems (comprised of 16 Bow-2000s and 1 host server)
Bow Pod₁₂₈ systems (comprised of 32 Bow-2000s and 8 host servers)
Bow Pod₂₅₆ systems (comprised of 64 Bow-2000s and 16 host servers)

Full pricing details for accessing these systems are available from Cirrascale or from Graphcore sales.

Note

Your vPod instance will always be set up with the latest released version of the IPU-M software and Poplar SDK. Thereafter, you need to request upgrade of this software.

2.2. Access

You access Graphcloud via an SSH connection. You can add multiple users to your vPod instance.

Full details are given in Section 4, Access.

Note

Multiple user jobs can be run at the same time provided that the total number of IPUs requested does not exceed the number of IPUs in the Pod system. This is because we use a reconfigurable Virtual-IPU partition that supports many smaller partition requests within it. Refer to the Monitoring Hardware Quick Start for more information on partitions.

2.3. Storage

2.3.1. Local storage

You have 4 TB of high-speed storage at /localdata. You should run tests and store data there. This is much faster than the disk used for /mnt/public, which is read-only.

A directory has been created with your username under /localdata. This provides fast access to a large amount of storage. By default, this personal workspace is private to each user.

2.3.2. Temporary storage

Each user has a tmp subdirectory (for example, /localdata/alice/tmp) which is configured (in the ~/.profile file) to be used as temporary storage. This provides fast storage and avoids the more limited system tmp directory filling up.

2.3.3. Shared workspace

You can create a shared area that you and your team can access.

Full details are given in Setting up a shared workspace in the Getting Started with Graphcloud.

2.4. Software and datasets

There are several software packages already installed on your vPod instance including:

C++ compilers
Python (versions 2.7 and 3.6)
Graphcore Poplar SDK (in /opt/gc), which includes:
- the Poplar Graph Programming Framework
- TensorFlow 1, TensorFlow 2 for the IPU
- PyTorch for the IPU
- PopART
- examples
- documentation
Section 5, Setup contains more information on how to configure the Poplar SDK before you can run programs.

Note

Your vPod instance will always be set up with the latest released version of the Poplar SDK. Thereafter, you need to request upgrade of this software.
Datasets for use with the Graphcore examples in /localdata/datasets. There are README files with more information on using each of these datasets. The same datasets are also on the shared drive /mnt/public/data, which can be used to restore the data if necessary.

Refer to Section 5.3, Installing the Graphcore examples and tutorials for more information on how to install Graphcore examples.

Licensing of datasets

Some datasets are provided locally for ease of use. These are made available for non-commercial use only and should not be downloaded from Graphcloud. Please take time to read the full terms & conditions of use and license information which you will find alongside the datasets.

The pre-installed public datasets are available for all users under /localdata/datasets.

2.5. Using your own data

You can upload your own data, within the available disk space.

2.6. Using synthetic data

You can use synthetic data to test model performance without the overhead of data transfer. The documentation for each framework describes how to enable using synthetic data, for example:

enableSyntheticData() in PyTorch
- The tutorial on efficient data loading is dedicated to data loading and explains how to use synthetic data.
--use_synthetic_data in TensorFlow 2.

Search help

2. About the system

2.1. Pod system

2.2. Access

2.3. Storage

2.3.1. Local storage

2.3.2. Temporary storage

2.3.3. Shared workspace

2.4. Software and datasets

2.5. Using your own data

2.6. Using synthetic data