2. Product description
2.1. Bow Pod256 reference design
Graphcore’s Bow Pod64 reference design assembles 16 Bow-2000 IPU-Machines together into a logical rack delivering over 22 petaFLOPS of AI compute. The Bow Pod64 can be used individually (64 Bow IPU processors) or as a building block for larger systems such as the Bow Pod256 (256 Bow IPUs), going up to 1024 Bow Pod64 racks (64 K Bow IPU processors) delivering nearly 23 exaFLOPS of AI compute.
The Bow Pod256 is built from 4 Bow Pod64 racks so contains 64 Bow-2000 IPU-Machines delivering over 89 petaFLOPS of AI compute.
Each Bow Pod64 combines the sixteen Bow-2000 IPU-Machines with network switches and a host server in a rack configuration (switches and host server not provided by Graphcore). The Bow Pod64 system assumes the following default components:
1 - 4 approved host servers, see the approved server list for more details. In this datasheet we use the Dell R6525 server with dual-socket AMD Epyc2 CPUs as the default offering. Default number of servers is 1, however up to 4 host servers may be required depending on workload - please speak to Graphcore sales.
1 Arista 7060X ToR switch (32x100G + 2 10G)
1 Arista 7010T management switch (48p 1G+ 4x1/10G)
1 x Arista DCS-7060PX4-32-F (GW-Link switch)
The Bow Pod256 is characterized by the following features:
Disaggregated host architecture allows for different server requirements based on workload
89.23 petaFLOPS (FP16.16) of AI compute, 22.31 petaFLOPS @ FP32, and up to 16.6 TBytes of memory
2D-torus IPU-Link topology
A high-level view of the cabling for an individual Bow Pod64 is shown in Fig. 2.1.
A Bow Pod256 system consists of four Bow Pod64 logical racks with optical GW-Link cables used to connect the Bow Pod64 racks together via two redundant switches. Since the GW-Links are optical Ethernet cables the Bow Pod64 logical racks can be installed further apart if necessary, if datacentre layout does not allow for them to be installed adjacent to each other.
Fig. 2.2 shows the Bow Pod256 layout and GW-Link cables between the four Bow Pod64 logical racks. A logical rack is a space in one or more physical racks occupied by a single Pod. Since the standard racking of a single Pod may not be possible within one physical rack, we use the term “logical rack” to refer to the set of components making up the single Pod, regardless of where they may be physically installed.
Bow Pod256 systems are available as a full implementation through Graphcore’s network of reseller and OEM partners.
Alternatively, customers may directly implement the Bow Pod256 system with the help of the Bow Pod256 build and test guide, available from the Graphcore document website.
2.2. Communication for scale-out: 3D IPU-Fabric with GCL
The Bow Pod256 reference design builds on the innovative IPU-Fabric, designed to support massive scale out. Fig. 2.3 below shows, on the left, an abstracted view of a Bow-2000 with the IPU-Fabric interconnects comprising IPU-Links™, GW-Links (for jitter-free IPU-to-IPU connectivity), and the Host-Link 100Gbps RDMA connection between the host server and each Bow-2000. The small insert on the right shows how these interconnects are used as part of the Bow Pod64 scale-out: IPU-Links join IPU processors together both within Bow-2000s as well as between them; GW-Links connect between the IPU-Gateway chips in each Bow-2000. The IPU-Link connections in the Bow Pod64 form a 2D torus since the loops are closed top and bottom.
The Graphcore Communication Library (GCL) manages the communication and synchronization between IPUs across any IPU-Fabric, supporting ML at scale.
2.3. Host-Links and GW-Links
Host servers are disaggregated from the Bow-2000 PU-Machines with Host-Links – Graphcore´s low-latency high throughout host-IPU RDMA transport using RoCEv2. The 100 Gbps Ethernet links from the RoCEv2 NICs in the Bow-2000 IPU-Machines connect through switches to the disaggregated host servers. This disaggregated host architecture for the Bow Pod256 enables user-defined host:IPU ratios and allows for scalable host server utilisation depending on the machine intelligence workload. The GW-Links are used to connect multiple Bow Pod64 logical racks either directly or through an optional Ethernet switching fabric as shown in Fig. 2.4.
2.4. Software
Bow Pod systems are fully supported by Graphcore’s Poplar® software development environment, providing a complete and mature platform for ML development and deployment. Standard ML frameworks including TensorFlow, Keras, ONNX, Halo, PaddlePaddle, HuggingFace, PyTorch and PyTorch Lightning are fully supported along with access to PopLibs through our Poplar C++ API. Note that PopLibs, PopART and TensorFlow are available as open source in the Graphcore GitHub repo https://github.com/graphcore. PopTorch provides a simple wrapper around PyTorch programs to enable the programs to run seamlessly on IPUs. The Poplar SDK also includes the PopVision™ visualisation and analysis tools which provide performance monitoring for IPUs - the graphical analysis enables detailed inspection of all processing activities.
In addition to these Poplar development tools, the Bow Pod64 is enabled with software support for industry standard converged infrastructure management tools including OpenBMC, Redfish, Docker containers, and orchestration with Slurm and Kubernetes.
Complete end-to-end software stack for developing, deploying and monitoring AI model training jobs as well as inference applications on the Graphcore IPU |
|
---|---|
ML frameworks |
TensorFlow, Keras, PyTorch, Pytorch Lightning, HuggingFace, PaddlePaddle, Halo, and ONNX |
Deployment options |
Bare metal (Linux), VM (HyperV), containers (Docker) |
Host-Links |
RDMA based disaggregation between a host and IPU over 100Gbps RoCEv2 NIC, using the IPU over Fabric (IPUoF) protocol |
Host-to-IPU ratios supported: 1:16 up to 1:64 |
|
Graphcore Communication Library (GCL) |
IPU-optimized communication and collective library integrated with the Poplar SDK stack |
Support all-reduce (sum,max), all-gather, reduce, broadcast |
|
Scale at near linear performance to 64k IPUs |
|
PopVision |
Visualization and analysis tools |
To see a full list of supported OS, VM and container options go to the Graphcore support portal https://www.graphcore.ai/support
IPU-Fabric topology discovery and validation |
|
---|---|
Provisioning |
REST API and SSH/CLI for IPU allocation/de-allocation into isolated domains (vPods) |
Plug-ins for SLURM and Kubernetes (K8) |
|
Resource monitoring |
REST API and SSH/CLI for accessing the Bow-2000 monitoring service |
Prometheus node exporter and Grafana (visualization) support |
Baseboard Management Controller (OpenBMC) |
Dual-image firmware with local rollback support |
Console support, CLI/SSH based |
Serial-over-Lan and Redfish REST API |
2.5. Technical specifications
IPU-Machines |
64x Bow-2000 blades |
IPUs |
256x Bow IPU processors (4 in each Bow-2000) |
IPU-Cores™ |
376,832 |
Worker threads |
2.26 million |
AI compute |
89.234 petaFLOPS AI (FP16.16) compute |
22.309 petaFLOPS FP32 compute |
|
Memory |
Up to 8422.4 GB (includes 230.4 GB In-Processor-Memory (64x 3.6 GB per Bow-2000) and 8192 GB Streaming Memory (64x 64 GB DIMM x2 per Bow-2000) |
For each Bow Pod64 rack: |
|
Bow Pod64 host server(s) |
Default: 1x Dell PowerEdge R6525 server |
Options: 1 – 4 Graphcore approved server/OS options, contact Graphcore sales or your channel partner for details |
|
Bow Pod64 default switches |
1x Arista DCS-7060CX-32S-F (100 GbE ToR switch) |
1x Arista DCS-7010T-48-F (1 GbE management switch) |
|
1x Arista DCS-7060PX4-32-F (GW-Link switch) |
IPU-Links |
128 Tbps (256 x16 Gen4) aggregated bi-directional bandwidth for direct, 2D torus within-rack Bow Pod256 IPU connectivity |
64 Bow-2000s directly connected |
|
512 standard OSFP ports with DAC cabling |
|
GW-Links |
25.6 TBps (128 x 100 Gbps) for between-rack Bow Pod256 connectivity |
Up to 1024 Bow Pod64 (direct) or 256 Bow Pod64 (switched) can be connected |
|
Standard 100 Gbps QSFP28 ports supporting industry standard transceivers (100G-DR) and DAC cabling |
For each Bow Pod64 rack: |
|
Bow Pod64 server(s) to Bow-2000 connectivity |
1 (default) to 4 host servers have connectivity to the 16 Bow-2000s via the Bow Pod64 100 GbE ToR switch (Arista DCS-7060CX-32S-F) |
1x 100 GbE port per Bow-2000 to connect to the ToR switch |
|
Dual (2x) 100 GbE ports per host server to connect to the ToR switch |
|
Bow Pod64 internal management network connectivity |
Aggregated in the Arista DCS-7010T-48-F 1GbE management switch are: |
2x 1 GbE RJ45 management ports from each of the 16 Bow-2000s |
|
Server management port(s) |
|
PDU monitoring port |
Air cooled |
Built-in N+1 hot-plug fan cooling system in each of the individual components (Bow-2000s, servers and switches) |
Rack airflow |
All Bow Pod256 components (Bow-2000 IPU-Machines, server(s) and switches) are mounted for airflow direction front of rack (single door, cold aisle side) to back of rack (split door, hot aisle side) |
Airflow rate |
103 CFM (measured) per Bow-2000 (6592 CFM total in Bow Pod64) |
For each Bow Pod64 rack: |
|
Rack |
42U - 600 mm (W) X 1200 mm (D) x 1991 mm (H) |
Weight |
450 kg (943 lbs) |
PDU |
PDU implementation can be customized for target workload and rack power density goals. Please contact Graphcore sales for any help required specifying the PDU implementation |
Input power (Vac) |
200 - 240 V |
Input power (Vdc) |
240-310 V for GC-ADA2-30W and GC-ADA2-3EW models |
Power cap |
1700 W with programmable power cap |
Redundancy |
1+1 redundancy (with power cap set to 1500W) |
Power (nominal) |
19 kW |
For information on Bow Pod256 integration with datacentre infrastructure, please contact Graphcore sales.
2.6. Environmental characteristics
Operating temperature and humidity (inlet air) |
10-32° C (50 to 90° F) at 20%-80% RH (*) |
Operating altitude |
0 to 3,048 m (0-10,000ft) (**) |
(*) Altitude less than 900 m/3000 ft and non-condensing environment
(**) Max. ambient temperature is de-rated by 1° C per 300 m above 900 m
For power caps higher than 1700W per Bow-2000 please contact Graphcore sales for environmental guidance.
2.7. Standards compliance for Bow-2000 IPU-Machines
EMC standards |
Emissions: FCC CFR 47, ICES-003, EN55032, EN61000-3-2, EN61000-3-3, VCCI 32-1 |
Immunity: EN55035, EN61000-4-2, EN61000-4-3, EN61000-4-4, EN61000-4-5, EN61000-4-6, EN61000-4-8, EN61000-4-11 |
|
Safety standards |
IEC62368-1 2nd Edition, IEC60950-1, UL62368-1 2nd Edition |
Certifications |
North America (FCC, UL), Europe (CE), UK (UKCA), Australia (RCM), Taiwan (BSMI), Japan (VCCI) |
South Korea (KC), China (CQC) |
|
CB-62368, CB-60950 |
|
Environmental standards |
EU 2011/65/EU RoHS Directive, XVII REACH 1907/2006, 2012/19/EU WEEE Directive |
The European Directive 2012/19/EU on Waste Electrical and Electronic Equipment (WEEE) states that these appliances should not be disposed of as part of the routine solid urban waste cycle, but collected separately in order to optimise the recovery and recycling flow of the materials they contain, while also preventing potential damage to human health and the environment arising from the presence of potentially hazardous substances.
The crossed-out bin symbol is printed on all products as a reminder, and must not be disposed of with your other household waste.
Owners of electrical and electronic equipment (EEE) should contact their local government agencies to identify local WEEE collection and treatment systems for the environmental recycling and /or disposal of their end of life computer products. For more information on proper disposal of these devices, refer to the public utility service.
2.8. Ordering information
Bow Pod systems are available to order from Graphcore channel partners – see https://www.graphcore.ai/partners for details of your nearest Graphcore partner.