1. Overview
The Bow Pod64 reference design is a rack solution containing 16 Bow-2000 IPU-Machines, one to four host servers (the default is one host server in the reference configuration), network switches and Bow Pod software. There are 64 Bow IPU processors in total with four in each Bow-2000. For more information on Bow Pod systems available from Graphcore see https://www.graphcore.ai/products.
Warning
This guide is for properly trained service personnel and technicians who are required to install Bow Pod systems.
If you have any questions then please contact your Graphcore representative or use the resources on the Graphcore support portal: https://www.graphcore.ai/support.
1.1. Acronyms and abbreviations
This is a short list that describes some of the most commonly used terms in this document.
Term |
Description |
AOC |
Active optical cable |
BMC |
Baseboard Management Controller. Standby power domain service processor doing system hardware management. |
BOM |
Bill of Materials |
GCD |
A graph compile domain is operated by a single poplar instance within the system, either for a single IPU-Machine or for several IPU-Machines connected by IPU-Link cables. |
GW-Link |
High speed (100 GbE) communication links that connect IPU-Machines horizontally across Bow Pod64 racks. Special cables are required for GW-Links. |
IPU-Gateway |
A device that disaggregates the server(s) and the four IPUs in the IPU-Machine across a RoCE network, provides external IPU memory, and enables IPU scaleout across 100 GbE connections (GW-Links) for rack-to-rack connectivity. |
IPU-Link |
High speed communication links that connect IPUs both within and between IPU-Machines in a Pod. Special cables are required for IPU-Links. |
IPU-Machine |
The term IPU-Machine refers to the blades installed in your system, so IPU-M2000 in IPU-POD systems and Bow-2000 in Bow Pod systems. |
Pod |
The term Pod covers both IPU-POD systems (such as the IPU‑POD64 and IPU‑POD256) and Bow Pod systems (such as Bow Pod64 and Bow Pod256). |
PDU |
Power Distribution Unit |
RDMA |
Remote DMA |
RNIC |
RDMA Network Interface Controller |
RoCE |
RDMA over converged Ethernet |
ToR |
Top of Rack. Often also used as a term for the ToR RDMA switch that is placed on top of the IPU-Machines. |