1. Model parallelism

This technical note describes how to parallelise TensorFlow models on IPU hardware.

If a deep learning network has too many layers and parameters to fit on one IPU, we need to divide it into pieces and distribute those pieces across multiple IPUs. This is called the model parallelism approach, and it enables us to train large models that exceed the memory capacity on a single IPU accelerator. Currently, we support two types of model parallelism, Sharding and Pipelining.