Model Parallelism on the IPU with TensorFlow: Sharding and Pipelining
Version: 3.1.0
1. Model parallelism
2. Sharding
2.1. Graph sharding
2.1.1. API
2.1.2. Code example
2.2. Limitations of sharding
3. Pipelining
3.1. Overview
3.2. Pipeline operation
3.3. Pipelining API
3.3.1. Inputs and outputs
3.3.2. Device mapping
3.3.3. Pipeline scheduling
Memory use
Execution time
Ramp-up and ramp-down time
Inter-IPU optimisations
3.3.4. Keras API in TensorFlow 2
3.4. Code examples
3.4.1. Inference code examples
3.4.2. Training code examples
3.5. Optimising the pipeline
3.5.1. Recomputation
3.5.2. Variable offloading
3.5.3. Device selection order
3.5.4. Data parallelism
3.5.5. Increase the gradient accumulation count
3.5.6. Profiling
4. PopVisionā¢ Graph Analyser tool
5. Trademarks & copyright
Model Parallelism on the IPU with TensorFlow: Sharding and Pipelining
Index