Logo
Model Parallelism on the IPU with TensorFlow: Sharding and Pipelining
Version: latest
  • 1. Model parallelism
  • 2. Sharding
    • 2.1. Graph sharding
      • 2.1.1. API
      • 2.1.2. Code example
    • 2.2. Limitations of sharding
  • 3. Pipelining
    • 3.1. Overview
    • 3.2. Pipeline operation
    • 3.3. Pipelining API
      • 3.3.1. Inputs and outputs
      • 3.3.2. Device mapping
      • 3.3.3. Pipeline scheduling
        • Memory use
        • Execution time
        • Ramp-up and ramp-down time
        • Inter-IPU optimisations
      • 3.3.4. Keras API in TensorFlow 2
    • 3.4. Code examples
      • 3.4.1. Inference code examples
      • 3.4.2. Training code examples
    • 3.5. Optimising the pipeline
      • 3.5.1. Recomputation
      • 3.5.2. Variable offloading
      • 3.5.3. Device selection order
      • 3.5.4. Data parallelism
      • 3.5.5. Increase the gradient accumulation count
      • 3.5.6. Profiling
  • 4. PopVision™ Graph Analyser tool
  • 5. Trademarks & copyright
Model Parallelism on the IPU with TensorFlow: Sharding and Pipelining

Model Parallelism on the IPU with TensorFlow: Sharding and Pipelining

This technical note describes how to parallelise TensorFlow models on IPU hardware.

Contents

  • 1. Model parallelism
  • 2. Sharding
    • 2.1. Graph sharding
      • 2.1.1. API
      • 2.1.2. Code example
    • 2.2. Limitations of sharding
  • 3. Pipelining
    • 3.1. Overview
    • 3.2. Pipeline operation
    • 3.3. Pipelining API
      • 3.3.1. Inputs and outputs
      • 3.3.2. Device mapping
      • 3.3.3. Pipeline scheduling
        • Memory use
        • Execution time
        • Ramp-up and ramp-down time
        • Inter-IPU optimisations
      • 3.3.4. Keras API in TensorFlow 2
    • 3.4. Code examples
      • 3.4.1. Inference code examples
      • 3.4.2. Training code examples
    • 3.5. Optimising the pipeline
      • 3.5.1. Recomputation
      • 3.5.2. Variable offloading
      • 3.5.3. Device selection order
      • 3.5.4. Data parallelism
      • 3.5.5. Increase the gradient accumulation count
      • 3.5.6. Profiling
  • 4. PopVision™ Graph Analyser tool
  • 5. Trademarks & copyright
Next

Revision cda29ff4.

Read the Docs v: latest