Logo
IPU Programmer's Guide
Version: latest
  • 1. Introduction
  • 2. IPU hardware overview
    • 2.1. Memory architecture
    • 2.2. Execution
    • 2.3. Tile architecture
      • 2.3.1. On-tile memory
    • 2.4. Host/device communication
  • 3. Programming model
    • 3.1. The Poplar graph libary
    • 3.2. Programs
      • 3.2.1. Data variables
      • 3.2.2. Copying data and executing compute sets
      • 3.2.3. Control flow: sequences, conditionals and loops
      • 3.2.4. Compute sets
      • 3.2.5. The computational graph
      • 3.2.6. Data streams
      • 3.2.7. IPU-level task parallelism
        • 3.2.7.1. Overlapping I/O within the IPU
    • 3.3. Loading and running programs
    • 3.4. The implementation of ML frameworks using IPU programs
    • 3.5. The compilation and execution of IPU programs
      • 3.5.1. Variable liveness
  • 4. Programming tools
    • 4.1. Using a machine learning framework
    • 4.2. Writing IPU programs directly
    • 4.3. Adding custom operations to ML frameworks
    • 4.4. Compilation
    • 4.5. Executing programs
    • 4.6. Profiling and analysing programs
    • 4.7. Further Reading
  • 5. Common algorithmic techniques for IPUs
    • 5.1. Replication
      • 5.1.1. Replication in ML training
      • 5.1.2. Replication in ML inference
      • 5.1.3. Using multiple processes on the host or multiple hosts
        • 5.1.3.1. Replicated tensor sharding
    • 5.2. Gradient accumulation
    • 5.3. Recomputation
    • 5.4. Model parallelism and pipelining (multi-IPU execution)
      • 5.4.1. A simple example
      • 5.4.2. Efficient pipelining
      • 5.4.3. Memory stashes and recomputation
      • 5.4.4. Interleaved schedule pipelining
      • 5.4.5. Further reading on pipelining
    • 5.5. Machine learning techniques on IPU hardware
      • 5.5.1. Batch size terminology
  • 6. Trademarks & copyright
IPU Programmer's Guide

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

6. Trademarks & copyright

Graphcloud®, Graphcore®, Poplar® and PopVision® are registered trademarks of Graphcore Ltd.

Bow™, Bow-2000™, Bow Pod™, Colossus™, In-Processor-Memory™, IPU-Core™, IPU-Exchange™, IPU-Fabric™, IPU-Link™, IPU-M2000™, IPU-Machine™, IPU-POD™, IPU-Tile™, PopART™, PopDist™, PopLibs™, PopRun™, PopTorch™, Streaming Memory™ and Virtual-IPU™ are trademarks of Graphcore Ltd.

All other trademarks are the property of their respective owners.

© Copyright 2022, Graphcore Ltd.

Previous

Revision 2ed121c4.