Targeting the IPU from TensorFlow 2
- 1. Introduction
 - 2. Targeting the Poplar XLA device
 - 3. Support for TensorFlow 2
 - 4. Keras with IPUs
 - 5. Compiling and pre-compiling executables
 - 6. Training a model
- 6.1. Training loops, data sets and feed queues
 - 6.2. Optional simplification of infeeds and outfeeds
 - 6.3. Accessing outfeed queue results during execution
 - 6.4. Replicated graphs
 - 6.5. Pipelined training
 - 6.6. Recomputation
 - 6.7. Gradient accumulation
 - 6.8. Optimizer state offloading
 - 6.9. Replicated tensor sharding
 - 6.10. Dataset benchmarking
 
 - 7. Efficient IPU I/O
 - 8. Example using IPUEstimator
 - 9. Example using IPUPipelineEstimator
 - 10. Distributed training
 - 11. Half-precision floating point and stochastic rounding
 - 12. IPU-optimised operations
 - 13. IPU Outlined Functions
 - 14. Writing custom operations
 - 15. IPU host embeddings
 - 16. IPU embedded application runtime
 - 17. Exporting precompiled models for TensorFlow Serving
 - 18. Retrieving information about compilation and execution
 - 19. Keras with IPUs
- 19.1. Single IPU models
 - 19.2. Using steps_per_execution
 - 19.3. Gradient accumulation
 - 19.4. Model parallelism
 - 19.5. Automatic data parallelism
 - 19.6. Asynchronous callbacks
 - 19.7. Configuring Infeeds and Outfeed
 - 19.8. Saving and loading Keras models
 - 19.9. Exporting precompiled Keras models for TensorFlow Serving
 - 19.10. IPU-specific Keras layers and optimizers
 - 19.11. Implementation details
 - 19.12. Automatic loss scaling
 
 - 20. IPU TensorFlow Addons
 - 21. TensorFlow API changes
 - 22. TensorFlow Python API
- 22.1. Operations and utilities related to the Graphcore IPU
 - 22.2. Distribution strategy for a single system
 - 22.3. Compiler interface
 - 22.4. Scoping contexts
 - 22.5. Infeed queue
 - 22.6. Outfeed queue
 - 22.7. General utilities
 - 22.8. Configuration utilities
 - 22.9. Looping utilities
 - 22.10. Distribution using PopDist
 - 22.11. Serving utilities
 - 22.12. Datasets
 - 22.13. Estimators
 - 22.14. Operators
- 22.14.1. Control flow operations.
 - 22.14.2. Custom operations
 - 22.14.3. Functional operators
 - 22.14.4. Image operations
 - 22.14.5. Graphcore utility operations
 - 22.14.6. IPU specific maths operations
 - 22.14.7. Pipelining operators
 - 22.14.8. Popnn primitive neural network operators
 - 22.14.9. Popnn normalization operators
 - 22.14.10. Popops all to all and all gather operators
 - 22.14.11. Popops cross replica operators
 - 22.14.12. Popops embedding operators
 - 22.14.13. F8 operations
 - 22.14.14. Popops reduce scatter operator
 - 22.14.15. Popops within replica operators
 - 22.14.16. Poprand operators
 - 22.14.17. Utility operations to be used in replicated mode
 - 22.14.18. Slicing operators
 - 22.14.19. Statistics operators
 - 22.14.20. Embedded application runtime
 
 - 22.15. Optimisers
 - 22.16. Sharding
 
 - 23. TensorFlow operators supported by the IPU
 - 24. Keras API changes
 - 25. Keras Python API
 - 26. IPU TensorFlow Addons API changes
 - 27. IPU TensorFlow Addons Python API
 - 28. Resources
 - 29. Legal notices