Targeting the IPU from TensorFlow 2
- 1. Introduction
- 2. Targeting the Poplar XLA device
- 3. Compiling and pre-compiling executables
- 4. Support for TensorFlow 2
- 5. TensorFlow 2 examples
- 6. Training a model
- 7. Efficient IPU I/O
- 8. Example using IPUEstimator
- 9. Example using IPUPipelineEstimator
- 10. Distributed training
- 11. Half-precision floating point and stochastic rounding
- 12. IPU-optimised operations
- 13. IPU Outlined Functions
- 14. Writing custom operations
- 15. IPU host embeddings
- 16. Retrieving information about compilation and execution
- 17. API changes
- 18. Deprecated profiling functionality
- 18.1. Adding an operation to get compilation and execution events
- 18.2. Enabling tracing in the hardware configuration options
- 18.3. Extract the reports from the returned events
- 18.4. Producing reports for use with the PopVision Graph Analyser
- 18.5. Using the IPU Model device for debugging
- 18.6. Reading the Poplar textual summary report
- 18.7. Producing an ELF image of the compilation
- 19. Python API
- 19.1. Operations and utilities related to the Graphcore IPU
- 19.2. Distribution strategy for a single system
- 19.3. Compiler interface
- 19.4. Scoping contexts
- 19.5. Infeed queue
- 19.6. Outfeed queue
- 19.7. General utilities
- 19.8. Configuration utilities
- 19.9. Looping utilities
- 19.10. Distributed training
- 19.11. Horovod
- 19.12. Datasets
- 19.13. Estimators
- 19.14. Keras
- 19.15. Keras layers
- 19.16. Keras losses
- 19.17. Keras optimizers
- 19.18. Operators
- 19.18.1. Custom operations
- 19.18.2. Functional operators
- 19.18.3. Image operations
- 19.18.4. Graphcore utility operations
- 19.18.5. IPU specific maths operations
- 19.18.6. Pipelining operators
- 19.18.7. Popnn primitive neural network operators
- 19.18.8. Popnn normalization operators
- 19.18.9. Popnn recurrent neural network operators
- 19.18.10. Popops all to all and all gather operators
- 19.18.11. Popops cross replica operators
- 19.18.12. Popops embedding operators
- 19.18.13. Popops reduce scatter operator
- 19.18.14. Poprand operators
- 19.18.15. Utility operations to be used in replicated mode
- 19.18.16. Slicing operators
- 19.18.17. Statistics operators
- 19.18.18. Summary operations for IPUs
- 19.19. Optimisers
- 19.20. Sharding
- 20. TensorFlow operators supported by the IPU
- 21. Resources
- 22. Index
- 23. Trademarks & copyright