Targeting the IPU from TensorFlow 1
Version: 3.4.0
1. Introduction
1.1. Document overview
2. Setup quick start
2.1. Enable Poplar SDK
2.2. Create and enable a Python virtual environment
2.3. Install the TensorFlow 1 wheels and validate
3. Tutorial
3.1. Preliminary graphs
3.2. A basic graph
3.2.1. Selecting hardware to run on
3.2.2. Running on the IPU Model simulator
3.3. Compiling the graph for the IPU
3.4. Sharding a graph
3.5. Adding variables
3.5.1. Troubleshooting
3.5.2. Note on the global_step counter
4. Targeting the Poplar XLA device
4.1. Supported types
4.2. Device selection
4.3. Configuring system options
4.3.1. TF_POPLAR_FLAGS environment variable
4.4. Supported operations
4.5. Unsupported operations
4.6. Error Handling
4.6.1. Construction and compilation errors
4.6.2. Runtime errors
5. Compiling and pre-compiling executables
5.1. Caching of compiled executables
5.2. Pre-compiling executables
5.2.1. Unsupported Operations
6. Training a model
6.1. Training loops, data sets and feed queues
6.2. Accessing outfeed queue results during execution
6.3. Replicated graphs
6.3.1. Selecting the number of replicas
6.3.2. Performing parameter updates
6.4. Pipelined training
6.4.1. Grouped scheduling
6.4.2. Interleaved scheduling
6.4.3. Sequential scheduling
6.4.4. Pipeline stage inputs and outputs
6.4.5. Applying an optimiser to the graph
6.4.6. Device mapping
6.4.7. Concurrent pipeline stages
6.5. Gradient accumulation
6.5.1. Optimizers
6.5.2. Pipelining
6.5.3. Accumulation data type
6.6. Optimizer state offloading
6.7. Dataset benchmarking
6.7.1. Accessing the JSON data
6.8. Half and mixed precision training
7. Efficient IPU I/O
7.1. Prefetch elements
7.2. I/O Tiles
8. Example using IPUEstimator
9. Example using IPUPipelineEstimator
10. Distributed training
10.1. PopDistStrategy examples
11. Half-precision floating point and stochastic rounding
11.1. Controlling the half-precision floating-point unit
11.2. Resetting the global random number seed
11.3. Debugging numerical issues
12. IPU-optimised operations
12.1. Image operations
12.2. Matmul serialisation
12.3. Dropout
12.4. Embedding lookup
12.5. Group normalisation
12.6. Instance normalisation
12.7. Layer normalisation
12.8. GeLU activation
12.9. Sequence slice
12.10. Histogram
13. IPU Outlined Functions
13.1. Usage
13.2. Examples
13.2.1. Models with common structures
13.2.2. Serializing large operations
14. Writing custom operations
14.1. Custom operation on the IPU
14.1.1. Building the Poplar graph
14.1.2. Gradient builders
14.1.3. Metadata
14.1.4. Compiling the IPU code
API level
PopLibs library code
Compiling the library file
14.1.5. Using the custom op in TensorFlow
14.1.6. Tensor allocation
14.1.7. Examples
In-place operations
Operation attributes
Custom codelet
14.2. Custom host CPU operations
14.2.1. Gradient callback
15. IPU host embeddings
15.1. Usage
15.2. Example
15.3. Experimental functionality: IPU embeddings in remote buffers
15.3.1. Partitioning strategies
Token strategy
Encoding strategy
Choosing a strategy for your application
16. IPU embedded application runtime
16.1. Usage
16.2. Pipelining and I/O tiles
16.2.1. Parallel requests
16.2.2. Timeout
16.2.3. Engine restarts
16.3. Example
16.4. Error Handling
16.4.1. Runtime errors
17. Exporting precompiled models for TensorFlow Serving
17.1. Exporting non-pipelined models defined inside a function
17.1.1. Example of exporting non-pipelined model defined inside a function
17.1.2. Example of exporting non-pipelined model defined inside a function with additional preprocessing and postprocessing steps
17.2. Exporting pipelined models defined as a list of functions
17.2.1. Pipeline example
17.2.2. Pipeline example with preprocessing and postprocessing steps
17.3. Running the model in TensorFlow Serving
18. Retrieving information about compilation and execution
18.1. TensorFlow options for reporting
18.2. XLA graph file naming
19. IPU TensorFlow Addons
19.1. Introduction
19.2. IPU SavedModel CLI
19.2.1. Run subcommand
19.2.2. Convert subcommand
19.2.3. Pipeline configuration
19.2.4. Pipeline development
19.2.5. Pipeline solution file
19.2.6. Example configuration file
20. TensorFlow API changes
20.1. Release 3.0
20.1.1. Non-breaking changes
Deprecated modules
20.2. Release 2.6
20.2.1. Breaking changes
Removal of deprecated APIs
20.3. Release 2.5
20.3.1. Breaking changes
Removal of deprecated APIs
Other
20.3.2. Non-breaking changes
Deprecated layers
RNN available_memory_proportion_fwd/available_memory_proportion_bwd deprecated
20.4. Release 2.4
20.4.1. Breaking changes
Summary ops
Removal of deprecated members
20.4.2. Non-breaking changes
20.5. Release 2.3
20.5.1. Breaking changes
Custom user op metadata interface updates
The verified transfers feature has been removed
20.5.2. Non-breaking changes
20.6. Release 2.2
20.6.1. Breaking changes
C++ Poplar TensorFlow libraries are private by default
Reports removed from ipu events
20.6.2. Non-breaking changes
IPULoggingTensorHook replication_factor deprecated
IPUInfeedQueue/IPUOutfeedQueue/IPULoggingTensorHook feed_name deprecated
Change of output location for profiling information
IPU Keras Layers deprecation in TensorFlow 1.15
Warning when epsilon value is too low
20.7. Release 2.1
20.7.1. Breaking changes
IPUPipelineEstimator change
Autosharding removed
Old IPU option configuration API changes
IPU Keras changes [TensorFlow 2]
20.7.2. Non-breaking changes
Recompute suggestions deprecated
IPUInfeedQueue/IPUOutfeedQueue replication_factor deprecated
IPUInfeedQueue data_to_prefetch deprecated
IPUOutfeedQueue data_to_prefetch deprecated
CTC loss ops deprecated
New configuration API
Support for grouped collectives
Environment variable changes
20.8. Release 2.0
20.8.1. Breaking changes
20.8.2. Non-breaking changes
IPUPipelineEstimator change
Autosharding deprecated
IPU config change
IPU Keras changes [TensorFlow 2]
21. TensorFlow Python API
21.1. Operations and utilities related to the Graphcore IPU
21.2. Compiler interface
21.3. Scoping contexts
21.4. Infeed queue
21.5. Outfeed queue
21.6. General utilities
21.7. Configuration utilities
21.8. Looping utilities
21.9. Distributed training
21.10. Horovod
21.11. Serving utilities
21.12. Datasets
21.12.1. Dataset benchmarking
21.12.2. Dataset wrappers
21.13. Estimators
21.13.1. IPUEstimator
21.13.2. IPUPipelineEstimator
21.13.3. Run configs
21.13.4. Session run hooks
21.14. Keras layers
21.14.1. Keras layer specializations for the Graphcore IPU
21.15. Operators
21.15.1. Control flow operations.
21.15.2. Custom operations
21.15.3. Functional operators
21.15.4. Image operations
21.15.5. Graphcore utility operations
21.15.6. IPU specific maths operations
21.15.7. Pipelining operators
21.15.8. Popnn primitive neural network operators
21.15.9. Popnn normalization operators
21.15.10. Popops all to all and all gather operators
21.15.11. Popops cross replica operators
21.15.12. Popops embedding operators
21.15.13. Popops reduce scatter operator
21.15.14. Popops within replica operators
21.15.15. Poprand operators
21.15.16. Utility operations to be used in replicated mode
21.15.17. Slicing operators
21.15.18. Statistics operators
21.15.19. Embedded application runtime
21.16. Optimisers
21.16.1. Helper classes and methods for gradient accumulation.
21.16.2. Optimizer classes for the Graphcore IPU
21.17. Sharding
21.17.1. Utility functions for sharding graphs
22. TensorFlow operators supported by the IPU
23. IPU TensorFlow Addons API changes
23.1. Release 3.0
23.1.1. Breaking changes
23.2. Release 2.5
23.2.1. Non-breaking changes
RNN available_memory_proportion_fwd/available_memory_proportion_bwd deprecated
23.3. Release 2.4
24. IPU TensorFlow Addons Python API
24.1. TensorFlow layers
24.1.1. TensorFlow layers made for IPU TensorFlow
24.2. TensorFlow optimizers
24.2.1. Optimizers made for IPU TensorFlow
25. Resources
25.1. Graphcore
25.2. TensorFlow
25.3. Other
26. Trademarks & copyright
Targeting the IPU from TensorFlow 1
Targeting the IPU from TensorFlow 1
Sorry, we can’t find the page you’re looking for!
Error code: 404