Logo
Tutorials
Version: latest
  • 1. Introduction
    • 1.1. Prerequisites
    • 1.2. Running these tutorials
  • 2. PyTorch
    • 2.1. Introduction to PopTorch - running a simple model
      • What is PopTorch?
      • Getting started: training a model on the IPU
        • Import the packages
        • Load the data
          • PopTorch DataLoader
        • Build the model
        • Prepare training for IPUs
        • Train the model
          • Training loop
          • Use the same IPU for training and inference
          • Save the trained model
        • Evaluate the model
      • Using the model on our own images to get predictions
        • Running our model on the IPU
        • Running our model on the CPU
        • Limitations with our model
      • Doing more with poptorch.Options
        • deviceIterations
        • replicationFactor
        • randomSeed
        • useIpuModel
      • How to set the options
      • Summary
        • Next steps:
    • 2.2. Efficient data loading with PopTorch
      • PyTorch and PopTorch DataLoader
      • Understanding batching with IPU
        • Device iterations
          • A note on returned data
        • Gradient accumulation
        • Replication
        • Global batch size
          • How many samples will then be loaded in one step?
      • Tuning hyperparameters
        • Evaluating the asynchronous DataLoader
        • What if the DataLoader throughput is too low?
        • Device iterations vs global batch size
          • Case of a training session
          • Case of an inference session
          • Conclusion: Training and inference sessions
      • Experiments
        • Case 1: No bottleneck
          • Why is the throughput lower with real data?
        • Case 2: Larger global batch size with replication
      • Summary
    • 2.3. BERT Fine-tuning on the IPU
      • File structure
      • How to use this demo
      • License
    • 2.4. Half and mixed precision in PopTorch
      • General
        • Motives for half precision
        • Numerical stability
          • Loss scaling
          • Stochastic rounding
      • Train a model in half precision
        • Import the packages
        • Build the model
        • Choose parameters
          • Casting a model’s parameters
          • Casting a single layer’s parameters
        • Prepare the data
        • Optimizers and loss scaling
        • Set PopTorch’s options
          • Stochastic rounding on IPU
          • Partials data type
        • Train the model
        • Evaluate the model
      • Visualise the memory footprint
      • Debug floating-point exceptions
      • Summary
    • 2.5. Observing Tensors in PopTorch
      • Table of Contents
      • General
      • File structure
      • Method 1: print tensor
      • Method 2: direct anchoring
      • Anchor modes
      • Gradient histogram example
        • Import packages
        • Build the model
        • Assigning assorted parameters
        • Set PopTorch options
        • Setting up the data loader
        • Initialising the PopTorch model
        • Printing out the tensor names
        • Anchoring the tensors
        • Training the model
        • Retrieving the tensors
        • Building the histogram
    • 2.6. PopTorch Parallel Execution Using Pipelining
      • File structure
      • Introduction to pipelined execution
      • Setting hyperparameters
      • Preparing the data
      • Model definition
        • Annotation for model partitioning
          • Defining the training model
      • Execution strategies
        • Pipelined execution (parallel)
          • Assigning blocks to stages and IPUs
          • Setting gradient accumulation and device iterations
        • Efficient model partitioning (advanced)
        • Sharded execution (sequential)
      • Saving memory by offloading to remote buffers (advanced)
      • Training the model
      • Inference
      • How to run the example script
      • Conclusion
    • 2.7. Training a Hugging Face model on the IPU using a local dataset
      • How to run this tutorial
        • Getting the dataset
        • Environment
      • Graphcore Hugging Face models
        • Utility imports
      • Preparing the NIH Chest X-ray Dataset
        • Preparing the labels
        • Create the dataset
        • Visualising the dataset
      • Preparing the model
      • Run the training
        • Plotting convergence
      • Run the evaluation
      • Conclusion
  • 3. PyTorch Geometric
    • 3.1. PyTorch Geometric on IPUs at a glance
      • Running on Paperspace
      • Porting to the IPU basics
      • High performance dataloader and fixed size inputs
      • Operation and layer considerations
        • Operations
          • Boolean indexing
        • PyTorch Geometric Layers
          • Global pooling layers
          • GCNConv layers
      • Conclusion
    • 3.2. An end to end example using PyTorch Geometric on IPUs
      • Running on Paperspace
      • Loading a PyTorch Geometric dataset
      • Using a dataloader
      • Creating a model
        • Message passing layers in PyTorch Geometric
        • Using the GCNConv layer in a model
      • Training our model
      • Running inference on the trained model
      • Conclusion
    • 3.3. Small graph batching on IPUs using padding
      • Running on Paperspace
      • Introduction to small graph structures and the QM9 Dataset
      • QM9 dataset in PyTorch Geometric
      • Mini-batching in PyTorch Geometric
      • Using the fixed size data loader in PopTorch Geometric
        • Batching using FixedSizeDataLoader in PopTorch Geometric
        • Training on the IPU using the Fixed Sized Dataloader in PopTorch Geometric
      • Conclusion
    • 3.4. Small graph batching on IPUs using packing
      • Running on Paperspace
      • Introduction to small graph structures and the MUTAG dataset
        • Loading the MUTAG dataset in PyTorch Geometric
      • Batching small graphs using packing
        • How to pack and pad a mini-batch of graphs
        • Packing using the fixed size data loader in PopTorch Geometric
        • Understanding packing efficiency
        • Training using a packed data loader
      • Conclusion
    • 3.5. Sampling Large Graphs for IPUs using PyTorch Geometric
      • Environment setup
      • Loading a large graph
      • Clustering to train the large graph for node classification
        • Clustering the graph
        • Training a GNN to classify papers in the PubMed dataset
      • Neighbourhood sampling the computation graph for node classification
        • Neighbour sampling the graph
        • Training using neighborhood sampling on IPUs
      • Conclusion
    • 3.6. Heterogeneous Graph Learning on IPUs
      • Running on Paperspace
      • Introduction to heterogeneous graphs
        • Loading a heterogeneous graph dataset
      • Creating heterogeneous GNNs
        • Automatically converting a GNN model
        • Using the heterogeneous convolution wrapper
        • Using heterogeneous operators
      • Fixed size heterogeneous data loading
      • Conclusion
  • 4. PopXL
    • 4.1. PopXL and popxl.addons
      • Introduction
      • Requirements
      • Basic concepts
      • A simple example
        • Imports
        • Defining a Linear Module
        • Creating a graph from a Module
        • Summary and concepts in practice
        • Multiple bound graphs
      • Nested Modules and Outlining
        • DotTree example
      • MNIST
        • Load dataset
        • Defining the Training step
        • Validation
      • Conclusion
    • 4.2. PopXL Custom Optimiser
      • Introduction
      • Requirements
      • Imports
      • Defining the Adam optimiser
        • Managing in-place ops
        • Using the var_updates module
        • Using our custom optimiser
      • MNIST with Adam
      • Validation
      • Conclusion
    • 4.3. Data Parallelism in PopXL
    • 4.4. Pipelining in PopXL
    • 4.5. Remote Variables and RTS in PopXL
    • 4.6. Phased Execution in PopXL
  • 5. Poplar
    • 5.1. Poplar Tutorial 1: Programs and Variables
      • Setup
      • Graphs, variables and programs
        • Creating the graph
        • Adding variables and mapping them to IPU tiles
        • Adding the control program
        • Compiling the poplar executable
      • Initialising variables
      • Getting data into and out of the device
      • Data streams
      • (Optional) Using the IPU
      • Summary
    • 5.2. Poplar Tutorial 2: Using PopLibs
      • Setup
      • Using PopLibs
      • Reshaping and transposing data
    • 5.3. Poplar Tutorial 3: Writing Vertex Code
      • Setup
      • Writing vertex code
      • Creating a codelet
      • Creating a compute set
      • Executing the compute set
    • 5.4. Poplar Tutorial 4: Profiling Output
      • Setup
      • Profiling on the IPU
      • Profiling methods
        • Command line profile summary
        • Generating profile report files
        • Using the PopVision analysis API in C++ or Python
      • Using PopVision Graph Analyser - loading and viewing a report
      • Using PopVision Graph Analyser - general functionality
        • Capturing IPU Reports - setting POPLAR_ENGINE_OPTIONS
        • Comparing two reports
        • Profiling an out of memory program
      • Using PopVision Graph Analyser - different tabs in the application
        • Memory report
        • Program tree
        • Operations summary
        • Liveness report
        • Execution trace
      • Follow-ups
      • Summary
    • 5.5. Poplar Tutorial 5: Matrix-vector Multiplication
      • Setup
      • Vertex code
      • Host code
      • (Optional) Using the IPU
      • Summary
    • 5.6. Poplar Tutorial 6: Matrix-vector Multiplication Optimisation
      • Setup
      • Optimising matrix-vector multiplication
  • 6. TensorFlow 2
    • 6.1. Using Infeed and Outfeed Queues in TensorFlow 2
      • Directory Structure
      • Table of Contents
      • Introduction
      • Example
        • Import the necessary APIs
        • Define hyperparameters
        • Prepare the dataset
        • Define the model
        • Define the custom training loop
        • Configure the hardware
        • Create data pipeline and execute the training loop
      • Additional notes
        • License
    • 6.2. Keras tutorial: How to run on IPU
      • Keras MNIST example
      • Running the example on the IPU
        • 1. Import the TensorFlow IPU module
        • 2. Preparing the dataset
        • 3. Add IPU configuration
        • 4. Specify IPU strategy
        • 5. Wrap the model within the IPU strategy scope
        • 6. Results
      • Going faster by setting steps_per_execution
      • Replication
      • Pipelining
    • 6.3. Using TensorBoard in TensorFlow 2 on the IPU
      • Preliminary Setup
      • Introduction to TensorBoard and Data Logging
        • How does TensorBoard work?
      • How do I launch TensorBoard?
        • TensorBoard on a Remote Machine
          • SSH Tunnelling
          • Exposing TensorBoard to the Network
        • Automatically Handling Log Directory Cleansing
      • Logging Data with tf.keras.callbacks.Callback
        • Running Evaluation at the end of an Epoch
        • Supported Data Types in tf.summary
        • Logging Custom Image Data at the end of an Epoch
        • Using tf.keras.callbacks.TensorBoard
      • Model Setup & Data Preparation
      • Model Definition
      • Model Training
      • Exploring TensorBoard
        • Scalars
        • Images
        • Graphs
        • Distributions and Histograms
          • Distributions
          • Histograms
      • Time Series
      • Using TensorBoard Without Keras
      • To Conclude
  • 7. PopVision
    • 7.1. PopVision Tutorial: Accessing profiling information
      • How to run this tutorial
      • Setup
      • Using the Python API
        • Loading a profile
        • Using visitors to explore the data
      • Going further with the PopVision Graph Analyser
    • 7.2. PopVision Tutorial: Lightweight Profiling
      • Introduction
      • Setup
      • Example 1: use of the Block program
        • Nested Block programs
      • Example 2: implicit Block programs
      • Example 3: I/O
      • Block flush
      • Conclusion
      • Further reading
    • 7.3. PopVision tutorial: Reading PVTI files with libpva
      • How to run this tutorial
      • Enabling PVTI file generation
      • Using the Python API
        • Loading a PVTI file
        • Accessing processes, threads, and events
        • Analysing epochs
      • Going further
    • 7.4. Tutorial: Instrumenting applications
      • How to run this tutorial
      • Introduction
      • Profiling execution of epochs
      • Logging the training and validation losses
      • Generating and profiling instant events
      • Generating heatmap data to visualise numerical stability of tensors
      • Going further
  • 8. Standard Tools
    • 8.1. Using IPUs from Jupyter Notebooks
      • Preparing your environment
      • Starting Jupyter with IPU support
        • Installing Jupyter
        • Starting a Jupyter server
        • Connect your local machine to the Jupyter server
        • Open the Jupyter notebook in the browser
      • Troubleshooting
        • Installing additional Python packages from a notebook
        • Encountering ImportErrors
        • Can’t connect to server
        • Login page
    • 8.2. Using VS Code with the Poplar SDK and IPUs
      • Goals
      • Terminology
      • Installing extensions
      • Python development
        • Easily creating an .env file
        • Choosing VS Code’s Python interpreter
        • Using the .env file to access IPUs
        • Debugging code which requires IPUs
      • Debugging C++ libraries and custom ops
        • Difficulty
        • Outline
        • 0. Choose the C++ code to debug
        • 1. Set up launch.json for C++ debugging
        • 2. Set up your Python program to make debugging easy
        • 3. Attach gdbserver to your running process using the PID
        • 4. Connect VS Code to gdbserver
      • Troubleshooting
        • ImportError and ModuleNotFoundError for PopTorch, PopART or TensorFlow
          • Symptoms
          • Solution
        • Config settings in launch.json are ignored
      • Features of the Python extension for VS Code
  • 9. Next steps
  • 10. Trademarks & copyright
Tutorials

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

7. PopVision

  • 7.1. PopVision Tutorial: Accessing profiling information
    • How to run this tutorial
    • Setup
    • Using the Python API
      • Loading a profile
      • Using visitors to explore the data
    • Going further with the PopVision Graph Analyser
  • 7.2. PopVision Tutorial: Lightweight Profiling
    • Introduction
    • Setup
    • Example 1: use of the Block program
      • Nested Block programs
    • Example 2: implicit Block programs
    • Example 3: I/O
    • Block flush
    • Conclusion
    • Further reading
  • 7.3. PopVision tutorial: Reading PVTI files with libpva
    • How to run this tutorial
    • Enabling PVTI file generation
    • Using the Python API
      • Loading a PVTI file
      • Accessing processes, threads, and events
      • Analysing epochs
    • Going further
  • 7.4. Tutorial: Instrumenting applications
    • How to run this tutorial
    • Introduction
    • Profiling execution of epochs
    • Logging the training and validation losses
    • Generating and profiling instant events
    • Generating heatmap data to visualise numerical stability of tensors
    • Going further
Previous Next

Revision f9f2f192.