Logo
PopRT User Guide
Version: latest
  • 1. Introduction
    • 1.1. Background
    • 1.2. Architecture
    • 1.3. Workflow
  • 2. Installation
    • 2.1. Compatibility of PopRT with the Poplar SDK
    • 2.2. Quick start with a Docker image
    • 2.3. Install PopRT on host server
      • 2.3.1. For Ubuntu 20.04
        • Install Poplar SDK
        • Enable the SDK
        • Install PopRT
  • 3. Quick start
    • 3.1. CLI parameters
    • 3.2. Convert and run model
      • 3.2.1. Download ONNX model
      • 3.2.2. Obtain input and output information for ONNX model
      • 3.2.3. Specify input shape
      • 3.2.4. Specify model accuracy
      • 3.2.5. Run model
      • 3.2.6. Export PopEF model
    • 3.3. Quick deployment
      • 3.3.1. Run exported PopEF model
      • 3.3.2. Run converted ONNX model
    • 3.4. Python API example
  • 4. Command line interface
    • 4.1. Named Arguments
    • 4.2. Sub-commands
      • 4.2.1. tf2onnx
        • Named Arguments
  • 5. Features
    • 5.1. Passes
      • 5.1.1. Pass abstract
    • 5.2. FP8
      • 5.2.1. IPU FP8 type
      • 5.2.2. FP8 quantisation
      • 5.2.3. Converting an FP32 model to FP8
      • 5.2.4. FP8 model conversion tool
      • 5.2.5. Debugging FP8 model conversion problems
    • 5.3. Overlap I/O
      • 5.3.1. Principle
      • 5.3.2. Configuring I/O tiles
      • 5.3.3. Debugging
      • 5.3.4. Concurrent requests
      • 5.3.5. Example
    • 5.4. Dynamic batch size
      • 5.4.1. Background
      • 5.4.2. Example
    • 5.5. Packing
      • 5.5.1. Background
      • 5.5.2. Packing and unpacking
      • 5.5.3. Transformer-based NLP models
      • 5.5.4. How to use packing
        • Downloading the model
        • Converting the model
        • Running the model
    • 5.6. CPU packing
      • 5.6.1. Background
      • 5.6.2. Functional modules
        • Timeout processing
        • User data preprocessing
        • Data accumulation
        • Post-packing processing
      • 5.6.3. Packing algorithms
        • End-to-end method
        • FirstFit method
        • NextFit method
      • 5.6.4. Examples
    • 5.7. Model fusion
      • 5.7.1. Implementing PopRT model fusion
      • 5.7.2. Implementing PopRT Runtime fusion model inference
    • 5.8. Custom operations
      • 5.8.1. Writing custom operators
        • Create the ONNX model file with the LeakyRelu op
      • 5.8.2. Using custom operators in PopRT
    • 5.9. Custom passes
      • 5.9.1. Implementing custom passes
      • 5.9.2. Using custom passes
        • Using custom passes in the PopRT CLI
        • Using custom passes in the Python API
    • 5.10. Custom patterns
      • 5.10.1. Implementing custom PopART patterns
      • 5.10.2. Using custom PopART patterns in PopRT
        • Method 1: Use PatternCreator to enable the pattern by default
        • Method 2: Configure a pattern using the Python API
        • Method 3: Config the specified pattern using CLI
    • 5.11. Custom transforms
      • 5.11.1. Implementing custom PopART transforms
      • 5.11.2. Using custom transforms in PopRT
    • 5.12. Manual sharding
      • 5.12.1. Sharding and model parallelism
      • 5.12.2. Pipelining and pipeline parallelism
      • 5.12.3. Manual sharding process
      • 5.12.4. Configuring manual sharding
        • Configuring manual sharding with the PopRT CLI
        • Configuring manual sharding with the Python API
      • 5.12.5. Example
    • 5.13. Error handling
    • 5.14. PopRT frontend
      • 5.14.1. ONNX frontend
      • 5.14.2. TensorFlow frontend
        • Loading a TensorFlow model with the PopRT CLI
        • Loading a TensorFlow model with Python API
    • 5.15. Auto-sharding
      • 5.15.1. Model parallelism
      • 5.15.2. Principle of auto-sharding
        • Alternative nodes strategy
        • Traversal strategy of sharding scheme
      • 5.15.3. Using auto-sharding
        • Auto-sharding tool
        • Examples
    • 5.16. Model debugger
      • 5.16.1. Model Debugger tool
      • 5.16.2. Examples
  • 6. Python API
    • 6.1. poprt module
    • 6.2. poprt.compiler module
    • 6.3. poprt.runtime module
    • 6.4. poprt.frontend module
    • 6.5. poprt.backends module
    • 6.6. poprt.quantizer module
    • 6.7. poprt.passes module
      • 6.7.1. Built-in passes
  • 7. C++ API
    • 7.1. PopRT Compiler
    • 7.2. Executable
    • 7.3. PopRT Runtime
      • 7.3.1. DeviceManager
  • 8. Revision history
  • 9. Trademarks & copyright
PopRT User Guide

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

5. Features

  • 5.1. Passes
  • 5.2. FP8
  • 5.3. Overlap I/O
  • 5.4. Dynamic batch size
  • 5.5. Packing
  • 5.6. CPU packing
  • 5.7. Model fusion
  • 5.8. Custom operations
  • 5.9. Custom passes
  • 5.10. Custom patterns
  • 5.11. Custom transforms
  • 5.12. Manual sharding
  • 5.13. Error handling
  • 5.14. PopRT frontend
  • 5.15. Auto-sharding
  • 5.16. Model debugger
Previous Next

Revision 67adfdbc.