Logo
IPU Inference Toolkit User Guide
latest
  • 1. Overview
  • 2. IPU Inference Toolkit architecture
    • 2.1. Model servers
    • 2.2. Graphcore Poplar software stack
      • 2.2.1. PopART
      • 2.2.2. PopEF and PopRT Runtime
    • 2.3. Using the IPU Inference Toolkit
      • 2.3.1. Model compilation overview
        • Model export
        • Batch size selection
        • Precision selection
        • Model conversion
        • Model compilation
      • 2.3.2. Model runtime overview
  • 3. Environment preparation
    • 3.1. Host server CPU architecture
    • 3.2. Host server operating system
    • 3.3. Docker
    • 3.4. Poplar SDK
      • 3.4.1. Installing the Poplar SDK on the host server
    • 3.5. Inspection of IPU hardware
    • 3.6. Install PopRT
      • 3.6.1. Installation with a Docker container
      • 3.6.2. Installation with pip
    • 3.7. Docker containers for the Poplar SDK
      • 3.7.1. gc-docker
      • 3.7.2. Run a Docker container
      • 3.7.3. Query the IPU status from a Docker container
  • 4. Model compilation
    • 4.1. ONNX model
      • 4.1.1. Model exporting
      • 4.1.2. Batch size selection
      • 4.1.3. Precision selection
      • 4.1.4. Model conversion and compilation
    • 4.2. TensorFlow model
      • 4.2.1. Model exporting
      • 4.2.2. Model conversion and compilation
    • 4.3. PyTorch model
      • 4.3.1. Model exporting
      • 4.3.2. Model conversion and compilation
  • 5. Model runtime
    • 5.1. Run with PopRT Runtime
      • 5.1.1. Environment preparation
      • 5.1.2. Run with PopRT Runtime Python API
      • 5.1.3. Run with PopRT Runtime C++ API
    • 5.2. Deploy to Triton Inference Server
      • 5.2.1. Environment preparation
      • 5.2.2. Configuration of generated model
        • Model name
        • Backend
        • Batching
        • Input and output
      • 5.2.3. Start model service
        • Verify the service with gRPC
        • Verify the service with HTTP
    • 5.3. Deploy to TensorFlow Serving
      • 5.3.1. Environment preparation
      • 5.3.2. Generate SavedModel model
      • 5.3.3. Start model service
        • Running with and without batching
      • 5.3.4. Verify the service with HTTP
  • 6. Container release notes
    • 6.1. Triton Inference Server
      • 6.1.1. New features
      • 6.1.2. Bug fixes
      • 6.1.3. Other improvements
      • 6.1.4. Known issues
      • 6.1.5. Compatibility changes
    • 6.2. TensorFlow Serving
      • 6.2.1. New features
      • 6.2.2. Bug fixes
      • 6.2.3. Other improvements
      • 6.2.4. Known issues
      • 6.2.5. Compatibility changes
  • 7. Trademarks & copyright
IPU Inference Toolkit User Guide

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

7. Trademarks & copyright

Graphcloud®, Graphcore®, Poplar® and PopVision® are registered trademarks of Graphcore Ltd.

Bow™, Bow-2000™, Bow Pod™, Colossus™, In-Processor-Memory™, IPU-Core™, IPU-Exchange™, IPU-Fabric™, IPU-Link™, IPU-M2000™, IPU-Machine™, IPU-POD™, IPU-Tile™, PopART™, PopDist™, PopLibs™, PopRun™, PopTorch™, Streaming Memory™ and Virtual-IPU™ are trademarks of Graphcore Ltd.

All other trademarks are the property of their respective owners.

This software is made available under the terms of the Graphcore End User License Agreement (EULA) and the Graphcore Container License Agreement. Please ensure you have read and accept the terms of the corresponding license before using the software. The Graphcore EULA applies unless indicated otherwise.

Copyright © 2022 Graphcore Ltd. All rights reserved.

Previous

Revision afce004c.