Logo
IPU Inference Toolkit User Guide
latest
  • 1. 概述
  • 2. IPU推理方案架构
    • 2.1. 模型服务
    • 2.2. Graphcore Poplar软件栈
      • 2.2.1. PopART
      • 2.2.2. PopEF and PopRT Runtime
    • 2.3. IPU推理方案架构
      • 2.3.1. 模型编译
        • 模型导出
        • 选择Batch size
        • 选择精度
        • 模型转换
        • 模型编译
      • 2.3.2. 模型运行
  • 3. 环境准备
    • 3.1. 主机CPU架构
    • 3.2. 主机操作系统
    • 3.3. Docker
    • 3.4. Poplar SDK
      • 3.4.1. 安装Poplar SDK
    • 3.5. 检查IPU硬件
    • 3.6. 安装PopRT
      • 3.6.1. 通过容器安装
      • 3.6.2. 通过pip安装
    • 3.7. 通过容器启动IPU运行环境
      • 3.7.1. gc-docker
      • 3.7.2. 使用docker run启动容器
      • 3.7.3. 容器内查询IPU的状态
  • 4. 模型编译
    • 4.1. ONNX模型
      • 4.1.1. 模型导出
      • 4.1.2. 选择batch size
      • 4.1.3. 选择精度
      • 4.1.4. 模型转换和编译
    • 4.2. TensorFlow模型
      • 4.2.1. 模型导出
      • 4.2.2. 模型转换和编译
    • 4.3. PyTorch模型
      • 4.3.1. 模型导出
      • 4.3.2. 模型转换和编译
  • 5. 模型运行
    • 5.1. 通过PopRT Runtime运行
      • 5.1.1. 环境准备
      • 5.1.2. 通过Python API运行
      • 5.1.3. 通过C++ API运行
    • 5.2. 部署到Trition Inference server
      • 5.2.1. 环境准备
      • 5.2.2. 生成模型的配置
        • 模型名称
        • Backend
        • Batching
        • 输入和输出
      • 5.2.3. 启动模型服务
        • 通过gRPC验证服务
        • 通过HTTP验证服务
    • 5.3. 部署到TensorFlow Serving
      • 5.3.1. 环境准备
      • 5.3.2. 生成SavedModel模型
      • 5.3.3. 启动模型服务
        • 开启或关闭batching功能
      • 5.3.4. 通过HTTP验证服务
  • 6. NLP 端到端在线搜索解决方案
    • 6.1. 引言
    • 6.2. NLP在线搜索的应用场景
    • 6.3. NLP在线搜索的一些挑战和问题
    • 6.4. 模型转换
    • 6.5. 模型部署
    • 6.6. 模型请求和压力测试
    • 6.7. 模型推理服务监控
      • 6.7.1. Grafana 的网页前端导入模版流程
    • 6.8. 结论
    • 6.9. 附录
      • 6.9.1. grafana 模版文件
  • 7. Kubernetes集群 GPU/IPU 混合部署解决方案
    • 7.1. 引言
    • 7.2. 搭建 GPU 和 IPU 的混合资源 Kubernetes 集群
      • 7.2.1. 为节点添加标签
        • 使 Kubernetes 集群支持对 GPU 资源进行调度
        • 使 Kubernetes 集群支持对 IPU 资源进行调度
    • 7.3. 在 IPU 和 GPU 的异构集群中进行统一设备调度
    • 7.4. 支持 IPU 和 GPU 模型的统一镜像
    • 7.5. 应用案例
      • 7.5.1. Kubernetes应用部署
      • 7.5.2. 测试服务
  • 8. Container release notes
    • 8.1. Triton Inference Server
      • 8.1.1. New features
      • 8.1.2. Bug fixes
      • 8.1.3. Other improvements
      • 8.1.4. Known issues
      • 8.1.5. Compatibility changes
    • 8.2. TensorFlow Serving
      • 8.2.1. New features
      • 8.2.2. Bug fixes
      • 8.2.3. Other improvements
      • 8.2.4. Known issues
      • 8.2.5. Compatibility changes
  • 9. Trademarks & copyright
IPU Inference Toolkit User Guide

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

9. Trademarks & copyright

Graphcloud®, Graphcore®, Poplar® and PopVision® are registered trademarks of Graphcore Ltd.

Bow™, Bow-2000™, Bow Pod™, Colossus™, In-Processor-Memory™, IPU-Core™, IPU-Exchange™, IPU-Fabric™, IPU-Link™, IPU-M2000™, IPU-Machine™, IPU-POD™, IPU-Tile™, PopART™, PopDist™, PopLibs™, PopRun™, PopTorch™, Streaming Memory™ and Virtual-IPU™ are trademarks of Graphcore Ltd.

All other trademarks are the property of their respective owners.

This software is made available under the terms of the Graphcore End User License Agreement (EULA) and the Graphcore Container License Agreement. Please ensure you have read and accept the terms of the corresponding license before using the software. The Graphcore EULA applies unless indicated otherwise.

Copyright © 2022 Graphcore Ltd. All rights reserved.

上一页

版本 090c6467.