1. Overview
The goal of this document is to help Graphcore AI engineers and customers to develop high-performance machine learning models running on the IPU. This document will cover the general topic of optimising performance.
There are many factors that affect model performance, and this document will cover the following:
Memory optimisation
Execution schemes
Optimisations specific to the IPU and Poplar
Scaling an application over multiple replicas
Although this document focuses on performance it is worth bearing in mind that numerical stability and convergence properties may limit the design options when trying to optimise a model for performance.
Implementation details will not be covered in this document, but you can refer to the framework-specific documentation: