7. Viewing an Insights Report
The Insights Report gives you a brief summary of how well your model fits into the available IPU memory. The report shows which tiles are responsible for the highest memory usage, and which vertices and exchanges require the most memory in your model and on the IPUs.
There are also a number of recommended actions you can take to reduce the memory usage reported, such as recomputation, changing the batch size or using FP16. Where relevant, details of the largest memory requirements are shown, and an estimated expected memory saving.
7.1. Memory insights
The top section of the Insights report shows how well your model fits into the available IPU memory. It includes the following information:
A panel at the top shows whether your model is within available IPU memory capacity, or whether you had an out of memory (OOM) issue. The proportion of IPU memory used is displayed, giving you a quick guide to how much memory you have spare (or how much you need to reduce if your model was OOM).
A chart showing the five tiles with the highest memory usage. For each tile, the amount of memory required is displayed, as well as the IPU on which that tile is located.
A chart showing a histogram of memory usage across all tiles. This gives you insight into the number of tiles that are using particular amounts of memory.
A table showing the vertex and exchange memory usage
7.2. Vertex and exchange sizes
The Insights report also shows you the vertices and exchanges that require the most memory in your model, and across the IPUs and tiles.
Select the Model, IPU or Tile tab to see the memory usage for each.
Select the Vertex (state and code) radio button to display the name of the vertex with the largest state, the amount of memory it required, and a list of compute sets in which that vertex was used. Beneath that, the vertex with the largest code size is also shown, with the same information.
Select the User variable radio button to see a list of always-live user variables at peak liveness that require memory reserved for the entire application.
Select the Exchange radio button to display the name of the largest exchange, the amount of memory it required, and the names of the variables that were involved in the transfer.
Select the Program step radio button to view the step with the peak memory usage, as well as a list of the not-always-live variables at this step.
When viewing the IPU tab, you can select which IPU to view using the drop-down list on the left. Only vertices and exchanges involving that IPU are listed.
When viewing the Tile tab, you can select which of the five most memory-hungry tiles to view, or enter a specific tile ID in the search box. Only vertices and exchanges involving that tile are listed.
7.3. Tips on reducing memory usage
The bottom section of the Insights report displays a number of panels that recommend possible solutions for improving the memory usage of your model, which may help in situations where you are out of memory. Where appropriate, estimates are given showing how much memory could be saved with each solution. Further information about these recommendations can be found in the Memory and Performance Optimisation Guide, on the Graphcore website.
These solutions may affect the performance, throughput, convergence and/or training characteristics of your model.
The following recommendations and solutions are included in the report:
Reducing the batch size, with links to a video for Evaluating batch sizes for IPUs and a Graphcore blog post on small batch sizes.
Using FP16 where appropriate. This panel shows how much memory could be saved by switching the five most memory-hungry tiles from FP32 to FP16. A list of potential candidate FP32 variables from the five worst tiles is displayed, showing how much memory could be saved by changing them to FP16.
Setting the available memory proportion for matmuls and convolutions
Offloading variables to the host machine
Reusing identical parts of your graph with outlining