V-IPU User Guide
Version: 1.15.1
1. Introduction
1.1. Scope of the document
1.2. Structure of the document
2. Concepts and architecture
2.1. Architecture
3. Getting started
3.1. Installing the V-IPU client
3.2. V-IPU configuration
3.3. Creating partitions
4. Partitions
4.1. Overview
4.2. Creating a reconfigurable partition
4.2.1. Limitations
4.3. Creating a preconfigured partition
4.4. IPU selection
4.5. Displaying partition information
4.5.1. Partition state
4.5.2. Provisioning state
4.6. IPUoF environment variables
4.7. IPUoF configuration files
4.7.1. Creating an IPUoF configuration file
4.7.2. Sharing access to devices
4.8. Removing a partition
4.9. Routing configuration
4.9.1. Intra-GCD sync configuration
4.10. Multi-GCD partitions
4.10.1. Multi-host IPUoF configuration
Multi-host IPUoF configuration files
4.10.2. Intra-GCD sync configuration
4.10.3. Inter-GCD sync configuration
4.10.4. Crossing IPU-Link domains
5. Integration with Slurm
5.1. Configuring Slurm to use V-IPU select plugin
5.2. Configuration parameters
5.3. The V-IPU job submission plugin
5.4. An example Slurm Controller configuration
5.5. Job submission and parameters
5.5.1. Job script examples
6. Integration with Kubernetes
6.1. Components and design
6.2. Package contents
6.3. Deploying the software
6.3.1. Prerequisites
6.3.2. Installation
Installing the CRDs
Installing the Operator
Multiple V-IPU controller support
Verify the installation is successful
6.3.3. Uninstall
6.3.4. Upgrading the Helm Chart
6.4. Configurations
6.5. Creating an IPUJob
6.5.1. Interactive mode
6.5.2. Mounting data volumes for an IPUJob
6.5.3. Automatic restarts
6.5.4. Clean up Kubernetes resources and IPU partitions
6.6. Debugging problems
6.6.1. How does the IPU Operator work?
6.6.2. Debugging
6.7. IPU usage statistics
6.8. Operator Metrics
6.9. Known limitations
7. Command line reference
7.1. Resource types
7.2. Global options
7.2.1. Using a configuration file
7.2.2. Using environment variables
7.3. Create partition
7.3.1. Syntax
7.3.2. Examples
7.3.3. Options
7.4. Get partition
7.4.1. Syntax
7.4.2. Examples
7.4.3. Options
7.5. List allocation
7.5.1. Syntax
7.5.2. Examples
7.5.3. Options
7.6. List IPUs
7.6.1. Supported options
7.7. List partition
7.7.1. Syntax
7.7.2. Examples
7.7.3. Options
7.8. Remove a partition
7.8.1. Syntax
7.8.2. Examples
7.8.3. Options
7.9. Reset partition
7.9.1. Syntax
7.9.2. Examples
7.9.3. Options
7.10. Bash completion
7.10.1. Syntax
7.10.2. Examples
7.10.3. Options
7.11. Zsh completion
7.11.1. Syntax
7.11.2. Examples
7.11.3. Options
8. Revision history
9. Trademarks & copyright
V-IPU User Guide
V-IPU User Guide
1. Introduction
1.1. Scope of the document
1.2. Structure of the document
2. Concepts and architecture
2.1. Architecture
3. Getting started
3.1. Installing the V-IPU client
3.2. V-IPU configuration
3.3. Creating partitions
4. Partitions
4.1. Overview
4.2. Creating a reconfigurable partition
4.2.1. Limitations
4.3. Creating a preconfigured partition
4.4. IPU selection
4.5. Displaying partition information
4.5.1. Partition state
4.5.2. Provisioning state
4.6. IPUoF environment variables
4.7. IPUoF configuration files
4.7.1. Creating an IPUoF configuration file
4.7.2. Sharing access to devices
4.8. Removing a partition
4.9. Routing configuration
4.9.1. Intra-GCD sync configuration
4.10. Multi-GCD partitions
4.10.1. Multi-host IPUoF configuration
4.10.2. Intra-GCD sync configuration
4.10.3. Inter-GCD sync configuration
4.10.4. Crossing IPU-Link domains
5. Integration with Slurm
5.1. Configuring Slurm to use V-IPU select plugin
5.2. Configuration parameters
5.3. The V-IPU job submission plugin
5.4. An example Slurm Controller configuration
5.5. Job submission and parameters
5.5.1. Job script examples
6. Integration with Kubernetes
6.1. Components and design
6.2. Package contents
6.3. Deploying the software
6.3.1. Prerequisites
6.3.2. Installation
6.3.3. Uninstall
6.3.4. Upgrading the Helm Chart
6.4. Configurations
6.5. Creating an IPUJob
6.5.1. Interactive mode
6.5.2. Mounting data volumes for an IPUJob
6.5.3. Automatic restarts
6.5.4. Clean up Kubernetes resources and IPU partitions
6.6. Debugging problems
6.6.1. How does the IPU Operator work?
6.6.2. Debugging
6.7. IPU usage statistics
6.8. Operator Metrics
6.9. Known limitations
7. Command line reference
7.1. Resource types
7.2. Global options
7.2.1. Using a configuration file
7.2.2. Using environment variables
7.3. Create partition
7.3.1. Syntax
7.3.2. Examples
7.3.3. Options
7.4. Get partition
7.4.1. Syntax
7.4.2. Examples
7.4.3. Options
7.5. List allocation
7.5.1. Syntax
7.5.2. Examples
7.5.3. Options
7.6. List IPUs
7.6.1. Supported options
7.7. List partition
7.7.1. Syntax
7.7.2. Examples
7.7.3. Options
7.8. Remove a partition
7.8.1. Syntax
7.8.2. Examples
7.8.3. Options
7.9. Reset partition
7.9.1. Syntax
7.9.2. Examples
7.9.3. Options
7.10. Bash completion
7.10.1. Syntax
7.10.2. Examples
7.10.3. Options
7.11. Zsh completion
7.11.1. Syntax
7.11.2. Examples
7.11.3. Options
8. Revision history
9. Trademarks & copyright