PopDist and PopRun: User Guide
Version: latest
1. Introduction
1.1. Installation
1.2. Validating the installation
1.3. Replicas and instances
2. Launching applications with PopRun
2.1. Launch modes
2.2. Multi-host setup
2.2.1. SSH setup
Install and configure OpenSSH on all hosts
Create SSH key pair
Authorise key pair on all hosts
Verify SSH setup
2.2.2. PopRun SSH Distribution tool
2.2.3. Network interfaces
2.2.4. File system setup
2.3. Caching compiled executables
2.4. Application launches
2.4.1. Single instance
2.4.2. Multi instance / Single host
2.4.3. Multi instance / Multi host
2.5. Process placement and non-uniform memory access (NUMA)
2.6. Offline compilation
2.7. Using with PopVision Graph Analyser
2.7.1. POPLAR_ENGINE_OPTIONS and PopVision Graph Analyser
2.7.2. Launching without POPLAR_ENGINE_OPTIONS
2.7.3. Launching with POPLAR_ENGINE_OPTIONS
2.8. V-IPU settings
2.8.1. V-IPU server address
2.8.2. PopRun and IPU over Fabric settings
2.8.3. V-IPU partition, cluster, and allocation
2.9. Storing and loading command line arguments
2.10. Troubleshooting and solution to common problems
2.10.1. PopRun cannot find or create partition
2.10.2. Program cannot acquire devices
3. PopRun features
3.1. Usage
3.1.1. Generic options
3.1.2. Debug options
3.1.3. Configuration
3.2. Tips and tricks
3.2.1. Passing environment variables
3.2.2. Character escaping
3.2.3. File output forwarding
3.2.4. Topology table
3.2.5. Generating auto-completion file
4. Poplar distributed configuration library (PopDist)
4.1. Horovod Installation
4.1.1. TensorFlow 1 and TensorFlow 2
4.1.2. PyTorch
4.1.3. PopART
4.2. PopDist Collectives
4.2.1. Running code on a subset of instances
4.3. PopDist API
4.4. PopDist examples
4.4.1. PyTorch
4.4.2. TensorFlow 1
4.4.3. TensorFlow 2
4.5. Conclusion
5. PopDist C++ API reference
6. PopDist Python API reference
6.1. PopART
6.2. PopTorch
6.3. TensorFlow 1 and 2
7. PopRun changelog
7.1. v2.3 (Poplar SDK 2.3)
7.1.1. New features
7.2. v2.2 (Poplar SDK 2.2)
7.2.1. New features
7.3. v2.1 (Poplar SDK 2.1)
7.3.1. New features
7.4. v2.0 (Poplar SDK 2.0)
7.4.1. New features
7.5. v1.0 (Poplar SDK 1.4)
7.5.1. New features
8. PopDist changelog
8.1. v2.6 (Poplar SDK 2.6)
8.2. v2.3 (Poplar SDK 2.3)
8.3. v2.2 (Poplar SDK 2.2)
8.3.1. New features
8.4. v2.1 (Poplar SDK 2.1)
8.4.1. New features
8.5. v2.0 (Poplar SDK 2.0)
8.5.1. New features
8.6. v1.0 (Poplar SDK 1.4)
8.6.1. New features
9. Known issues
9.1. Race condition with multiple users of the same partition
10. Index
11. Trademarks & copyright
PopDist and PopRun: User Guide
Index
C
|
D
|
E
|
F
|
G
|
I
|
M
|
O
|
P
|
R
|
S
|
W
C
checkNumIpusPerReplica (C++ function)
checkNumIpusPerReplica() (in module popdist)
collectives (C++ type)
collectives::allGather (C++ function)
collectives::allReduceSum (C++ function)
collectives::broadcast (C++ function)
configureSessionOptions() (in module popdist.popart)
createGraph (C++ function)
D
deallocateSharedBuffers (C++ function)
defaultCommunicatorId (C++ function)
E
execute_on_instances() (in module popdist)
F
finalizeBackend (C++ function)
G
getDevice (C++ function)
getDevice() (in module popdist.popart)
getDeviceId (C++ function)
getDeviceId() (in module popdist)
getInstanceIndex (C++ function)
getInstanceIndex() (in module popdist)
getLocalInstanceIndex (C++ function)
getLocalInstanceIndex() (in module popdist)
getNumInstances (C++ function)
getNumInstances() (in module popdist)
getNumIpusPerReplica (C++ function)
getNumIpusPerReplica() (in module popdist)
getNumLocalReplicas (C++ function)
getNumLocalReplicas() (in module popdist)
getNumTotalReplicas (C++ function)
getNumTotalReplicas() (in module popdist)
getReplicaIndexOffset (C++ function)
getReplicaIndexOffset() (in module popdist)
I
init() (in module popdist)
initializeBackend (C++ function)
initializeSharedBuffers (C++ function)
isBackendInitialized (C++ function)
isBackendInitialized() (in module popdist)
isPopdistEnvSet() (in module popdist)
isUniformReplicasPerInstance (C++ function)
isUniformReplicasPerInstance() (in module popdist)
M
module
popdist
popdist.popart
popdist.poptorch
popdist.tensorflow
O
Options (class in popdist.poptorch)
P
popdist
module
popdist (C++ type)
,
[1]
,
[2]
,
[3]
popdist.popart
module
popdist.poptorch
module
popdist.tensorflow
module
prepareDevice (C++ function)
prepareParentDevice (C++ function)
R
readSharedBuffer (C++ function)
registerBackend (C++ function)
registerCommunicator (C++ function)
registerCommunicator() (in module popdist)
registerDefaultBackend (C++ function)
run (C++ function)
S
set_ipu_config() (in module popdist.tensorflow)
setEngineOptions (C++ function)
synchronize (C++ function)
synchronize() (in module popdist)
W
writeToSharedBuffer (C++ function)
Read the Docs
v: latest