11. Glossary
11.1. Sample
The smallest division of a data set.
11.2. Micro-batch size
The number of samples processed in a single execution of a graph on a single device.
Also referred to as the machine batch size.
The micro-batch shape, or the shape of input data as defined in the ONNX model,
is therefore [micro_batch_size, *sample_shape]
.
11.3. Replication factor
The number of graphs to be run in parallel over multiple devices. The weight gradients from each device will be accumulated before a weight update. Also referred to as “device replication factor” or “spatial replication factor”. This is sometimes called data-parallel execution.
11.4. Accumulation factor
The weight gradients will be accumulated over this number of micro-batches in series before a weight update. Also referred to as “temporal replication factor”.
Accumulation can be thought of as doing replication on a single device.
11.5. Batch size
This is defined as micro-batch size * replication factor * accumulation
factor
.
This is the number of samples per weight update.
11.6. Batches per step
The number of batches to run in a single call to Session::run
.
11.7. Step size
This is defined as batch size * batches per step
.
This is the number of samples per step.
11.8. Input data shape
Inputs to a session.run()
call are read in with the assumption that data is
arranged in the shape:
[batches_per_step, accl_factor, repl_factor, micro_batch_size, *sample_shape]
However, there is no constraint of the shape of the input array, except that it has the correct number of elements.