6. Tensors

The concepts to tensors were introduced in Section 3.3, Tensors. You define a tensor with shape, data type and optional initialisation data. A tensor has zero or more consumer operations and up to one producer operation.

There are three types of tensors in PopXL:

  • Constant

  • Variable

  • Intermediate

An intermediate tensor is the output of an operation. Variable tensors and constant tensors are initialised with data. For instance, in the example Listing 6.1, a is a variable tensor, b is a constant tensor, and o is an intermediate tensor.

Listing 6.1 Example of tensor addition
13with main:
14    a = popxl.variable(3, dtype=popxl.int8, name="variable_a")
15    b = popxl.constant(1, dtype=popxl.int8, name="constant_b")
16
17    # addition
18    o = a + b

Download tensor_addition.py

6.1. Constant tensors

A constant tensor is initialised with data during graph creation with constant(). This tensor cannot change during the runtime of a model. You can also use Python numeric literals in PopXL. These literals are implicitly converted to constant tensors. For example:

b = popxl.constant(1, dtype=popxl.int8, name="constant_b")
o = a + b

can also be written as:

o = a + 1

6.2. Variable tensors

Variable tensors are always live in IPU memory and this memory does not get freed during execution. Therefore, a variable tensor is used to represent trainable parameters in a model or non-trainable optimizer states.

You create and initialize variable tensors in the scope of the main graph. You can add a variable tensor to the main graph using variable().

To enable flexible interaction, you can read or write variable tensors on the IPU at runtime using readWeights() and writeWeights() methods respectively.

Note that, you have to copy the initial value of a variable tensor to the IPU from the host before running the graph with weightsFromHost().

If your graph has a replication factor, you can load different instances of the variable on different replicas by using replica grouping, see Section 14.2, Replica grouping.

6.3. Intermediate tensors

An intermediate tensor is produced by an operation, which means it is not initialised with data. It stays live in IPU memory from the point at which it is produced until the last time it is consumed.