18. IPU TensorFlow Addons

18.1. Introduction

IPU TensorFlow Addons is a collection of addons created for IPU TensorFlow. These include layers and optimizers for Keras, as well as legacy TensorFlow layers and optimizers.

18.2. Keras layers

The ipu_tensorflow_addons.keras.layers namespace contains Keras layers optimised for running on IPUs. These layers can be used the same way as standard Keras layers.

Some IPU-specific versions of standard Keras layers are included. Swapping out standard keras layers for their IPU-specific counterparts will improve your model’s performance when using IPUs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
import argparse
import tensorflow as tf

from tensorflow.python import ipu
from tensorflow.python import keras
from tensorflow.python.keras import layers
from tensorflow.python.keras.datasets import imdb
from tensorflow.python.keras.preprocessing import sequence
from tensorflow.python.keras.optimizer_v2.adam import Adam

from ipu_tensorflow_addons.keras import layers as ipu_layers

max_features = 20000


# Define the dataset
def get_dataset():
  (x_train, y_train), (_, _) = imdb.load_data(num_words=max_features)

  x_train = sequence.pad_sequences(x_train, maxlen=80)

  ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
  ds = ds.repeat()
  ds = ds.map(lambda x, y: (x, tf.cast(y, tf.int32)))
  ds = ds.batch(32, drop_remainder=True)
  return ds


# Define the model
def get_model():
  input_layer = layers.Input(shape=(80), dtype=tf.int32, batch_size=32)

  with ipu.keras.PipelineStage(0):
    x = ipu_layers.Embedding(max_features, 64)(input_layer)
    x = ipu_layers.LSTM(64, dropout=0.2)(x)

  with ipu.keras.PipelineStage(1):
    a = layers.Dense(8, activation='relu')(x)

  with ipu.keras.PipelineStage(2):
    b = layers.Dense(8, activation='relu')(x)

  with ipu.keras.PipelineStage(3):
    x = layers.Concatenate()([a, b])
    x = layers.Dense(1, activation='sigmoid')(x)

  return keras.Model(input_layer, x)


#
# Main code
#

# Parse command line args
parser = argparse.ArgumentParser("Config Parser", add_help=False)
parser.add_argument('--steps-per-epoch',
                    type=int,
                    default=768,
                    help="Number of steps in each epoch.")
parser.add_argument('--epochs',
                    type=int,
                    default=3,
                    help="Number of epochs to run.")
args = parser.parse_args()

# Configure IPUs
cfg = ipu.config.IPUConfig()
cfg.auto_select_ipus = 2
cfg.configure_ipu_system()

# Set up IPU strategy
strategy = ipu.ipu_strategy.IPUStrategyV1()
with strategy.scope():

  model = get_model()
  model.set_pipelining_options(gradient_accumulation_steps_per_replica=8,
                               device_mapping=[0, 1, 1, 0])
  model.compile(loss='binary_crossentropy',
                optimizer=Adam(0.005),
                steps_per_execution=16)

  model.fit(get_dataset(),
            steps_per_epoch=args.steps_per_epoch,
            epochs=args.epochs)

18.3. Optimizers

The optimizers contained in the IPU TensorFlow add-ons are drop in replacements to TensorFlow optimizers. They are functionally the same but have a number of additional features, which can be used via the optimizer’s kwargs.

The precision of any optimizer states within the optimizer can be set independently of each other and the model parameters. This is particularly useful when training in mixed precision.

The optimizer update can be outlined, making the optimizer update block code reusable, which can reduce memory at the expense of passing variables around.