23. IPU TensorFlow Addons Python API
23.1. TensorFlow layers
23.1.1. TensorFlow layers made for IPU TensorFlow
- class ipu_tensorflow_addons.layers.PopnnAUGRU(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
XLA compatible, time-major Popnn implementation of an AUGRU layer.
Below is a typical workflow:
with tf.Graph().as_default(): augru = PopnnAUGRU(num_units, ...) outputs, output_state = augru(inputs, initial_state, training=True)
- __init__(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
Creates a PopnnAUGRU model from model spec.
- Parameters
num_units – the number of units within the RNN model.
dtype – tf.float16 or tf.float32
partials_dtype – the type used by Popnn to perform partial calculations. Either tf.float16 or tf.float32.
seed – A Python integer. Used to create the default Glorot uniform initializer weights_initializer.
weights_initializer – starting value to initialize the weight (default is Glorot uniform initializer).
activation – Activation function. Defaults to “tanh”. Accepted values: “tanh”, “relu”, “softmax”, “sigmoid”, “hard_sigmoid”.
recurrent_activation – Recurrent activation function. Defaults to “sigmoid”. Must generate output in the [0,1] range. Accepted values: “tanh”, “softmax”, “sigmoid”, “hard_sigmoid”.
return_state – Boolean. Whether to return the last state in addition to the output. Default:
True
.bias_initializer – starting value to initialize the bias (default is all zeros).
name – VariableScope for the created subgraph; defaults to class name. This only serves the default scope if later no scope is specified when invoking
__call__()
.options – A Python dictionary. Implementation or debug options for the forward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
options_bwd – A Python dictionary. Implementation or debug options for the backward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
- call(inputs, seq_len, attention_score, initial_state=None, training=True, time_major=True)
Runs the forward step for the AUGRU model.
- Parameters
inputs – 3-D tensor with shape [time_len, batch_size, input_size].
seq_len – 1-D tensor with the sequence length of samples in each batch.
attention_score – The output of attention layer, the score of samples in each batch, shaped
[batch_size, max_seq_len]
.initial_state – Initial state tensor, shaped
[batch_size, num_units]
. If not provided, the state is initialized to zeros.training – whether this operation will be used in training or inference.
time_major – whether the time dimension is the first dimension.
- Returns
A tuple of output and output state.
output: a tensor of shape [time_len, batch_size, num_units].
output_state: The output state of the last cell.
- Raises
ValueError – if initial_state is not valid.
- class ipu_tensorflow_addons.layers.PopnnDynamicGRU(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
XLA compatible, time-major Popnn implementation of an GRU layer, with a sequence length input.
Below is a typical workflow:
with tf.Graph().as_default(): gru = PopnnDynamicGRU(num_units, ...) outputs, output_state = gru( inputs, seq_len, initial_state, training=True)
- __init__(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
Creates a PopnnDynamicGRU model from model spec.
- Parameters
num_units – the number of units within the RNN model.
dtype – tf.float16 or tf.float32
partials_dtype – the type used by Popnn to perform partial calculations. Either tf.float16 or tf.float32.
seed – A Python integer. Used to create the default Glorot uniform initializer weights_initializer.
weights_initializer – starting value to initialize the weight (default is Glorot uniform initializer).
bias_initializer – starting value to initialize the bias (default is all zeros).
activation – Activation function. Defaults to “tanh”. Accepted values: “tanh”, “relu”, “softmax”, “sigmoid”, “hard_sigmoid”.
recurrent_activation – Recurrent activation function. Defaults to “sigmoid”. Must generate output in the [0,1] range. Accepted values: “tanh”, “softmax”, “sigmoid”, “hard_sigmoid”.
return_state – Boolean. Whether to return the last state in addition to the output. Default:
True
.name – VariableScope for the created subgraph; defaults to class name. This only serves the default scope if later no scope is specified when invoking
__call__()
.reset_after – GRU convention (whether to apply reset gate after or before matrix multiplication). False = “before” (default), True = “after”. Leave as default (False) to match the behaviour of the standard TensorFlow GRU.
options – A Python dictionary. Implementation or debug options for the forward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
options_bwd – A Python dictionary. Implementation or debug options for the backward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
- call(inputs, seq_len, initial_state=None, training=True, time_major=True)
Runs the forward step for the DynamicGRU model.
- Parameters
inputs – 3-D tensor with shape [batch_size, time_len, input_size].
seq_len – 1-D tensor with the sequence length of samples in each batch.
initial_state – Initial state tensor, shaped
[batch_size, num_units]
. If not provided, the state is initialized to zeros.training – whether this operation will be used in training or inference.
time_major – whether the time dimension is the first demension.
- Returns
A tuple of output and output state.
output: a tensor of shape [time_len, batch_size, num_units].
output_state: The output state of the last cell.
- Raises
ValueError – if initial_state is not valid.
- class ipu_tensorflow_addons.layers.PopnnDynamicLSTM(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, options=None, options_bwd=None)
- call(inputs, seq_len, initial_state=None, training=True)
Runs the forward step for the LSTM model.
- Parameters
inputs – 3D tensor with shape [time_len, batch_size, input_size].
seq_len – 1-D tensor with the sequence length of samples in each batch.
initial_state – An
LSTMStateTuple
of state tensors, each shaped[batch_size, num_units]
. If not provided, the state is initialized to zeros.training – Set to False to use the LSTM model in inference mode.
- Returns
A tuple of output and output state.
output: a tensor of shape [time_len, batch_size, num_units].
output_state: An
LSTMStateTuple
of the same shape and structure as initial_state.
- Raises
ValueError – if initial_state is not valid.
- class ipu_tensorflow_addons.layers.PopnnGRU(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
XLA compatible, time-major Popnn implementation of a GRU layer.
Below is a typical workflow:
with tf.Graph().as_default(): gru = PopnnGRU(num_units, ...) outputs, output_state = gru(inputs, initial_state, training=True)
- __init__(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, reset_after=False, options=None, options_bwd=None)
Creates a PopnnGRU model from model spec.
- Parameters
num_units – the number of units within the GRU model.
dtype – tf.float16 or tf.float32
partials_dtype – the type used by Popnn to perform partial calculations. Either tf.float16 or tf.float32.
seed – A Python integer. Used to create the default Glorot uniform initializer weights_initializer.
weights_initializer – starting value to initialize the weights (default is Glorot uniform initializer).
bias_initializer – starting value to initialize the bias (default is all zeros).
activation – Activation function. Defaults to “tanh”. Accepted values: “tanh”, “relu”, “softmax”, “sigmoid”, “hard_sigmoid”.
recurrent_activation – Recurrent activation function. Defaults to “sigmoid”. Must generate output in the [0,1] range. Accepted values: “tanh”, “softmax”, “sigmoid”, “hard_sigmoid”.
return_state – Boolean. Whether to return the last state in addition to the output. Default:
True
.name – VariableScope for the created subgraph; defaults to class name. This only serves the default scope if later no scope is specified when invoking
__call__()
.reset_after – GRU convention (whether to apply reset gate after or before matrix multiplication). False = “before” (default), True = “after”. Leave as default (False) to match the behaviour of the standard TensorFlow GRU.
options – A Python dictionary. Implementation or debug options for the forward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
options_bwd – A Python dictionary. Implementation or debug options for the backward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
- build(input_shape)
Create variables of the PopnnGRU.
It can be called manually before
__call__()
or automatically through__call__()
. In the former case, any subsequent__call__()
will skip creating variables.- Parameters
input_shape – a TensorShape object with 3 dimensions.
- Raises
ValueError – if input_shape has wrong dimension or unknown 3rd dimension.
- call(inputs, initial_state=None, training=True)
Runs the forward step for the GRU model.
- Parameters
inputs – 3D tensor with shape [time_len, batch_size, input_size].
initial_state – Initial state tensor, shaped
[batch_size, num_units]
. If not provided, the state is initialized to zeros.training – Set to False to use the GRU model in inference mode.
- Returns
A tuple of output and output_state.
output: a tensor of shape [time_len, batch_size, num_units].
output_state: The output state of the last cell.
- Raises
ValueError – if initial_state is not valid.
- state_shape(batch_size)
Shape of Popnn GRU state.
State shape is [batch_size, num_units].
- Parameters
batch_size – an int
- Returns
A Python array.
- class ipu_tensorflow_addons.layers.PopnnLSTM(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, options=None, options_bwd=None)
XLA compatible, time-major Popnn implementation of an LSTM layer.
Below is a typical workflow:
with tf.Graph().as_default(): lstm = PopnnLSTM(num_units, ...) outputs, output_states = lstm(inputs, initial_states, training=True)
- __init__(num_units, dtype=tf.float32, partials_dtype=tf.float32, seed=None, weights_initializer=None, bias_initializer=None, activation='tanh', recurrent_activation='sigmoid', return_state=True, name=None, options=None, options_bwd=None)
Creates a PopnnLSTM model from model spec.
- Parameters
num_units – the number of units within the LSTM model.
dtype – tf.float16 or tf.float32
partials_dtype – the type used by Popnn to perform partial calculations. Either tf.float16 or tf.float32.
seed – A Python integer. Used to create the default Glorot uniform initializer weights_initializer.
weights_initializer – starting value to initialize the weights (default is Glorot uniform initializer).
bias_initializer – starting value to initialize the bias (default is all zeros).
activation – Activation function. Defaults to “tanh”. Accepted values: “tanh”, “relu”, “softmax”, “sigmoid”, “hard_sigmoid”.
recurrent_activation – Recurrent activation function. Defaults to “sigmoid”. Must generate output in the [0,1] range. Accepted values: “tanh”, “softmax”, “sigmoid”, “hard_sigmoid”.
return_state – Boolean. Whether to return the last state in addition to the output. Default:
True
.name – VariableScope for the created subgraph; defaults to class name. This only serves the default scope if later no scope is specified when invoking
__call__()
.options – A Python dictionary. Implementation or debug options for the forward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
options_bwd – A Python dictionary. Implementation or debug options for the backward LSTM cell in PopLibs. See the LSTM documentation in the PopLibs API reference for the full list of options.
- build(input_shape)
Create variables of the PopnnLSTM.
It can be called manually before
__call__()
or automatically through__call__()
. In the former case, any subsequent__call__()
will skip creating variables.- Parameters
input_shape – a TensorShape object with 3 dimensions.
- Raises
ValueError – if input_shape has wrong dimension or unknown 3rd dimension.
- call(inputs, initial_state=None, training=True)
Runs the forward step for the LSTM model.
- Parameters
inputs – 3D tensor with shape [time_len, batch_size, input_size].
initial_state – An
LSTMStateTuple
of state tensors, each shaped[batch_size, num_units]
. If not provided, the state is initialized to zeros.training – Set to False to use the LSTM model in inference mode.
- Returns
A tuple of output and output state.
output: a tensor of shape [time_len, batch_size, num_units].
output_state: An
LSTMStateTuple
of the same shape and structure as initial_state.
- Raises
ValueError – if initial_state is not valid.
- state_shape(batch_size)
Shape of Popnn LSTM states.
Shape is a 2-element tuple. Each is [batch_size, num_units]
- Parameters
batch_size – an int
- Returns
a tuple of Python arrays.