19.15. Ops available in PopXL
- class popxl.ops.CallSiteInfo(subgraph_op)
- Information relating to a parent graph calling a subgraph, for example using a call op or repeat op. - This is a convenience class for extracting information about the callsite and its subgraph. - Parameters
- subgraph_op (Union[CallOp, LoopOp]) – 
 - property called_graph: popxl.graph.Graph
- Get the called graph. 
 - graph_to_parent(graph_tensor)
- Get the tensor in the parent graph using the tensor in the called graph. - Both input and output tensors can be used - Parameters
- graph_tensor (Tensor) – The tensor in the called graph. 
- Raises
- ValueError – If - graph_tensoris not an input or output of the called graph.
- Returns
- The associated input or output tensor on the - CallOp.
- Return type
 
 - graph_to_parent_input_index(idx)
- Get the parent graph input tensor index given the graph input tensor index. 
 - graph_to_parent_output_index(idx)
- Get the parent graph output tensor index given the graph output tensor index. 
 - property inputs: Tuple[popxl.tensor.Tensor, ...]
- Get the parent graph inputs. - Returns
- Tuple[Tensor, …] 
 
 - property outputs: Tuple[popxl.tensor.Tensor, ...]
- Get the parent graph outputs. - Returns
- Tuple[Tensor, …] 
 
 - parent_input(idx)
- Get the parent graph input tensor at a given index. 
 - parent_output(idx)
- Get the parent graph output tensor at a given index. 
 - parent_to_graph(parent_tensor)
- Get the input tensor in the called graph using the input tensor in the parent graph. - If the - parent_tensorhas been used multiple times as an input only the first instance is returned.- Parameters
- parent_tensor (Tensor) – The tensor from the parent graph. 
- Raises
- ValueError – If - parent_tensoris not an input to the CallOp.
- Returns
- The tensor in the - called_graph.
- Return type
 
 - parent_to_graph_input_index(idx)
- Get the graph input tensor index given the parent graph input tensor index. 
 - parent_to_graph_output_index(idx)
- Get the graph output tensor index given the parent graph output tensor index. 
 - set_parent_input_modified(parent_tensor, infer_modified_regions=False)
- Specify that the parent graph’s input tensor - parent_tensoris modified by the call op.- This will guarantee that any modification to the graph input during the execution of the called graph will also change - parent_tensor.
 
- popxl.ops.abs(t)
- Compute the absolute value of each element of the input tensor. - See also PyTorch Tensor.abs. 
- popxl.ops.add(lhs, rhs)
- Add two tensors elementwise. - Follows NumPy broadcasting rules. Arguments must have the same dtype. - See also PyTorch Tensor.add, NumPy add, ONNX Add. 
- popxl.ops.add_(lhs, rhs)
- Add two tensors elementwise in place, in the lhs tensor. Follows NumPy broadcasting rules. Arguments must have the same dtype. - Note: There is no operation that adds to the rhs tensor in place. Use add_(rhs, lhs) or rhs += lhs for the same functionality. - See also PyTorch Tensor.add_. 
- popxl.ops.argmax(t, dim=0, keepdim=False)
- Compute the argmax of a tensor. - Compute the indices of the max elements of the input tensor’s element along the provided axis. The resulting tensor has the same rank as the input if keepdim is True. If keepdim is False, then the resulting tensor has the reduced dimension pruned. - See also PyTorch Tensor.argmax, NumPy argmax, ONNX ArgMax. 
- popxl.ops.argmin(t, dim=0, keepdim=False)
- Compute the argmin of a tensor. - Compute the indices of the min elements of the input tensor’s element along the provided axis. The resulting tensor has the same rank as the input if keepdim is True. If keepdim is False, then the resulting tensor has the reduced dimension pruned. - See also PyTorch Tensor.argmin, NumPy argmin, ONNX ArgMin. 
- popxl.ops.average_pool(t, kernel_size, stride=None, padding=None, out_pads=None, dilation=None, in_dilations=None, auto_pad='not_set', ceil_mode=False)
- Average pool a tensor. - average_pool consumes an input tensor - tand applies average pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Average pooling consisting of computing the average on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing.- Parameters
- t (Tensor) – - Input data tensor from previous layer. - If the input is a 3D tensor, the size is (N, C, L), where: - N is the batch size, 
- C is the number of channel, 
- L is the length. 
 
- If the input is a 2D image, the size is (N, C, H, W), where: - N is the batch size, 
- C is the number of channel, 
- H and W are the height and width. 
 
 - If the input is a 3D, image the size is (N, C, D, H, W), where: - N is the batch size, 
- C is the number of channel, D is the depth, 
- H and W are the height and width. 
 
- kernel_size (Tuple[int]) – The size of the kernel along each axis. 
- stride (Tuple[int]) – Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis. 
- padding (Tuple[int]) – - Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. - The value represents the number of pixels added to the beginning and end part of the corresponding axis. - paddingformat should be as follows: [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis- iand xi_end, the number of pixels added at the end of axis- i.
- out_pads (Tuple[int]) – The output padding for pooling. 
- dilation (Tuple[int]) – dilation value along each spatial axis of the filter. 
- in_dilations (Tuple[int]) – The input dilations attributes along each spatial axis of the filter. 
- auto_pad (Literal) – auto_pad must be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input so that - output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.
- ceil_mode (bool) – When True, will use ceil instead of floor to compute the output shape. 
 
- Returns
- Output data tensor from average pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used. 
- Return type
 
- popxl.ops.batch_norm_inference(t, scale, bias, mean, var, epsilon=1e-05, momentum=0.9)
- Apply batch normalisation to a tensor in an inference setting. - For more details, refer to the paper Group Normalization. - Parameters
- t (Tensor) – Tensor to be normalized. 
- scale (Tensor) – Tensor used to scale the result of normalisation. 
- bias (Tensor) – Tensor used to shift the result of normalisation. 
- mean (Tensor) – Mean estimate. 
- var (Tensor) – Variance estimate. 
- epsilon (float) – small quantity for avoidance of div-by-zero when variance is zero. 
- momentum (float) – coefficient for the exponential moving average (not used in inference). 
 
- Returns
- The batch normalised tensor. 
- Return type
 
- popxl.ops.call(graph, *inputs, inputs_dict=None)
- Call a graph. - The - inputsand- inputs_dicttensors are passed as graph inputs. You can specify an input either positionally using- inputsor via a tensor map using- inputs_dict.- Graph inputs are determined when the graph was created using - ir.create_graph(callable, ...). The order of inputs in will be the same as the order of the tensor inputs in the function signature and the order of called- popxl.graph_inputs.- See - create_graph()for more information.- Parameters
- graph (Graph) – The graph to call. 
- *inputs (Union[TensorLike, Iterable[TensorLike]]) – Provide inputs via position. 
- inputs_dict (Optional[Mapping[Tensor, TensorLike]]) – Provide inputs via graph tensor. Mapping of - graph tensor -> parent tensor.
- inputs (Union[Tensor, int, float, bool, ndarray, Iterable[Union[int, float, bool]], Iterable[Union[Tensor, int, float, bool, ndarray, Iterable[Union[int, float, bool]]]]]) – 
 
- Returns
- Tuple of the output tensors of the call in the parent graph. 
- Return type
- Tuple[Tensor, …] 
 
- popxl.ops.call_with_info(graph, *inputs, inputs_dict=None, check_inputs=True)
- Call a graph and return information about the call site. - The - inputsand- inputs_dicttensors are passed as graph inputs. You can specify an input either positionally using- inputsor via a tensor map using- inputs_dict. This op returns- CallSiteInfothat can be used to inspect call site inputs/outputs.- Graph inputs are determined when the graph was created using - ir.create_graph(callable, ...).- The order of inputs will be the same as the order of the tensor inputs in the function signature and the order of called - popxl.graph_inputs.- See - create_graph()for more information.- Parameters
- graph (Graph) – The graph to call. 
- *inputs (Union[TensorLike, Iterable[TensorLike]]) – Provide inputs via position. 
- inputs_dict (Optional[Mapping[Tensor, TensorLike]]) – Provide inputs via graph tensor. Mapping of - graph tensor -> parent tensor.
- check_inputs (bool) – Check when called if all inputs have been provided. Defaults to True. 
- inputs (Union[Tensor, int, float, bool, ndarray, Iterable[Union[int, float, bool]], Iterable[Union[Tensor, int, float, bool, ndarray, Iterable[Union[int, float, bool]]]]]) – 
 
- Raises
- ValueError – A ValueError will be raised if: - An incorrect number of inputs have been provided - A parent input tensor is not in the parent graph - A graph input tensor is specified twice 
- TypeError – A TypeError will be raised if: - Graph input tensor is specified twice - If a graph input cannot be coerced into a tensor 
 
- Returns
- CallSiteInfo
- Information on the created callsite. 
 
- Return type
- info 
 
- popxl.ops.cast(t, data_type)
- Cast a tensor to a specific data type. - This operation casts tensor - tto data type- data_type.- See also ONNX Cast. - Parameters
- t (Tensor) – The tensor to be cast. 
- data_type (popxl.dtypes.dtype) – The dtype to cast to. 
 
- Raises
- TypeError – If - data_typeis of type float8_143 or float8_152.
- Returns
- The tensor cast to the specified type. 
- Return type
 
- popxl.ops.ceil(t)
- Compute the ceil of the elements of input tensor. NaN values are propagated. 
- popxl.ops.clamp(t, min=- inf, max=inf)
- Clip all elements so they are within the range [min, max]. NaN values are propagated. 
- popxl.ops.clip(t, min=- inf, max=inf)
- Clip all elements so they are within the range [min, max]. NaN values are propagated. 
- popxl.ops.concat(ts, axis=0)
- Concatenate tensors along an axis. The result will be copied to a new tensor. - See also ONNX Concat. 
- popxl.ops.concat_(ts, axis=0)
- Concatenate tensors along an axis. - The result will alias both of the input tensors. 
- popxl.ops.conditional(cond, then_branch, else_branch, then_inputs=None, else_inputs=None, then_inputs_dict=None, else_inputs_dict=None)
- Execute - then_branchor- else_branchaccording to the value of tensor- condat runtime.- The - then/else_inputsand- then/else_inputs_dicttensors are passed as then/else_branch inputs. You can specify a then/else_input either positionally using- then/else_inputsor via a tensor map using- then/else_inputs_dict.- Graph inputs are determined when the graph was created using - ir.create_graph(callable, ...).- The order of inputs will be the same as the order of the tensor inputs in the function signature and the order of called - popxl.graph_inputs.- See - create_graph()for more information.- Parameters
- cond (Tensor) – A boolean single-value tensor. If true the then_branch is executed otherwise the else_branch is executed. 
- then_branch (Graph) – Graph to run if condition is true. 
- else_branch (Graph) – Graph to run if condition is false. 
- then_inputs (Optional[Iterable[Union[Tensor, Iterable[Tensor]]]]) – Provide inputs to then_branch via position, - then_inputsfollow the same rules as- inputsin- calland- repeatop.
- else_inputs (Optional[Iterable[Union[Tensor, Iterable[Tensor]]]]) – Provide inputs to else_branch via position, - else_inputsfollow the same rules as- inputsin- calland- repeatop.
- then_inputs_dict (Optional[Mapping[Tensor, Tensor]]) – Provide inputs to then_branch via a tensor map. Mapping of - graph tensor -> parent tensor,- then_inputs_dictfollow the same rules as- inputs_dictin- calland- repeatop.
- else_inputs_dict (Optional[Mapping[Tensor, Tensor]]) – - else_inputs_dictfollow the same rules as- inputs_dictin- calland- repeatop.
 
- Raises
- ValueError – If: - An incorrect number of inputs have been provided. - A parent input tensor is not in the parent graph. - A graph input tensor is specified twice. 
- TypeError – If: - A graph input tensor is specified twice. - A graph input cannot be coerced into a tensor. 
 
- Returns
- The values that are live after the execution of the conditional. The return values in - then_branchand- else_branchmust be of the same data type. The number of the outputs in- then_branchand- else_branchmust be equal. The shape of the input and outputs in- then_branchand- else_branchmust also be the same.
- Return type
- Tuple[Tensor, …] 
 
- popxl.ops.conditional_with_info(cond, then_branch, else_branch, then_inputs=None, else_inputs=None, then_inputs_dict=None, else_inputs_dict=None, check_inputs=True)
- Execute - then_branchor- else_branchaccording to the value of tensor- condat runtime and return the call site info.- The - then/else_inputsand- then/else_inputs_dicttensors are passed as then/else_branch inputs. You can specify a then/else_input either positionally using- then/else_inputsor via a tensor map using- then/else_inputs_dict.- Graph inputs are determined when the graph was created using - ir.create_graph(callable, ...).- The order of inputs will be the same as the order of the tensor inputs in the function signature and the order of called - popxl.graph_inputs.- See - create_graph()for more information.- Parameters
- cond (Tensor) – A boolean single-value tensor. If true the then_branch is executed otherwise the else_branch is executed. 
- then_branch (Graph) – Graph to run if condition is true. 
- else_branch (Graph) – Graph to run if condition is false. 
- then_inputs (Optional[Iterable[Union[Tensor, Iterable[Tensor]]]]) – Provide inputs to then_branch via position, - then_inputsfollow the same rules as- inputsin- calland- repeatop.
- else_inputs (Optional[Iterable[Union[Tensor, Iterable[Tensor]]]]) – Provide inputs to else_branch via position, - else_inputsfollow the same rules as- inputsin- calland- repeatop.
- then_inputs_dict (Optional[Mapping[Tensor, Tensor]]) – Provide inputs to then_branch via a tensor map. Mapping of - graph tensor -> parent tensor,- then_inputs_dictfollow the same rules as- inputs_dictin- calland- repeatop.
- else_inputs_dict (Optional[Mapping[Tensor, Tensor]]) – - else_inputs_dictfollow the same rules as- inputs_dictin- calland- repeatop.
- check_inputs (bool) – Check when called if all inputs have been provided to both graphs. Defaults to True. 
 
- Raises
- ValueError – If: - An incorrect number of inputs have been provided. - A parent input tensor is not in the parent graph. - A graph input tensor is specified twice. 
- TypeError – If: - A graph input tensor is specified twice. - A graph input cannot be coerced into a tensor. 
 
- Returns
- Information on the created conditional site. 
- Return type
- ConditionalSiteInfo 
 
- popxl.ops.conv(t, weight, stride=(1, 1), padding=(0, 0, 0, 0), dilation=(1, 1), groups=1, pad_type='not_set', available_memory_proportions=None, partials_types=None, enable_conv_dithering=None)
- Use the convolution operator on a tensor. - The convolution operator consumes an input tensor and a filter, and computes the output. - Parameters
- t (Tensor) – Input data tensor from previous layer; If the input is a 3D tensor, the size is (N, C, L), where N is the batch size, C is the number of channel, L is the length; If the input is a 2D image, the size is (N, C, H, W), where N is the batch size, C is the number of channel, H and W are the height and width; If the input is a 3D image, the size is (N, C, D, H, W), where N is the batch size, C is the number of channel, D is the depth, H and W are the height and width. 
- weight (Tensor) – The weight tensor that will be used in the convolutions; If the input is a 3D tensor, the weight size is (M, C/group, k), where C is the number of channels, k is the length of the kernel, M is the number of feature maps. If the input is a 2D image, the weight size is (M, C/group, kH, kW), where C is the number of channels, kH and kW are the height and width of the kernel, M is the number of feature maps. If the input is a 3D image, the weight size is (M, C/group, kD, kH, kW), where C is the number of channels, kD, kH and kW are the depth, height and width of the kernel, M is the number of feature maps. 
- stride (Tuple[int]) – Stride along each spatial axis. 
- padding (Tuple[int]) – Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. - padsformat should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis- iand xi_end, the number of pixels added at the end of axis- i.
- dilation (Tuple[int]) – dilation value along each spatial axis of the filter. 
- groups (int(default is 1)) – number of groups input channels and output channels are divided into. 
- pad_type (PadType(default is not_set)) – pad_type must be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input so that - output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.
- available_memory_proportions (List[float]) – The available memory proportions per conv, each [0, 1). 
- partials_types (List[str]) – The partials type per convolution, choose between half and float. 
- enable_conv_dithering (List[int]) – Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models. 
 
- Returns
- A tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths. 
- Return type
 
- popxl.ops.conv_pow2scaled(t, weight, log2_scale, stride=(1, 1), padding=(0, 0, 0, 0), dilation=(1, 1), groups=1, pad_type='not_set', available_memory_proportions=None, enable_conv_dithering=None)
- Perform a scaled convolution on a float8 tensor. - The convolution operator consumes an input tensor, a filter and computes the output. The dtype of the input tensor and filter must be one of - popxl.float8_143or- popxl.float8_152.- The result of the convolution is scaled by - pow2(log2_scale)before it is converted to float16.- The - log2_scalemust be a scalar tensor of type- popxl.int32and contain a runtime value in the range- [-32, 32)- Parameters
- t (Tensor) – Input data tensor from previous layer of type either - popxl.float8_143or- popxl.float8_152; If the input is a 3D tensor, the size is (N, C, L), where N is the batch size, C is the number of channel, L is the length; If the input is a 2D image, the size is (N, C, H, W), where N is the batch size, C is the number of channel, H and W are the height and width; If the input is a 3D image, the size is (N, C, D, H, W), where N is the batch size, C is the number of channel, D is the depth, H and W are the height and width.
- weight (Tensor) – The weight tensor that will be used in the convolutions of type either - popxl.float8_143or- popxl.float8_152; If the input is a 3D tensor, the weight size is (M, C/group, k), where C is the number of channels, k is the length of the kernel, M is the number of feature maps. If the input is a 2D image, the weight size is (M, C/group, kH, kW), where C is the number of channels, kH and kW are the height and width of the kernel, M is the number of feature maps. If the input is a 3D image, the weight size is (M, C/group, kD, kH, kW), where C is the number of channels, kD, kH and kW are the depth, height and width of the kernel, M is the number of feature maps.
- log2_scale (Tensor) – 32-bit integer power-of-two exponent, where the convolution output is multiplied by - pow2(log2_scale)before conversion to float16.
- stride (Tuple[int]) – Stride along each spatial axis. 
- padding (Tuple[int]) – Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. - padsformat should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis- iand xi_end, the number of pixels added at the end of axis- i.
- dilation (Tuple[int]) – dilation value along each spatial axis of the filter. 
- groups (int(default is 1)) – number of groups input channels and output channels are divided into. 
- pad_type (PadType(default is not_set)) – pad_type must be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input so that - output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.
- available_memory_proportions (List[float]) – The available memory proportions per conv, each [0, 1). 
- enable_conv_dithering (List[int]) – Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models. 
 
- Returns
- A tensor that contains the result of the convolution of type - popxl.float16. The output dimensions are functions of the kernel size, stride size, and pad lengths.
- Return type
- Raises
- TypeError – If the tensor or weight tensors do not have a dtype in - {popxl.float8_143, popxl.float8_152}, or if the- log2_scaletensor does not have dtype- popxl.int32
- ValueError – If - log2_scaleis not a scalar tensor.
 
 
- popxl.ops.conv_transpose(t, weight, stride=(1, 1), padding=(0, 0, 0, 0), dilation=(1, 1), groups=1, pad_type='not_set', output_padding=(), output_shape=(), available_memory_proportions=None, partials_types=None, enable_conv_dithering=None)
- Perform a convolution transpose operation on a tensor. - The convolution transpose operator consumes an input tensor and a filter, and computes the output. - If the - paddingparameter is provided the shape of the output is auto generated.- output_shapecan also be explicitly specified in which case- paddingvalues are auto generated. See attribute descriptions for more details.- See also PyTorch Tensor.ConvTranspose2d, ONNX ConvTranspose. - popxl.ops.t
- Input data tensor from a previous layer. If the input is a 3D tensor, the size is (N, C, L), where N is the batch size, C is the number of channels, L is the length; If the input is a 2D image, the size is (N, C, H, W), where N is the batch size, C is the number of channels, H and W are the height and width; If the input is a 3D image, the size is (N, C, D, H, W), where N is the batch size, C is the number of channels, D is the depth, H and W are the height and width. - Type
 
 - popxl.ops.weight
- The weight tensor that will be used in the convolutions. If the input is a 3D tensor, the weight size is (M, C/group, k), where C is the number of channels, k is the length of the kernel, M is the number of feature maps. If the input is a 2D image, the weight size is (M, C/group, kH, kW), where C is the number of channels, kH and kW are the height and width of the kernel, M is the number of feature maps. If the input is a 3D image, the weight size is (M, C/group, kD, kH, kW), where C is the number of channels, kD, kH and kW are the depth, height and width of the kernel, M is the number of feature maps. - Type
 
 - popxl.ops.padding
- Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. - padsformat should be- [x1_begin, x2_begin...x1_end, x2_end,...], where- xi_beginis the number of pixels added at the beginning of axis- iand- xi_endis the number of pixels added at the end of axis- i. If the pads parameter is provided the shape of the output is auto generated. See ONNX Conv Transpose for details.- Type
- Tuple[int] 
 
 - popxl.ops.groups
- Number of groups input channels and output channels are divided into. - Type
- int(default is 1) 
 
 - popxl.ops.pad_type
- The - pad_typemust be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input such that- output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.- Type
- PadType(default is not_set) 
 
 - popxl.ops.output_padding
- Additional elements added to the side with higher coordinate indices in the output. Each padding value in - output_paddingmust be strictly less than the corresponding stride/dilation dimension. Note that this attribute doesn’t directly affect the computed output values. It only controls the selection of the computed values, so changing this attribute only adds or removes output elements. If- output_shapeis explicitly provided,- output_paddingdoes not contribute additional size to- output_shapebut participates in the computation of the needed padding amount.- Type
- Tuple[int] 
 
 - popxl.ops.output_shape
- The shape of the output can be explicitly set which will cause padding values to be auto generated. If output_shape is specified pads values are ignored. See ONNX Conv Transpose for details on how padding is generated. - Type
- Tuple[int] 
 
 - popxl.ops.available_memory_proportions
- The available memory proportions per conv, each [0, 1). - partials_types (List[str]):
- The partials type per convolution, choose between half and float. 
 - Type
- List[float] 
 
 - popxl.ops.enable_conv_dithering
- Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models. - Type
- List[int] 
 
 - Returns
- Output data tensor that contains the result of the convolution. The output dimensions are functions
- of the kernel size, stride size, pad lengths and group count. 
 
- Return type
- Parameters
 
- popxl.ops.conv_transpose_pow2scaled(t, weight, log2_scale, stride=(1, 1), padding=(0, 0, 0, 0), dilation=(1, 1), groups=1, pad_type='not_set', output_padding=(), output_shape=(), available_memory_proportions=None, enable_conv_dithering=None)
- Perform a single transposed and scaled convolution operation on a tensor. - This operator consumes an input, weight, and log2 scale tensor to compute a transposed convolution, then scales the convolution output by - pow2(log2_scale)before converting to float16.- The dtype of the input - tand weight tensor must be one of- popxl.float8_143or- popxl.float8_152. The- log2_scalemust be a scalar tensor of type- popxl.int32and contain a runtime value in the range- [-32, 32)- If the - paddingparameter is provided the shape of the output is auto generated.- output_shapecan also be explicitly specified in which case- paddingvalues are auto generated. See attribute descriptions for more details.- See also PyTorch Tensor.ConvTranspose2d, ONNX ConvTranspose. - popxl.ops.t
- Input data tensor from previous layer of type either - popxl.float8_143or- popxl.float8_152; If the input is a 3D tensor, the size is (N, C, L), where N is the batch size, C is the number of channels, L is the length; If the input is a 2D image, the size is (N, C, H, W), where N is the batch size, C is the number of channels, H and W are the height and width; If the input is a 3D image, the size is (N, C, D, H, W), where N is the batch size, C is the number of channels, D is the depth, H and W are the height and width.- Type
 
 - popxl.ops.weight
- The weight tensor that will be used as a kernel in the convolution, of dtype either - popxl.float8_143or- popxl.float8_152; If the input is a 3D tensor, the weight size is (M, C/group, k), where C is the number of channels, k is the length of the kernel, M is the number of feature maps. If the input is a 2D image, the weight size is (M, C/group, kH, kW), where C is the number of channels, kH and kW are the height and width of the kernel, M is the number of feature maps. If the input is a 3D image, the weight size is (M, C/group, kD, kH, kW), where C is the number of channels, kD, kH and kW are the depth, height and width of the kernel, M is the number of feature maps.- Type
 
 - popxl.ops.log2_scale
- 32-bit integer power-of-two exponent, where the convolution output is multiplied by - pow2(log2_scale)before conversion to float16. Must be of dtype- popxl.int32.- Type
 
 - popxl.ops.padding
- Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. - padsformat should be- [x1_begin, x2_begin...x1_end, x2_end,...], where- xi_beginis the number of pixels added at the beginning of axis- iand- xi_endis the number of pixels added at the end of axis- i. If the pads parameter is provided the shape of the output is auto generated. See ONNX Conv Transpose for details.- Type
- Tuple[int] 
 
 - popxl.ops.groups
- Number of groups input channels and output channels are divided into. - Type
- int(default is 1) 
 
 - popxl.ops.pad_type
- The - pad_typemust be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input such that- output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.- Type
- PadType(default is not_set) 
 
 - popxl.ops.output_padding
- Additional elements added to the side with higher coordinate indices in the output. Each padding value in - output_paddingmust be strictly less than the corresponding stride/dilation dimension. Note that this attribute doesn’t directly affect the computed output values. It only controls the selection of the computed values, so changing this attribute only adds or removes output elements. If- output_shapeis explicitly provided,- output_paddingdoes not contribute additional size to- output_shapebut participates in the computation of the needed padding amount.- Type
- Tuple[int] 
 
 - popxl.ops.output_shape
- The shape of the output can be explicitly set which will cause padding values to be auto generated. If output_shape is specified pads values are ignored. See ONNX Conv Transpose for details on how padding is generated. - Type
- Tuple[int] 
 
 - popxl.ops.available_memory_proportions
- The available memory proportions per conv, each [0, 1). - Type
- List[float] 
 
 - popxl.ops.enable_conv_dithering
- Enable convolution dithering per convolution. If true, then convolutions with different parameters will be laid out from different tiles in an effort to improve tile balance in models. - Type
- List[int] 
 
 - Returns
- Output data tensor that contains the result of the convolution. The output dimensions are functions
- of the kernel size, stride size, pad lengths and group count. 
 
- Return type
- Raises
- TypeError – If the tensor or weight tensors do not have a dtype in - {popxl.float8_143, popxl.float8_152}, or if the- log2_scaletensor does not have dtype- popxl.int32
- ValueError – If - log2_scaleis not a scalar tensor.
 
- Parameters
 
- popxl.ops.cos(t)
- Compute the cosine of each element of the input tensor. - See also PyTorch Tensor.cos. 
- popxl.ops.cumsum(t, dim=0)
- Performs the cumulative sum of the input elements along the given dimension - dim.- See also Pytorch Tensor.cumsum, Numpy cumsum. 
- popxl.ops.detach(t)
- Prevent gradient computation of this tensor. - This operation is numerically equivalent to the identity op. - See also PyTorch Tensor.detach. 
- popxl.ops.detach_(t)
- Prevent in-place gradient computation of this tensor. - The in-place version of - detach(). The behaviour is the same, it blocks gradient propagation on the input tensor but does not make a copy of the input tensor.- See also PyTorch Tensor.detach_. 
- popxl.ops.div(lhs, rhs)
- Divide two tensors elementwise. - Follows NumPy broadcasting rules. The arguments must have the same dtype. The output will be the same dtype as the inputs. Floor division is used with integer values. - See also PyTorch Tensor.div, ONNX Div. 
- popxl.ops.dropout(t, seed_tensor, p)
- Randomly set elements of the input tensor to zero. - This operation will zero elements of tensor - twith a probability of- p. The dropout mask is created using samples from a Bernoulli distribution seeded with the- seed_tensor.- You needs to manage updating the - seed_tensorfor each forward pass and replica.- See also ONNX Dropout. - Parameters
- Returns
- A new tensor with the dropout applied. 
- Return type
 
- popxl.ops.dynamic_slice(t, index, axes, sizes, no_overlap)
- Return a cloned slice of the input tensor. - The name “dynamic” refers to the fact that the index can be specified at runtime. - A slice along an axis can be defined by the tuple ( - start,- stop,- step) where:- startis the index for the respective axis
- stopis index + size for the respective axis
- stepequals 1
 - Limitations: - Assuming we would like to slice - twith dimension [4, 3]:- A step other than 1 is not supported (that is, - t[::2,:]is not supported)
- Negative slicing is not supported (that is, - t[:-1,:]is not supported)
- A stop value greater than the size of the axis is not supported (that is, - t[:5,:]is not supported)
 - Parameters
- t (Tensor) – The input tensor. 
- index (Tensor) – The indices to start the slice from. 
- axes (List[int]) – The axes to slice from. 
- sizes (List[int]) – - The sizes of the slices for the specified axes. For example: - If - index= [1, 2],- axes= [0, 3] and- sizes= [2, 4], then the tensor will be sliced as- t[1:2, :, :, 2:4].
- no_overlap (bool) – If set to true, then correct gradient backpropagation is only guaranteed if each region in the output tensor has exactly one populator (operation that writes data to this region). There are no run-time or compile-time checks possible to ensure this. 
 
- Returns
- A clone (not a view) of the sliced input tensor. 
- Return type
 
- popxl.ops.dynamic_update(t, index, t_update, axes, sizes, no_overlap)
- Update a slice of a tensor. - The name “dynamic” refers to the fact that the index can be specified at runtime. - index,- axesand- sizesdetermine the slice of- twhich will be updated. The dimensions of this slice and- t_updatemust match. A slice along an axis can be defined by the tuple (- start,- stop,- step) where:- startis the index for the respective axis
- stopis- index+- sizefor the respective axis
- stepequals 1
 - Limitations: - Assuming we would like to update - twith dimension [4, 3], the slicing of- twill have the following limitations:- A - stepother than 1 is not supported (that is,- t[::2,:]is not supported)
- Negative slicing is not supported (that is, - t[:-1,:]is not supported)
- A value of - stoplarger than the size of the axis is not supported (for example,- t[:5,:]is not supported)
 - Parameters
- t (Tensor) – The tensor to update. 
- index (Tensor) – The indices to start the slice from. 
- t_update (Tensor) – The tensor to update - twith.
- axes (Iterable[int]) – The axes of - tto make the update on.
- sizes (Iterable[int]) – The sizes of the updates along the specified axes. For example, if - index= [1, 2],- axes= [0, 3] and- sizes= [2, 4], then the tensor will be updated at- t[1:2, :, :, 2:4].
- no_overlap (bool) – If set to true, then correct gradient backpropagation is only guaranteed if each region in the output tensor has exactly one populator (operation that writes data to this region). There are no run-time or compile-time checks possible to ensure this. 
 
- Returns
- The updated tensor. 
- Return type
 
- popxl.ops.dynamic_update_(t, index, t_update, axes, sizes, no_overlap)
- Update a slice of a tensor in place. - Dynamically updates tensor - tin place. The name “dynamic” refers to the fact that the index can be specified during runtime.- index,- axesand- sizesdetermine the slice of- twhich will be updated. The dimensions of this slice and- t_updatemust match. A slice along an axis can be defined by the tuple (- start,- stop,- step) where:- startis the index for the respective axis
- stopis- index+- sizefor the respective axis
- stepequals 1
 - Limitations: - Assuming we would like to update - twith dimension [4, 3], the slicing of- twill have the following limitations:- A step value other than 1 is not supported (that is, - t[::2,:]is not supported)
- Negative slicing is not supported (that is, t[:-1,:] is not supported) 
- A - stopvalue larger than the size of the axis is not supported (for example, t[:5,:] is not supported)
 - Parameters
- t (Tensor) – Tensor to update. 
- index (Tensor) – The indices to start the slice from. 
- t_update (Tensor) – The tensor to update - twith.
- axes (List[int]) – The axes of - tto make the update on.
- sizes (List[int]) – The sizes of the updates along the specified axes. For example, if - index= [1, 2],- axes= [0, 3] and- sizes= [2, 4], the tensor will be updated at- t[1:2, :, :, 2:4].
- no_overlap (bool) – If set to true, then correct gradient backpropagation is only guaranteed if each region in the output tensor has exactly one populator (operation that writes data to this region). There are no run-time or compile-time checks possible to ensure this. 
 
- Returns
- The updated tensor. 
- Return type
 
- popxl.ops.equal(lhs, rhs)
- Apply an elementwise equality operation. - Follows NumPy broadcasting rules. - See also PyTorch Tensor.equal, NumPy equal, ONNX Equal. 
- popxl.ops.exp(t)
- Compute the exponential of the elements of input tensor. - See also PyTorch Tensor.exp, NumPy exp, ONNX Exp. 
- popxl.ops.exp_(t)
- Compute the exponential of the elements of input tensor (in-place). - See also PyTorch Tensor.exp_. 
- popxl.ops.flatten(t)
- Flatten a tensor. - Internally this uses - reshape().- See also PyTorch Tensor.flatten, ONNX Flatten. 
- popxl.ops.flatten_(t)
- Flatten a tensor in place. - Internally this uses - reshape_().- This is the in-place version of - flatten().
- popxl.ops.fmod(lhs, rhs)
- Compute the elementwise remainder after division (modulo operation). - Follows NumPy broadcasting rules. Arguments must have the same dtype. - See also PyTorch Tensor.fmod, NumPy fmod. 
- popxl.ops.gather(t, indices, axis=0, available_memory_proportion=None, zero_OOR=False)
- Select multiple elements from a tensor along specified axes. - Elements are specified via - indices, along a specified axis. Equivalent to- numpy.take(). Note that this is different from- torch.gather().- Examples: - x = popxl.variable(np.arange(16).reshape(4, 4)) # [[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11], # [12, 13, 14, 15]] gather(x, [3, 1, 2]) == Tensor([x[3], x[1], x[2]]) # [[12, 13, 14, 15], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11]] gather(x, [[0, 1], [1, 2]]) == gather(x, [0, 1, 1, 2]).reshape(2, 2, 4) # [[[ 0, 1, 2, 3], # [ 4, 5, 6, 7]], # [[ 4, 5, 6, 7], # [ 8, 9, 10, 11]]] - See also PyTorch Tensor.gather, ONNX Gather. - Parameters
- t (Tensor) – The input tensor. 
- indices (Tensor) – The indices of the elements to extract. 
- axis (int) – The axis to gather on. The default is 0. 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0 if not set globally. 
- zero_OOR (bool) – If - False, out of range (OOR) indices will produce undefined data. If- True, out of range indices will produce zeros.
 
- Returns
- The gathered elements concatenated. 
- Return type
 
- popxl.ops.gelu(t)
- Compute the GELU activation on a tensor. - For more details, refer to the paper Gaussian Error Linear Units 
- popxl.ops.gelu_(t)
- Compute the GELU activation on a tensor (in-place). - For more details, refer to the paper Gaussian Error Linear Units 
- popxl.ops.geluerf(t)
- Compute the accurate GELU activation on a tensor. - For more details, refer to the paper Gaussian Error Linear Units 
- popxl.ops.geluerf_(t)
- Compute the accurate GELU activation on a tensor (in-place). - For more details, refer to the paper Gaussian Error Linear Units 
- popxl.ops.greater(input, other)
- Computes where the first tensor is greater than the second tensor. - This is an element-wise operation (with NumPy-style broadcasting support). - See also Pytorch greater, NumPy greater. 
- popxl.ops.group_norm(t, weight, bias, num_groups, eps=1e-05)
- Apply group normalisation to a tensor. - For more details, refer to the paper Group Normalization. - Parameters
- t (Tensor) – Tensor to be normalized. 
- weight (Tensor) – Tensor used to scale the result of normalisation. 
- bias (Tensor) – Tensor used to shift the result of normalisation. 
- num_groups (int) – Number of groups to separate the channels into. 
- eps (float) – The small value to use to avoid division by zero. 
 
- Returns
- The group normalised tensor. 
- Return type
 
- popxl.ops.groupedgather(t, indices, axis=0, group_size=1, available_memory_proportion=None, zero_OOR=False)
- Select multiple elements from a tensor along specified axes. - Elements are specified via - indices, along a specified axis for each group.- Examples: - x = popxl.variable(np.arange(16).reshape(2, 2, 4)) # [[[ 0, 1, 2, 3], # [ 4, 5, 6, 7]], # [[ 8, 9, 10, 11], # [12, 13, 14, 15]]] gather(x, [[0, 1, 0], [1, 0, 1]]) == Tensor( [[x[0][3], x[0][1], x[0][2]], [x[1][1], x[0][2], x[1][3]]] ) # [[[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 0, 1, 2, 3]], # [[12, 13, 14, 15], # [ 8, 9, 10, 11], # [12, 13, 14, 15]]] - Parameters
- t (Tensor) – The input tensor. 
- indices (Tensor) – The indices of the elements to extract. 
- axis (int) – The axis to gather on. The default is 0. 
- group_size (int) – The group size of the data. The default is 1. 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0 if not set globally. 
- zero_OOR (bool) – If - False, out of range (OOR) indices will produce undefined data. If- True, out of range indices will produce zeros.
 
- Returns
- The gathered elements concatenated. 
- Return type
 
- popxl.ops.histogram(t, levels, absolute_of_input)
- Compute the histogram of the input tensor. - All but the last bin are half-open. In other words, if - levelsis:- ` [1, 2, 3, 4] `- then the first bin is [1, 2) (including 1, but excluding 2) and the second [2, 3). The last bin, however, is [3, 4], which includes 4. - See also PyTorch torch.histc, - NumPy histogram.
- popxl.ops.host_load(h2d_stream, name=None)
- Transfer a tensor from the host to the IPU. - This operation represents the transfer of data from the host to the IPU. It uses the existing host to IPU transfers created when building the IR, but defers the actual poplar::Copy until the op itself runs. This allows the copy to be scheduled as part of the normal op scheduling. - Data is sent from the host via the - IStepIOobject passed to- session.run().- Parameters
- h2d_stream (HostToDeviceStream) – Stream to load from. 
- name (str) – Name to use for the returned tensor. 
 
- Returns
- The output tensor streamed from the host. 
- Return type
 
- popxl.ops.host_store(d2h_stream, t)
- Transfer a tensor from the IPU to the host. - This operation represents the transfer of data from the IPU to the host. It uses the existing device to host transfers created when building the IR, but defers the actual poplar::Copy until the op itself runs. This allows the copy to be scheduled as part of the normal op scheduling. - Data is received on the host via the - IStepIOobject passed to- session.run().- Raises
- ValueError – If the stream shape or dtype doesn’t match the tensor shape. 
- Parameters
- d2h_stream (DeviceToHostStream) – The stream to use for the host store. 
- t (Tensor) – The input tensor to copy to host. 
 
- Return type
- None 
 
- popxl.ops.identity(t, output_name=None)
- Input is equal to the output. This can also be used to rename a Tensor. 
- popxl.ops.increment_mod(t, increment, modulus)
- Increment the elements of a tensor using modulo arithmetic. 
- popxl.ops.increment_mod_(t, increment, modulus)
- Increment the elements of a tensor using modulo arithmetic in place. 
- popxl.ops.init(shape, dtype, name=None, init_type='zero', meta_shape=None)
- Create a tensor that is initialised with zero or undefined values. - The returned tensor is not considered a variable. A variable must be created in the main graph; it can be initialised to arbitrary values and can be read/written with session methods. - In contrast, - initcan be executed anywhere so it can return an initialised tensor in non-main graphs.- The tensor can only be initialised to zero or undefined values. - Parameters
- dtype (dtypes.dtype) – Data type of the output tensor. 
- shape (Tuple[int]) – Shape of the output tensor. 
- name (str) – Name of the output tensor. 
- init_type (Union[Literal["zero"], Literal["undef"]]) – Initialisation of the output tensor. 
- meta_shape (Tuple[int]) – meta shape of tensor 
 
- Raises
- ValueError – If the - init_typeis unknown.
- Returns
- An initialised tensor. 
- Return type
 
- popxl.ops.interpolate(t, scale_factor=(1.0, 1.0, 1.0, 1.0), mode='nearest', nearest_mode='round_prefer_floor', coordinate_transformation_mode='half_pixel')
- Interpolate the input tensor. Each dimension value of the output tensor is: output_dimension = floor(input_dimension * scale_factor). - Parameters
- t (Tensor) – Input data tensor from previous layer. 
- scale_factor (Tuple[float]) – The scale array along each dimension. It takes value greater than or equal to 1. The number of elements of ‘scales’ should be the same as the rank of input ‘t’. 
- mode (InterpolateType) – The interpolate algorithm, three interpolation modes: nearest (default), linear and cubic. 
- nearest_mode (InterpolateNearestType) – Four modes: round_prefer_floor (default, as known as round half down), round_prefer_ceil (as known as round half up), floor, ceil. Only used by nearest interpolation. It indicates how to get “nearest” pixel in input tensor from x_original, so this attribute is valid only if “mode” is “nearest”. 
- coordinate_transformation_mode (InterpolateCoordinateTransformationType) – - This attribute describes how to transform the coordinate in the interpolated tensor to the coordinate in the original tensor. The coordinate of each dimension is transformed individually. Let’s describe a case using axis x as an example. - Some variables are defined as follows: - x_interpolated: the coordinate of axis x in the interpolated tensor. 
- x_original: the coordinate of axis x in the original tensor. 
- length_original: the length of the original tensor in axis x. 
- length_interpolated: the length of the interpolated tensor in axis x. 
- roi_x: roi_x = (start_x, end_x) of the axis x in input “roi”. 
- scale: scale = length_interpolated / length_original. 
 - Then: - if coordinate_transformation_mode is “half_pixel”, x_original = (x_interpolated + 0.5) / scale - 0.5, 
- if coordinate_transformation_mode is “pytorch_half_pixel”, x_original = length_interpolated > 1 ? (x_interpolated + 0.5) / scale - 0.5 : 0, 
- if coordinate_transformation_mode is “align_corners”, x_original = x_interpolated * (length_original - 1) / (length_interpolated - 1), 
- if coordinate_transformation_mode is “asymmetric”, x_original = x_interpolated / scale, 
- if coordinate_transformation_mode is “tf_crop_and_resize”, x_original = length_interpolated > 1 ? start_x * (length_original - 1) + x_interpolated * (end_x - start_x) * (length_original - 1) / (length_interpolated - 1) : 0.5 * (start_x + end_x) * (length_original - 1). 
 
 
- Returns
- Output data tensor after interpolate. 
- Return type
 
- popxl.ops.io_tile_copy(t)
- Copy a tensor to or from I/O tiles on the current IPU. 
- popxl.ops.ipu_copy(t, destination, source=None)
- Copy a tensor to an IPU. - Parameters
- Raises
- ValueError – If the source IPU could not be inferred and the source is not specified. 
- Returns
- The copied tensor. 
- Return type
 
- popxl.ops.isfinite(t)
- Return a boolean tensor of the same shape indicating which elements are finite (not NaN or infinity). 
- popxl.ops.isinf(t)
- Return a boolean tensor of the same shape indicating which elements are positive or negative infinity. 
- popxl.ops.isnan(t)
- Return a boolean tensor of the same shape indicating which elements are NaN. 
- popxl.ops.l1(t, axis=None, keepdims=False)
- Compute the sum of the magnitudes of the elements in a tensor (L1 norm) along specified axes. - Parameters
- t (Tensor) – Tensor to compute the L1 norm of. 
- axis (int or list) – Axis or axes to compute L1 norm along. If none is specified then all elements will be normalised. If an axis is negative then it indexes from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being reduced ( - True) or not (- False).
 
- Returns
- The reduced tensor containing the L1 norm of elements along the specified axes. 
- Return type
 
- popxl.ops.l2(t, axis=None, keepdims=False)
- Compute the square root of the sum of the squares of the elements in a tensor (L2 norm) along specified axes. - Parameters
- Returns
- The reduced tensor containing the L2 norm of elements along the specified axes. 
- Return type
 
- popxl.ops.lamb_square(t)
- Square each element before applying an add reduction. - Used in the LAMB optimizer: https://arxiv.org/abs/1904.00962 
- popxl.ops.layer_norm(t, weight, bias, eps=1e-05)
- Apply layer normalisation to a tensor. - Uses - group_normunder the hood.- Parameters
- Returns
- The layer normalised tensor. 
- Return type
 
- popxl.ops.log(t)
- Compute the log of the elements of input tensor. - See also PyTorch torch.log, NumPy log, ONNX Log. 
- popxl.ops.logical_and(lhs, rhs)
- Compute the elementwise logical - andof two tensors.- Follows NumPy broadcasting rules. Inputs will be cast to bool if necessary. - See also PyTorch Tensor.logical_and, NumPy logical_and. 
- popxl.ops.logical_not(t)
- Compute the elementwise - notof a tensor.- Inputs will be cast to bool if necessary. - See also PyTorch Tensor.logical_not, NumPy logical_not. 
- popxl.ops.logical_or(lhs, rhs)
- Compute the elementwise logical - orof the input tensors.- Follows NumPy broadcasting rules. Inputs will be cast to bool if necessary. - See also PyTorch Tensor.logical_or, NumPy logical_or. 
- popxl.ops.logsum(t, axis=None, keepdims=False)
- Compute the log of summed elements of a tensor along specified axes. - Supported dtypes: float. - Parameters
- t (Tensor) – Tensor to compute the log of the sum of elements. 
- axis (int or list) – Axis or axes to compute the log of the sum along. If none is specified all axes will be summed. If an axis is negative it indexes from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being computed ( - Trueor not (- False).
 
- Returns
- A new tensor containing the log of the summed elements along the specified axes. 
- Return type
 
- popxl.ops.logsumexp(t, axis=None, keepdims=False)
- Compute the log of the summed exponentials of elements in a tensor, along specified axes. - Supported dtypes: floats. - See also PyTorch Tensor.logsumexp. - Parameters
- t (Tensor) – Tensor to compute the log of the summed exponentials of the elements. 
- axis (int or list) – Axis or axes to compute the log of the summed exponentials along. If none is specified all axes will be reduced. If axis is negative it indexes from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being computed ( - True) or not (- False).
 
- Returns
- A new tensor containing the log of the summed exponentials of the elements along the specified axes. 
- Return type
 
- popxl.ops.matmul(lhs, rhs, available_memory_proportion=None, output_type=None, partials_type=None)
- Perform matrix multiplication of two tensors. - Follows NumPy matrix multiplication rules for N-D tensors, see - numpy.matmul().- Arguments must have the same dtype. Shapes must be compatible as defined by the NumPy matrix multiplication rules. - See also PyTorch Tensor.matmul, NumPy matmul, ONNX MatMul. - Parameters
- lhs (Tensor) – Left hand side of matrix multiplication. 
- rhs (Tensor) – Right hand side of matrix multiplication. 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0. 
- output_type (Optional[dtypes.dtype], optional) – 3 Output datatype to enforce. Defaults to the dtype of lhs/rhs. 
- partials_type (dtypes.dtype, optional) – The type to use for partial results (float16, float32). Defaults to dtypes.float32. 
 
- Returns
- The matrix product of - lhsand- rhs.
- Return type
 
- popxl.ops.matmul_pow2scaled(lhs, rhs, log2_scale, available_memory_proportion=None)
- Perform a scaled matrix multiplication between two tensors. - Compute a matrix multiplication between - lhsand- rhs, then multiply the result by- pow2(log2_scale).- The matrix multiply arguments must have either - popxl.float8_143or- popxl.float8_152dtype. The- log2_scaleargument must be of type- popxl.int8and be in the range in [-32, 32).- Follows NumPy matrix multiplication rules for N-D tensors, see - numpy.matmul().- Parameters
- lhs (Tensor) – Left hand side of matrix multiplication. 
- rhs (Tensor) – Right hand side of matrix multiplication. 
- log2_scale (Tensor) – integer power-of-two exponent, where the matrix multiplication output is multiplied by pow2(log2_scale). 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0. 
 
- Raises
- TypeError – If the matrix multiply operand tensors do not have a dtype in - {popxl.float8_143, popxl.float8_152}, or if the- log2_scaletensor does not have dtype- popxl.int32
- ValueError – If - log2_scaleis not a scalar tensor.
 
- Return type
 
- popxl.ops.max(t, axis=None, keepdims=False)
- Compute the maximum value of the elements in a tensor along specified axes. - See also PyTorch Tensor.max, ONNX Max. - Parameters
- Returns
- The reduced tensor containing the maximum of elements computed along the specified axes. 
- Return type
 
- popxl.ops.max_pool(t, kernel_size, stride=None, padding=None, out_pads=None, dilation=None, in_dilations=None, auto_pad='not_set', ceil_mode=False, storage_order='row')
- Max pool a tensor. - This consumes an input tensor - tand applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Max pooling consists of computing the max on all values of a subset of the input tensor according to the kernel size and down-sampling the data into the output tensor Y for further processing.- Parameters
- t (Tensor) – - Input data tensor from the previous layer. - If the input is a 3D tensor, the size is (N, C, L), where N is the batch size, C is the number of channel, L is the length; 
- If the input is a 2D image, the size is (N, C, H, W), where N is the batch size, C is the number of channel, H and W are the height and width; 
- If the input is a 3D image, the size is (N, C, D, H, W), where N is the batch size, C is the number of channel, D is the depth, H and W are the height and width. 
 
- kernel_size (Tuple[int]) – The size of the kernel along each axis. 
- stride (Tuple[int]) – Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis. 
- padding (Tuple[int]) – Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. - paddingformat should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis- iand xi_end, the number of pixels added at the end of axis- i.
- out_pads (Tuple[int]) – The output padding for pooling. 
- dilation (Tuple[int]) – dilation value along each spatial axis of the filter. 
- in_dilations (Tuple[int]) – The input dilations attributes along each spatial axis of the filter. 
- auto_pad (Literal) – auto_pad must be either “not_set”, “same_upper”, “same_lower” or “valid”. The default value is “not_set”, which means explicit padding is used. “same_upper” or “same_lower” mean pad the input so that - output_shape[i] = ceil(input_shape[i] / strides[i])for each axis- i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In the case that the padding is an odd number, the extra padding is added at the end for “same_upper” and at the beginning for “same_lower”.
- ceil_mode (bool) – When True, will use ceil instead of floor to compute the output shape. 
- storage_order (Literal['row', 'column']) – The storage order of the tensor. Default is row. 
 
- Returns
- Output data tensor from max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used. 
- Return type
 
- popxl.ops.maximum(*ts)
- Compute the elementwise maximum of N tensors. - Follows NumPy broadcasting rules. Arguments must have the same dtype. 
- popxl.ops.mean(t, axis=None, keepdims=False)
- Compute the arithmetic mean of elements in a tensor along the specified axes. - See also PyTorch Tensor.mean, NumPy mean, ONNX Mean. - Parameters
- t (Tensor) – Tensor to compute the mean of elements. 
- axis (int or list) – Axis or axes to compute the mean along. If none is provided all axes will be reduced. If axis is negative it indexes from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being reduced ( - True) or not (- False).
 
- Returns
- The reduced tensor containing the arithmetic means computed along the specified axes. 
- Return type
 
- popxl.ops.median(t, axis=None, keepdims=False)
- Compute the median of elements in a tensor along axes. - See also PyTorch Tensor.median, NumPy median. - Parameters
- Returns
- The reduced tensor. 
- Return type
 
- popxl.ops.min(t, axis=None, keepdims=False)
- Compute the minimum of the elements of a tensor along axes. - See also PyTorch Tensor.min, ONNX Min. - Parameters
- Returns
- The reduced tensor containing the minimum of the elements along the axes. 
- Return type
 
- popxl.ops.mul(lhs, rhs)
- Multiply two tensors elementwise. - Follows NumPy broadcasting rules. Arguments must have the same dtype. - See also PyTorch Tensor.mul, ONNX Mul. 
- popxl.ops.negate(t)
- Perform elementwise negation (two’s complement) of a tensor. 
- popxl.ops.nll_loss(probs, labels, ignore_index=None, reduction='mean', log_prob=False)
- Compute the negative log likelihood loss. - Compute the negative log likelihood loss - lwhere- probs = softmax(x). The returned loss will be reduced by- reduction(default mean) across items in- targets. Any item in- targetequal to- ignore_indexwill not contribute to- lor- dl/dx.- See also PyTorch nll_loss, ONNX NegativeLogLikelihoodLoss. - Parameters
- probs (Tensor) – The probabilities. Expected to be the output of - softmax().
- labels (Tensor) – The labels. Target values for the probabilities. 
- ignore_index (Optional[int], optional) – Specify label values that should not contribute to the loss 
- reduction (str) – Specify how to reduce the loss. Defaults to - mean. Options- mean,- sumand- none
- log_prob (bool) – If true input probabilities are logged 
 
- Returns
- The calculated negative log likelihood loss. 
- Return type
 
- popxl.ops.nll_loss_with_softmax_grad(probs, labels, loss_grad=1, ignore_index=None, reduction='mean')
- Compute the negative log likelihood loss. - Compute the negative log likelihood loss - land returns the gradient- dE/dxwhere- probs = softmax(x).- loss_gradshould be the gradient- dE/dl, where- Eis the error from which back propagation is initialised. Typically,- E = ltherefore in order to return- dl/dxthe- loss_gradshould be- dl/dlwhich would be- 1.- Parameters
- probs (Tensor) – The probabilities. Expected to be the output of - softmax().
- labels (Tensor) – The labels. Target values for the probabilities. 
- loss_grad (Tensor) – The gradient, - dE/dl. Supports float32 dtypes with float16- probs
- reduction (ReductionType) – Specify how to reduce the loss. Defaults to - mean. Options- mean,- sumand- none
- ignore_index (Optional[int]) – Specify label values that should not contribute to - lor- dE/dx. Defaults to None.
 
- Returns
- A tuple of the loss and the gradient: ( - l,- dE/dx).
- Return type
 
- popxl.ops.onehot(t, num_classes, values, axis)
- Produce a one-hot tensor based on inputs. - See also ONNX OneHot. - Parameters
- Returns
- Output tensor. 
- Return type
 
- popxl.ops.pow(t, e)
- Raise the elements of - tto the power of- e.- If - eis- TensorLike, then- t[i]will be raised to the power of- e[i]. If- eis a- floator- int, all elements will be raised to the power of- e. Follows NumPy broadcasting rules.
- popxl.ops.pow2scale_cast_from_fp8(t, log2_scale, data_type)
- Add a fused operation - cast(X, dtype) * pow2(log2_scale)to cast from floating point 8 type.- See the PopXL documentation on floating point 8 types for more details. - Parameters
- Raises
- TypeError – If - data_typeis not of type float16 or float32.
- Returns
- The converted float16 or float32 tensor. 
- Return type
 
- popxl.ops.pow2scale_cast_to_fp8(t, log2_scale, data_type)
- Add a fused operation - cast(src * pow2(log2_scale), dtype)to cast to floating point 8 data type.- See the PopXL documentation on floating point 8 types for more details. - Parameters
- Raises
- TypeError – If - data_typeis not of type float8_143 or float8_152.
- Returns
- The converted float8 tensor. 
- Return type
 
- popxl.ops.print_tensor(t, title=None, print_self=True, print_gradient=False, summarise_threshold=1000, edge_items=3, max_line_width=75, digits=8, float_format='auto', separator=' ', open_bracket='[', close_bracket=']')
- Print a tensor. - The output tensor of this op must be consumed if you want to print the gradient tensor. If the output is not consumed this op does not get pruned when running - removeIsolatedTensors.- The default output format will split large lines, print all elements in the same format, pad elements so that they align and summarise large tensors. - Parameters
- t (Tensor) – The tensor to print. 
- title (str, optional) – Title to print. Defaults to None. 
- print_self (bool, optional) – Print the tensor itself. Defaults to - True.
- print_gradient (bool, optional) – Indicates if the associated gradient tensor of t is also printed ( - True) or not (- False). Defaults to False.
- summarise_threshold (int) – default 1000. If the number of elements of the tensor exceeds this threshold the output will be summarised. Only the edge elements will be displayed with an ellipsis indicating skipped elements. A value of 0 will disable summarisation. 
- edge_items (int) – default 3. Number of edge elements to include at the beginning and end when summarisation is enabled. 
- max_line_width (int) – default 75. lines longer than this limit will be split across multiple lines. A value of 0 will disable line splitting. 
- digits (int) – default 8. Number of digits to display. For integers this limit can be exceeded if any number is large enough. For floating points this does not include the exponent. The number of digits is used in conjunction analysis of the tensor to determine the width of each element to align all elements when printed. A value of 0 disables this analysis and each elements will be printed in an unaligned format. 
- float_format (str) – default ‘auto’. Determines the floating point format to use. Options: ‘auto’, ‘fixed’, ‘scientific’ and ‘none’. ‘auto’ mode determines the appropriate format based on the data. ‘fixed’ uses fixed point format e.g. - -100.00. ‘scientific’ uses scientific notation e.g.- -1.123e+10. ‘none’ does not take care to display numbers in the same format. If- digits==0this option is disregarded and the- float_formatis set to ‘none’
- separator (str) – default ‘,’. Character used to delininate values. 
- open_bracket (str) – default ‘[’. character used to open a tensor. 
- close_bracket (str) – default ‘]’. Character used to close a tensor. 
 
- Raises
- ValueError – if separator, open_bracket or close_bracket are not a single character. 
- KeyError – if float_format is not one of the amiable options (see parameter docstring above) 
 
- Returns
- The input tensor, unchanged. 
- Return type
 
- popxl.ops.prod(t, axis=None, keepdims=False)
- Compute the product of elements along an axis. - See also PyTorch Tensor.prod, NumPy prod. - Parameters
- t (Tensor) – Tensor to compute product of. 
- axis (int or list) – Axis or axes to compute product along. If none is provided, all axes will be reduced. If the axis is negative, the product is computed from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being reduced (‘True`) or not (‘False`). 
 
- Returns
- The reduced tensor. 
- Return type
 
- popxl.ops.random_normal(seed_tensor, shape, mean=0.0, std=1.0, dtype=popxl.dtypes.float32)
- Randomly sample from a normal distribution. - The mean and standard deviation of the distribution is specified by - meanand- stdrespectively.- Note: not compatible with IPU Model. - Parameters
- seed_tensor (Tensor) – A tensor used to seed the probability distribution. Must have data type uint32 and shape (2,). 
- shape (Tuple[int, ...]) – The shape of the output tensor. 
- mean (float, optional) – Mean of the distribution. Defaults to 0.0. 
- std (float, optional) – Standard deviation of the distribution. Defaults to 1.0. 
- dtype (dtypes.dtype, optional) – Data type of output tensor. Defaults to dtypes.float32. 
 
- Returns
- A new tensor with elements sampled from a normal distribution. 
- Return type
 
- popxl.ops.random_uniform(seed_tensor, shape, low=0.0, high=1.0, dtype=popxl.dtypes.float32)
- Randomly sample from a uniform distribution. - This operation will sample uniformly from a range with minimum value - lowand maximum value- high.- Note: not compatible with IPU Model. - Parameters
- seed_tensor (Tensor) – A tensor used to seed the probability distribution. Must have data type uint32 and shape (2,). 
- shape (Tuple[int, ...]) – The shape of the output tensor. 
- low (float, optional) – Minimum value. Defaults to 0.0. 
- high (float, optional) – Maximum value. Defaults to 1.0. 
- dtype (dtypes.dtype, optional) – Data type of output tensor. Defaults to dtypes.float32. 
 
- Returns
- A new tensor with element values sampled from a uniform distribution. 
- Return type
 
- popxl.ops.relu(t)
- Compute the ReLU activation of a tensor. - For more details, refer to Rectifier (neural networks). - See also ONNX Relu. 
- popxl.ops.relu_(t)
- Compute the ReLU activation of a tensor in place. - For more details, refer to Rectifier (neural networks). 
- popxl.ops.remote_load(remote_buffer, offset, name=None)
- Load a tensor from Streaming Memory. - This operation loads a tensor from the remote buffer residing in Streaming Memory. - The tensor will be loaded from the memory location corresponding to - remote_buffer_id(specified in- remote_buffer).- The value of - offsetmust be >= 0.- The relationship between - offsetand- remote_buffer_idis described in- remote_store().- Note - There is no data dependency in the graph between remote store and remote load. Thus, the remote load operator may end up before the remote store operator in the serialized graph. One way to avoid this is by using - with popxl.in_sequence(True).- See also - Parameters
- remote_buffer (RemoteBuffer) – The handle to the remote buffer. 
- offset (Union[int, Tensor]) – Integer or rank-0 tensor indicating which entry in the remote buffer to load from. 
- name (str) – Name to use for the returned tensor. 
 
- Returns
- A new tensor loaded from the remote buffer. 
- Return type
 
- popxl.ops.remote_load_(remote_buffer, offset, t)
- Load from Streaming Memory into a specified tensor. - This operation loads from the remote buffer in Streaming Memory into an existing tensor. - This op is identical to - remote_load, except that the data loaded from the remote buffer will be written to the tensor- t.- Note - There is no data dependency (in the graph) between remote store and remote load. Thus, the remote load operator may end up before the remote store operator in the serialized graph. One way to avoid this is by using - with popxl.in_sequence(True).- See also - Parameters
- remote_buffer (RemoteBuffer) – The handle to the remote buffer. 
- offset (Union[int, Tensor]) – Integer or rank-0 tensor indicating which entry in the remote buffer to load from. 
- t (Tensor) – The tensor the loaded data will written to. 
 
- Returns
- The tensor loaded from the remote buffer 
- Return type
 
- popxl.ops.remote_store(remote_buffer, offset, t)
- Store a tensor in Streaming Memory. - This operation stores the input tensor in the remote buffer residing in Streaming Memory. - This op is typically used to store different, identically-shaped tensors to the same remote buffer by specifying the offset. - Instances of the op with matching - remote_buffer_id(specified in- remote_buffer) will outline together, meaning that if different tensors are to be stored under the same remote buffer ID, a different- offsetvalue has to be supplied for each tensor.- remote_bufferhandles the relationship between- remote_buffer_id,- shapeand- dtypebecause- shapeand- dtypeneeds to be fixed for each- remote_buffer_id.- The value of - offsetmust be >= 0.- If - tis of rank- x, the remote buffer with- remote_buffer_idwill be of rank- x+1, where the new dimension (the row) will be of size- entries.- Note - There is no data dependency (in the graph) between remote store and remote load. Thus, the remote load operator may end up before the remote store operator in the serialized graph. One way to avoid this is by using - with popxl.in_sequence(True).- See also - Parameters
- remote_buffer (RemoteBuffer) – The handle to the remote buffer. 
- offset (Union[int, Tensor]) – Integer or rank-0 tensor indicating which entry in the remote buffer to store to. 
- t (Tensor) – Tensor to copy and store in the remote buffer. 
 
- Return type
- None 
 
- popxl.ops.rename(t, output_name=None)
- Input is equal to the output. This can also be used to rename a Tensor. 
- popxl.ops.repeat(graph, repeat_count, *inputs, inputs_dict=None)
- Repeatedly call a graph. - This operation repeatedly executes a graph - repeat_counttimes. The input tensors are provided as graph inputs for the first iteration.- The - inputsand- inputs_dicttensors are passed as graph inputs. You can specify an input either positionally using- inputs, or via a tensor map using- inputs_dict.- Graph inputs are determined when the graph is created using - create_graph(callable, ...). The order of inputs will be the same as the order of the tensor inputs in the function signature and the order of called- popxl.graph_inputs. See- create_graph()for more information.- Between each execution of the subgraph, the N outputs of subgraph will be copied to the first N inputs. These are called loop carried inputs. The number of outputs must be less than or equal to the number of inputs. The remaining inputs will be unchanged throughout the loop iterations (unless modified in place). - Example: - # popxl.Module to repeat class AddWeight(popxl.Module): def __init__(self): self.w: popxl.Tensor = None def build(self, x): self.w = popxl.graph_input(x.shape, x.dtype, "w") return self.w + x, w with g: # a graph add_weight0 = AddWeight() add_weight_graph0 = ir.create_graph(add_weight0, x0) # repeat 8 times y0, w0 = ops.repeat(add_weight_graph0, 8, x0, inputs_dict={add_weight0.w: w0}) - See also PyTorch Tensor.repeat, NumPy repeat. - Parameters
- graph (Graph) – User defined graph to repeat - repeat_counttimes.
- repeat_count (int) – Number of times to repeat calling the graph. 
- *inputs (Tensor, List[Tensor], int, float) – Provide inputs via position. 
- inputs_dict (Optional[Mapping[Tensor, Tensor]]) – Provide inputs via a tensor map. Mapping of - graph tensor -> parent tensor.
- check_inputs (bool = True) – If true, then check when called that all inputs have been provided. 
 
- Return type
 - Throws:
- ValueError: If repeat_count < 0. ValueError: If the number of subgraph inputs < subgraph outputs. 
 
- popxl.ops.repeat_with_info(graph, repeat_count, *inputs, inputs_dict=None, check_inputs=True)
- Repeatedly call a graph and return information about the call site. - This operation repeatedly executes a graph - repeat_countnumber of times. The input tensors are provided as graph inputs for the first iteration.- Returns - CallSiteInfothat can be used to inspect callsite inputs/outputs.- The - inputsand- inputs_dicttensors are passed as graph inputs. You can specify an input either positionally using- inputsor via a tensor map using- inputs_dict.- Graph inputs are determined when the graph is created using - ir.create_graph(callable, ...). The order of inputs will be the same as the order of the tensor inputs in the function signature and the order of called- popxl.graph_inputs. See- create_graph()for more information.- Between each execution of the subgraph, the N outputs of subgraph will be copied to the first N inputs. These are called loop carried inputs. The number of outputs must be less than or equal to the number of inputs. - Implementation detail: In order to maintain the input / output indices of the subgraph, we must call the user provided subgraph, and create a “middle” subgraph to repeat the user provided subgraph inside: - LoopOp Keep | Going Loop Carried | Iterator | Inputs | | | | | | |- Implicit Inputs V V V V V V V .-Wrapper_subgraph--+-+-+----+-----. Parent graph | | | | | | | | | | | | | | | | | | | V V V | | | CallOp .-Loop_subgraph---. | | | | | (user provided) |<- | | '-->| | | | | (Ops) | | | | | | | | | | | '----------+-+-+--' | | | | | | | V V V | '---+---------------+-+-+----------' | | | | | | | | V V V V Keep Loop Carried Going Outputs- Example: - # popxl.Module to repeat class AddWeight(popxl.Module): def __init__(self): self.w: popxl.Tensor = None def build(self, x): self.w = popxl.graph_input(x.shape, x.dtype, "w") return self.w + x, w with g: # a graph add_weight0 = AddWeight() add_weight_graph0 = ir.create_graph(add_weight0, x0) # repeat 8 times call_info = ops.repeat( add_weight_graph0, 8, x0, inputs_dict={add_weight0.w: w0} ) y0, w0 = call_info.outputs() - Parameters
- graph (Graph) – User defined graph to repeat - repeat_counttimes.
- repeat_count (int) – Number of times to repeat calling the graph. 
- *inputs (Tensor, List[Tensor], int, float) – Provide inputs via position. 
- inputs_dict (Optional[Mapping[Tensor, Tensor]]) – Provide inputs via a tensor map. Mapping of - graph tensor -> parent tensor.
- check_inputs (bool) – Check when called if all inputs have been provided. Defaults to True. 
 
- Raises
- ValueError – If - repeat_count < 0.
- ValueError – If the number of explicitly passed inputs + the number of loop created inputs != the number of outputs. 
 
- Returns
- Information on the created callsite for the repeat op. 
- Return type
 
- popxl.ops.reshape(t, shape)
- Reshape a tensor. - See also PyTorch Tensor.reshape, NumPy reshape, ONNX Reshape. - Parameters
- Raises
- ValueError – A ValueError will be raised if: - An invalid value is encountered in the shape. - If more than -1 is given in shape. 
- Returns
- The reshaped tensor. 
- Return type
 
- popxl.ops.reshape_(t, shape)
- Reshape a tensor (in-place). - This is the in-place version of - reshape().- Parameters
- Raises
- ValueError – A ValueError will be raised if: - An invalid value is encountered in the shape. - If more than -1 is given in shape. 
- Returns
- An alias of the input tensor, reshaped. 
- Return type
 
- popxl.ops.roi_align(t, rois, batch_index, output_size, spatial_scale, sampling_ratio)
- Apply pooling across each region of interest. - This consumes an input tensor - tand regions of interest (ROIs) to apply pooling across each ROI. Only supports average pooling. Max pooling is not supported.- Parameters
- t (Tensor) – Input data tensor from the previous operator; 4-D feature map of shape ( - N,- C,- H,- W), where- Nis the batch size,- Cis the number of channels, and- Hand- Ware the height and the width of the data.
- rois (Tensor) – ROIs to pool over. - roisis 2-D input of shape (- numRois, 4) given as [[x1, y1, x2, y2], …], where- numRoisis the number of ROIs. The ROI coordinates are in the coordinate system of the input image. Each coordinate set has a 1:1 correspondence with the- batch_indexinput.
- batch_index (Tensor) – 1-D tensor of shape [ - numRois,] with each element denoting the index of the corresponding image in the batch.
- output_size (Tuple[int]) – Pooled output height and width. 
- spatial_scale (float) – Multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling; that is, the spatial scale of the input feature map - trelative to the input image.
- sampling_ratio (int) – Number of sampling points in the interpolation grid used to compute the output value of each pooled output bin. 
 
- Returns
- ROI pooled output Y, a 4-D tensor of shape ( - numRois,- channels,- aligned_height,- aligned_width) where- aligned_heightis the output height and- aligned_widthis the output height. The r-th batch element- Y[r-1]is a pooled feature map corresponding to the- r-th ROI- t[r-1].
- Return type
 
- popxl.ops.scaled_add(X, Y, a=1.0, b=1.0)
- Perform a scaled addition of two tensors. - Compute the sum of - Xscaled by- a and `Yby- b, which means- aX + bY.- Does not apply NumPy broadcasting. Uses mixed precision poplibs operations. - Xand- Ymust be the same shape, but can be different types.- aand- bmust be scalars.- Parameters
- Returns
- A tensor containing - aX + bY.
- Return type
 
- popxl.ops.scaled_add_(X, Y, a=1.0, b=1.0)
- Perform a scaled addition of two tensors (in-place). - Compute the sum of - Xscaled by- a and `Yby- b. This is performed in place on- X, which means that- X = aX + bY.- Does not apply NumPy broadcasting. Uses mixed precision poplibs operations. - Xand- Ymust be the same shape, but can be different types.- Parameters
- Returns
- The - Xtensor containing- aX + bY.
- Return type
 
- popxl.ops.scatter(t, indices, values, axis=0, available_memory_proportion=None)
- Update the values of multiple elements in an tensor. - The elements specified by - indicesare updated with the values in- values.- scatterrequires the three input tensor to be of the same rank- r >= 1. The optional attribute- axisidentifies the axis of the tensor along which the update will be performed. By default, the outer-most axis, axis 0, is used. The output of the operation is produced by creating a copy of the input tensor,- t, and then updating its elements to the values specified by- valuesat the index positions specified by- indices. The output shape is the same as the shape of the input tensor.- For each entry in - values, the target index in- tis obtained by combining the corresponding entry in- indiceswith the index of the entry itself: the index-value for dimension = axis is obtained from the value of the corresponding entry in indices and the index-value for dimension != axis is obtained from the index of the entry itself.- Pseudo-code example: - x1 = x.copy() scatter(x1, [1, 2, 3], [-1, -2, -3]) x2 = x.copy() x[1] = -1 x[2] = -2 x[3] = -3 x1 == x2 - See also PyTorch Tensor.scatter. - Parameters
- t (Tensor) – The input tensor. 
- indices (Tensor) – The indices of the elements to update. 
- values (Tensor) – The values to update the tensor with. 
- axis (int) – Which axis to set on. Default is 0. 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0 if not set globally. 
 
- Returns
- The tensor with updated values. 
- Return type
 
- popxl.ops.scatter_reduce(data, indices, reduction, initial_values=None, axis=0, axis_size=None, enable_index_broadcast=True, available_memory_proportion=None)
- popxl.ops.shaped_dropout(t, seed_tensor, shape, ratio)
- Add a shaped dropout operation to the input tensor. - Applies a shaped dropout to the input tensor - t. This operator requires a- shapeparameter that is used to define the shape of the dropout mask so that strongly correlated features in the input tensor- tcan be preserved. The- shapeparameter must be broadcastable to the input tensor- t. The dropout mask is created using samples from a Bernoulli distribution seeded with a seed tensor- seed_tensor.- Parameters
- t (Tensor) – The Tensor to apply the shaped dropout operation to. 
- seed_tensor (Tensor) – The Tensor used to seed the probability distribution which generates the dropout mask. Must have data type uint32 and shape [2,]. 
- shape (Iterable[int]) – The shape of the dropout mask. This must be broadcastable to the input tensor. 
- ratio (float) – The probability of dropping an input feature. Default = 0.5. 
 
- Returns
- A new tensor with the shaped dropout applied. 
- Return type
 
- popxl.ops.sign(t)
- Return the sign of each element in the Tensor (-1, 0 or 1). NaN values have a sign of 0. 
- popxl.ops.sin(t)
- Compute the sine of each element of the input tensor. - See also PyTorch Tensor.sin. 
- popxl.ops.slice(t, start=None, stop=None, step=None, axis=None)
- Select elements from a tensor using a slice or multiple slices. - A slice specifies the start (inclusive) and stop (exclusive) index of elements to select. Multiple slices can be specified using a list of items for each parameter ( - start,- stop,- step). If- stepis- -1, the slice is performed backwards.- If - axisis not specified, each slice will correspond to dimensions 0 to- Nwhere- Nis the number of slices.- Examples: - t == slice(t) == slice(t, axis=1) slice(t, start=1) # Slice axis 0 from start index 1 slice(t, start=[1, 2]) == slice(t, start=[1, 2], axis=[0, 1]) slice(t, stop=-2) # Slice axis 0 upto second last element (exclusive) slice( t, stop=3, step=-1 ) # Slice backwards from last element (inclusive) to third last element (exclusive) - See also ONNX Slice. - Parameters
- t (Tensor) – Tensor to slice 
- start (Optional[Union[int, List[Optional[int]]]]) – Index of first element (inclusive) or - Nonewhich defaults to 0.
- stop (Optional[Union[int, List[Optional[int]]]]) – Index of last element (exclusive) or - Nonewhich defaults to last element (inclusive) if step is forward or first element (inclusive) if step is backwards.
- step (Optional[Union[int, List[Optional[int]]]]) – - 1for forward or- -1for backwards.
- axis (Optional[Union[int, List[int]]]) – Axis of tensor to slice on or - Nonewill default to each axis sequentially.
 
- Returns
- A tensor containing the selected slices. 
- Return type
 
- popxl.ops.slice_(t, start=None, stop=None, step=None, axis=None)
- Select elements from a tensor, in place, using a slice or multiple slices. - This is the in-place version of - slice(). The functionality is the same, but the tensor is sliced in place.- A slice specifies the start (inclusive) and stop (exclusive) index of elements to select. Multiple slices can be specified using a list of items for each parameter ( - start,- stop,- step). If- stepis- -1, the slice is performed backwards.- If - axisis not specified, each slice will correspond to dimensions 0 to- Nwhere- Nis the number of slices.- Parameters
- t (Tensor) – Tensor to slice 
- start (Optional[Union[int, List[Optional[int]]]]) – Index of first element (inclusive) or - Nonewhich defaults to 0.
- stop (Optional[Union[int, List[Optional[int]]]]) – Index of last element (exclusive) or - Nonewhich defaults to last element (inclusive) if step is forward or first element (inclusive) if step is backwards.
- step (Optional[Union[int, List[Optional[int]]]]) – - 1for forward or- -1for backwards.
- axis (Optional[Union[int, List[int]]]) – Axis of tensor to slice on or - Nonewill default to each axis sequentially.
 
- Returns
- An alias of the input tensor containing the selected slices. 
- Return type
 
- popxl.ops.softmax(t, axis)
- Normalize the elements of a tensor along specified axes. - This rescales the slices of - axissuch that all elements are within the range [0, 1] and sum to 1. The output shape and dtype match the input.- See also ONNX Softmax. 
- popxl.ops.split(t, splits, axis=0)
- Split a tensor along an axis into a list of tensors. - See also PyTorch Tensor.split, NumPy split, ONNX Split. - Parameters
- Raises
- ValueError – If the split doesn’t equally divide the tensor. 
- Returns
- A list of tensors. 
- Return type
- List[Tensor] 
 
- popxl.ops.split_random_seed(seed, n=2)
- Produce - nrandom seeds from an initial seed.- Chaining calls to - split_random_seedcan be used to ensure unique random behaviour across a program. For example:- seed, s1 = ops.split_random_seed(seed) y = ops.dropout(x, s1) seed, s2 = ops.split_random_seed(seed) z = ops.dropout(y, s2) 
- popxl.ops.sqrt(t)
- Compute the square root of the elements of a tensor. - If - tis negative, then this will return NaN.
- popxl.ops.squeeze(t, axes=None)
- Remove axes of length one from the tensor. - Takes an input - axeswith a list of axes to squeeze. If- axesis not provided, all the single dimensions will be removed from the shape. If an axis is selected with shape entry not equal to one, an error is raised. Implemented using- reshapeunder the hood.- See also PyTorch Tensor.squeeze, NumPy squeeze, ONNX Squeeze. - Parameters
- Raises
- ValueError – A ValueError is raised if: - The axes contains duplicates. - The axis cannot be squeezed. 
- Returns
- The squeezed tensor. 
- Return type
 
- popxl.ops.sub(lhs, rhs)
- Subtract two tensors elementwise. - Follows NumPy broadcasting rules. Arguments must have the same dtype. - See also PyTorch Tensor.sub, ONNX Sub. 
- popxl.ops.subsample(t, strides)
- Subsamples a tensor by selecting every n’th element from each dimension. The subsample count N is provided for each dimension. - Parameters
- Returns
- A subsampled output tensor. 
- Return type
- Raises
- ValueError – Thrown if the length of the strides list is larger than the rank of the input tensor. 
 
- popxl.ops.sum(t, axis=None, keepdims=False)
- Sum elements over an axis. - See also PyTorch Tensor.sum, NumPy sum, ONNX Sum. - Parameters
- Returns
- The reduced tensor. 
- Return type
 
- popxl.ops.sumsquare(t, axis=None, keepdims=False)
- Compute the sum of the squares of tensor elements over an axis. - Parameters
- t (Tensor) – Tensor to compute the sum of squares from. 
- axis (int or list) – Axis or axes over which to compute the sum of squares. If none is provided all axes will be reduced. If axis is negative it counts from the last to the first axis. 
- keepdims (bool) – Keep the axis that is being reduced or not. 
 
- Returns
- The reduced tensor. 
- Return type
 
- popxl.ops.swish(t)
- Compute the Swish activation of a tensor. - For more details, refer to Rectifier (neural networks). 
- popxl.ops.swish_(t)
- Compute the Swish activation of a tensor in place. - For more details, refer to Rectifier (neural networks). 
- popxl.ops.tanh(t)
- Compute the hyperbolic tangent function elementwise on a tensor. - See also PyTorch Tensor.tanh, NumPy tanh, ONNX Tanh. 
- popxl.ops.tied_gather(t, indices, axis=0, available_memory_proportion=None, zero_OOR=False)
- Select multiple elements from an array. - Elements are specified given by - indices, along a specified axis. Equivalent to- numpy.take(). Note that this is different from- torch.gather().- Numerically the same as the - gatherop but does not specify the tile layout of the- indicestensor. When preceding a- matmulop the tile layout of the indices is determined by the- matmul, not the- tied_gather. This has a has lower memory footprint but costs extra cycles due to the exchange.- Examples: - x = popxl.variable(np.arange(16).reshape(4, 4)) # [[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11], # [12, 13, 14, 15]] gather(x, [3, 1, 2]) == Tensor([x[3], x[1], x[2]]) # [[12, 13, 14, 15], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11]] gather(x, [[0, 1], [1, 2]]) == gather(x, [0, 1, 1, 2]).reshape(2, 2, 4) # [[[ 0, 1, 2, 3], # [ 4, 5, 6, 7]], # [[ 4, 5, 6, 7], # [ 8, 9, 10, 11]]] - Parameters
- t (Tensor) – The input tensor. 
- indices (Tensor) – The indices of the elements to extract. 
- axis (int) – The axis to gather on. The default is 0. 
- available_memory_proportion (Optional[float]) – The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. Defaults to 1.0 if not set globally. 
- zero_OOR (bool) – If False, out of range (OOR) indices will produce garbage data. If True, OOR indices will produce zeros. 
 
- Returns
- The gathered elements concatenated. 
- Return type
 
- popxl.ops.topk(t, k, axis, largest, sorted, available_memory_proportion=None)
- Retrieve the top-K largest or smallest elements along a specified axis. - See also PyTorch torch.topk, ONNX TopK. - Parameters
- t (Tensor) – Input tensor. 
- k (int) – The number of top elements to retrieve 
- axis (int) – Dimension on which to do the sort. 
- largest (bool) – Whether to return the top-K largest or smallest elements. 
- sorted (bool) – Whether to return the elements in sorted order. 
- available_memory_proportion (Optional[float]) – Optional[float] The maximum proportion of available memory on each tile that this layer should consume temporarily during the course of the operation. This value is used by the grad operator only. In other words value of that parameter is irrelevant to the inference use case. Defaults to 1.0 if not set globally. 
 
- Returns
- A tuple of output values and indices. 
- Return type
 
- popxl.ops.transpose(t, permutation=None)
- Permute the axes of a tensor. - By default this operation reverses the axes of - t.- See also PyTorch Tensor.transpose, NumPy transpose, ONNX Transpose. 
- popxl.ops.transpose_(t, permutation=None)
- Permute the axes of a tensor in place. - By default this operation reverses the axes of - t.- This is the in-place version of - transpose(). The behaviour is the same, but it modifies the tensor in place.- See also PyTorch Tensor.transpose_. 
- popxl.ops.where(condition, lhs, rhs)
- Elementwise selection based on satisfying a condition. - Choose elements from - lhsor- rhsdepending on whether the corresponding element in- conditionis satisfied or not. The operator supports multi-directional broadcasting (NumPy style).- See also PyTorch Tensor.where, NumPy where, ONNX Where. - Parameters
- Returns
- The tensor containing elementwise - lhs if condition else rhs.
- Return type
 
- class popxl.ops.collectives.CommGroup
- Class to specify sub-groups of replicas. - Examples of derived sub-groups: - IPU-link domain sub-rack: - where N is power of two and replicaGroupSize > 1. - Complete IPU-link domain / full rack: 
- Using GW-links only: 
 - __init__(*args, **kwargs)
- Overloaded function. - __init__(self: popart_internal_ir.CommGroup) -> None 
- __init__(self: popart_internal_ir.CommGroup, type: popart_internal_ir.CommGroupType, replicaGroupSize: int) -> None 
- __init__(self: popart_internal_ir.CommGroup, grouping: popart_internal_ir.ReplicaGrouping) -> None 
 
 - property replicaGroupSize
- Replica group size. 
 - toReplicaGrouping(self: popart_internal_ir.CommGroup, numReplicas: int) popart_internal_ir.ReplicaGrouping
 - property type
- Replica group type. 
 
- class popxl.ops.collectives.CommGroupType
- PopART equivalent of GCL CommGroupType. Each of these enumeration constants have a corresponding GCL CommGroupType value. - Members: - All : All replicas viewed as one group, replica group size is ignored. */ - Consecutive : Groups are consecutive in replica. - If there are N replicas denoted {0, … N-1} and group size is k, then there are N/k groups of size k: - {0, 1, … k-1}, {k, … 2k-1} … {N-k-1, … N-1} - Orthogonal : Groups are sliced orthogonal to the replica ordering. - If there are N replicas denoted {0, … N-1} and group size is k, then there are m = N/k groups of size k: - {0, m, 2m, …}, {1, m+1, 2m+1, …} … {m-1, 2m-1, … N-1} - Ungrouped : Each replica is in it’s own group, replica group size is ignored. */ - All = <CommGroupType.All: 0>
 - Consecutive = <CommGroupType.Consecutive: 1>
 - Orthogonal = <CommGroupType.Orthogonal: 2>
 - Ungrouped = <CommGroupType.Ungrouped: 3>
 - __init__(self: popart_internal_ir.CommGroupType, value: int) None
 - property name
 - property value
 
- popxl.ops.collectives.all_reduce(ts, ipus=None, op='add')
- Allreduce tensors across IPUs within a replica. - Currently only the - addreduce op is supported by autodiff.- Parameters
- Returns
- Output Tensors. The data of each tensor is identical on the IPUs corresponding to - ipus
- Return type
- List[Tensor] 
 
- popxl.ops.collectives.all_reduce_identical_grad_inputs(ts, ipus=None, op='add')
- Allreduce tensors across IPUs within a replica where the grad tensors of the corresponding grad op are identical. - This means that this op is an all-reduce and the corresponding grad op an identity. - Currently only the - addreduce op is supported by autodiff.- The - AllReduceToIdentityPatternpattern must be run for this op to function correctly.- Parameters
- Returns
- Output Tensors. Each Tensors data is identical on a IPU corresponding to - ipus
- Return type
- List[Tensor] 
 
- popxl.ops.collectives.all_reduce_identical_inputs(ts, ipus=None, op='add')
- Allreduce tensors across IPUs within a replica where the input tensors are identical. - This means the op is an identity but the corresponding grad op is an allreduce. - Currently only the - addreduce op is supported by autodiff.- The - AllReduceToIdentityPatternpattern must be run for this op to function correctly.- Parameters
- Returns
- Output Tensors. Each Tensors data is identical on a IPU corresponding to - ipus
- Return type
- List[Tensor] 
 
- popxl.ops.collectives.replica_sharded_slice(t, group=None)
- Take the replicated tensor sharded slice of a Tensor. 
- popxl.ops.collectives.replicated_all_gather(t, axis=0, group=None, output_shape='auto')
- Gather a tensor across replicas such that the output tensor contains the values of the tensor from each replica. - The shape of the output tensor is determined by the value of - output_shape:- new_axis: the output shape is- (group.size, *t.shape)
- concat: the output shape has the same behavior as concat on- axis
- meta_shape: the output shape is- t.meta_shape
- auto: if the input has a meta-shape- meta_shapeis chosen, otherwise- concat
 - This op is auto-differentiable and it’s corresponding grad op is an replicated_slice (except when - output_shape==meta_shape).- Parameters
- t (Tensor) – Tensor to be gathered. 
- axis (int) – Axis to gather and concatenate values when using ‘concat’ mode 
- group (Optional[ReplicaGrouping]) – Replicas to gather from. Defaults to All replicas. 
- output_shape (str) – see above for details. Choose ‘new_axis’, ‘concat’, ‘meta_shape’ or ‘auto’. 
 
- Returns
- Gathered tensor. 
- Return type
- Raises
- ValueError – if - output_shapeis not one of ‘new_axis’, ‘concat’, ‘meta_shape’ or ‘auto’.
 
- popxl.ops.collectives.replicated_all_reduce(t, op='add', group=None)
- Reduce a tensor across replicas. - Parameters
- t (Tensor) – Tensor to be reduced 
- op (str, optional) – Operation to reduce with. Defaults to ‘add’. Options: ‘add’, ‘mean’, ‘mul’, ‘min’, ‘max’, ‘and’, ‘or’, ‘square_add’. 
- group (Optional[ReplicaGrouping]) – Replicas to reduce across. Defaults to All replicas. 
 
- Returns
- Reduced tensor 
- Return type
 
- popxl.ops.collectives.replicated_all_reduce_(t, op='add', group=None)
- Reduces tensor - tacross replicas inplace on- t.- Parameters
- t (Tensor) – Tensor to be reduced 
- sharding. (operations for replicated tensor) – 
- op (str, optional) – Operation to reduce with. Defaults to ‘add’. Options: ‘add’, ‘mean’, ‘mul’, ‘min’, ‘max’, ‘and’, ‘or’, ‘square_add’. 
- group (Optional[ReplicaGrouping]) – Replicas to reduce across. Defaults to All replicas. 
 
- Returns
- Reduced tensor 
- Return type
 
- popxl.ops.collectives.replicated_reduce_scatter(t, op='add', group=None, configure_output_for_replicated_tensor_sharding=False)
- Reduce a tensor across replicas with each replica receiving a unique slice of the tensor. - Parameters
- t (Tensor) – Tensor to be reduced. Inputs will be flattened. 
- op (str, optional) – Operation to reduce with. Defaults to ‘add’. Options: ‘add’, ‘mean’, ‘mul’, ‘min’, ‘max’, ‘and’, ‘or’, ‘square_add’. 
- group (Optional[CommGroup]) – Replicas to reduce across. Defaults to All replicas. 
- configure_output_for_replicated_tensor_sharding (Optional[bool]) – Configures the output to be a replica sharded tensor. Defaults to false. Replicated tensor sharded tensors do not follow the data element order of the original tensor, and can only be used in operations that belong to the same replicated tensor sharding group, where all tensor inputs follow the same data order. 
 
- Returns
- A slice of the reduced tensor. Always a 1D tensor. 
- Return type
 
- popxl.ops.collectives.replicated_slice(t, axis=0, group=None)
- Each replica takes a equal slice of - tsplit along axis- axis. e.g. if- thas shape- (2,4), there are two replicas and- axis==0: the first replica will output- [0:1, ...]and the second replica- [1:2, ...].- This op is similar to - replica_sharded_slicebut differs in that it maintains the output shape and does not configure the output for replicated tensor sharding.- This op is auto-differentiable and it’s corresponding grad op is an replicated_all_gather. - Parameters
- t (that slice) – Tensor to split 
- axis (int) – Axis to slice along 
- group (Optional[ReplicaGrouping]) – Replica grouping that determines group of replicas 
- t – 
 
- Returns
- A slice of the tensor. 
- Return type
- Raises
- ValueError – if the group size does not equally divide the axis size 
 
- popxl.ops.var_updates.accumulate_(t, X, f=None)
- Update (in-place) tensor - tgiven updater values- Xand a factor- faccording to- t = t + (f * X).- Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- fmust be scalar.
- popxl.ops.var_updates.accumulate_mean_(t, X, step)
- Update (in-place) tensor - tgiven updater values- Xand a factor- faccording to- t = (step/(step+1)) * t + (1/(step+1)) * X.- Intended to be used to keep track of the mean of a series of values. - For example: - with g: accum = popxl.variable(0, dtype=popxl.float32) a = popxl.variable(1, dtype=popxl.float32) b = popxl.variable(2, dtype=popxl.float32) accumulate_mean(accum, a, 0.0) accumulate_mean(accum, b, 1.0) - will result in - accumhaving the value- (a+b)/2 = 1.5.- Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- stepmust be scalar.
- popxl.ops.var_updates.accumulate_moving_average_(t, X, f)
- Update (in-place) tensor - tgiven updater values- Xand a factor- faccording to- t = (f * t) + ((1-f) * X).- Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- fmust be scalar.
- popxl.ops.var_updates.accumulate_moving_average_square_(t, X, f)
- Update (in-place) tensor - tgiven updater values- Xand a factor- faccording to- t = (f * t) + ((1-f) * X^2).- Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- fmust be scalar.
- popxl.ops.var_updates.accumulate_square_(t, X, f=1.0)
- Update (in-place) tensor - tgiven updater values- Xand a factor- faccording to- t = t + (f * X^2).- Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- fmust be scalar.
- popxl.ops.var_updates.accumulator_scale_(t, f)
- Scale a tensor in-place. - This op will directly zero the input tensor if the factor is const and 0. - Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - Parameters
- Returns
- An alias to the updated tensor. 
- Return type
 
- popxl.ops.var_updates.accumulator_zero_(t)
- Zero the input tensor. - This is an AccumulatorScaleOp with a factor of 0, and this zeroes the input tensor. 
- popxl.ops.var_updates.adam_updater(acc_first_order, acc_second_order, weight=None, time_step=None, weight_decay=None, beta1=None, beta2=None, epsilon=1e-07)
- Calculate an updater term to update the weights for Adam. - Accumulated bias corrected first order momentum (FP16/FP32) - mc:- mc = m / (1 - b1 ** t) - Without correction: - mc = m - Accumulated bias corrected second order momentum (FP16/FP32) - vc:- vc = v / (1 - b2 ** t) - Without correction: - vc = v - Updater term (FP16/FP32, with weight decay mode: - decay >0.0and- wd > 0.0)- x:- x = mc / (sqrt(vc) + eps) + wd * w - Updater term (FP16/FP32, without weight decay mode: - decay)- x:- x = mc / (sqrt(vc) + eps) - Note - time_stepwill be incremented by 1.- Parameters
- acc_first_order (Tensor) – Tensor ( - m) First order momentum (FP16/FP32).
- acc_second_order (Tensor) – Tensor ( - v) Second order momentum (FP16/FP32).
- weight (Optional[Tensor]) – Optional[Tensor] ( - w) Weight. Only required for- weight_decay.
- time_step (Optional[Tensor]) – Tensor ( - t) Time step. Providing this tensor enables bias correction.
- weight_decay (Optional[Union[float, Tensor]]) – Optional[Union[float, Tensor]] = None Optional scalar to apply weight decay. 
- beta1 (Optional[Union[float, Tensor]]) – Optional[Union[float, Tensor]] = None Only required in bias correction for - m
- beta2 (Optional[Union[float, Tensor]]) – Optional[Union[float, Tensor]] = None Only required in bias correction for - v
- epsilon (Union[float, Tensor]) – Union[float, Tensor] = 1e-07 Scalar to calculate updater. 
 
- Raises
- ValueError – If - weight_decayis set and- weightis None.
- ValueError – If - time_stepset to None and- beta1and- beta2are not set (no bias correction can take place).
 
- Returns
- An updater to update the weight for Adam. 
- Return type
 
- popxl.ops.var_updates.adam_var_update(t, x, r1, r2, learning_rate=None, max_weight_norm=None)
- Calculate the updated weight tensor for Adam or LAMB. - x= updater term (see- adamupdater())
- lr= learning rate
- max_weight_norm= max weight norm (c.f. \(\phi\) or scaling function in Lamb paper)
- r1= (Lamb) L2 norm of the weight (- w)
- r2= (Lamb) L2 norm of the updater term (- x)
 - Lamb r1 (FP32): \(r1 = ||w||_2\) (without Lamb or \(\phi (r1) == 0: r1/r2 = 1\)) - Special case: replicated weight sharding; every replica only stores a shard of - w, therefore the sum-of-squares is computed replicated, and thereafter all-reduced before every replica takes the square root of- r1sq.- Lamb r2 (FP32): \(r2 = ||x||_2\) (without Lamb or \(r2 == 0: r1/r2 = 1\)) - Special case: replicated weight sharding; every replica only stores a shard of - x, therefore the sum-of-squares is computed replicated, and thereafter all-reduced before every replica takes the square root of- r2sq.- Scale factor: \(\phi (r1) = min(r1, max_weight_norm)\) - Variable update: \(w -= (\phi (r1) / r2) * lr * x\) where \(\phi (r1) / r2\) is the Lamb trust ratio. - Parameters
- t (Tensor) – The weight to update. 
- x (Tensor) – The updater term. 
- r1 (Tensor) – The - r1squared input tensor.
- r2 (Tensor) – The - r2squared input tensor.
- learning_rate (Optional[Union[float, Tensor]]) – Optional learning rate tensor to use. Will be constant if this argument is a float or None. Defaults to None. 
- max_weight_norm (Optional[Union[float, Tensor]]) – Optional max weight tensor to use. Will be constant if this argument is is a float or None. Defaults to None. 
 
- Returns
- The updated weight tensor. 
- Return type
 
- popxl.ops.var_updates.adamax_updater(acc_first_order, acc_second_order, weight=None, time_step=None, weight_decay=None, beta1=0.9, epsilon=1e-07)
- Calculate an updater term to update the weights for Adamax. - Accumulated bias corrected first order momentum (FP16/FP32) - mc:- mc = m / (1 - b1 ** t) - Updater term (FP16/FP32, with weight decay mode: - decay > 0.0and- wd > 0.0)- x:- x = mc / (vc + eps) + wd * w - Updater term (FP16/FP32, without weight decay mode: decay) - x:- x = mc / (vc + eps) - Note - time_stepwill be incremented by 1.- Parameters
- acc_first_order (Tensor) – First order momentum (FP16/FP32) ( - m).
- acc_second_order (Tensor) – Second order momentum (FP16/FP32) ( - v).
- weight (Optional[Tensor]) – Weight ( - w). Only required for- weight_decay.
- time_step (Tensor) – Time step ( - t).
- weight_decay (Optional[Union[float, Tensor]]) – Optional scalar to apply weight decay. Defaults to None 
- beta1 (Union[float, Tensor]) – Scalar to do bias correction for - m.Defaults to 0.9
- epsilon (Union[float, Tensor]) – Scalar to calculate updater. Defaults to 1e-07 
 
- Raises
- ValueError – If - weight_decayis set and- weightis None.
- ValueError – If - time_stepis None.
 
- Returns
- An updater to update the weight for Adamax. 
- Return type
 
- popxl.ops.var_updates.copy_var_update_(t, X)
- Update a tensor in-place by copying the tensor containing the updater values. 
- popxl.ops.var_updates.lamb_updater(acc_first_order, acc_second_order, weight=None, time_step=None, weight_decay=None, beta1=None, beta2=None, epsilon=1e-07)
- Calculate an updater term to update the weights for LAMB. - Accumulated bias corrected first order momentum (FP16/FP32) - mc:- mc = m / (1 - b1 ** t) (without correction: mc = m) - Accumulated bias corrected second order momentum (FP16/FP32) - vc:- vc = v / (1 - b2 ** t) (without correction: vc = v) - Updater term (FP16/FP32, with weight decay mode: - decay > 0.0and- wd > 0.0)- x:- x = mc / (sqrt(vc) + eps) + wd * w - Updater term (FP16/FP32, without weight decay mode: decay) - x:- x = mc / (sqrt(vc) + eps) - Note - time_stepwill be incremented by 1.- Parameters
- acc_first_order (Tensor) – First order momentum (FP16/FP32) ( - m).
- acc_second_order (Tensor) – Second order momentum (FP16/FP32) ( - v).
- weight (Optional[Tensor], optional) – Weight ( - w). Only required for- weight_decay. Defaults to None.
- time_step (Optional[Tensor], optional) – Time step ( - t). Providing this tensor enables bias correction. Defaults to None.
- weight_decay (Optional[Union[float, Tensor]], optional) – Optional scalar to apply weight decay. Defaults to None. 
- beta1 (Optional[Union[float, Tensor]], optional) – Only required in bias correction for - m. Defaults to None.
- beta2 (Optional[Union[float, Tensor]], optional) – Only required in bias correction for - v. Defaults to None.
- epsilon (Union[float, Tensor], optional) – Scalar to calculate updater. Defaults to 1e-07. 
 
- Raises
- ValueError – If - weight_decayis set and- weightis None.
- ValueError – If - time_stepis set to None and- beta1and- beta2are not set (no bias correction can take place).
 
- Returns
- An updater to update the weight for LAMB. 
- Return type
 
- popxl.ops.var_updates.sparse_accumulate_(t, X, indices, axis=0, f=None, W=None)
- Apply a sparse accumulate operation to a tensor. - Does not apply NumPy broadcasting. Uses mixed precision PopLibs operations. - tand- Xmust have the same shape, but can be different types.- Detail: - Assume you have: - w -> Gather -> x - and when the optimiser step is grown: - dW <- GatherGrad <- x \ Accumulate -> accum' / accum - GatherGrad is essentially a scatter operation. Then we Accumulate the resultant - dWon- accum. This involves creating an extra- dWtensor, so we can do the following instead:- x | V accum -> SparseAccumulate -> accum' - SparseAccumulate can accumulate the slices of - xinto- accumas required, in one operation, without extra requiring extra memory.- When calling this op, the input tensor - Wis an optional input. This can be used when two different views of the weight are consumed in the forward pass, and one of those ops is a Gather, thus requiring a SparseAccumulate in the weight update step.- We connect the op to the other view of the weight instead of the view this SparseAccumulate is for. Then, the lowering will clone that tensor (and its layout) when creating - accum.- Parameters
- t (Tensor) – Tensor to be updated 
- X (Tensor) – Value to update the tensor with. 
- indices (Tensor) – The indices of the scatter operation. 
- axis (int, optional) – Which axis to set on. Default is 0. 
- f (Optional[Union[float, Tensor]], optional) – Optional scalar to apply to update before the addition. Defaults to None. 
- W (Optional[Tensor], optional) – Tile mapping reference tensor for - tto be cloned from.
 
- Returns
- An alias to the updated tensor. 
- Return type