6. Python API

6.1. `poprt` module

class poprt.Converter(*, input_shape=None, convert_version=11, precision='fp32', checkpoints=None, eightbitsio=False, fp16_skip_op_types=None, skip_passes=None, used_passes=[], check=False, disable_fast_norm=False, pack_args=None, fp8_skip_op_names=None, fp8_params='F143,F143,0,0', quantize=False, enable_insert_remap=False, enable_erf_gelu=False, serialize_matmul=None, serialize_matmul_add=None, merge_matmul=None, merge_matmul_add=None, merge_moe=None, remap_mode=['after_matmul'], max_tensor_size=-1, infer_shape_ahead=False, enable_avoid_overflow_patterns=False, disable_progress_bar=False, batch_size=None, batch_axis=None, remove_outputs=[], fold_periodic_initializer=False, enable_compress_pattern=False, merge_if_with_same_cond=False, logger=<Logger poprt (WARNING)>)

Convert a general ONNX model to an IPU-friendly ONNX model.

Parameters

input_shape (Dict[str, List[int]]) –
convert_version (int) –
precision (str) –
checkpoints (str) –
eightbitsio (bool) –
fp16_skip_op_types (str) –
skip_passes (str) –
used_passes (List[str]) –
check (bool) –
disable_fast_norm (bool) –
pack_args (Dict) –
fp8_skip_op_names (str) –
fp8_params (str) –
quantize (bool) –
enable_insert_remap (bool) –
enable_erf_gelu (bool) –
serialize_matmul (Dict[str, str]) –
serialize_matmul_add (Dict[str, str]) –
merge_matmul (str) –
merge_matmul_add (str) –
merge_moe (str) –
remap_mode (List[str]) –
max_tensor_size (int) –
infer_shape_ahead (bool) –
enable_avoid_overflow_patterns (bool) –
disable_progress_bar (bool) –
batch_size (int) –
batch_axis (int) –
remove_outputs (List[str]) –
fold_periodic_initializer (bool) –
enable_compress_pattern (bool) –
merge_if_with_same_cond (bool) –
logger (Logger) –

convert(model)

Convert a general ONNX model to an IPU-friendly ONNX model.

Parameters: model (ModelProto) – An ONNX ModelProto class object to be converted.
Returns: An ONNX ModelProto class object representing the ONNX model.
Return type: ModelProto

6.2. `poprt.compiler` module

class poprt.compiler.Compiler

Compile ONNX model to PopEF.

Return type: None

static compile(model: str, outputs: List[str], options: poprt._compiler.CompilerOptions = <poprt._compiler.CompilerOptions object at 0x7f0b941a7cf0>) → poprt::Executable

Parameters

model (Union[AnyStr, onnx.ModelProto]) –
outputs (List[str] | None) –
options (CompilerOptions) –

Return type

Executable

static compile_and_export(model: str, outputs: List[str], filename: str, options: poprt._compiler.CompilerOptions = <poprt._compiler.CompilerOptions object at 0x7f0bad4f37b0>) → None

Parameters

model (Union[AnyStr, onnx.ModelProto]) –
outputs (List[str] | None) –
filename (str | None) –
options (CompilerOptions) –

Return type

None

static compile_and_get_summary_report(model: str, outputs: List[str], options: poprt._compiler.CompilerOptions = <poprt._compiler.CompilerOptions object at 0x7f0b941a7cb0>, reset_profile: bool = True) → str

Parameters

model (Union[AnyStr, ModelProto]) –
outputs (List[str]) –
options (CompilerOptions) –
reset_profile (bool) –

Return type

str

class poprt.compiler.CompilerOptions

Return type: None

6.3. `poprt.runtime` module

class poprt.runtime.Runner(popef, config=None)

Load PopEF model, and execute.

Parameters

popef (Union[str, Executable]) – The input PopEF model.
config (Union[RuntimeConfig, PackRunnerConfig]) – The runtime config.

Return type

None

execute(input, output)

Execute the runner.

Parameters

input (Dict[str, ndarray]) –
output (Dict[str, ndarray]) –

Return type

None

class poprt.runtime.DeviceManager

Device manager.

Return type: None

get_device(num_ipus)

Get devices containing the required number of IPUs.

Parameters: num_ipus (int) – The number of IPUs.
Returns: The device.
Return type: Device

get_num_devices()

Get the number of devices.

Return type: int

get_specific_device(device_id)

Get specific devices.

Parameters: device_id (int) – The ID of the target device.
Returns: The device.
Return type: Device

ipu_hardware_version()

Get IPU version.

The possible outputs are:

ipu21: C600 cards

ipu2: M2000 or Bow IPUs

Return type: str

6.4. `poprt.frontend` module

class poprt.frontend.OnnxFrontend(path, **kwargs)

The ONNX frontend.

Parameters: path (str) – The path to the input ONNX model.
Return type: None

get_onnx_name(dir_or_name)

Filter out non-ONNX file.

Parameters

files – A list of filenames.
dir_or_name (str) –

Returns

The ONNX model if there are only one ONNX model, otherwise throw an error.

Return type

Optional[str]

load_model()

Load ONNX Model.

Parameters: dir_or_name – The directory containing the ONNX model or the name of the model. If a directory is specified, then there should only be one ONNX model in the directory.
Returns: The ONNX model.
Return type: ModelProto

class poprt.frontend.TensorflowFrontend(path, *, saved_model=True, signature_def='', tag='', opset=11, inputs_as_nchw=None, outputs_as_nchw=None, inputs=None, input_shape={}, outputs=None, **kwargs)

The TensorFlow frontend.

Parameters

path (str) – The path to the input model.
saved_model (bool) – If True, then this is a TensorFlow SavedModel, otherwise not.
signature_def (str) – The SignatureDef from the SavedModel to use.
tag (str) – The tag to use for the SavedModel.
opset (int) – The opset version to use for the ai.onnx domain in the TensorFlow frontend.
inputs_as_nchw (str) – Transpose inputs from nhwc to nchw.
outputs_as_nchw (str) – Transpose outputs from nhwc to nchw
output_names – The names of the output models (This parameter is optional if the model is a SavedModel.)
inputs (str) –
input_shape (dict) –
outputs (str) –

Return type

None

get_inputs_meta(graph)

Get the metadata of the input tensors of the TensorFlow graph.

Returns: None
Return type: None

get_outputs_meta(graph)

Get the metadata of the output tensors of the TensorFlow graph.

Returns: None
Return type: None

load_model()

Load TensorFlow model and convert to an ONNX ModelProto.

Return type: ModelProto

split_nodename_and_shape(name): input name with shape into name and shape.

6.5. `poprt.backends` module

class poprt.backends.Backend(path_or_bytes, *, export_popef=None, compiler_options=<poprt.compiler.CompilerOptions object>, runtime_options={"autoReset":false, "batchSizeTimeoutNS":9223372036854775807, "batchingDim":4294967295, "checkPackageHash":true, "dataParallelTimeoutNS":9223372036854775807, "deviceWaitConfig.sleepTimeSec":6, "deviceWaitConfig.timeoutSec":1, "flushOnWaitingOutputs":false, "isBatchSizeTimeoutEnabled":false, "requestTracepointsBufferSize":1000, "ringBufferSizeMultiplier":2, "runnerType":0, "threadSafe":true, "timeoutNS":10000000, "validateIOParams":true}, align_output_dtype=False, logger=None)

The PopRT backend.

Parameters

path_or_bytes (Union[AnyStr, IO[bytes], onnx.ModelProto]) – The input ONNX model.
export_popef (str) – The path for the exported PopEF model.
compiler_options (compiler.CompilerOptions) – Compiler options. See poprt.compiler.CompilerOptions.
runtime_options (runtime.AnyConfig) – Runtime options. See poprt.runtime.RuntimeConfig
align_output_dtype (bool) – If True, align output data type based on the ONNX model. Backend.run also has the parameter align_output_dtype. The output data type will be aligned if either of these two parameters is set to True.
logger (logging.Logger) – A custom logger.

Return type

None

get_io_info()

Get meta info of input and outputs, including data type, name and shape.

Return type: tuple[Dict[str, Any], Dict[str, Any]]

run(output_names, inputs, align_output_dtype=False)

Run the model.

Parameters

output_names (List[str]) – The names of the output tensors.
inputs (Dict[str, ndarray]) – The input tensor data.
align_output_dtype (bool) – If True, align output data type based on the ONNX model.

Return type

List[ndarray]

class poprt.backends.ORTBackend(path_or_bytes, sess_options=None, providers=None, provider_options=None, lazy_load=False, **kwargs)

Bases: Backend

Backend compatible with onnxruntime.InferenceSession.

Parameters

path_or_bytes – The input ONNX model.
sess_options – onnxruntime.InferenceSession compatible API, not used
providers – onnxruntime.InferenceSession compatible API. Not used.
provider_options – onnxruntime.InferenceSession compatible API. Not used.
lazy_load – If False, ORTBackend will load the ONNX model by default. Set to True to prevent this behaviour.
**kwargs – See poprt.Backend for more args.

Return type

None

run(output_names, input_feed, run_options=None)

Run the model.

Parameters

output_names – The names of the output tensors.
inputs – The input tensor data.
align_output_dtype – If True, align output data type based on the ONNX model.

Return type

List[ndarray]

6.6. `poprt.quantizer` module

poprt.quantizer.quantize(onnx_model, input_model, output_dir, data_preprocess=None, precision='fp8', quantize_loss_type='kld', num_of_layers_keep_fp16=0, options=None)

Quantize the model according to the specified strategy.

Only SimpleQuantizer is supported.

Parameters

onnx_model (ModelProto) – The ONNX ModelProto.
input_model (str) – The original model.
output_dir (str) – the output dir
data_preprocess (Optional[str]) – The path to the pickle format file for data preprocessing. The storage format is {input_name_1: ndarray_1, input_name_2: ndarray_2, …}.
precision (typing_extensions.Literal[fp8, fp8_weight]) – The precision to be used in the conversion.
quantize_loss_type (str) – choose the calibration method, default is kld.
num_of_layers_keep_fp16 (int) – set the layer whose loss is topk to fp16 in fp8 quantization.
options (Optional[Dict[str, Any]]) – Options for the conversion.

Returns

A quantized ONNX ModelProto.

Return type

ModelProto

class poprt.quantizer.FP8Quantizer(output_dir, loss_type, data_preprocess=None, precision='fp8', num_of_layers_keep_fp16=0, options=None)

Return the input model.

Parameters

output_dir (str) –
loss_type (str) –
data_preprocess (str) –
precision (typing_extensions.Literal[fp8, fp8_weight]) –
num_of_layers_keep_fp16 (int) –
options (Dict[str, Any]) –

6.7. `poprt.passes` module

class poprt.Pass(*args, **kwargs)

Abstract base class for passes.

A new pass could be as follows:

import onnx
from poprt.passes import register, Pass


@register('dummy_pass')
class Dummy(Pass):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)

    def run(self, onnx_model: onnx.ModelProto) -> onnx.ModelProto:
        print(f"producer_name: {onnx_model.producer_name}")
        return onnx_model

Return type: None

static against_passes(pass_names)

Register against property for a pass.

Passes can’t work with against passes.

Parameters: pass_names (List[str]) –
Return type: Callable[[Any], PassReg]

static constraint_passes(constraint_name, pass_names)

Register constraints for a pass.

Valid constraints are against, depend and before.

Parameters

constraint_name (str) –
pass_names (List[str]) –

Return type

Callable[[Any], PassReg]

static get_pass(name, *args, **kwargs)

Get a pass by its registered name.

Parameters: name (str) – The registered name of the pass.
Returns: An instance of Pass.
Return type: Pass

Example:

import poprt

# get a Pass with parameters
onnx_model = poprt.get_pass('float_to_half', skip_op_types=['Gelu'])(onnx_model)

poprt.get_pass('model_overview')(onnx_model)

static get_registered_passes()

Get all registered passes.

Return type: Dict[str, PassReg]

static get_typed_registered_passes(pass_type)

Get typed registered passes.

Parameters: pass_type (Any) –
Return type: Dict[str, Pass]

static property_register(k, v)

Register a property for a pass.

Parameters

k (str) –
v (Any) –

Return type

Callable[[Any], PassReg]

static register_pass(pass_name)

Register a pass.

Parameters: pass_name (str) –
Return type: Callable[[Any], PassReg]

run(onnx_model)

Run a pass.

Inherited subclasses should override this method.

Parameters: onnx_model (ModelProto) – The input ONNX model.
Returns: The optimized ONNC model.
Return type: ModelProto

traverse_graph(graph, transform, is_main_graph=True)

Traverse a and transform a GraphProto.

Parameters

graph (GraphProto) – The input graph.
transform (Callable[[GraphProto, bool], GraphProto]) – The transform function.
is_main_graph (bool) –

Return type

GraphProto

class poprt.PassManager(used_passes=[], gather_ir_passes=False)

Manage passes.

Parameters

used_passes (List[Union[str, Pass]]) – List of passes that will be used.
gather_ir_passes (bool) – If True, gather the ONNX IR passes and execute it in one turn.

Return type

None

Example:

import poprt

pm = poprt.PassManager(
    [
        'model_overview',
        'float_to_half',
        poprt.get_pass('model_overview'),
    ]
)

pm.run(onnx_model)

add_passes(used_passes=[], gather_ir_passes=False)

Add passes for PassManager.

Parameters

used_passes (List[Union[str, Pass]]) – The list of passes that will be used.
gather_ir_passes (bool) – If True, gather the ONNX IR passes and execute it in one turn.

Return type

None

get_all_pass_names()

Get names of all passes.

Return type: List[str]

get_passes()

Get all passes.

Return type: List[Pass]

run(onnx_model)

Apply passes to the ONNX model.

Parameters: onnx_model (ModelProto) – The ONNX model that will be optimized.
Returns: The optimized ONNX model.
Return type: ModelProto

sort_passes(): Solve pass dependency.

6.7.1. Built-in passes

Refer to the Section 5.1, Passes section for more details about passes.

class poprt.passes.add_checkpoints.AddCheckpoints(checkpoints)

Add an intermediate tensor to the output.

Parameters: checkpoints (List[str]) –
Return type: None

This pass is registered as add_checkpoints.

class poprt.passes.apply_host_concat_split.ApplyHostConcatSplit(merged_inputs, remap=False)

Merge model inputs with the same shape.

NOTE: this is an experimental feature. Only tested on merging 2D inputs.

For example, a onnx graph has 3 inputs with same shape [512, 1] and dtype fp16. Before this Pass(only show inputs info):

+--+-----+---+
+-512*1*fp16-+

+--+-----+---+
+-512*1*fp16-+

+--+-----+---+
+-512*1*fp16-+

After applying this Pass:

                   +------+  -> +--+-----+---+
+------------+     |      |     +-512*1*fp16-+
|            |     |      |
| 3*512*fp16 | --> | HCSR |  -> +--+-----+---+
|            |     | Node |     +-512*1*fp16-+
+------------+     |      |
                   |      |  -> +--+-----+---+
                   +------+     +-512*1*fp16-+

The raw inputs will be replaced with one new input with shape [3, 512] and same dtype. The HCSR Node is a custom operation equal to Split + Reshape. it has 3 new outputs, these outputs have same shape and data from raw model inputs.

Parameters: merged_inputs (List[Any]) –

This pass is registered as apply_host_concat_split.

class poprt.passes.apply_ir_pass.ApplyIrPass(passes=[])

Apply passes based on the ONNX IR.

Parameters: passes (List[str]) –
Return type: None

This pass is registered as apply_ir_pass.

class poprt.passes.attention_padding.AttentionPadding(*args, **kwargs)

Recognise the attention pattern and pad inputs of MatMul Op to speed up.

Return type: None

This pass is registered as attention_padding.

class poprt.passes.auto_insert_remap.AutoInsertRemap(remap_mode=['after_matmul'])

Insert a Remap op after or before the assigned op.

This is an experimental feature. Pass ‘before/after’ + ‘_’ + ‘op_type’ to assign remap_mode. For example, there are two different insert modes:

after_matmul: Add the Remap op after the MatMul. This mode is more
general but is more likely to go OOM.
before_add. Add the Remap op before the Add op. This mode is targetted at reducing cycles of attention + mask
in transformer-based models.

Parameters: remap_mode (List[str]) –
Return type: None

This pass is registered as auto_insert_remap.

class poprt.passes.workarounds.BatchNormWorkaround(*args, **kwargs)

Workaround for the BatchNorm op.

Return type: None

This pass is registered as batchnorm_workaround.

class poprt.passes.check_unspported_attribute.ReplaceUnsupportedAttr(*args, **kwargs)

Check the attributes that popart does not supported.

Return type: None

This pass is registered as check_unsupported_attribute.

class poprt.passes.check_with_fake_data.CheckWithFakeData(origin_model)

Check model with fake data using the ONNX runtime.

Parameters: origin_model (ModelProto) –
Return type: None

This pass is registered as check_with_fake_data.

class poprt.passes.compress_pattern.CompressPattern(*args, **kwargs)

Return type: None

This pass is registered as compress_pattern.

class poprt.passes.const_batch_size.ConstBatchSize(const_batch_size=1)

Convert an unknown batch size to a const value.

Parameters: const_batch_size (int) –
Return type: None

This pass is registered as const_batch_size.

class poprt.passes.const_input_shape.ConstInputShape(const_input_shape={}, batch_size=None, batch_axis=None)

Convert the input shape to const values.

Parameters

const_input_shape (Dict[str, Any]) –
batch_size (int) –
batch_axis (int) –

This pass is registered as const_input_shape.

class poprt.passes.constant_folding.ConstantFolding(max_tensor_size=- 1)

Support constant folding.

Parameters: max_tensor_size (int) –
Return type: None

This pass is registered as constant_folding.

class poprt.passes.workarounds.CumSumWorkaround(*args, **kwargs)

Workaround for the CumSum op.

Return type: None

This pass is registered as cumsum_workaround.

class poprt.passes.double_to_float.DoubleToFloat(*args, **kwargs)

Convert double to float (only for initializer).

Return type: None

This pass is registered as double_to_float.

class poprt.passes.eight_bits_io.EightBitsIO: Insert norm operator after input image.

This pass is registered as eight_bits_io.

class poprt.passes.apply_ir_pass.eliminate_deadend(*args, **kwargs)

Return type: None

This pass is registered as eliminate_deadend.

class poprt.passes.apply_ir_pass.eliminate_duplicate_initializer(*args, **kwargs)

Return type: None

This pass is registered as eliminate_duplicate_initializer.

class poprt.passes.apply_ir_pass.eliminate_identity(*args, **kwargs)

Return type: None

This pass is registered as eliminate_identity.

class poprt.passes.apply_ir_pass.eliminate_nop_arithmetic(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_arithmetic.

class poprt.passes.apply_ir_pass.eliminate_nop_cast(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_cast.

class poprt.passes.apply_ir_pass.eliminate_nop_expand(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_expand.

class poprt.passes.apply_ir_pass.eliminate_nop_flatten(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_flatten.

class poprt.passes.apply_ir_pass.eliminate_nop_if(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_if.

class poprt.passes.apply_ir_pass.eliminate_nop_pad(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_pad.

class poprt.passes.apply_ir_pass.eliminate_nop_reshape(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_reshape.

class poprt.passes.apply_ir_pass.eliminate_nop_transpose(*args, **kwargs)

Return type: None

This pass is registered as eliminate_nop_transpose.

class poprt.passes.apply_ir_pass.eliminate_unused_initializer(*args, **kwargs)

Return type: None

This pass is registered as eliminate_unused_initializer.

class poprt.passes.erf_gelu_pattern.ErfGeluPattern(*args, **kwargs)

Recognise the pattern of the Erf Gelu op and replace the pattern with Erf Gelu.

Return type: None

This pass is registered as erf_gelu_pattern.

class poprt.passes.apply_ir_pass.extract_constant_to_initializer(*args, **kwargs)

Return type: None

This pass is registered as extract_constant_to_initializer.

class poprt.passes.fill_squeeze_axes.FillSqueezeAxes(*args, **kwargs)

Fill the empty axes of the Squeeze op to ensure that shape-inference works.

Return type: None

This pass is registered as fill_squeeze_axes.

class poprt.passes.final_check.FinalCheck(*args, **kwargs)

Final check for data type and shape of the converted model.

Return type: None

This pass is registered as final_check.

class poprt.passes.workarounds.FloatOpsWorkaround(*args, **kwargs)

Workaround for ops which are required with float32 or float16 inputs.

Return type: None

This pass is registered as float_ops_workaround.

class poprt.passes.float_to_fp8.Float2FP8(fp8_params=['F143', 'F143', 0, 0], skip_op_names=[], convert_model='fp8', fp8_input_dict=None, fp8_weight_dict=None, skip_op=True)

Convert a model from fp32 or fp16 to fp8.

Parameters

fp8_params (List[Union[typing_extensions.Literal[F143, F152], str]]) – Set of parameters for the fp8 model. The format is [input_format, weight_format, input_scale, weight_scale].
skip_op_names (List[str]) – The names of ops which will not be converted in fp8 mode. They will remain as fp16 or fp32. Example names [‘Conv_1’, ‘Conv_2’].
convert_model (typing_extensions.Literal[fp8, fp8_weight]) – Specifies which fp8 type the model is converted to, can be set to ‘fp8’ or ‘fp8_weight’
fp8_input_dict (Dict[str, int]) – Set parameters for each fp8 input node of the fp8 model. If it’s not None, fp8_params will be discarded.
fp8_weight_dict (Dict[str, int]) – Set parameters for each fp8 weight node of the fp8 model. If it’s not None, fp8_params will be discarded.
skip_op (bool) –

Return type

None

This pass is registered as float_to_fp8.

class poprt.passes.float_to_half.Float2Half(skip_op_types=[], enable_avoid_overflow_patterns=False)

Convert a model from fp32 to fp16.

Parameters

skip_op_types (List[str]) –
enable_avoid_overflow_patterns (bool) –

Return type

None

This pass is registered as float_to_half.

class poprt.passes.float_to_int8.Float2Int8(int8_offset_scale=None)

Convert a model from fp32 or fp16 to int8.

Parameters: int8_offset_scale (Dict) – a dict to store int8 value, offset and scale. For example:
Return type: None

{
    "weight_name": {
        "value": np.int8
        "offset": np.float32,
        "scale": np.float32,
        }
}

This pass is registered as float_to_int8.

class poprt.passes.float_to_mixed.Float2Mixed

Convert a model from fp32 to mixed precision.

Return type: None

This pass is registered as float_to_mixed.

class poprt.passes.fold_periodic_initializer.FoldPeriodicInitializer

Fold periodic initializer to save Always-Live memory.

Return type: None

This pass is registered as fold_periodic_initializer.

class poprt.passes.apply_ir_pass.fuse_bn_into_conv(*args, **kwargs)

Return type: None

This pass is registered as fuse_bn_into_conv.

class poprt.passes.fuse_bn_into_gemm.FuseBnIntoGemm

Fuse BatchNormalization to MatMul/Gemm.

Conditions:: Condition 1: MatMul/Gemm uses an initializer. Condition 2: No multi outputs in Gemm/MatMul. Condition 3: Initializers used across operators is not supported.

Return type: None

This pass is registered as fuse_bn_into_gemm.

class poprt.passes.fuse_cast_into_onehot.FuseCastIntoOnehot

Fuse Cast into OneHot.

Return type: None

This pass is registered as fuse_cast_into_onehot.

class poprt.passes.apply_ir_pass.fuse_consecutive_cast(*args, **kwargs)

Return type: None

This pass is registered as fuse_consecutive_cast.

class poprt.passes.apply_ir_pass.fuse_consecutive_reshape(*args, **kwargs)

Return type: None

This pass is registered as fuse_consecutive_reshape.

class poprt.passes.apply_ir_pass.fuse_consecutive_squeeze(*args, **kwargs)

Return type: None

This pass is registered as fuse_consecutive_squeeze.

class poprt.passes.apply_ir_pass.fuse_consecutive_transpose(*args, **kwargs)

Return type: None

This pass is registered as fuse_consecutive_transpose.

class poprt.passes.apply_ir_pass.fuse_consecutive_unsqueeze(*args, **kwargs)

Return type: None

This pass is registered as fuse_consecutive_unsqueeze.

class poprt.passes.fuse_mul_into_matmul.FuseMulIntoMatmul

Fuse Mul into MatMul.

Return type: None

This pass is registered as fuse_mul_into_matmul.

class poprt.passes.fused_attention.FusedAttention(*args, **kwargs)

Recognise the pattern of MultiHeadAttention and replace it with Fused MultiHeadAttention. A Attention pattern is as follows:

Add
|
Reshape  --    --    --
|           \           \
MatMul       MatMul      MatMul
|            |           |
Reshape      Reshape     Reshape
|            |           |
Add          Add         Add
|            |           |
Reshape      Reshape     Reshape

A Fused Attention pattern is as follows:

Add
|
Concat
|
MatMul
|
Add
|
Reshape
|
Transpose
|
Split

Return type: None

This pass is registered as fused_attention.

class poprt.passes.gelu_pattern.GeluPattern(*args, **kwargs)

Recognise the pattern of the Gelu op and replace the pattern with Gelu.

Return type: None

This pass is registered as gelu_pattern.

class poprt.passes.workarounds.IndicesWorkaround(*args, **kwargs)

Workaround for the Gather and GatherElements ops.

Return type: None

This pass is registered as indices_workaround.

class poprt.passes.insert_attention_mask.InsertAttentionMask(*args, **kwargs)

Replace Reshape-Cast-Sub-Mul with Cast-AttentionMask.

Return type: None

This pass is registered as insert_attention_mask.

class poprt.passes.int64_to_int32.Int64ToInt32(*args, **kwargs)

Convert int64 to int32.

Return type: None

This pass is registered as int64_to_int32.

class poprt.passes.layer_norm_pattern.LayerNormPattern(*args, **kwargs)

Recognise the pattern of the LayerNorm op and replace the pattern with GroupNorm.

Return type: None

This pass is registered as layer_norm_pattern.

class poprt.passes.layer_precision_compare.LayerPrecisionCompare(origin_model, data_preprocess=None, options=None, output_dir='./')

Compare the output of the Conv/MatMul/Gemm op of the original model and the fp8 model.

It will randomly take a batch of data from the calibration for inference, and then records the output of the original model and the converted model. We use cosine distance to evaluate the error because it is a normalized number that measures the angle between vectors. The closer the value is to 0, the smaller the error. The log is written to a log file.

Parameters

origin_model (ModelProto) –
data_preprocess (str) –
options (Dict[str, Any]) –
output_dir (str) –

Return type

None

This pass is registered as layer_precision_compare.

class poprt.passes.manual_sharding.ManualSharding(sharding_info=None, pipelining_info=None)

Shard the graph into several subgraphs manually in terms of specific nodes.

Parameters

sharding_info (Dict[str, int]) –
pipelining_info (Dict[str, int]) –

Return type

None

This pass is registered as manual_sharding.

class poprt.passes.matmul_rotary_embedding.MatmulRotaryEmbedding

Recognise the pattern of element-wise rotary embedding and replace the pattern with the equivalent matmul.

Return type: None

This pass is registered as matmul_rotary_embedding.

class poprt.passes.merge_if_with_same_cond.MergeIfWithSameCond(*args, **kwargs)

Merge if nodes that use same condition input.

Return type: None

This pass is registered as merge_if_with_same_cond.

class poprt.passes.merge_matmul.MergeMatmul(merge_str=None)

Parameters: merge_str (str) –
Return type: None

This pass is registered as merge_matmul.

class poprt.passes.merge_matmul_add.MergeMatmulAdd(merge_str=None)

Parameters: merge_str (str) –
Return type: None

This pass is registered as merge_matmul_add.

class poprt.passes.merge_moe.MergeMoE(merge_str=None)

Parameters: merge_str (str) –
Return type: None

This pass is registered as merge_moe.

class poprt.passes.merge_multi_slice.MergeMultiSlice(*args, **kwargs)

Return type: None

This pass is registered as merge_multi_slice.

class poprt.passes.merge_variant_if.MergeVariantIf(*args, **kwargs)

Move nodes to subgraph of if operator.

Return type: None

This pass is registered as merge_variant_if.

class poprt.passes.model_overview.ModelOverview(use_print=True, *args, **kwargs)

Print overview information of the model to stdout.

Return type: None

This pass is registered as model_overview.

class poprt.passes.move_subgraph_initializer.MoveSubgraphInitializer

Move the subgraph initializers into the main graph.

PopART only searches initializers in the main graph.

Return type: None

This pass is registered as move_subgraph_initializer.

class poprt.passes.workarounds.OneHotWorkaround(*args, **kwargs)

Workaround for the OneHot op which needs to have an int32 depth and a positive axis.

Return type: None

This pass is registered as onehot_workaround.

class poprt.passes.overlap_io.OverlapIO

Enable overlap IO.

Return type: None

This pass is registered as overlap_io.

class poprt.passes.packed_transformer.PackedTransformer(args): Recognise the pattern of SelfAttention and replace it with Packed SelfAttention.

This pass is registered as packed_transformer.

class poprt.passes.post_expand.PostExpand(*args, **kwargs)

Return type: None

This pass is registered as post_expand.

class poprt.passes.pre_scale.PreScale(*args, **kwargs)

Pre scale: Attention matrix Q to Q/sqrt(d), and remove 1/sqrt(d) node.

Return type: None

This pass is registered as pre_scale.

class poprt.passes.remove_duplicated_initializer.RemoveDuplicatedInitializer

Remove duplicated initializer to save memory.

Return type: None

This pass is registered as remove_duplicated_initializer.

class poprt.passes.workarounds.RemoveEmptyConcatInputs(*args, **kwargs)

Workaround for the Concat op which does not support empty inputs in PopART.

Return type: None

This pass is registered as remove_empty_concat_inputs.

class poprt.passes.remove_initializer_from_input.RemoveInitializerFromInput(*args, **kwargs)

Remove initializer from model inputs.

Model: https://github.com/onnx/models/blob/main/vision/classification/resnet/model/resnet50-v1-7.onnx

Return type: None

This pass is registered as remove_initializer_from_input.

class poprt.passes.remove_input_cast.RemoveInputCast(*args, **kwargs)

Remove input cast: input(fp16)->cast(fp16->int32)->gather to input(int32)->gather.

Return type: None

This pass is registered as remove_input_cast.

class poprt.passes.remove_outputs.RemoveOutputs(outputs=[])

Remove specific outputs and useless structures of the graph.

Parameters: outputs (List[str]) –
Return type: None

This pass is registered as remove_outputs.

class poprt.passes.replace_bn_with_mul_add.ReplaceBNWithMulAdd(*args, **kwargs)

Replace BatchNormalization op with Mul + Add ops.

Return type: None

This pass is registered as replace_bn_with_mul_add.

class poprt.passes.replace_castlike.ReplaceCastLike(*args, **kwargs)

Replace ONNX CastLike op with Cast.

Return type: None

This pass is registered as replace_castlike.

class poprt.passes.replace_clip_empty_inputs.ReplaceClipInputs(*args, **kwargs)

Replace empty inputs in Clip op.

Return type: None

This pass is registered as replace_clip_empty_inputs.

class poprt.passes.replace_consecutive_cast_with_notzero.ReplaceConsecuiveCastWithNotZero(*args, **kwargs)

Recognise the pattern of consecutive Cast ops and replace the pattern with a NotZero op.

Return type: None

This pass is registered as replace_consecutive_cast_with_notzero.

class poprt.passes.replace_div_with_mul.ReplaceDivWithMul(*args, **kwargs)

Replace Div with Mul if the divisor is constant.

Model: https://github.com/onnx/models/blob/main/text/machine_comprehension/gpt-2/model/gpt2-10.onnx

Return type: None

This pass is registered as replace_div_with_mul.

class poprt.passes.apply_ir_pass.replace_einsum_with_matmul(*args, **kwargs)

Return type: None

This pass is registered as replace_einsum_with_matmul.

class poprt.passes.replace_erf_with_erfv2.ReplaceErfWithErfV2(*args, **kwargs)

Replace the Erf op with ErfV2.

ErfV2 is more efficient with bigger errors.

Return type: None

This pass is registered as replace_erf_with_erfv2.

class poprt.passes.replace_gemm_with_matmul.ReplaceGemmWithMatMul(*args, **kwargs)

Replace Gemm with MatMul in the ONNX model.

Return type: None

This pass is registered as replace_gemm_with_matmul.

class poprt.passes.replace_greater_or_equal.ReplaceGreaterOrEqual(*args, **kwargs)

Replace the GreaterOrEqual op with the Less and Not ops.

Return type: None

This pass is registered as replace_greater_or_equal.

class poprt.passes.replace_groupnorm_with_fast_norm.ReplaceGroupNormWithFastNorm(*args, **kwargs)

Replace GroupNormalization with FastNorm if datatype is fp16 and num_groups =1.

Return type: None

This pass is registered as replace_groupnorm_with_fast_norm.

class poprt.passes.replace_half_reducemean.ReplaceHalfReduceMean(*args, **kwargs)

Replace the ReduceMean op in fp16 mode with ReduceSum + Mul in case of overflow.

Return type: None

This pass is registered as replace_half_reducemean.

class poprt.passes.replace_hardswish.ReplaceHardSwish(*args, **kwargs)

Replace the HardSwish op with the HardSigmoid and Mul ops.

Replacement is required for opsets earlier than 14 since HardSwish is only supported from opset version 14.

Return type: None

This pass is registered as replace_hardswish.

class poprt.passes.replace_isinf.ReplaceIsInf(*args, **kwargs)

Replace the IsInf op with the IsInfV2 op (support detect_negative and detect_positive).

Return type: None

This pass is registered as replace_isinf.

class poprt.passes.replace_less_or_equal.ReplaceLessOrEqual(*args, **kwargs)

Replace the LessOrEqual op with the Less and Not ops.

Return type: None

This pass is registered as replace_less_or_equal.

class poprt.passes.replace_nonzero.ReplaceNonZero(*args, **kwargs)

Replace NonZero with ArgMax when the number of non-zero elements is known.

Only a single element is supported.

Return type: None

This pass is registered as replace_nonzero.

class poprt.passes.replace_pow.ReplacePow(*args, **kwargs)

Replace Pow op with the Square and Mul ops.

Return type: None

This pass is registered as replace_pow.

class poprt.passes.replace_round.ReplaceRound(*args, **kwargs)

Replace the Round op with the RoundV2 op (half to even mode).

Return type: None

This pass is registered as replace_round.

class poprt.passes.replace_softmax.ReplaceSoftmax(*args, **kwargs)

Replace the Softmax op with the SoftmaxV2 op when the axis is the lowest dim and the lowest dim is odd.

Return type: None

This pass is registered as replace_softmax.

class poprt.passes.replace_where_mask.ReplaceWhereMask(*args, **kwargs)

Change the attention mask method from “where” to “add”.

Return type: None

This pass is registered as replace_where_mask.

class poprt.passes.replace_where_with_mul_add.ReplaceWhereWithMulAdd

Replace the Where op with the Add and Mul ops.

Where(condition, X, Y) = Add(Mul(condition, X), Mul(neg_condition, Y)).

This pass is registered as replace_where_with_mul_add.

class poprt.passes.replace_where_with_wherev2.ReplaceWhereWithWhereV2(*args, **kwargs)

Replace the Where op with WhereV2.

Return type: None

This pass is registered as replace_where_with_wherev2.

class poprt.passes.serialize_matmul.SerializeMatmul(serialize_dict=None)

Enable serializing of the Matmul op to save memory on chip.

Parameters: serialize_dict (Dict) –
Return type: None

This pass is registered as serialize_matmul.

class poprt.passes.serialize_matmul_add.SerializeMatmulAdd(serialize_dict=None)

Parameters: serialize_dict (Dict) –
Return type: None

This pass is registered as serialize_matmul_add.

class poprt.passes.shape_inference.ReplacePow(*args, **kwargs)

Run shape inference.

Return type: None

This pass is registered as shape_inference.

class poprt.passes.sort_graph.SortGraph(*args, **kwargs)

sort subgraph to keep topological order in popart.

Return type: None

This pass is registered as sort_graph.

class poprt.passes.workarounds.TopKWorkaround(*args, **kwargs)

Workaround for the TopK op which needs to have a positive axis.

Return type: None

This pass is registered as topk_workaround.

class poprt.passes.apply_ir_pass.trace_folding(*args, **kwargs)

Return type: None

This pass is registered as trace_folding.

class poprt.passes.apply_ir_pass.unique_name_for_nodes(*args, **kwargs)

Return type: None

This pass is registered as unique_name_for_nodes.

Search help

6. Python API

6.1. poprt module

6.2. poprt.compiler module

6.3. poprt.runtime module

6.4. poprt.frontend module

6.5. poprt.backends module

6.6. poprt.quantizer module

6.7. poprt.passes module

6.7.1. Built-in passes

6.1. `poprt` module

6.2. `poprt.compiler` module

6.3. `poprt.runtime` module

6.4. `poprt.frontend` module

6.5. `poprt.backends` module

6.6. `poprt.quantizer` module

6.7. `poprt.passes` module