module onnxrt.onnx_inference#

Inheritance diagram of mlprodict.onnxrt.onnx_inference

Short summary#

module mlprodict.onnxrt.onnx_inference

Implements a class able to compute the predictions from on an ONNX model.

source on GitHub

Classes#

class

truncated documentation

OnnxInference

Loads an ONNX file or object or stream. Computes the output of the ONNX graph. Several runtimes …

Properties#

property

truncated documentation

input_names

Returns the names of all inputs. It does not include the optional inputs.

input_names_shapes

Returns the names and shapes of all inputs. This method assumes all inputs are tensors. It does not include …

input_names_shapes_types

Returns the names, shapes, types of all inputs. This method assumes all inputs are tensors. It does not …

output_names

Returns the names of all outputs.

output_names_shapes

Returns the names and shapes of all outputs. This method assumes all inputs are tensors.

output_names_shapes_types

Returns the names, shapes, types of all outputs. This method assumes all inputs are tensors. It does not …

Static Methods#

staticmethod

truncated documentation

_get_type_property

Methods#

method

truncated documentation

__getitem__

Returns the ONNX verions of a node.

__getstate__

To pickle the object.

__init__

__repr__

usual

__setstate__

To unpickle the object.

__str__

usual

_build_compile_run

Rewrite the run function in python, compiles it, and adds it as a method.

_guess_inplace

Looks into every node of the graph to see if there is a way to do the computation inplace. By default (input_inplace=False), …

_guess_input_dtype

_init

Prepares the instance to deliver predictions.

_run_sequence_runtime

_run_sequence_runtime_compiled

Executes a compiled version of _run_sequence_runtime(), compiled with method _build_compile_run(). …

_run_whole_runtime

_set_shape_inference_runtime

Set shapes based on shape inference relying on the runtime. The values are stored in every node.

_set_size_inference_runtime

Set sizes allocated during inference relying on the runtime. The values are stored in every node.

_set_type_inference_runtime

Set types based on type inference relying on the runtime. The values are stored in every node.

_validate_outputs

Checks the output have the expected type. The function returns the list of mismatches.

build_intermediate

Builds every possible ONNX file which computes one specific intermediate output from the inputs. …

check_onnx

Checks the model follow ONNX conventions.

display_sequence

Shows the sequence of nodes to run if runtime=='python'.

get_execution_order

This function returns a dictionary {(kind, name): (order, op)}, name can be a node name or a result name. In …

get_profiling

Returns the profiling after a couple of execution.

global_index

Maps every name to one integer to avoid using dictionaries when running the predictions.

infer_shapes

Computes expected shapes.

infer_sizes

Computes expected sizes.

infer_types

Computes expected shapes.

reduce_size

Reduces the memory footprint as much as possible.

run

Computes the predictions for this onnx graph.

run2onnx

Executes the graphs with the given inputs, then adds the intermediate results into ONNX nodes in the original graph. …

shape_inference

Infers the shape of the outputs with onnx package.

switch_initializers_dtype

Switches all initializers to numpy.float64. If model is None, a simple cast is done. Otherwise, the function …

to_sequence

Produces a graph to facilitate the execution. One example:

Documentation#

Implements a class able to compute the predictions from on an ONNX model.

source on GitHub

class mlprodict.onnxrt.onnx_inference.OnnxInference(onnx_or_bytes_or_stream, runtime=None, skip_run=False, inplace=True, input_inplace=False, ir_version=None, target_opset=None, runtime_options=None, session_options=None, inside_loop=False, static_inputs=None, new_outputs=None, new_opset=None, existing_functions=None)#

Bases: object

Loads an ONNX file or object or stream. Computes the output of the ONNX graph. Several runtimes are available.

  • 'python': the runtime implements every onnx operator needed to run a scikit-learn model by using numpy or C++ code.

  • 'python_compiled': it is the same runtime than the previous one except every operator is called from a compiled function (_build_compile_run) instead for a method going through the list of operator

  • 'onnxruntime1': uses onnxruntime (or onnxruntime1-cuda, …)

  • 'onnxruntime2': this mode is mostly used to debug as python handles calling every operator but onnxruntime is called for every of them, this process may fail due to wrong inference type specially of the graph includes custom nodes, in that case, it is better to compute the output of intermediates nodes. It is much slower as fo every output, every node is computed but more robust.

Parameters:
  • onnx_or_bytes_or_streamonnx object, bytes, or filename or stream

  • runtime – runtime options

  • skip_run – do not build the runtime

  • inplace – use inplace computation as much as possible

  • input_inplace – the computation is allowed to overwrite the input, see _guess_inplace

  • ir_version – if not None, overwrite the default version

  • target_opset – used to overwrite target_opset

  • runtime_options – specific options for the runtime

  • inside_loop – tells the runtime the graph is meant to be repeated multiple times (in that case, inputs and outputs may share the same name)

  • static_inputs – Loop can use static variables, variables from the graph which runs the loop (enumerate of strings)

  • new_outputs – if the loading fails, it might worth cutting the graph, if not None, the graph will be cut to have these new_outputs as the final outputs

  • new_opset – overwrite the main opset and replaces by this new one

  • existing_functions – a model may contain several local functions, this parameter is used when a local function is calling another local function previously defined.

Among the possible runtime_options, there are: * enable_profiling: enables profiling for onnxruntime * session_options: an instance of SessionOptions from

  • ir_version: change ir_version

Changed in version 0.9: Parameters existing_functions was added. Removes device parameter. See runtime. Runtime onnxruntime1-cuda was added.

source on GitHub

__getitem__(item)#

Returns the ONNX verions of a node.

source on GitHub

__getstate__()#

To pickle the object.

source on GitHub

__init__(onnx_or_bytes_or_stream, runtime=None, skip_run=False, inplace=True, input_inplace=False, ir_version=None, target_opset=None, runtime_options=None, session_options=None, inside_loop=False, static_inputs=None, new_outputs=None, new_opset=None, existing_functions=None)#
__repr__()#

usual

source on GitHub

__setstate__(state)#

To unpickle the object.

source on GitHub

__str__()#

usual

source on GitHub

_build_compile_run(debug=False)#

Rewrite the run function in python, compiles it, and adds it as a method.

Parameters:

debug – insert debugging code

Returns:

method name, callable object

Run a model with runtime ‘python_compiled’

The following code trains a model and compute the predictions with runtime 'python_compiled'. It converts the onnx graph into a python function which calls every operator. Its code is printed below.

<<<

import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from mlprodict.onnx_conv import to_onnx
from mlprodict.onnxrt import OnnxInference

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, __ = train_test_split(X, y, random_state=11)
y_train = y_train.astype(numpy.float32)
clr = AdaBoostClassifier(
    base_estimator=DecisionTreeClassifier(max_depth=3),
    n_estimators=3)
clr.fit(X_train, y_train)

model_def = to_onnx(clr, X_train.astype(numpy.float32),
                    target_opset=12)

oinf2 = OnnxInference(model_def, runtime='python_compiled')
print(oinf2.run({'X': X_test[:5]}))

# prints out the python function equivalent
# to the onnx graph
print(oinf2)

>>>

    {'output_label': array([2, 2, 1, 1, 2]), 'output_probability': []}
    OnnxInference(...)
        def compiled_run(dict_inputs, yield_ops=None, context=None, attributes=None):
            if yield_ops is not None:
                raise NotImplementedError('yields_ops should be None.')
            # init: classes (classes)
            # init: clip_min (clip_min)
            # init: inverted_n_classes (inverted_n_classes)
            # init: mul_operand (mul_operand)
            # init: n_classes_minus_one (n_classes_minus_one)
            # init: shape_tensor (shape_tensor)
            # init: shape_tensor3 (shape_tensor3)
            # init: zero_scalar (zero_scalar)
            # inputs
            X = dict_inputs['X']
            (elab_name_1, eprob_name_1, ) = n0_treeensembleclassifier_1(X)
            (elab_name_0, eprob_name_0, ) = n1_treeensembleclassifier_1(X)
            (clipped_proba1, ) = n2_clip_11(eprob_name_1, clip_min)
            (clipped_proba, ) = n3_clip_11(eprob_name_0, clip_min)
            (log_proba, ) = n4_log(clipped_proba)
            (log_proba1, ) = n5_log(clipped_proba1)
            (reduced_proba1, ) = n6_reducesum_11(log_proba1)
            (reduced_proba, ) = n7_reducesum_11(log_proba)
            (reshaped_result1, ) = n8_reshape_5(reduced_proba1, shape_tensor)
            (reshaped_result, ) = n9_reshape_5(reduced_proba, shape_tensor)
            (prod_result1, ) = n10_mul(reshaped_result1, inverted_n_classes)
            (prod_result, ) = n11_mul(reshaped_result, inverted_n_classes)
            (sub_result1, ) = n12_sub(log_proba1, prod_result1)
            (sub_result, ) = n13_sub(log_proba, prod_result)
            (samme_proba1, ) = n14_mul(sub_result1, n_classes_minus_one)
            (samme_proba, ) = n15_mul(sub_result, n_classes_minus_one)
            (summation_prob, ) = n16_sum(samme_proba, samme_proba1)
            (div_result, ) = n17_div(summation_prob, n_classes_minus_one)
            (exp_operand, ) = n18_mul(div_result, mul_operand)
            (exp_result, ) = n19_exp(exp_operand)
            (reduced_exp_result, ) = n20_reducesum_11(exp_result)
            (normaliser, ) = n21_reshape_5(reduced_exp_result, shape_tensor)
            (cast_normaliser, ) = n22_cast(normaliser)
            (comparison_result, ) = n23_equal(cast_normaliser, zero_scalar)
            (cast_output, ) = n24_cast(comparison_result)
            (zero_filtered_normaliser, ) = n25_add(normaliser, cast_output)
            (probabilities, ) = n26_div(exp_result, zero_filtered_normaliser)
            (argmax_output, ) = n27_argmax_12(probabilities)
            (output_probability, ) = n28_zipmap(probabilities)
            (array_feature_extractor_result, ) = n29_arrayfeatureextractor(classes, argmax_output)
            (reshaped_result2, ) = n30_reshape_5(array_feature_extractor_result, shape_tensor3)
            (label, ) = n31_cast(reshaped_result2)
            (output_label, ) = n32_cast(label)
            return {
                'output_label': output_label,
                'output_probability': output_probability,
            }

source on GitHub

static _get_type_property(info, prop)#
_guess_inplace(input_inplace=False)#

Looks into every node of the graph to see if there is a way to do the computation inplace. By default (input_inplace=False), the function assumes inputs cannot be modified so the first node cannot do inplace computation. This function only works with the python runtime.

Parameters:

input_inplace – the computation is allowed to overwrite the input

This function checks that one node is used only once and then can be modified by the next node. Nodes A, C can be overwritten by the computation. Node B cannot as it is used by two nodes.

It does not handle specific case such node B being overwritten by node C but without changing its shape and node D only needs the shape of B. Then B could be overwritten as well.

source on GitHub

_guess_input_dtype()#
_init(existing_functions=None)#

Prepares the instance to deliver predictions.

source on GitHub

_run_sequence_runtime(inputs, clean_right_away=False, intermediate=False, verbose=0, node_time=False, overwrite_types=None, yield_ops=None, fLOG=None, context=None, attributes=None)#
_run_sequence_runtime_compiled(inputs, clean_right_away=False, intermediate=False, verbose=0, node_time=False, yield_ops=None, fLOG=None, context=None, attributes=None)#

Executes a compiled version of _run_sequence_runtime, compiled with method _build_compile_run. Every parameter with a default value is ignored. Switch to runtime='python' to enable those.

source on GitHub

_run_whole_runtime(inputs, clean_right_away=False, intermediate=False, verbose=0, node_time=False, overwrite_types=None, yield_ops=None, fLOG=None, context=None, attributes=None)#
_set_shape_inference_runtime()#

Set shapes based on shape inference relying on the runtime. The values are stored in every node.

source on GitHub

_set_size_inference_runtime(inputs, context=None)#

Set sizes allocated during inference relying on the runtime. The values are stored in every node.

source on GitHub

_set_type_inference_runtime(inputs=None)#

Set types based on type inference relying on the runtime. The values are stored in every node.

source on GitHub

_validate_outputs(res, verbose=0, fLOG=None)#

Checks the output have the expected type. The function returns the list of mismatches.

Parameters:
  • res – results in a dictionary

  • verbose – verbosity

  • fLOG – logging function

Returns:

dictionary

source on GitHub

build_intermediate(outputs=None, verbose=0, overwrite_types=None, fLOG=None)#

Builds every possible ONNX file which computes one specific intermediate output from the inputs.

Parameters:
  • outputs – subsets of outputs to get, None to get all outputs,

  • overwrite_types – shape inference does not work all the time, this allows to force types when building intermediate results, see select_model_inputs_outputs

  • verbose – displays intermediate information

  • fLOG – logging function

Returns:

collections.OrderedDict

source on GitHub

check_onnx()#

Checks the model follow ONNX conventions.

source on GitHub

display_sequence(verbose=1)#

Shows the sequence of nodes to run if runtime=='python'.

source on GitHub

get_execution_order()#

This function returns a dictionary {(kind, name): (order, op)}, name can be a node name or a result name. In that case, it gets the execution order than the node which created it. The function returns None if the order is not available (the selected runtime does not return it). kind is either ‘node’ or ‘node’. If two nodes have the same name, returned order is the last one. Initializers gets an execution order equal to -1, inputs to 0, all others results are >= 1.

New in version 0.7.

source on GitHub

get_profiling(as_df=False)#

Returns the profiling after a couple of execution.

Parameters:

as_df – return the results as a dataframe (True)

Returns:

dataframe or list of dictionaries

New in version 0.6.

source on GitHub

global_index(name)#

Maps every name to one integer to avoid using dictionaries when running the predictions.

Parameters:

name – outputs name

Returns:

integer

source on GitHub

infer_shapes()#

Computes expected shapes.

Returns:

dictionary of shapes

source on GitHub

infer_sizes(inputs, context=None)#

Computes expected sizes.

Parameters:

inputs – inputs as a dictionary

Returns:

dictionary of dictionary of sizes

source on GitHub

infer_types(inputs=None)#

Computes expected shapes.

Parameters:

inputs – needed when this class host a function and not a graph

Returns:

dictionary of types

source on GitHub

property input_names#

Returns the names of all inputs. It does not include the optional inputs.

Changed in version 0.6: The list does not include optional inputs anymore.

source on GitHub

property input_names_shapes#

Returns the names and shapes of all inputs. This method assumes all inputs are tensors. It does not include the optional inputs.

Changed in version 0.6: The list does not include optional inputs anymore.

source on GitHub

property input_names_shapes_types#

Returns the names, shapes, types of all inputs. This method assumes all inputs are tensors. It does not include the optional inputs.

Changed in version 0.6: The list does not include optional inputs anymore.

source on GitHub

property output_names#

Returns the names of all outputs.

source on GitHub

property output_names_shapes#

Returns the names and shapes of all outputs. This method assumes all inputs are tensors.

source on GitHub

property output_names_shapes_types#

Returns the names, shapes, types of all outputs. This method assumes all inputs are tensors. It does not include the optional outputs.

source on GitHub

reduce_size(pickable=False)#

Reduces the memory footprint as much as possible.

Parameters:

pickable – keeps a pickle object?

source on GitHub

run(inputs, clean_right_away=False, intermediate=False, verbose=0, node_time=False, overwrite_types=None, yield_ops=None, fLOG=None, context=None, attributes=None)#

Computes the predictions for this onnx graph.

Parameters:
  • inputs – inputs as dictionary or a dataframe

  • clean_right_away – clean the intermediate outputs as soon as they are not needed

  • intermediate – returns a dictionary of intermediate variables instead of the results only

  • verbose – display information while predicting

  • node_time – measure time of each node

  • overwrite_types – shape inference does not work all the time, this allows to force types when building intermediate results, see select_model_inputs_outputs

  • yield_ops – dictionary to overwrite the output of operator YieldOp

  • fLOG – logging function if verbose > 0

  • context – local variables, needed when this object is a subgraph

  • attributes – this uses when this class runs a FunctionProto to store the values of the attributes of the function

Returns:

outputs as dictionary and a second dictionary of the time spent in each node if node_time is True

Computes predictions with any runtime

The following example compares predictions between scikit-learn and this runtime for the python runtime.

<<<

import numpy
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from mlprodict.onnxrt import OnnxInference
from mlprodict.onnx_conv import to_onnx

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, _ = train_test_split(X, y)
clr = LinearRegression()
clr.fit(X_train, y_train)

exp = clr.predict(X_test[:5])
print(exp)

model_def = to_onnx(clr, X_train.astype(numpy.float32),
                    target_opset=12)
oinf = OnnxInference(model_def)
y = oinf.run({'X': X_test[:5]})
print(y)

>>>

    [0.01  1.365 0.893 0.052 2.04 ]
    {'variable': array([[0.01 ],
           [1.365],
           [0.893],
           [0.052],
           [2.04 ]])}

The function returns all intermediate outputs if intermediate is True. In case of runtime onnxruntime1, if intermediate is True, the first class builds all ONNX cut out to keep the one output and converted into OnnxInference.

Changed in version 0.9: Parameter attributes was added.

source on GitHub

run2onnx(inputs, verbose=0, fLOG=None, as_parameter=True, suffix='_DBG', param_name=None, node_type='DEBUG', domain='DEBUG', domain_opset=1, attributes=None)#

Executes the graphs with the given inputs, then adds the intermediate results into ONNX nodes in the original graph. Once saved, it can be looked with a tool such as netron.

Parameters:
  • inputs – inputs as dictionary or a dataframe

  • verbose – display information while predicting

  • fLOG – logging function if verbose > 0

  • as_parameter – add new nodes with results as one parameter (True) or as initializer (False)

  • suffix – suffix to add to new results

  • param_name – name of the parameter to add (by default the result name), it can be a function param_name(reult_name) -> parameter_name

  • node_type – type of the new node

  • domain – domain the new node

  • domain_opset – opset for domain

  • attributes – values for attributes if this class runs a FunctionProto

Returns:

outputs as dictionary and the onnx graph with new nodes

The following example shows how to use it.

New in version 0.7.

source on GitHub

shape_inference()#

Infers the shape of the outputs with onnx package.

Returns:

A new ONNX graph which defined outputs.

source on GitHub

switch_initializers_dtype(model=None, dtype_in=<class 'numpy.float32'>, dtype_out=<class 'numpy.float64'>)#

Switches all initializers to numpy.float64. If model is None, a simple cast is done. Otherwise, the function assumes the model is a scikit-learn pipeline. This only works if the runtime is 'python'.

Parameters:
  • modelscikit-learn model or None

  • dtype_in – previous type

  • dtype_out – next type

Returns:

done operations

source on GitHub

to_sequence(existing_functions=None)#

Produces a graph to facilitate the execution.

One example:

Convert ONNX into graph

An example on how to convert an ONNX graph into a graph.

<<<

import pprint
import numpy
from mlprodict.npy.xop import loadop
from mlprodict.onnxrt import OnnxInference

OnnxAiOnnxMlLinearRegressor = loadop(
    ('ai.onnx.ml', 'LinearRegressor'))

pars = dict(coefficients=numpy.array([1., 2.]),
            intercepts=numpy.array([1.]),
            post_transform='NONE')
onx = OnnxAiOnnxMlLinearRegressor(
    'X', output_names=['Y'], **pars)
model_def = onx.to_onnx(
    {'X': pars['coefficients'].astype(numpy.float32)},
    outputs={'Y': numpy.float32},
    target_opset=12)
oinf = OnnxInference(model_def)
pprint.pprint(oinf.to_sequence())

>>>

    {'attributes': {},
     'functions': {},
     'inits': {},
     'inputs': {'X': {'name': 'X',
                      'type': {'elem': 'float', 'kind': 'tensor', 'shape': (2,)}}},
     'intermediate': {'Y': None},
     'ir_version': 8,
     'nodes': {'_linearregressor': Onnx-LinearRegressor(X) -> Y    (name='_linearregressor')},
     'outputs': {'Y': {'name': 'Y',
                       'type': {'elem': 'float',
                                'kind': 'tensor',
                                'shape': ('?',)}}},
     'sequence': [Onnx-LinearRegressor(X) -> Y    (name='_linearregressor')],
     'statics': {},
     'targets': {'ai.onnx.ml': 1}}

See an example of representation in notebook ONNX visualization.

source on GitHub

mlprodict.onnxrt.onnx_inference.iskeyword()#

x.__contains__(y) <==> y in x.