# Python Runtime for ONNX¶

This runtime does not take any dependency on scikit-learn, only on numpy, scipy, and has custom implementations in C++ (cython, pybind11).

## Inference¶

The main class reads an ONNX file and may computes predictions based on a runtime implementated in Python. The ONNX model relies on the following operators Python Runtime for ONNX operators.

`mlprodict.onnxrt.OnnxInference`

(*self*, *onnx_or_bytes_or_stream*, *runtime* = None, *skip_run* = False, *inplace* = True, *input_inplace* = False, *ir_version* = None, *target_opset* = None, *runtime_options* = None, *session_options* = None, *inside_loop* = False, *static_inputs* = None, *new_outputs* = None, *new_opset* = None, *device* = None)

Loads an ONNX file or object or stream. Computes the output of the ONNX graph. Several runtimes are available.

`'python'`

: the runtime implements every onnx operator needed to run a scikit-learn model by using numpy or C++ code.

`'python_compiled'`

: it is the same runtime than the previous one except every operator is called from a compiled function (`_build_compile_run`

) instead for a method going through the list of operator

`'onnxruntime1'`

: uses onnxruntime

`'onnxruntime2'`

: this mode is mostly used to debug as python handles calling every operator but onnxruntime is called for every of them, this process may fail due to wrong inference type specially of the graph includes custom nodes, in that case, it is better to compute the output of intermediates nodes. It is much slower as fo every output, every node is computed but more robust.

`build_intermediate`

(self,outputs= None,verbose= 0,overwrite_types= None,fLOG= None)Builds every possible ONNX file which computes one specific intermediate output from the inputs.

`check_model`

(self)Checks the model follow ONNX conventions.

`display_sequence`

(self,verbose= 1)Shows the sequence of nodes to run if

`runtime=='python'`

.

`get_execution_order`

(self)This function returns a dictionary {(kind, name): (order, op)},

namecan be a node name or a result name. In that case, it gets the execution order than the node which created it. The function returns None if the order is not available (the selected runtime does not return it).kindis either ‘node’ or ‘node’. If two nodes have the same name, returned order is the last one. Initializers gets an execution order equal to -1, inputs to 0, all others results are >= 1.

`get_profiling`

(self,as_df= False)Returns the profiling after a couple of execution.

`global_index`

(self,name)Maps every name to one integer to avoid using dictionaries when running the predictions.

`infer_shapes`

(self)Computes expected shapes.

`infer_sizes`

(self,inputs,context= None)Computes expected sizes.

`infer_types`

(self)Computes expected shapes.

`reduce_size`

(self,pickable= False)Reduces the memory footprint as much as possible.

`run`

(self,inputs,clean_right_away= False,intermediate= False,verbose= 0,node_time= False,overwrite_types= None,yield_ops= None,fLOG= None)Computes the predictions for this onnx graph.

`run2onnx`

(self,inputs,verbose= 0,fLOG= None,as_parameter= True,suffix= ‘_DBG’,param_name= None,node_type= ‘DEBUG’,domain= ‘DEBUG’,domain_opset= 1)Executes the graphs with the given inputs, then adds the intermediate results into ONNX nodes in the original graph. Once saved, it can be looked with a tool such as netron.

`shape_inference`

(self)Infers the shape of the outputs with onnx package.

`switch_initializers_dtype`

(self,model= None,dtype_in= <class ‘numpy.float32’>,dtype_out= <class ‘numpy.float64’>)Switches all initializers to

`numpy.float64`

. Ifmodelis None, a simple cast is done. Otherwise, the function assumes the model is a scikit-learn pipeline. This only works if the runtime is`'python'`

.

`to_sequence`

(self)Produces a graph to facilitate the execution.

One example…

## Python to ONNX¶

`mlprodict.onnx_tools.onnx_grammar.translate_fct2onnx`

(*fct*, *context* = None, *cpl* = False, *context_cpl* = None, *output_names* = None, *dtype* = <class ‘numpy.float32’>, *verbose* = 0, *fLOG* = None)

Translates a function into ONNX. The code it produces is using classes

OnnxAbs,OnnxAdd, …

## ONNX Export¶

`mlprodict.onnxrt.onnx_inference_exports.OnnxInferenceExport`

(*self*, *oinf*)

Implements methods to export a instance of

`OnnxInference`

into json, dot,text,python.

## ONNX Structure¶

`mlprodict.onnx_tools.onnx_manipulations.enumerate_model_node_outputs`

(*model*, *add_node* = False, *order* = False)

Enumerates all the nodes of a model.

`mlprodict.onnx_tools.onnx_manipulations.select_model_inputs_outputs`

(*model*, *outputs* = None, *inputs* = None, *infer_shapes* = False, *overwrite* = None, *remove_unused* = True, *verbose* = 0, *fLOG* = None)

Takes a model and changes its outputs.

## onnxruntime¶

`mlprodict.onnxrt.onnx_inference_ort.device_to_providers`

`mlprodict.onnxrt.onnx_inference_ort.get_ort_device`

## Validation¶

`mlprodict.onnxrt.validate.enumerate_validated_operator_opsets`

(*verbose* = 0, *opset_min* = -1, *opset_max* = -1, *check_runtime* = True, *debug* = False, *runtime* = ‘python’, *models* = None, *dump_folder* = None, *store_models* = False, *benchmark* = False, *skip_models* = None, *assume_finite* = True, *node_time* = False, *fLOG* = <built-in function print>, *filter_exp* = None, *versions* = False, *extended_list* = False, *time_kwargs* = None, *dump_all* = False, *n_features* = None, *skip_long_test* = True, *fail_bad_results* = False, *filter_scenario* = None, *time_kwargs_fact* = None, *time_limit* = 4, *n_jobs* = None)

Tests all possible configurations for all possible operators and returns the results.

`mlprodict.onnxrt.validate.side_by_side.side_by_side_by_values`

(*sessions*, *args*, *inputs* = None, *return_results* = False, *kwargs*)

Compares the execution of two sessions. It calls method

`OnnxInference.run`

with value`intermediate=True`

and compares the results.

`mlprodict.onnxrt.validate.summary_report`

(*df*, *add_cols* = None, *add_index* = None)

Finalizes the results computed by function

`enumerate_validated_operator_opsets`

.

`mlprodict.onnxrt.validate.validate_graph.plot_validate_benchmark`

## C++ classes¶

**Gather**

`mlprodict.onnxrt.ops_cpu.op_gather_.GatherDouble`

(*self*, *arg0*)

Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.

`mlprodict.onnxrt.ops_cpu.op_gather_.GatherFloat`

(*self*, *arg0*)

Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.

`mlprodict.onnxrt.ops_cpu.op_gather_.GatherInt64`

(*self*, *arg0*)

Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.

**ArrayFeatureExtractor**

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_double`

(*arg0*, *arg1*)

array_feature_extractor_double(arg0: numpy.ndarray[numpy.float64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float64]

C++ implementation of operator ArrayFeatureExtractor for float64. The function only works with contiguous arrays.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_float`

(*arg0*, *arg1*)

array_feature_extractor_float(arg0: numpy.ndarray[numpy.float32], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float32]

C++ implementation of operator ArrayFeatureExtractor for float32. The function only works with contiguous arrays.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_int64`

(*arg0*, *arg1*)

array_feature_extractor_int64(arg0: numpy.ndarray[numpy.int64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.int64]

C++ implementation of operator ArrayFeatureExtractor for int64. The function only works with contiguous arrays.

**SVM**

`mlprodict.onnxrt.ops_cpu.op_svm_classifier_.RuntimeSVMClassifier`

`mlprodict.onnxrt.ops_cpu.op_svm_regressor_.RuntimeSVMRegressor`

**Tree Ensemble**

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_.RuntimeTreeEnsembleClassifierDouble`

(*self*)

Implements runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_classifier.cc in onnxruntime. Supports double only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_.RuntimeTreeEnsembleClassifierFloat`

(*self*)

Implements runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_classifier.cc in onnxruntime. Supports float only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_.RuntimeTreeEnsembleRegressorDouble`

(*self*)

Implements double runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports double only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_.RuntimeTreeEnsembleRegressorFloat`

(*self*)

Implements float runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports float only.

**Still tree ensembles but refactored.**

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_p_.RuntimeTreeEnsembleClassifierPDouble`

(*self*, *arg0*, *arg1*, *arg2*, *arg3*)

Implements double runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_Classifier.cc in onnxruntime. Supports double only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_p_.RuntimeTreeEnsembleClassifierPFloat`

(*self*, *arg0*, *arg1*, *arg2*, *arg3*)

Implements float runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_Classifier.cc in onnxruntime. Supports float only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_p_.RuntimeTreeEnsembleRegressorPDouble`

(*self*, *arg0*, *arg1*, *arg2*, *arg3*)

Implements double runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports double only.

`mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_p_.RuntimeTreeEnsembleRegressorPFloat`

(*self*, *arg0*, *arg1*, *arg2*, *arg3*)

Implements float runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports float only.

**Topk**

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_double`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_max_double(arg0: numpy.ndarray[numpy.float64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_float`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_max_float(arg0: numpy.ndarray[numpy.float32], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_int64`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_max_int64(arg0: numpy.ndarray[numpy.int64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_double`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_min_double(arg0: numpy.ndarray[numpy.float64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_float`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_min_float(arg0: numpy.ndarray[numpy.float32], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_int64`

(*arg0*, *arg1*, *arg2*, *arg3*)

topk_element_min_int64(arg0: numpy.ndarray[numpy.int64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]

th_pararows. It only does it on the last axis.

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_double`

(*arg0*, *arg1*)

topk_element_fetch_double(arg0: numpy.ndarray[numpy.float64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float64]

Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_float`

(*arg0*, *arg1*)

topk_element_fetch_float(arg0: numpy.ndarray[numpy.float32], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float32]

Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).

`mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_int64`

(*arg0*, *arg1*)

topk_element_fetch_int64(arg0: numpy.ndarray[numpy.int64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.int64]

Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).

## Shapes¶

The computation of the predictions through epkg:ONNX may
be optimized if the shape of every nodes is known. For example,
one possible optimisation is to do inplace computation every time
it is possible but this is only possible if the size of
the input and output are the same. We could compute the predictions
for a sample and check the sizes are the same
but that could be luck. We could also guess from a couple of samples
with different sizes and assume sizes and polynomial functions
of the input size. But in rare occasions, that could be luck too.
So one way of doing it is to implement a method
`_set_shape_inference_runtime`

which works the same say as method `_run_sequence_runtime`

but handles shapes instead. Following class tries to implement
a way to keep track of shape along the shape.

`mlprodict.onnxrt.shape_object.ShapeObject`

(*self*, *shape*, *dtype* = None, *use_n1* = False, *name* = None, *subtype* = None)

Handles mathematical operations around shapes. It stores a type (numpy type), and a name to somehow have an idea of where the shape comes from in the ONNX graph. The shape itself is defined by a list of

`DimensionObject`

or`ShapeOperator`

orNoneif the shape is unknown. A dimension is an integer or a variable encoded as a string. This variable is a way to tell the dimension may vary.