module `sklapi.onnx_pipeline`#

Short summary#

module mlprodict.sklapi.onnx_pipeline

A pipeline which serializes into ONNX steps by steps.

Classes#

class	truncated documentation
`OnnxPipeline`	The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step …

Properties#

property	truncated documentation
`_estimator_type`
`_final_estimator`
`_repr_html_`	HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …
`classes_`	The classes labels. Only exist if the last step is a classifier.
`feature_names_in_`	Names of features seen during first step fit method.
`n_features_in_`	Number of features seen during first step fit method.
`named_steps`	Access the steps by name. Read-only attribute to access any step by given name. Keys are steps names and …

Methods#

method	truncated documentation
`__init__`
`_fit`
`_preprocess_options`	Preprocesses the options.
`_to_onnx`	Converts a transformer into ONNX.
`fit`	Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed …

Documentation#

A pipeline which serializes into ONNX steps by steps.

source on GitHub

class mlprodict.sklapi.onnx_pipeline.OnnxPipeline(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#

Bases: Pipeline

The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step in order to minimize discrepencies. By default, ONNX is using float and not double which is the default for scikit-learn. It may introduce discrepencies when a non-continuous model (mathematical definition) such as tree ensemble and part of the pipeline.

Parameters:

steps – List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.
memory – str or object with the joblib.Memory interface, default=None Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute named_steps or steps to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.
verbose – bool, default=False If True, the time elapsed while fitting each step will be printed as it is completed.
output_name – string requested output name or None to request all and have method transform to store all of them in a dataframe
enforce_float32 – boolean onnxruntime only supports float32, scikit-learn usually uses double floats, this parameter ensures that every array of double floats is converted into single floats
runtime – string, defined the runtime to use as described in OnnxInference.
options – see to_onnx
white_op – see to_onnx
black_op – see to_onnx
final_types – see to_onnx
op_version – ONNX targeted opset

The class stores transformers before converting them into ONNX in attributes raw_steps_.

See notebook Discrepencies with ONNX to see it can be used to reduce discrepencies after it was converted into ONNX.

source on GitHub

__abstractmethods__ = frozenset({})#

__init__(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#

_abc_impl = <_abc._abc_data object>#

_fit(X, y=None, **fit_params_steps)#

_preprocess_options(name, options)#

Preprocesses the options.

Parameters:

name – option name
options – conversion options

Returns:

new options

source on GitHub

_to_onnx(name, fitted_transformer, x_train, rewrite_ops=True, verbose=0)#

Converts a transformer into ONNX.

Parameters:

name – model name
fitted_transformer – fitted transformer
x_train – training dataset
rewrite_ops – use rewritten converters
verbose – display some information

Returns:

corresponding OnnxTransformer

source on GitHub

fit(X, y=None, **fit_params)#

Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.

Parameters:

X – iterable Training data. Must fulfill input requirements of first step of the pipeline.
y – iterable, default=None Training targets. Must fulfill label requirements for all steps of the pipeline.
fit_params – dict of string -> object Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns:

self, Pipeline, this estimator

source on GitHub

module sklapi.onnx_pipeline#

Short summary#

Classes#

Properties#

Methods#

Documentation#

module `sklapi.onnx_pipeline`#