module sklapi.onnx_pipeline#

Inheritance diagram of mlprodict.sklapi.onnx_pipeline

Short summary#

module mlprodict.sklapi.onnx_pipeline

A pipeline which serializes into ONNX steps by steps.

source on GitHub

Classes#

class

truncated documentation

OnnxPipeline

The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step …

Properties#

property

truncated documentation

_estimator_type

_final_estimator

_repr_html_

HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …

classes_

The classes labels. Only exist if the last step is a classifier.

feature_names_in_

Names of features seen during first step fit method.

n_features_in_

Number of features seen during first step fit method.

named_steps

Access the steps by name. Read-only attribute to access any step by given name. Keys are steps names and …

Methods#

method

truncated documentation

__init__

_fit

_preprocess_options

Preprocesses the options.

_to_onnx

Converts a transformer into ONNX.

fit

Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed …

Documentation#

A pipeline which serializes into ONNX steps by steps.

source on GitHub

class mlprodict.sklapi.onnx_pipeline.OnnxPipeline(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#

Bases: Pipeline

The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step in order to minimize discrepencies. By default, ONNX is using float and not double which is the default for scikit-learn. It may introduce discrepencies when a non-continuous model (mathematical definition) such as tree ensemble and part of the pipeline.

Parameters:
  • steps – List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.

  • memory – str or object with the joblib.Memory interface, default=None Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute named_steps or steps to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.

  • verbose – bool, default=False If True, the time elapsed while fitting each step will be printed as it is completed.

  • output_name – string requested output name or None to request all and have method transform to store all of them in a dataframe

  • enforce_float32 – boolean onnxruntime only supports float32, scikit-learn usually uses double floats, this parameter ensures that every array of double floats is converted into single floats

  • runtime – string, defined the runtime to use as described in OnnxInference.

  • options – see to_onnx

  • white_op – see to_onnx

  • black_op – see to_onnx

  • final_types – see to_onnx

  • op_version – ONNX targeted opset

The class stores transformers before converting them into ONNX in attributes raw_steps_.

See notebook Discrepencies with ONNX to see it can be used to reduce discrepencies after it was converted into ONNX.

source on GitHub

__abstractmethods__ = frozenset({})#
__init__(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#
_abc_impl = <_abc._abc_data object>#
_fit(X, y=None, **fit_params_steps)#
_preprocess_options(name, options)#

Preprocesses the options.

Parameters:
  • name – option name

  • options – conversion options

Returns:

new options

source on GitHub

_to_onnx(name, fitted_transformer, x_train, rewrite_ops=True, verbose=0)#

Converts a transformer into ONNX.

Parameters:
  • name – model name

  • fitted_transformer – fitted transformer

  • x_train – training dataset

  • rewrite_ops – use rewritten converters

  • verbose – display some information

Returns:

corresponding OnnxTransformer

source on GitHub

fit(X, y=None, **fit_params)#

Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.

Parameters:
  • X – iterable Training data. Must fulfill input requirements of first step of the pipeline.

  • y – iterable, default=None Training targets. Must fulfill label requirements for all steps of the pipeline.

  • fit_params – dict of string -> object Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns:

self, Pipeline, this estimator

source on GitHub