module sklapi.onnx_pipeline
#
Short summary#
module mlprodict.sklapi.onnx_pipeline
A pipeline which serializes into ONNX steps by steps.
Classes#
class |
truncated documentation |
---|---|
The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step … |
Properties#
property |
truncated documentation |
---|---|
|
|
|
|
|
HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should … |
|
The classes labels. Only exist if the last step is a classifier. |
|
Names of features seen during first step fit method. |
|
Number of features seen during first step fit method. |
|
Access the steps by name. Read-only attribute to access any step by given name. Keys are steps names and … |
Methods#
method |
truncated documentation |
---|---|
Preprocesses the options. |
|
Converts a transformer into ONNX. |
|
Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed … |
Documentation#
A pipeline which serializes into ONNX steps by steps.
- class mlprodict.sklapi.onnx_pipeline.OnnxPipeline(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#
Bases:
Pipeline
The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step in order to minimize discrepencies. By default, ONNX is using float and not double which is the default for scikit-learn. It may introduce discrepencies when a non-continuous model (mathematical definition) such as tree ensemble and part of the pipeline.
- Parameters:
steps – List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.
memory – str or object with the joblib.Memory interface, default=None Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute
named_steps
orsteps
to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.verbose – bool, default=False If True, the time elapsed while fitting each step will be printed as it is completed.
output_name – string requested output name or None to request all and have method transform to store all of them in a dataframe
enforce_float32 – boolean onnxruntime only supports float32, scikit-learn usually uses double floats, this parameter ensures that every array of double floats is converted into single floats
runtime – string, defined the runtime to use as described in
OnnxInference
.options – see
to_onnx
white_op – see
to_onnx
black_op – see
to_onnx
final_types – see
to_onnx
op_version – ONNX targeted opset
The class stores transformers before converting them into ONNX in attributes
raw_steps_
.See notebook Discrepencies with ONNX to see it can be used to reduce discrepencies after it was converted into ONNX.
- __abstractmethods__ = frozenset({})#
- __init__(steps, *, memory=None, verbose=False, output_name=None, enforce_float32=True, runtime='python', options=None, white_op=None, black_op=None, final_types=None, op_version=None)#
- _abc_impl = <_abc._abc_data object>#
- _fit(X, y=None, **fit_params_steps)#
- _preprocess_options(name, options)#
Preprocesses the options.
- Parameters:
name – option name
options – conversion options
- Returns:
new options
- _to_onnx(name, fitted_transformer, x_train, rewrite_ops=True, verbose=0)#
Converts a transformer into ONNX.
- Parameters:
name – model name
fitted_transformer – fitted transformer
x_train – training dataset
rewrite_ops – use rewritten converters
verbose – display some information
- Returns:
corresponding
OnnxTransformer
- fit(X, y=None, **fit_params)#
Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.
- Parameters:
X – iterable Training data. Must fulfill input requirements of first step of the pipeline.
y – iterable, default=None Training targets. Must fulfill label requirements for all steps of the pipeline.
fit_params – dict of string -> object Parameters passed to the
fit
method of each step, where each parameter name is prefixed such that parameterp
for steps
has keys__p
.
- Returns:
self, Pipeline, this estimator