scikit-learn API

This is the main class which makes it easy to insert to use the prediction from an ONNX files into a scikit-learn pipeline.


mlprodict.sklapi.OnnxTransformer (self, onnx_bytes, output_name = None, enforce_float32 = True, runtime = ‘python’, change_batch_size = None, reshape = False)

Calls onnxruntime or the runtime implemented in this package to transform input based on a ONNX graph. It follows scikit-learn API so that it can be included in a scikit-learn pipeline. See notebook Transfer Learning with ONNX for an example.

enumerate_create (onnx_bytes, output_names = None, enforce_float32 = True)

Creates multiple OnnxTransformer, one for each requested intermediate node.

onnx_bytes : bytes output_names: string

requested output names or None to request all and have method transform to store all of them in a dataframe


onnxruntime only supports float32, scikit-learn usually uses double floats, this parameter ensures that every array of double floats is converted into single floats

fit (self, X = None, y = None, fit_params)

Loads the ONNX model.

fit_transform (self, X, y = None, inputs)

Loads the ONNX model and runs the predictions.

onnx_converter (self)

Returns a converter for this model. If not overloaded, it fetches the converter mapped to the first scikit-learn parent it can find.

onnx_parser (self, scope = None, inputs = None)

Returns a parser for this model.

onnx_shape_calculator (self)

transform (self, X, y = None, inputs)

Runs the predictions. If X is a dataframe, the function assumes every columns is a separate input, otherwise, X is considered as a first input and inputs can be used to specify extra inputs.


mlprodict.sklapi.OnnxPipeline (self, steps, memory = None, verbose = False, output_name = None, enforce_float32 = True, runtime = ‘python’, options = None, white_op = None, black_op = None, final_types = None, op_version = None)

The pipeline overwrites method fit, it trains and converts every steps into ONNX before training the next step in order to minimize discrepencies. By default, ONNX is using float and not double which is the default for scikit-learn. It may introduce discrepencies when a non-continuous model (mathematical definition) such as tree ensemble and part of the pipeline.

fit (self, X, y = None, fit_params)

Fits the model, fits all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.