Note

Go to the end to download the full example code

Probabilities or raw scores#

A classifier usually returns a matrix of probabilities. By default, sklearn-onnx creates an ONNX graph which returns probabilities but it may skip that step and return raw scores if the model implements the method decision_function. Option 'raw_scores' is used to change the default behaviour. Let’s see that on a simple example.

Train a model and convert it #

import numpy
import sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import onnxruntime as rt
import onnx
import skl2onnx
from skl2onnx.common.data_types import FloatTensorType
from skl2onnx import convert_sklearn
from sklearn.linear_model import LogisticRegression

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
clr = LogisticRegression(max_iter=500)
clr.fit(X_train, y_train)
print(clr)

initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type,
                      target_opset=12)

LogisticRegression(max_iter=500)

Output type #

Let’s confirm the output type of the probabilities is a list of dictionaries with onnxruntime.

sess = rt.InferenceSession(onx.SerializeToString(),
                           providers=["CPUExecutionProvider"])
res = sess.run(None, {'float_input': X_test.astype(numpy.float32)})
print("skl", clr.predict_proba(X_test[:1]))
print("onnx", res[1][:2])

skl [[9.67640110e-01 3.23597766e-02 1.13857252e-07]]
onnx [{0: 0.9676401615142822, 1: 0.0323597677052021, 2: 1.1385726850221545e-07}, {0: 0.9700090885162354, 1: 0.029990874230861664, 2: 7.182367056657313e-08}]

Raw scores and decision_function #

initial_type = [('float_input', FloatTensorType([None, 4]))]
options = {id(clr): {'raw_scores': True}}
onx2 = convert_sklearn(clr, initial_types=initial_type, options=options,
                       target_opset=12)

sess2 = rt.InferenceSession(onx2.SerializeToString(),
                            providers=["CPUExecutionProvider"])
res2 = sess2.run(None, {'float_input': X_test.astype(numpy.float32)})
print("skl", clr.decision_function(X_test[:1]))
print("onnx", res2[1][:2])

skl [[ 6.45112312  3.05317907 -9.50430219]]
onnx [{0: 6.451123237609863, 1: 3.0531787872314453, 2: -9.504302024841309}, {0: 6.631670951843262, 1: 3.1552586555480957, 2: -9.786930084228516}]

Versions used for this example

print("numpy:", numpy.__version__)
print("scikit-learn:", sklearn.__version__)
print("onnx: ", onnx.__version__)
print("onnxruntime: ", rt.__version__)
print("skl2onnx: ", skl2onnx.__version__)

numpy: 1.23.5
scikit-learn: 1.2.2
onnx:  1.13.1
onnxruntime:  1.14.1
skl2onnx:  1.14.0

Total running time of the script: ( 0 minutes 0.241 seconds)

Gallery generated by Sphinx-Gallery