.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_convert_zipmap.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_convert_zipmap.py: .. _l-rf-example-zipmap: Probabilities as a vector or as a ZipMap ======================================== A classifier usually returns a matrix of probabilities. By default, *sklearn-onnx* converts that matrix into a list of dictionaries where each probabily is mapped to its class id or name. That mechanism retains the class names. This conversion increases the prediction time and is not always needed. Let's see how to deactivate this behaviour on the Iris example. .. contents:: :local: Train a model and convert it ++++++++++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 25-48 .. code-block:: default from timeit import repeat import numpy import sklearn from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split import onnxruntime as rt import onnx import skl2onnx from skl2onnx.common.data_types import FloatTensorType from skl2onnx import convert_sklearn from sklearn.linear_model import LogisticRegression iris = load_iris() X, y = iris.data, iris.target X_train, X_test, y_train, y_test = train_test_split(X, y) clr = LogisticRegression(max_iter=500) clr.fit(X_train, y_train) print(clr) initial_type = [('float_input', FloatTensorType([None, 4]))] onx = convert_sklearn(clr, initial_types=initial_type, target_opset=12) .. rst-class:: sphx-glr-script-out .. code-block:: none LogisticRegression(max_iter=500) .. GENERATED FROM PYTHON SOURCE LINES 49-54 Output type +++++++++++ Let's confirm the output type of the probabilities is a list of dictionaries with onnxruntime. .. GENERATED FROM PYTHON SOURCE LINES 54-61 .. code-block:: default sess = rt.InferenceSession(onx.SerializeToString()) res = sess.run(None, {'float_input': X_test.astype(numpy.float32)}) print(res[1][:2]) print("probabilities type:", type(res[1])) print("type for the first observations:", type(res[1][0])) .. rst-class:: sphx-glr-script-out .. code-block:: none [{0: 0.013476976193487644, 1: 0.6533272862434387, 2: 0.3331957757472992}, {0: 0.00026752299163490534, 1: 0.14112624526023865, 2: 0.8586062788963318}] probabilities type: type for the first observations: .. GENERATED FROM PYTHON SOURCE LINES 62-66 Without ZipMap ++++++++++++++ Let's remove the ZipMap operator. .. GENERATED FROM PYTHON SOURCE LINES 66-78 .. code-block:: default initial_type = [('float_input', FloatTensorType([None, 4]))] options = {id(clr): {'zipmap': False}} onx2 = convert_sklearn(clr, initial_types=initial_type, options=options, target_opset=12) sess2 = rt.InferenceSession(onx2.SerializeToString()) res2 = sess2.run(None, {'float_input': X_test.astype(numpy.float32)}) print(res2[1][:2]) print("probabilities type:", type(res2[1])) print("type for the first observations:", type(res2[1][0])) .. rst-class:: sphx-glr-script-out .. code-block:: none [[1.3476976e-02 6.5332729e-01 3.3319578e-01] [2.6752299e-04 1.4112625e-01 8.5860628e-01]] probabilities type: type for the first observations: .. GENERATED FROM PYTHON SOURCE LINES 79-85 One output per class ++++++++++++++++++++ This options removes the final operator ZipMap and splits the probabilities into columns. The final model produces one output for the label, and one output per class. .. GENERATED FROM PYTHON SOURCE LINES 85-97 .. code-block:: default options = {id(clr): {'zipmap': 'columns'}} onx3 = convert_sklearn(clr, initial_types=initial_type, options=options, target_opset=12) sess3 = rt.InferenceSession(onx3.SerializeToString()) res3 = sess3.run(None, {'float_input': X_test.astype(numpy.float32)}) for i, out in enumerate(sess3.get_outputs()): print("output: '{}' shape={} values={}...".format( out.name, res3[i].shape, res3[i][:2])) .. rst-class:: sphx-glr-script-out .. code-block:: none output: 'output_label' shape=(38,) values=[1 2]... output: 'i0' shape=(38,) values=[0.01347698 0.00026752]... output: 'i1' shape=(38,) values=[0.6533273 0.14112625]... output: 'i2' shape=(38,) values=[0.33319578 0.8586063 ]... .. GENERATED FROM PYTHON SOURCE LINES 98-100 Let's compare prediction time +++++++++++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 100-122 .. code-block:: default X32 = X_test.astype(numpy.float32) print("Time with ZipMap:") print(repeat(lambda: sess.run(None, {'float_input': X32}), number=100, repeat=10)) print("Time without ZipMap:") print(repeat(lambda: sess2.run(None, {'float_input': X32}), number=100, repeat=10)) print("Time without ZipMap but with columns:") print(repeat(lambda: sess3.run(None, {'float_input': X32}), number=100, repeat=10)) # The prediction is much faster without ZipMap # on this example. # The optimisation is even faster when the classes # are described with strings and not integers # as the final result (list of dictionaries) may copy # many times the same information with onnxruntime. .. rst-class:: sphx-glr-script-out .. code-block:: none Time with ZipMap: [0.0320822442881763, 0.03609319170936942, 0.019995237234979868, 0.019970526918768883, 0.01996899675577879, 0.019931857008486986, 0.01996868709102273, 0.019934366922825575, 0.019932357594370842, 0.019963126629590988] Time without ZipMap: [0.012162216939032078, 0.012008977122604847, 0.012015177868306637, 0.011998637113720179, 0.012007178273051977, 0.011997698340564966, 0.011995207984000444, 0.012001477647572756, 0.011993267107754946, 0.012001668103039265] Time without ZipMap but with columns: [0.018674788996577263, 0.018497281707823277, 0.018507812172174454, 0.018481571692973375, 0.018697800114750862, 0.018521700985729694, 0.018493331968784332, 0.018508970737457275, 0.018511012196540833, 0.018496121745556593] .. GENERATED FROM PYTHON SOURCE LINES 123-124 **Versions used for this example** .. GENERATED FROM PYTHON SOURCE LINES 124-130 .. code-block:: default print("numpy:", numpy.__version__) print("scikit-learn:", sklearn.__version__) print("onnx: ", onnx.__version__) print("onnxruntime: ", rt.__version__) print("skl2onnx: ", skl2onnx.__version__) .. rst-class:: sphx-glr-script-out .. code-block:: none numpy: 1.23.5 scikit-learn: 1.2.2 onnx: 1.13.1 onnxruntime: 1.14.1 skl2onnx: 1.14.0 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.869 seconds) .. _sphx_glr_download_auto_examples_plot_convert_zipmap.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_convert_zipmap.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_convert_zipmap.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_