.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_investigate_pipeline.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_investigate_pipeline.py: Investigate a pipeline ====================== The following example shows how to look into a converted models and easily find errors at every step of the pipeline. .. contents:: :local: Create a pipeline +++++++++++++++++ We reuse the pipeline implemented in example `Pipelining: chaining a PCA and a logistic regression `_. There is one change because `ONNX-ML Imputer `_ does not handle string type. This cannot be part of the final ONNX pipeline and must be removed. Look for comment starting with ``---`` below. .. GENERATED FROM PYTHON SOURCE LINES 28-54 .. code-block:: default import skl2onnx import onnx import sklearn import numpy import pickle from skl2onnx.helpers import collect_intermediate_steps import onnxruntime as rt from onnxconverter_common.data_types import FloatTensorType from skl2onnx import convert_sklearn import numpy as np import pandas as pd from sklearn import datasets from sklearn.decomposition import PCA from sklearn.linear_model import LogisticRegression from sklearn.pipeline import Pipeline pipe = Pipeline(steps=[('pca', PCA()), ('logistic', LogisticRegression())]) digits = datasets.load_digits() X_digits = digits.data[:1000] y_digits = digits.target[:1000] pipe.fit(X_digits, y_digits) .. rst-class:: sphx-glr-script-out .. code-block:: none somewhere/.local/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:458: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result( .. raw:: html
Pipeline(steps=[('pca', PCA()), ('logistic', LogisticRegression())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


.. GENERATED FROM PYTHON SOURCE LINES 55-57 Conversion to ONNX ++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 57-71 .. code-block:: default initial_types = [('input', FloatTensorType((None, X_digits.shape[1])))] model_onnx = convert_sklearn(pipe, initial_types=initial_types, target_opset=12) sess = rt.InferenceSession(model_onnx.SerializeToString()) print("skl predict_proba") print(pipe.predict_proba(X_digits[:2])) onx_pred = sess.run(None, {'input': X_digits[:2].astype(np.float32)})[1] df = pd.DataFrame(onx_pred) print("onnx predict_proba") print(df.values) .. rst-class:: sphx-glr-script-out .. code-block:: none skl predict_proba [[9.99998536e-01 5.99063276e-19 3.48549017e-10 1.55765751e-08 3.32559797e-10 1.21314675e-06 3.98959988e-08 1.22513856e-07 2.23871277e-08 4.98148546e-08] [1.47648456e-14 9.99999301e-01 1.05811967e-10 7.49298733e-13 2.48627429e-07 8.75686427e-12 5.39025146e-11 2.95899945e-11 4.50528884e-07 1.30607495e-13]] onnx predict_proba [[9.99998569e-01 5.99060278e-19 3.48550383e-10 1.55766511e-08 3.32561811e-10 1.21315134e-06 3.98961149e-08 1.22514820e-07 2.23872494e-08 4.98151529e-08] [1.47648956e-14 9.99999285e-01 1.05811790e-10 7.49297488e-13 2.48627401e-07 8.75685548e-12 5.39024415e-11 2.95900075e-11 4.50528205e-07 1.30607344e-13]] .. GENERATED FROM PYTHON SOURCE LINES 72-80 Intermediate steps ++++++++++++++++++ Let's imagine the final output is wrong and we need to look into each component of the pipeline which one is failing. The following method modifies the scikit-learn pipeline to steal the intermediate outputs and produces an smaller ONNX graph for every operator. .. GENERATED FROM PYTHON SOURCE LINES 80-100 .. code-block:: default steps = collect_intermediate_steps(pipe, "pipeline", initial_types) assert len(steps) == 2 pipe.predict_proba(X_digits[:2]) for i, step in enumerate(steps): onnx_step = step['onnx_step'] sess = rt.InferenceSession(onnx_step.SerializeToString()) onnx_outputs = sess.run(None, {'input': X_digits[:2].astype(np.float32)}) skl_outputs = step['model']._debug.outputs print("step 1", type(step['model'])) print("skl outputs") print(skl_outputs) print("onnx outputs") print(onnx_outputs) .. rst-class:: sphx-glr-script-out .. code-block:: none step 1 skl outputs {'transform': array([[-9.78697129e+00, 7.22639567e+00, -2.16935601e+01, 1.13765854e+01, -3.54566122e+00, -5.59543345e+00, 4.71459904e+00, 4.29410146e+00, -5.71520266e+00, -3.31533698e+00, -3.42040920e-01, 2.90474751e+00, -3.18177631e-01, -6.66363079e-01, -2.82714171e+00, 5.91632481e+00, -9.69544780e-01, 1.92676767e+00, 1.71450677e+00, 9.60454853e-01, 3.81570991e-01, -1.37130203e+00, 4.29353551e+00, 2.32392659e+00, 7.13256034e-01, 3.00982060e+00, -1.98303620e+00, -4.81811365e-01, -1.90930400e-01, 2.03950266e+00, 1.59803428e+00, -1.46831581e+00, -1.70903280e+00, 7.93109126e-02, -1.62244448e-01, 5.10619572e-02, -6.63308841e-01, 1.35869345e+00, -1.03930533e+00, 2.09485311e+00, 2.15669105e+00, -7.78040093e-02, -4.01347652e-02, 8.40159293e-01, -4.74891758e-01, -1.14564701e-01, -5.31817617e-02, -6.87010227e-01, -1.29090165e-01, 2.12032919e-01, 3.63901656e-01, -1.29285214e-01, -8.14384613e-02, -3.82919696e-02, -9.76885583e-03, -1.39046240e-02, 1.59100433e-03, -2.87444919e-03, 5.75119957e-03, 1.85595427e-03, -5.00911047e-03, -2.97522053e-16, -3.62721779e-16, -9.16970102e-16], [ 1.54267314e+01, -4.91291516e+00, 1.74676972e+01, -1.13960509e+01, 5.64555024e+00, -5.73696034e+00, -2.08026490e+00, 5.23721537e+00, 3.37859393e+00, 3.60754149e+00, 2.90967608e+00, -3.75628331e+00, -1.21238177e+00, -5.21796290e+00, -4.95051435e+00, -4.01835168e+00, -2.97046115e+00, -5.64772188e+00, 5.61898054e+00, -4.32016109e+00, 1.97701819e+00, -3.39030059e+00, -5.67779351e-01, 6.70107684e-01, 6.31443589e+00, 8.65991552e-01, -1.58633137e-01, -3.52940090e+00, -6.81737794e-01, 2.47187038e+00, 1.21588602e+00, -2.22346979e+00, 1.37364649e+00, -1.79895009e+00, 3.03710592e+00, -2.63278986e+00, 3.68918985e+00, -6.08509461e-01, 2.45039011e-01, -6.63479061e-01, -1.50727140e+00, 1.10449110e+00, -4.58384385e-01, 3.40399894e-01, -2.67878895e-01, -1.87647893e+00, -2.04332870e-01, 4.61919057e-01, -2.44538953e-02, 8.66380644e-04, -7.56583008e-02, 1.91237218e-01, -4.73950435e-02, 2.74122911e-02, 4.32524378e-03, -3.66956686e-03, -1.88790754e-03, 5.22119207e-03, -1.86775268e-03, -5.07041881e-03, -1.70805502e-03, -1.36935166e-15, 1.87600612e-15, 2.24048193e-16]])} onnx outputs [array([[-9.78696918e+00, 7.22639418e+00, -2.16935596e+01, 1.13765869e+01, -3.54566169e+00, -5.59543419e+00, 4.71459913e+00, 4.29410172e+00, -5.71520233e+00, -3.31533766e+00, -3.42040956e-01, 2.90474820e+00, -3.18177521e-01, -6.66362762e-01, -2.82714176e+00, 5.91632557e+00, -9.69544411e-01, 1.92676878e+00, 1.71450722e+00, 9.60455000e-01, 3.81571233e-01, -1.37130189e+00, 4.29353571e+00, 2.32392645e+00, 7.13255644e-01, 3.00982046e+00, -1.98303699e+00, -4.81811345e-01, -1.90929428e-01, 2.03950262e+00, 1.59803391e+00, -1.46831608e+00, -1.70903313e+00, 7.93111920e-02, -1.62244320e-01, 5.10618053e-02, -6.63308740e-01, 1.35869300e+00, -1.03930485e+00, 2.09485340e+00, 2.15669107e+00, -7.78042451e-02, -4.01348546e-02, 8.40159178e-01, -4.74891752e-01, -1.14564836e-01, -5.31820133e-02, -6.87010109e-01, -1.29090160e-01, 2.12032914e-01, 3.63901585e-01, -1.29285201e-01, -8.14384818e-02, -3.82919535e-02, -9.76885855e-03, -1.39046190e-02, 1.59100478e-03, -2.87444773e-03, 5.75120095e-03, 1.85595371e-03, -5.00910869e-03, -2.97522440e-16, -3.62721478e-16, -9.16970234e-16], [ 1.54267330e+01, -4.91291523e+00, 1.74676952e+01, -1.13960505e+01, 5.64554977e+00, -5.73695993e+00, -2.08026505e+00, 5.23721552e+00, 3.37859440e+00, 3.60754132e+00, 2.90967607e+00, -3.75628376e+00, -1.21238267e+00, -5.21796274e+00, -4.95051432e+00, -4.01835108e+00, -2.97046089e+00, -5.64772320e+00, 5.61898041e+00, -4.32016087e+00, 1.97701883e+00, -3.39030123e+00, -5.67779183e-01, 6.70107961e-01, 6.31443739e+00, 8.65990698e-01, -1.58633396e-01, -3.52940130e+00, -6.81737602e-01, 2.47187066e+00, 1.21588552e+00, -2.22346997e+00, 1.37364626e+00, -1.79894996e+00, 3.03710651e+00, -2.63278961e+00, 3.68919039e+00, -6.08509362e-01, 2.45039523e-01, -6.63479626e-01, -1.50727117e+00, 1.10449147e+00, -4.58384544e-01, 3.40399921e-01, -2.67879218e-01, -1.87647831e+00, -2.04333454e-01, 4.61918980e-01, -2.44538486e-02, 8.66428018e-04, -7.56583139e-02, 1.91237226e-01, -4.73950393e-02, 2.74122953e-02, 4.32525249e-03, -3.66956624e-03, -1.88790716e-03, 5.22119366e-03, -1.86775310e-03, -5.07041905e-03, -1.70805631e-03, -1.36935253e-15, 1.87600603e-15, 2.24048182e-16]], dtype=float32)] step 1 skl outputs {'decision_function': array([[9.99998536e-01, 5.99063276e-19, 3.48549017e-10, 1.55765751e-08, 3.32559797e-10, 1.21314675e-06, 3.98959988e-08, 1.22513856e-07, 2.23871277e-08, 4.98148546e-08], [1.47648456e-14, 9.99999301e-01, 1.05811967e-10, 7.49298733e-13, 2.48627429e-07, 8.75686427e-12, 5.39025146e-11, 2.95899945e-11, 4.50528884e-07, 1.30607495e-13]]), 'predict_proba': array([[9.99998536e-01, 5.99063276e-19, 3.48549017e-10, 1.55765751e-08, 3.32559797e-10, 1.21314675e-06, 3.98959988e-08, 1.22513856e-07, 2.23871277e-08, 4.98148546e-08], [1.47648456e-14, 9.99999301e-01, 1.05811967e-10, 7.49298733e-13, 2.48627429e-07, 8.75686427e-12, 5.39025146e-11, 2.95899945e-11, 4.50528884e-07, 1.30607495e-13]])} onnx outputs [array([0, 1], dtype=int64), array([[9.9999857e-01, 5.9906028e-19, 3.4855038e-10, 1.5576651e-08, 3.3256181e-10, 1.2131513e-06, 3.9896115e-08, 1.2251482e-07, 2.2387249e-08, 4.9815153e-08], [1.4764896e-14, 9.9999928e-01, 1.0581179e-10, 7.4929749e-13, 2.4862740e-07, 8.7568555e-12, 5.3902442e-11, 2.9590008e-11, 4.5052820e-07, 1.3060734e-13]], dtype=float32)] .. GENERATED FROM PYTHON SOURCE LINES 101-108 Pickle ++++++ Each steps is a separate model in the pipeline. It can be pickle independetly from the others. Attribute *_debug* contains all the information needed to *replay* the prediction of the model. .. GENERATED FROM PYTHON SOURCE LINES 108-126 .. code-block:: default to_save = { 'model': steps[1]['model'], 'data_input': steps[1]['model']._debug.inputs, 'data_output': steps[1]['model']._debug.outputs, 'inputs': steps[1]['inputs'], 'outputs': steps[1]['outputs'], } del steps[1]['model']._debug with open('classifier.pkl', 'wb') as f: pickle.dump(to_save, f) with open('classifier.pkl', 'rb') as f: restored = pickle.load(f) print(restored['model'].predict_proba(restored['data_input']['predict_proba'])) .. rst-class:: sphx-glr-script-out .. code-block:: none [[9.99998536e-01 5.99063276e-19 3.48549017e-10 1.55765751e-08 3.32559797e-10 1.21314675e-06 3.98959988e-08 1.22513856e-07 2.23871277e-08 4.98148546e-08] [1.47648456e-14 9.99999301e-01 1.05811967e-10 7.49298733e-13 2.48627429e-07 8.75686427e-12 5.39025146e-11 2.95899945e-11 4.50528884e-07 1.30607495e-13]] .. GENERATED FROM PYTHON SOURCE LINES 127-128 **Versions used for this example** .. GENERATED FROM PYTHON SOURCE LINES 128-134 .. code-block:: default print("numpy:", numpy.__version__) print("scikit-learn:", sklearn.__version__) print("onnx: ", onnx.__version__) print("onnxruntime: ", rt.__version__) print("skl2onnx: ", skl2onnx.__version__) .. rst-class:: sphx-glr-script-out .. code-block:: none numpy: 1.23.5 scikit-learn: 1.2.2 onnx: 1.13.1 onnxruntime: 1.14.1 skl2onnx: 1.14.0 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.913 seconds) .. _sphx_glr_download_auto_examples_plot_investigate_pipeline.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_investigate_pipeline.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_investigate_pipeline.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_