.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_benchmark_cdist.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_benchmark_cdist.py: .. _l-benchmark-cdist: Compare CDist with scipy ======================== The following example focuses on one particular operator, CDist and compares its execution time between *onnxruntime* and *scipy*. .. contents:: :local: ONNX Graph with CDist +++++++++++++++++++++ `cdist `_ function computes pairwise distances. .. GENERATED FROM PYTHON SOURCE LINES 24-42 .. code-block:: default from pprint import pprint from timeit import Timer import numpy as np from scipy.spatial.distance import cdist from tqdm import tqdm from pandas import DataFrame import onnx import onnxruntime as rt from onnxruntime import InferenceSession import skl2onnx from skl2onnx.algebra.custom_ops import OnnxCDist from skl2onnx.common.data_types import FloatTensorType X = np.ones((2, 4), dtype=np.float32) Y = np.ones((3, 4), dtype=np.float32) Y *= 2 print(cdist(X, Y, metric='euclidean')) .. rst-class:: sphx-glr-script-out .. code-block:: none [[2. 2. 2.] [2. 2. 2.]] .. GENERATED FROM PYTHON SOURCE LINES 43-44 ONNX .. GENERATED FROM PYTHON SOURCE LINES 44-52 .. code-block:: default op = OnnxCDist('X', 'Y', op_version=12, output_names=['Z'], metric='euclidean') onx = op.to_onnx({'X': X, 'Y': Y}, outputs=[('Z', FloatTensorType())]) print(onx) .. rst-class:: sphx-glr-script-out .. code-block:: none ir_version: 8 producer_name: "skl2onnx" producer_version: "1.14.0" domain: "ai.onnx" model_version: 0 graph { node { input: "X" input: "Y" output: "Z" name: "CD_CDist" op_type: "CDist" attribute { name: "metric" s: "euclidean" type: STRING } domain: "com.microsoft" } name: "OnnxCDist" input { name: "X" type { tensor_type { elem_type: 1 shape { dim { } dim { dim_value: 4 } } } } } input { name: "Y" type { tensor_type { elem_type: 1 shape { dim { } dim { dim_value: 4 } } } } } output { name: "Z" type { tensor_type { elem_type: 1 } } } } opset_import { domain: "com.microsoft" version: 1 } .. GENERATED FROM PYTHON SOURCE LINES 53-58 CDist and onnxruntime +++++++++++++++++++++ We compute the output of CDist operator with onnxruntime. .. GENERATED FROM PYTHON SOURCE LINES 58-64 .. code-block:: default sess = InferenceSession(onx.SerializeToString(), providers=["CPUExecutionProvider"]) res = sess.run(None, {'X': X, 'Y': Y}) print(res) .. rst-class:: sphx-glr-script-out .. code-block:: none [array([[1.9999999, 1.9999999, 1.9999999], [1.9999999, 2. , 2. ]], dtype=float32)] .. GENERATED FROM PYTHON SOURCE LINES 65-69 Benchmark +++++++++ Let's compare onnxruntime and scipy. .. GENERATED FROM PYTHON SOURCE LINES 69-86 .. code-block:: default def measure_time(name, stmt, context, repeat=100, number=20): tim = Timer(stmt, globals=context) res = np.array( tim.repeat(repeat=repeat, number=number)) res /= number mean = np.mean(res) dev = np.mean(res ** 2) dev = (dev - mean**2) ** 0.5 return dict( average=mean, deviation=dev, min_exec=np.min(res), max_exec=np.max(res), repeat=repeat, number=number, nrows=context['X'].shape[0], ncols=context['Y'].shape[1], name=name) .. GENERATED FROM PYTHON SOURCE LINES 87-88 scipy .. GENERATED FROM PYTHON SOURCE LINES 88-95 .. code-block:: default time_scipy = measure_time( "scipy", "cdist(X, Y)", context={'cdist': cdist, 'X': X, 'Y': Y}) pprint(time_scipy) .. rst-class:: sphx-glr-script-out .. code-block:: none {'average': 6.041005393490195e-05, 'deviation': 7.657981508466052e-05, 'max_exec': 0.000649342848919332, 'min_exec': 4.6733045019209384e-05, 'name': 'scipy', 'ncols': 4, 'nrows': 2, 'number': 20, 'repeat': 100} .. GENERATED FROM PYTHON SOURCE LINES 96-97 onnxruntime .. GENERATED FROM PYTHON SOURCE LINES 97-103 .. code-block:: default time_ort = measure_time( "ort", "sess.run(None, {'X': X, 'Y': Y})", context={'sess': sess, 'X': X, 'Y': Y}) pprint(time_ort) .. rst-class:: sphx-glr-script-out .. code-block:: none {'average': 8.588387724012136e-05, 'deviation': 8.050225022119505e-07, 'max_exec': 9.149259421974421e-05, 'min_exec': 8.511459454894066e-05, 'name': 'ort', 'ncols': 4, 'nrows': 2, 'number': 20, 'repeat': 100} .. GENERATED FROM PYTHON SOURCE LINES 104-105 Longer benchmark .. GENERATED FROM PYTHON SOURCE LINES 105-129 .. code-block:: default metrics = [] for dim in tqdm([10, 100, 1000, 10000]): # We cannot change the number of column otherwise # we need to create a new graph. X = np.random.randn(dim, 4).astype(np.float32) Y = np.random.randn(10, 4).astype(np.float32) time_scipy = measure_time( "scipy", "cdist(X, Y)", context={'cdist': cdist, 'X': X, 'Y': Y}) time_ort = measure_time( "ort", "sess.run(None, {'X': X, 'Y': Y})", context={'sess': sess, 'X': X, 'Y': Y}) metric = dict(N=dim, scipy=time_scipy['average'], ort=time_ort['average']) metrics.append(metric) df = DataFrame(metrics) df['scipy/ort'] = df['scipy'] / df['ort'] print(df) df.plot(x='N', y=['scipy/ort']) .. image-sg:: /auto_examples/images/sphx_glr_plot_benchmark_cdist_001.png :alt: plot benchmark cdist :srcset: /auto_examples/images/sphx_glr_plot_benchmark_cdist_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0/4 [00:00` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_benchmark_cdist.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_