.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "gyexamples/plot_op_add.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_gyexamples_plot_op_add.py: .. _l-b-add: Compares implementations of Add =============================== This example compares the addition of *numpy* to :epkg:`onnxruntime` implementation. Function :epkg:`numpy:add` is repeated 3 times. This minimizes the cost of copying the data from python to an external library. If available, :epkg:`tensorflow` and :epkg:`pytorch` are included as well. The numpy implementation is not the best, it allocates more buffers than necessary because parameter *out* is not used to reuse buffers. .. contents:: :local: .. GENERATED FROM PYTHON SOURCE LINES 21-32 .. code-block:: default import numpy import pandas import matplotlib.pyplot as plt from onnxruntime import InferenceSession from skl2onnx.common.data_types import FloatTensorType from skl2onnx.algebra.onnx_ops import OnnxAdd from cpyquickhelper.numbers import measure_time from tqdm import tqdm from mlprodict.testing.experimental_c_impl.experimental_c import code_optimisation print(code_optimisation()) .. rst-class:: sphx-glr-script-out .. code-block:: none AVX-omp=8 .. GENERATED FROM PYTHON SOURCE LINES 33-35 Add implementations +++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 35-162 .. code-block:: default try: from tensorflow.math import add as tf_add from tensorflow import convert_to_tensor except ImportError: tf_add = None try: from torch import add as torch_add, from_numpy except ImportError: torch_add = None def build_ort_add(op_version=12): node1 = OnnxAdd('x', 'y', op_version=op_version) node2 = OnnxAdd(node1, 'y', op_version=op_version) node = OnnxAdd(node2, 'y', op_version=op_version, output_names=['z']) onx = node.to_onnx(inputs=[('x', FloatTensorType()), ('y', FloatTensorType())], target_opset=op_version) sess = InferenceSession(onx.SerializeToString()) return lambda x, y: sess.run(None, {'x': x, 'y': y}) def loop_fct(fct, xs, ys): for x, y in zip(xs, ys): fct(x, y) def benchmark_op(repeat=5, number=2, name="Add", shape_fcts=None): if shape_fcts is None: def shape_fct(dim): return (5, dim, dim) shape_fcts = (shape_fct, shape_fct) ort_fct = build_ort_add() res = [] for dim in tqdm([8, 16, 32, 64, 100, 128, 200, 256, 400, 512, 1024, 1536, 2048, 2560]): shape1 = shape_fcts[0](dim) shape2 = shape_fcts[1](dim) n_arrays = (16 if dim < 512 else 4) if dim < 2048 else 4 if len(shape1) > 3: n_arrays = int(n_arrays / 4) xs = [numpy.random.rand(*shape1).astype(numpy.float32) for _ in range(n_arrays)] ys = [numpy.random.rand(*shape2).astype(numpy.float32) for _ in range(n_arrays)] info = dict(shape1=shape1, shape2=shape2) # numpy ctx = dict( xs=xs, ys=ys, fct=lambda x, y: numpy.add(numpy.add(numpy.add(x, y), y), y), loop_fct=loop_fct) obs = measure_time( "loop_fct(fct, xs, ys)", div_by_number=True, context=ctx, repeat=repeat, number=number) obs['dim'] = dim obs['fct'] = 'numpy' obs.update(info) res.append(obs) # onnxruntime ctx['fct'] = ort_fct obs = measure_time( "loop_fct(fct, xs, ys)", div_by_number=True, context=ctx, repeat=repeat, number=number) obs['dim'] = dim obs['fct'] = 'ort' obs.update(info) res.append(obs) if tf_add is not None: # tensorflow ctx['fct'] = lambda x, y: tf_add(tf_add(tf_add(x, y), y), y) ctx['xs'] = [convert_to_tensor(x) for x in xs] ctx['ys'] = [convert_to_tensor(y) for y in ys] obs = measure_time( "loop_fct(fct, xs, ys)", div_by_number=True, context=ctx, repeat=repeat, number=number) obs['dim'] = dim obs['fct'] = 'tf' obs.update(info) res.append(obs) if torch_add is not None: # torch ctx['fct'] = lambda x, y: torch_add( torch_add(torch_add(x, y), y), y) ctx['xs'] = [from_numpy(x) for x in xs] ctx['ys'] = [from_numpy(y) for y in ys] obs = measure_time( "loop_fct(fct, xs, ys)", div_by_number=True, context=ctx, repeat=repeat, number=number) obs['dim'] = dim obs['fct'] = 'torch' obs.update(info) res.append(obs) # Dataframes shape1_name = str(shape1).replace(str(dim), "N") shape2_name = str(shape2).replace(str(dim), "N") df = pandas.DataFrame(res) df.columns = [_.replace('dim', 'N') for _ in df.columns] piv = df.pivot('N', 'fct', 'average') rs = piv.copy() for c in ['ort', 'torch', 'tf']: if c in rs.columns: rs[c] = rs['numpy'] / rs[c] rs['numpy'] = 1. # Graphs. fig, ax = plt.subplots(1, 2, figsize=(12, 4)) piv.plot(logx=True, logy=True, ax=ax[0], title=f"{name} benchmark\n{shape1_name} + {shape2_name} lower better") ax[0].legend(prop={"size": 9}) rs.plot(logx=True, logy=True, ax=ax[1], title="%s Speedup, baseline=numpy\n%s + %s" " higher better" % (name, shape1_name, shape2_name)) ax[1].plot([min(rs.index), max(rs.index)], [0.5, 0.5], 'g--') ax[1].plot([min(rs.index), max(rs.index)], [2., 2.], 'g--') ax[1].legend(prop={"size": 9}) return df, rs, ax dfs = [] .. GENERATED FROM PYTHON SOURCE LINES 163-165 (5, N, N) + (5, N, N) +++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 165-170 .. code-block:: default df, piv, ax = benchmark_op() dfs.append(df) df.pivot("fct", "N", "average") .. image-sg:: /gyexamples/images/sphx_glr_plot_op_add_001.png :alt: Add benchmark (5, N, N) + (5, N, N) lower better, Add Speedup, baseline=numpy (5, N, N) + (5, N, N) higher better :srcset: /gyexamples/images/sphx_glr_plot_op_add_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0/14 [00:00
N 8 16 32 64 100 128 200 256 400 512 1024 1536 2048 2560
fct
numpy 0.000266 0.000338 0.000490 0.001203 0.002968 0.006525 0.020246 0.033073 0.076052 0.034979 0.124223 0.277049 0.488542 0.891990
ort 0.000946 0.001023 0.001215 0.002165 0.005590 0.010304 0.014817 0.025639 0.056736 0.023845 0.093362 0.195504 0.342244 0.535457
torch 0.009631 0.000759 0.000988 0.001892 0.285306 0.334686 0.338003 0.289207 0.368227 0.096349 0.143716 0.200749 0.313160 0.521078


.. GENERATED FROM PYTHON SOURCE LINES 171-173 (5, N, N) + (5, N, 1) +++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 173-181 .. code-block:: default shape_fcts = (lambda dim: (5, dim, dim), lambda dim: (5, dim, 1)) df, piv, ax = benchmark_op(shape_fcts=shape_fcts) dfs.append(df) df.pivot("fct", "N", "average") .. image-sg:: /gyexamples/images/sphx_glr_plot_op_add_002.png :alt: Add benchmark (5, N, N) + (5, N, 1) lower better, Add Speedup, baseline=numpy (5, N, N) + (5, N, 1) higher better :srcset: /gyexamples/images/sphx_glr_plot_op_add_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0/14 [00:00
N 8 16 32 64 100 128 200 256 400 512 1024 1536 2048 2560
fct
numpy 0.000686 0.000781 0.001101 0.002378 0.003684 0.006964 0.019006 0.031303 0.067944 0.027432 0.119001 0.257607 0.458008 0.869021
ort 0.001072 0.001217 0.001615 0.002975 0.006414 0.009735 0.012881 0.019354 0.048007 0.020406 0.083684 0.166027 0.290346 0.445782
torch 0.000857 0.000955 0.001425 0.003189 0.237103 0.179329 0.223975 0.337264 0.346147 0.090481 0.127216 0.181796 0.404321 0.446880


.. GENERATED FROM PYTHON SOURCE LINES 182-184 (5, N, N) + (5, 1, N) +++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 184-192 .. code-block:: default shape_fcts = (lambda dim: (5, dim, dim), lambda dim: (5, 1, dim)) df, piv, ax = benchmark_op(shape_fcts=shape_fcts) dfs.append(df) df.pivot("fct", "N", "average") .. image-sg:: /gyexamples/images/sphx_glr_plot_op_add_003.png :alt: Add benchmark (5, N, N) + (5, 1, N) lower better, Add Speedup, baseline=numpy (5, N, N) + (5, 1, N) higher better :srcset: /gyexamples/images/sphx_glr_plot_op_add_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0/14 [00:00
N 8 16 32 64 100 128 200 256 400 512 1024 1536 2048 2560
fct
numpy 0.000692 0.000814 0.001024 0.001922 0.003605 0.007358 0.019055 0.031873 0.066891 0.030239 0.103410 0.234905 0.502303 0.717315
ort 0.001150 0.001344 0.001667 0.003009 0.006611 0.010113 0.012950 0.020912 0.047534 0.019450 0.074938 0.164812 0.297489 0.449658
torch 0.000859 0.000915 0.001113 0.001919 0.177483 0.069487 0.178411 0.204143 0.289458 0.091763 0.118653 0.164023 0.299025 0.578191


.. GENERATED FROM PYTHON SOURCE LINES 193-195 (5, N, 5, N) + (1, N, 1, 1) +++++++++++++++++++++++++++ .. GENERATED FROM PYTHON SOURCE LINES 195-203 .. code-block:: default shape_fcts = (lambda dim: (5, dim, 5, dim), lambda dim: (1, dim, 1, 1)) df, piv, ax = benchmark_op(shape_fcts=shape_fcts) dfs.append(df) df.pivot("fct", "N", "average") .. image-sg:: /gyexamples/images/sphx_glr_plot_op_add_004.png :alt: Add benchmark (5, N, 5, N) + (1, N, 1, 1) lower better, Add Speedup, baseline=numpy (5, N, 5, N) + (1, N, 1, 1) higher better :srcset: /gyexamples/images/sphx_glr_plot_op_add_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0/14 [00:00
N 8 16 32 64 100 128 200 256 400 512 1024 1536 2048 2560
fct
numpy 0.000206 0.000279 0.000553 0.002425 0.005651 0.009622 0.020826 0.035399 0.083106 0.032821 0.129877 0.482247 0.750446 1.127728
ort 0.000302 0.000375 0.000714 0.001707 0.003838 0.006104 0.015999 0.025792 0.057261 0.022947 0.094185 0.286709 0.605408 1.185481
torch 0.000275 0.000400 0.000943 0.083589 0.026481 0.074657 0.091110 0.091277 0.114947 0.034710 0.116979 0.281791 0.495723 0.751140


.. GENERATED FROM PYTHON SOURCE LINES 204-213 Conclusion ++++++++++ It is difficult to have a final conclusion as the addition of two vectors is of the same order of magnitude of a copy between python and the C++ code of onnxruntime, pytorch or tensorflow. numpy is much better of small vectors. onnxruntime, pytorch and tensorflow are not optimized on this case because it is not very common in deep learning. .. GENERATED FROM PYTHON SOURCE LINES 213-221 .. code-block:: default merged = pandas.concat(dfs) name = "add" merged.to_csv(f"plot_{name}.csv", index=False) merged.to_excel(f"plot_{name}.xlsx", index=False) plt.savefig(f"plot_{name}.png") plt.show() .. image-sg:: /gyexamples/images/sphx_glr_plot_op_add_005.png :alt: plot op add :srcset: /gyexamples/images/sphx_glr_plot_op_add_005.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 5 minutes 6.325 seconds) .. _sphx_glr_download_gyexamples_plot_op_add.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_op_add.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_op_add.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_