Compares filtering implementations (numpy, cython)

The benchmark looks into different ways to implement thresholding: every value of a vector superior to mx is replaced by mx. It compares several implementation to numpy.

import pprint
import numpy
import matplotlib.pyplot as plt
from pandas import DataFrame
from td3a_cpp.tutorial.experiment_cython import (
    pyfilter_dmax, filter_dmax_cython,
    filter_dmax_cython_optim,
    cyfilter_dmax,
    cfilter_dmax, cfilter_dmax2,
    cfilter_dmax16, cfilter_dmax4
)
from td3a_cpp.tools import measure_time_dim


def get_vectors(fct, n, h=200, dtype=numpy.float64):
    ctxs = [dict(va=numpy.random.randn(n).astype(dtype),
                 fil=fct,
                 mx=numpy.float64(0),
                 x_name=n)
            for n in range(10, n, h)]
    return ctxs


def numpy_filter(va, mx):
    va[va > mx] = mx


all_res = []
for fct in [numpy_filter,
            pyfilter_dmax, filter_dmax_cython,
            filter_dmax_cython_optim,
            cyfilter_dmax,
            cfilter_dmax, cfilter_dmax2,
            cfilter_dmax16, cfilter_dmax4]:

    print(fct)
    ctxs = get_vectors(fct, 1000 if fct == pyfilter_dmax else 40000)
    res = list(measure_time_dim('fil(va, mx)', ctxs, verbose=1))
    for r in res:
        r['fct'] = fct.__name__
    all_res.extend(res)

pprint.pprint(all_res[:2])

Out:

<function numpy_filter at 0x7f4ca951aaf0>

  0%|          | 0/200 [00:00<?, ?it/s]
  6%|6         | 12/200 [00:00<00:01, 109.54it/s]
 12%|#1        | 23/200 [00:00<00:01, 98.10it/s]
 16%|#6        | 33/200 [00:00<00:01, 90.80it/s]
 22%|##1       | 43/200 [00:00<00:01, 84.77it/s]
 26%|##6       | 52/200 [00:00<00:01, 79.85it/s]
 30%|###       | 61/200 [00:00<00:01, 74.88it/s]
 34%|###4      | 69/200 [00:00<00:01, 70.76it/s]
 38%|###8      | 77/200 [00:01<00:01, 66.87it/s]
 42%|####2     | 84/200 [00:01<00:01, 63.72it/s]
 46%|####5     | 91/200 [00:01<00:01, 60.49it/s]
 49%|####9     | 98/200 [00:01<00:01, 57.71it/s]
 52%|#####2    | 104/200 [00:01<00:01, 55.55it/s]
 55%|#####5    | 110/200 [00:01<00:01, 53.50it/s]
 58%|#####8    | 116/200 [00:01<00:01, 51.55it/s]
 61%|######1   | 122/200 [00:01<00:01, 49.75it/s]
 64%|######3   | 127/200 [00:02<00:01, 48.36it/s]
 66%|######6   | 132/200 [00:02<00:01, 47.27it/s]
 68%|######8   | 137/200 [00:02<00:01, 46.17it/s]
 71%|#######1  | 142/200 [00:02<00:01, 45.09it/s]
 74%|#######3  | 147/200 [00:02<00:01, 43.98it/s]
 76%|#######6  | 152/200 [00:02<00:01, 43.02it/s]
 78%|#######8  | 157/200 [00:02<00:01, 42.02it/s]
 81%|########1 | 162/200 [00:02<00:00, 41.18it/s]
 84%|########3 | 167/200 [00:02<00:00, 40.40it/s]
 86%|########6 | 172/200 [00:03<00:00, 39.63it/s]
 88%|########8 | 176/200 [00:03<00:00, 39.00it/s]
 90%|######### | 180/200 [00:03<00:00, 38.32it/s]
 92%|#########2| 184/200 [00:03<00:00, 37.62it/s]
 94%|#########3| 188/200 [00:03<00:00, 36.79it/s]
 96%|#########6| 192/200 [00:03<00:00, 36.27it/s]
 98%|#########8| 196/200 [00:03<00:00, 35.68it/s]
100%|##########| 200/200 [00:03<00:00, 35.21it/s]
100%|##########| 200/200 [00:03<00:00, 51.19it/s]
<built-in function pyfilter_dmax>

  0%|          | 0/5 [00:00<?, ?it/s]
 80%|########  | 4/5 [00:00<00:00, 26.02it/s]
100%|##########| 5/5 [00:00<00:00, 19.42it/s]
<built-in function filter_dmax_cython>

  0%|          | 0/200 [00:00<?, ?it/s]
 14%|#4        | 29/200 [00:00<00:00, 274.05it/s]
 28%|##8       | 57/200 [00:00<00:00, 159.94it/s]
 38%|###8      | 76/200 [00:00<00:01, 121.63it/s]
 45%|####5     | 90/200 [00:00<00:01, 101.16it/s]
 51%|#####1    | 102/200 [00:00<00:01, 86.92it/s]
 56%|#####6    | 112/200 [00:01<00:01, 76.97it/s]
 60%|######    | 121/200 [00:01<00:01, 69.13it/s]
 64%|######4   | 129/200 [00:01<00:01, 63.01it/s]
 68%|######8   | 136/200 [00:01<00:01, 58.20it/s]
 71%|#######1  | 142/200 [00:01<00:01, 54.38it/s]
 74%|#######4  | 148/200 [00:01<00:01, 50.92it/s]
 77%|#######7  | 154/200 [00:02<00:00, 47.84it/s]
 80%|#######9  | 159/200 [00:02<00:00, 45.52it/s]
 82%|########2 | 164/200 [00:02<00:00, 43.41it/s]
 84%|########4 | 169/200 [00:02<00:00, 41.50it/s]
 87%|########7 | 174/200 [00:02<00:00, 39.82it/s]
 89%|########9 | 178/200 [00:02<00:00, 38.56it/s]
 91%|#########1| 182/200 [00:02<00:00, 37.37it/s]
 93%|#########3| 186/200 [00:02<00:00, 36.23it/s]
 95%|#########5| 190/200 [00:03<00:00, 35.24it/s]
 97%|#########7| 194/200 [00:03<00:00, 34.33it/s]
 99%|#########9| 198/200 [00:03<00:00, 33.49it/s]
100%|##########| 200/200 [00:03<00:00, 59.24it/s]
<built-in function filter_dmax_cython_optim>

  0%|          | 0/200 [00:00<?, ?it/s]
 14%|#4        | 28/200 [00:00<00:00, 277.40it/s]
 28%|##8       | 56/200 [00:00<00:00, 161.27it/s]
 38%|###7      | 75/200 [00:00<00:01, 122.51it/s]
 45%|####5     | 90/200 [00:00<00:01, 100.77it/s]
 51%|#####1    | 102/200 [00:00<00:01, 86.84it/s]
 56%|#####6    | 112/200 [00:01<00:01, 76.94it/s]
 60%|######    | 121/200 [00:01<00:01, 69.14it/s]
 64%|######4   | 129/200 [00:01<00:01, 63.02it/s]
 68%|######8   | 136/200 [00:01<00:01, 58.17it/s]
 71%|#######1  | 142/200 [00:01<00:01, 54.36it/s]
 74%|#######4  | 148/200 [00:01<00:01, 50.89it/s]
 77%|#######7  | 154/200 [00:02<00:00, 47.81it/s]
 80%|#######9  | 159/200 [00:02<00:00, 45.48it/s]
 82%|########2 | 164/200 [00:02<00:00, 43.33it/s]
 84%|########4 | 169/200 [00:02<00:00, 41.37it/s]
 87%|########7 | 174/200 [00:02<00:00, 39.71it/s]
 89%|########9 | 178/200 [00:02<00:00, 38.46it/s]
 91%|#########1| 182/200 [00:02<00:00, 37.29it/s]
 93%|#########3| 186/200 [00:02<00:00, 36.21it/s]
 95%|#########5| 190/200 [00:03<00:00, 35.20it/s]
 97%|#########7| 194/200 [00:03<00:00, 34.29it/s]
 99%|#########9| 198/200 [00:03<00:00, 33.45it/s]
100%|##########| 200/200 [00:03<00:00, 59.13it/s]
<built-in function cyfilter_dmax>

  0%|          | 0/200 [00:00<?, ?it/s]
 14%|#3        | 27/200 [00:00<00:00, 262.24it/s]
 27%|##7       | 54/200 [00:00<00:00, 158.28it/s]
 36%|###6      | 73/200 [00:00<00:01, 120.70it/s]
 44%|####3     | 87/200 [00:00<00:01, 100.57it/s]
 50%|####9     | 99/200 [00:00<00:01, 86.56it/s]
 55%|#####4    | 109/200 [00:01<00:01, 76.69it/s]
 59%|#####8    | 118/200 [00:01<00:01, 69.00it/s]
 63%|######3   | 126/200 [00:01<00:01, 62.91it/s]
 66%|######6   | 133/200 [00:01<00:01, 58.12it/s]
 70%|######9   | 139/200 [00:01<00:01, 54.33it/s]
 72%|#######2  | 145/200 [00:01<00:01, 50.85it/s]
 76%|#######5  | 151/200 [00:02<00:01, 47.72it/s]
 78%|#######8  | 156/200 [00:02<00:00, 45.42it/s]
 80%|########  | 161/200 [00:02<00:00, 43.32it/s]
 83%|########2 | 166/200 [00:02<00:00, 41.45it/s]
 86%|########5 | 171/200 [00:02<00:00, 39.79it/s]
 88%|########7 | 175/200 [00:02<00:00, 38.53it/s]
 90%|########9 | 179/200 [00:02<00:00, 37.34it/s]
 92%|#########1| 183/200 [00:02<00:00, 36.25it/s]
 94%|#########3| 187/200 [00:03<00:00, 35.25it/s]
 96%|#########5| 191/200 [00:03<00:00, 34.34it/s]
 98%|#########7| 195/200 [00:03<00:00, 33.49it/s]
100%|#########9| 199/200 [00:03<00:00, 32.71it/s]
100%|##########| 200/200 [00:03<00:00, 57.89it/s]
<built-in function cfilter_dmax>

  0%|          | 0/200 [00:00<?, ?it/s]
 14%|#3        | 27/200 [00:00<00:00, 256.29it/s]
 26%|##6       | 53/200 [00:00<00:00, 157.22it/s]
 36%|###5      | 71/200 [00:00<00:01, 120.85it/s]
 42%|####2     | 85/200 [00:00<00:01, 100.31it/s]
 48%|####8     | 97/200 [00:00<00:01, 86.37it/s]
 54%|#####3    | 107/200 [00:01<00:01, 76.54it/s]
 58%|#####8    | 116/200 [00:01<00:01, 68.87it/s]
 62%|######2   | 124/200 [00:01<00:01, 62.79it/s]
 66%|######5   | 131/200 [00:01<00:01, 58.01it/s]
 68%|######8   | 137/200 [00:01<00:01, 54.22it/s]
 72%|#######1  | 143/200 [00:01<00:01, 50.78it/s]
 74%|#######4  | 149/200 [00:02<00:01, 47.74it/s]
 77%|#######7  | 154/200 [00:02<00:01, 45.40it/s]
 80%|#######9  | 159/200 [00:02<00:00, 43.30it/s]
 82%|########2 | 164/200 [00:02<00:00, 41.41it/s]
 84%|########4 | 169/200 [00:02<00:00, 39.75it/s]
 86%|########6 | 173/200 [00:02<00:00, 38.50it/s]
 88%|########8 | 177/200 [00:02<00:00, 37.34it/s]
 90%|######### | 181/200 [00:02<00:00, 36.23it/s]
 92%|#########2| 185/200 [00:03<00:00, 35.21it/s]
 94%|#########4| 189/200 [00:03<00:00, 34.27it/s]
 96%|#########6| 193/200 [00:03<00:00, 33.41it/s]
 98%|#########8| 197/200 [00:03<00:00, 32.62it/s]
100%|##########| 200/200 [00:03<00:00, 57.06it/s]
<built-in function cfilter_dmax2>

  0%|          | 0/200 [00:00<?, ?it/s]
 15%|#5        | 30/200 [00:00<00:00, 296.92it/s]
 30%|###       | 60/200 [00:00<00:00, 192.38it/s]
 41%|####1     | 82/200 [00:00<00:00, 149.62it/s]
 50%|####9     | 99/200 [00:00<00:00, 125.31it/s]
 56%|#####6    | 113/200 [00:00<00:00, 108.85it/s]
 62%|######2   | 125/200 [00:01<00:00, 96.82it/s]
 68%|######8   | 136/200 [00:01<00:00, 87.20it/s]
 72%|#######2  | 145/200 [00:01<00:00, 80.10it/s]
 77%|#######7  | 154/200 [00:01<00:00, 73.74it/s]
 81%|########1 | 162/200 [00:01<00:00, 68.69it/s]
 84%|########4 | 169/200 [00:01<00:00, 64.60it/s]
 88%|########8 | 176/200 [00:01<00:00, 60.92it/s]
 92%|#########1| 183/200 [00:02<00:00, 57.65it/s]
 94%|#########4| 189/200 [00:02<00:00, 55.11it/s]
 98%|#########7| 195/200 [00:02<00:00, 52.75it/s]
100%|##########| 200/200 [00:02<00:00, 83.26it/s]
<built-in function cfilter_dmax16>

  0%|          | 0/200 [00:00<?, ?it/s]
 13%|#3        | 26/200 [00:00<00:00, 257.26it/s]
 26%|##6       | 52/200 [00:00<00:00, 157.28it/s]
 35%|###5      | 70/200 [00:00<00:01, 120.52it/s]
 42%|####2     | 84/200 [00:00<00:01, 99.64it/s]
 48%|####8     | 96/200 [00:00<00:01, 85.28it/s]
 53%|#####3    | 106/200 [00:01<00:01, 75.52it/s]
 57%|#####6    | 114/200 [00:01<00:01, 68.43it/s]
 61%|######1   | 122/200 [00:01<00:01, 62.14it/s]
 64%|######4   | 129/200 [00:01<00:01, 57.44it/s]
 68%|######7   | 135/200 [00:01<00:01, 53.60it/s]
 70%|#######   | 141/200 [00:01<00:01, 50.12it/s]
 73%|#######3  | 146/200 [00:01<00:01, 47.34it/s]
 76%|#######5  | 151/200 [00:02<00:01, 44.85it/s]
 78%|#######8  | 156/200 [00:02<00:01, 42.73it/s]
 80%|########  | 161/200 [00:02<00:00, 40.79it/s]
 83%|########2 | 166/200 [00:02<00:00, 39.09it/s]
 85%|########5 | 170/200 [00:02<00:00, 37.79it/s]
 87%|########7 | 174/200 [00:02<00:00, 36.63it/s]
 89%|########9 | 178/200 [00:02<00:00, 35.55it/s]
 91%|#########1| 182/200 [00:03<00:00, 34.54it/s]
 93%|#########3| 186/200 [00:03<00:00, 33.68it/s]
 95%|#########5| 190/200 [00:03<00:00, 32.89it/s]
 97%|#########7| 194/200 [00:03<00:00, 32.08it/s]
 99%|#########9| 198/200 [00:03<00:00, 31.34it/s]
100%|##########| 200/200 [00:03<00:00, 55.51it/s]
<built-in function cfilter_dmax4>

  0%|          | 0/200 [00:00<?, ?it/s]
 13%|#3        | 26/200 [00:00<00:00, 247.69it/s]
 26%|##5       | 51/200 [00:00<00:00, 150.34it/s]
 34%|###4      | 69/200 [00:00<00:01, 114.46it/s]
 42%|####1     | 83/200 [00:00<00:01, 94.89it/s]
 47%|####6     | 94/200 [00:00<00:01, 82.09it/s]
 52%|#####1    | 103/200 [00:01<00:01, 73.03it/s]
 56%|#####5    | 111/200 [00:01<00:01, 65.91it/s]
 59%|#####8    | 118/200 [00:01<00:01, 60.22it/s]
 62%|######2   | 125/200 [00:01<00:01, 55.18it/s]
 66%|######5   | 131/200 [00:01<00:01, 51.33it/s]
 68%|######8   | 137/200 [00:01<00:01, 47.97it/s]
 71%|#######1  | 142/200 [00:02<00:01, 45.40it/s]
 74%|#######3  | 147/200 [00:02<00:01, 43.10it/s]
 76%|#######6  | 152/200 [00:02<00:01, 41.11it/s]
 78%|#######8  | 157/200 [00:02<00:01, 39.35it/s]
 80%|########  | 161/200 [00:02<00:01, 38.02it/s]
 82%|########2 | 165/200 [00:02<00:00, 36.70it/s]
 84%|########4 | 169/200 [00:02<00:00, 35.51it/s]
 86%|########6 | 173/200 [00:02<00:00, 34.44it/s]
 88%|########8 | 177/200 [00:03<00:00, 33.46it/s]
 90%|######### | 181/200 [00:03<00:00, 32.59it/s]
 92%|#########2| 185/200 [00:03<00:00, 31.77it/s]
 94%|#########4| 189/200 [00:03<00:00, 31.01it/s]
 96%|#########6| 193/200 [00:03<00:00, 30.30it/s]
 98%|#########8| 197/200 [00:03<00:00, 29.64it/s]
100%|##########| 200/200 [00:03<00:00, 29.15it/s]
100%|##########| 200/200 [00:03<00:00, 52.25it/s]
[{'average': 1.4140732004307213e-05,
  'context_size': 232,
  'deviation': 3.3578626948943653e-07,
  'fct': 'numpy_filter',
  'max_exec': 1.4873840264044702e-05,
  'min_exec': 1.3905640225857496e-05,
  'number': 50,
  'repeat': 10,
  'x_name': 10},
 {'average': 1.4223869889974595e-05,
  'context_size': 232,
  'deviation': 1.4830254474728954e-07,
  'fct': 'numpy_filter',
  'max_exec': 1.461125968489796e-05,
  'min_exec': 1.4097659732215107e-05,
  'number': 50,
  'repeat': 10,
  'x_name': 210}]

Let’s display the results

cc = DataFrame(all_res)
cc['N'] = cc['x_name']

fig, ax = plt.subplots(2, 2, figsize=(10, 10))
cc[cc.N <= 1100].pivot('N', 'fct', 'average').plot(
    logy=True, ax=ax[0, 0])
cc[cc.fct != 'pyfilter_dmax'].pivot('N', 'fct', 'average').plot(
    logy=True, ax=ax[0, 1])
cc[cc.fct != 'pyfilter_dmax'].pivot('N', 'fct', 'average').plot(
    logy=True, logx=True, ax=ax[1, 1])
cc[(cc.fct.str.contains('cfilter') |
    cc.fct.str.contains('numpy'))].pivot('N', 'fct', 'average').plot(
    logy=True, ax=ax[1, 0])
ax[0, 0].set_title("Comparison of filter implementations")
ax[0, 1].set_title("Comparison of filter implementations\n"
                   "without pyfilter_dmax")
Comparison of filter implementations, Comparison of filter implementations without pyfilter_dmax

Out:

Text(0.5, 1.0, 'Comparison of filter implementations\nwithout pyfilter_dmax')

The results depends on the machine, its number of cores, the compilation settings of numpy or this module.

plt.show()

Total running time of the script: ( 0 minutes 33.766 seconds)

Gallery generated by Sphinx-Gallery