Profile a function

pyinstrument is a profiler for python functions. It tells how long the program stays in every function. Because numerical functions are usually very short, it is usually helpful to call them multiple times before the cause becomes significant.

from sys import executable
import sys
import numpy
from pyinstrument import Profiler
from pyquickhelper.loghelper import run_cmd
from td3a_cpp.tutorial import pydot, cblas_ddot


va = numpy.random.randn(100000).astype(numpy.float64)
vb = numpy.random.randn(100000).astype(numpy.float64)


def f1_python(va, vb, n=10):
    # pydot is really too slow, let's run 10 times only
    for i in range(n):
        pydot(va, vb)


def f2_numpy(va, vb, n=100000):
    for i in range(n):
        numpy.dot(va, vb)


def f3_blas(va, vb, n=100000):
    for i in range(n):
        cblas_ddot(va, vb)


if '--pyspy' in sys.argv:
    # When called with option --pyspy
    f1_python(va, vb)
    f2_numpy(va, vb)
    f3_blas(va, vb)
    profiler = None
else:
    profiler = Profiler()
    profiler.start()

    f1_python(va, vb)
    f2_numpy(va, vb)
    f3_blas(va, vb)

    profiler.stop()

    print(profiler.output_text(unicode=False, color=False))

Out:

  _     ._   __/__   _ _  _  _ _/_   Recorded: 10:02:18 AM Samples:  12687
 /_//_/// /_\ / //_// / //_'/ //     Duration: 13.769    CPU time: 91.850
/   _/                      v3.4.1

Program: somewheretd3a_cpp_39_std/td3a_cpp/examples/plot_profile.py

13.768 <module>  plot_profile.py:19
|- 7.193 f3_blas  plot_profile.py:42
|     [4 frames hidden]  plot_profile, <built-in>
|        7.047 cblas_ddot  <built-in>:0
|- 5.756 f2_numpy  plot_profile.py:37
|     [8 frames hidden]  plot_profile, <__array_function__ int...
|        5.118 implement_array_function  <built-in>:0
`- 0.819 f1_python  plot_profile.py:31
      [5 frames hidden]  plot_profile, td3a_cpp

numpy is much faster and does not always appear if pydot is run the same number of times. The program is spied on a given number of times per seconds, each time the system records which function the program is executing. At the end, the profiler is able to approximatvely tell how long the program stayed in every function. If a function does not appear, it means it was never executed or too fast to be caught. An HTML report can be generated.

if profiler is not None:
    with open("dot_pyinstrument.html", "w", encoding="utf-8") as f:
        f.write(profiler.output_html())

See Visual outputs from example ‘Profile a function’. pyinstrument does not measure native function (C++) very well. For this module py-spy is more efficient but it only works in command line as the package itself is written in RUST (see Taking ML to production with Rust: a 25x speedup).

if profiler is not None:
    cmd = ("py-spy record --native --function --rate=10 "
           "-o dotpyspy.svg -- {0} plot_profile.py --pyspy").format(executable)
    run_cmd(cmd, wait=True, fLOG=print)

Out:

[run_cmd] execute py-spy record --native --function --rate=10 -o dotpyspy.svg -- somewheretd3a_cpp_39_std/_venv/bin/python plot_profile.py --pyspy
end of execution py-spy record --native --function --rate=10 -o dotpyspy.svg -- somewheretd3a_cpp_39_std/_venv/bin/python plot_profile.py --pyspy

See Visual outputs from example ‘Profile a function’. We see that cblas_ddot() and numpy.dot uses the same C function but the wrapping is not the same and numpy is more efficient.

Total running time of the script: ( 0 minutes 26.867 seconds)

Gallery generated by Sphinx-Gallery