Note
Go to the end to download the full example code
Profile a function¶
pyinstrument is a profiler for python functions. It tells how long the program stays in every function. Because numerical functions are usually very short, it is usually helpful to call them multiple times before the cause becomes significant.
from sys import executable
import sys
import numpy
from pyinstrument import Profiler
from pyquickhelper.loghelper import run_cmd
from td3a_cpp.tutorial import pydot, cblas_ddot
va = numpy.random.randn(100000).astype(numpy.float64)
vb = numpy.random.randn(100000).astype(numpy.float64)
def f1_python(va, vb, n=10):
# pydot is really too slow, let's run 10 times only
for i in range(n):
pydot(va, vb)
def f2_numpy(va, vb, n=100000):
for i in range(n):
numpy.dot(va, vb)
def f3_blas(va, vb, n=100000):
for i in range(n):
cblas_ddot(va, vb)
if '--pyspy' in sys.argv:
# When called with option --pyspy
f1_python(va, vb)
f2_numpy(va, vb)
f3_blas(va, vb)
profiler = None
else:
profiler = Profiler()
profiler.start()
f1_python(va, vb)
f2_numpy(va, vb)
f3_blas(va, vb)
profiler.stop()
print(profiler.output_text(unicode=False, color=False))
_ ._ __/__ _ _ _ _ _/_ Recorded: 05:37:11 AM Samples: 9232
/_//_/// /_\ / //_// / //_'/ // Duration: 9.377 CPU time: 65.105
/ _/ v4.3.0
Program: somewheretd3a_cpp_39_std/td3a_cpp/examples/plot_profile.py
9.376 <module> plot_profile.py:18
|- 4.007 f3_blas plot_profile.py:41
| [4 frames hidden] plot_profile, <built-in>
| 3.837 cblas_ddot <built-in>:0
|- 3.963 f2_numpy plot_profile.py:36
| [4 frames hidden] plot_profile, numpy
| 3.869 [self]
`- 1.405 f1_python plot_profile.py:30
[5 frames hidden] plot_profile, td3a_cpp
numpy is much faster and does not always appear if pydot is run the same number of times. The program is spied on a given number of times per seconds, each time the system records which function the program is executing. At the end, the profiler is able to approximatvely tell how long the program stayed in every function. If a function does not appear, it means it was never executed or too fast to be caught. An HTML report can be generated.
if profiler is not None:
with open("dot_pyinstrument.html", "w", encoding="utf-8") as f:
f.write(profiler.output_html())
See Visual outputs from example ‘Profile a function’. pyinstrument does not measure native function (C++) very well. For this module py-spy is more efficient but it only works in command line as the package itself is written in RUST (see Taking ML to production with Rust: a 25x speedup).
if profiler is not None:
cmd = ("py-spy record --native --function --rate=10 "
"-o dotpyspy.svg -- {0} plot_profile.py --pyspy").format(executable)
run_cmd(cmd, wait=True, fLOG=print)
[run_cmd] execute py-spy record --native --function --rate=10 -o dotpyspy.svg -- somewheretd3a_cpp_39_std/_venv/bin/python plot_profile.py --pyspy
end of execution py-spy record --native --function --rate=10 -o dotpyspy.svg -- somewheretd3a_cpp_39_std/_venv/bin/python plot_profile.py --pyspy
See Visual outputs from example ‘Profile a function’.
We see that cblas_ddot
and numpy.dot()
uses
the same C function but the wrapping is not the same
and numpy is more efficient.
Total running time of the script: ( 0 minutes 22.328 seconds)