.. blogpost:: :title: RandomForestClassifier - prediction for one observation :keywords: scikit-learn, py-spy, benchmark, one-off prediction :date: 2019-12-04 :categories: benchmark I was meeting with Olivier Grisel this morning and we were wondering why :epkg:`scikit-learn` was slow to compute the prediction of a random forest for one observation compare to what :epkg:`onnxruntime` does, and more specically some optimized C++ code inspired from :epkg:`onnxruntime`. We used :epkg:`py-spy` and wrote the following script: :: import numpy as np from joblib import Memory import sklearn from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from skl2onnx import to_onnx from mlprodict.onnxrt import OnnxInference m = Memory(location="c:\\temp", mmap_mode='r') @m.cache def make_model(): X, y = make_classification( n_features=2, n_redundant=0, n_informative=2, random_state=1, n_clusters_per_class=1, n_samples=1000) rf = RandomForestClassifier( max_depth=5, n_estimators=100, max_features=1) rf.fit(X, y) onx = to_onnx(rf, X.astype(np.float32)) onxb = onx.SerializeToString() return rf, X, onxb rf, X, onxb = make_model() X = X.astype(np.float32) oinf = OnnxInference(onxb, runtime="python") oinf_ort = OnnxInference(onxb, runtime="onnxruntime1") def f1(): for _ in range(0, 10): for i in range(0, X.shape[0]): y = oinf.run({'X': X[i: i+1]}) def f2(): with sklearn.config_context(assume_finite=True): for _ in range(0, 10): for i in range(0, X.shape[0]): rf.predict(X[i: i+1]) f1() # C++ code f2() # scikit-learn The script is run with the following command line: :: py-spy record --native --function --rate=10 -o demo.svg -- python demo.py The following image is a snapshot of the final result: .. image:: rf1.png It shows that on my machine, :epkg:`scikit-learn` is delayed by :epkg:`joblib` which is not really useful for this small dataset. ``check_is_fitted`` takes some time despite the fact ``sklearn.config_context(assume_finite=True)`` was used. We then modified the script to compare this outcome to what we would get for the prediction of 1000 observations in a row. :: import numpy as np from joblib import Memory import sklearn from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from skl2onnx import to_onnx from mlprodict.onnxrt import OnnxInference m = Memory(location="c:\\temp", mmap_mode='r') @m.cache def make_model(): X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_clusters_per_class=1, n_samples=1000) rf = RandomForestClassifier(max_depth=5, n_estimators=100, max_features=1) rf.fit(X, y) onx = to_onnx(rf, X.astype(np.float32)) onxb = onx.SerializeToString() return rf, X, onxb rf, X, onxb = make_model() X = X.astype(np.float32) oinf = OnnxInference(onxb, runtime="python") def f1_1000(): for _ in range(0, 5000): y = oinf.run({'X': X}) def f2_1000(): with sklearn.config_context(assume_finite=True): for _ in range(0, 5000): rf.predict(X) f1_1000() # C++ f2_1000() # scikit-learn Both versions spend similar time into the functions which compute the predictions but :epkg:`joblib` is still adding some extra time. .. image:: rf123.png Figures about other classifiers can be found at `Prediction time scikit-learn / onnxruntime for common datasets `_. It shows the predictions time on *breast cancer* and *digits* datasets.