.. _sklearngrammarlrrst: ===================================== Converts a logistic regression into C ===================================== .. only:: html **Links:** :download:`notebook `, :downloadlink:`html `, :download:`PDF `, :download:`python `, :downloadlink:`slides `, :githublink:`GitHub|_doc/notebooks/sklearn_grammar_lr.ipynb|*` The logistic regression is trained in python and executed in C. .. code:: ipython3 from jyquickhelper import add_notebook_menu add_notebook_menu() .. contents:: :local: Train a linear regression ------------------------- .. code:: ipython3 from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_iris iris = load_iris() X = iris.data[:, :2] y = iris.target y[y == 2] = 1 lr = LogisticRegression() lr.fit(X, y) .. parsed-literal:: LogisticRegression() Export into C ------------- .. code:: ipython3 # grammar is the expected scoring model. from mlprodict.grammar_sklearn import sklearn2graph gr = sklearn2graph(lr, output_names=['Prediction', 'Score']) gr .. parsed-literal:: We can even check what the function should produce as a score. Types are strict. .. code:: ipython3 import numpy X = numpy.array([[numpy.float32(1), numpy.float32(2)]]) e2 = gr.execute(Features=X[0, :]) print(e2) .. parsed-literal:: [ 0. -11.264062] We compare with scikit-learn. .. code:: ipython3 lr.decision_function(X[0:1, :]) .. parsed-literal:: array([-11.26406172]) Conversion into C: .. code:: ipython3 res = gr.export(lang='c', hook={'array': lambda v: v.tolist(), 'float32': lambda v: float(v)}) print(res["code"]) .. parsed-literal:: int LogisticRegression (float* pred, float* Features) { // 2290909222952-LogisticRegression - children // 2290909222728-concat - children // 2290909222672-sign - children // 2290909222616-+ - children // 2290909222560-adot - children float pred0c0c00c0[2] = {(float)3.3882975578308105, (float)-3.164527654647827}; float* pred0c0c00c1 = Features; // 2290909222560-adot - itself float pred0c0c00; adot_float_float(&pred0c0c00, pred0c0c00c0, pred0c0c00c1, 2); // 2290909222560-adot - done float pred0c0c01 = (float)-8.323304176330566; // 2290909222616-+ - itself float pred0c0c0 = pred0c0c00 + pred0c0c01; // 2290909222616-+ - done // 2290909222672-sign - itself float pred0c0; sign_float(&pred0c0, pred0c0c0); // 2290909222672-sign - done // 2290909222728-concat - itself float pred0[2]; concat_float_float(pred0, pred0c0, pred0c0c0); // 2290909222728-concat - done memcpy(pred, pred0, 2*sizeof(float)); // 2290909222952-LogisticRegression - itself return 0; // 2290909222952-LogisticRegression - done } We execute the code with module `cffi `__. .. code:: ipython3 from mlprodict.grammar_sklearn.cc import compile_c_function fct = compile_c_function(res["code"], 2) fct .. parsed-literal:: .wrapper_float(features, output=None)> .. code:: ipython3 e2 = fct(X[0, :]) e2 .. parsed-literal:: array([ 0. , -11.264062], dtype=float32) Time comparison --------------- .. code:: ipython3 %timeit lr.decision_function(X[0:1, :]) .. parsed-literal:: 64.9 µs ± 5.84 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) .. code:: ipython3 %timeit fct(X[0, :]) .. parsed-literal:: 6.17 µs ± 380 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) There is a significant speedup on this example. It could be even faster by removing some Python part and optimizing the code produced by `cffi `__. We can also save the creation of the array which contains the output by reusing an existing one. .. code:: ipython3 out = fct(X[0, :]) .. code:: ipython3 %timeit fct(X[0, :], out) .. parsed-literal:: 6.33 µs ± 430 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)