This notebook shows how to write python functions similar functions as numpy offers and get a function which can be converted into ONNX.
from jyquickhelper import add_notebook_menu
add_notebook_menu()
%load_ext mlprodict
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
import numpy
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer, StandardScaler
from sklearn.linear_model import LogisticRegression
pipe = make_pipeline(
FunctionTransformer(numpy.log),
StandardScaler(),
LogisticRegression())
pipe.fit(X_train, y_train)
Pipeline(steps=[('functiontransformer', FunctionTransformer(func=<ufunc 'log'>)), ('standardscaler', StandardScaler()), ('logisticregression', LogisticRegression())])
Let's convert it into ONNX.
from mlprodict.onnx_conv import to_onnx
try:
onx = to_onnx(pipe, X_train.astype(numpy.float64))
except (RuntimeError, TypeError) as e:
print(e)
FunctionTransformer is not supported unless the transform function is None (= identity). You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.
The pipeline cannot be converter because the converter does not know how to convert the function (numpy.log
) held by FunctionTransformer
into ONNX. One way to avoid that is to replace it by a function log
defined with ONNX operators and executed with an ONNX runtime.
import mlprodict.npy.numpy_onnx_pyrt as npnxrt
pipe = make_pipeline(
FunctionTransformer(npnxrt.log),
StandardScaler(),
LogisticRegression())
pipe.fit(X_train, y_train)
Pipeline(steps=[('functiontransformer', FunctionTransformer(func=<mlprodict.npy.onnx_numpy_wrapper.onnxnumpy_nb_log_None_None object at 0x0000028680B88A90>)), ('standardscaler', StandardScaler()), ('logisticregression', LogisticRegression())])
onx = to_onnx(pipe, X_train.astype(numpy.float64), rewrite_ops=True)
%onnxview onx
The operator Log
is belongs to the graph. There is some overhead by using this function on small matrices. The gap is much less on big matrices.
%timeit numpy.log(X_train)
4.61 µs ± 821 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit npnxrt.log(X_train)
15.5 µs ± 723 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
What about more complex functions? It is a bit more complicated too. The previous syntax does not work.
def custom_fct(x):
return npnxrt.log(x + 1)
pipe = make_pipeline(
FunctionTransformer(custom_fct),
StandardScaler(),
LogisticRegression())
pipe.fit(X_train, y_train)
Pipeline(steps=[('functiontransformer', FunctionTransformer(func=<function custom_fct at 0x00000286D5302700>)), ('standardscaler', StandardScaler()), ('logisticregression', LogisticRegression())])
try:
onx = to_onnx(pipe, X_train.astype(numpy.float64), rewrite_ops=True)
except TypeError as e:
print(e)
FunctionTransformer is not supported unless the transform function is of type <class 'function'> wrapped with onnxnumpy.
The syntax is different.
from typing import Any
from mlprodict.npy import onnxnumpy_default, NDArray
import mlprodict.npy.numpy_onnx_impl as npnx
@onnxnumpy_default
def custom_fct(x: NDArray[(None, None), numpy.float64]) -> NDArray[(None, None), numpy.float64]:
return npnx.log(x + numpy.float64(1))
pipe = make_pipeline(
FunctionTransformer(custom_fct),
StandardScaler(),
LogisticRegression())
pipe.fit(X_train, y_train)
Pipeline(steps=[('functiontransformer', FunctionTransformer(func=<mlprodict.npy.onnx_numpy_wrapper.onnxnumpy_custom_fct_None_None object at 0x00000286824D28B0>)), ('standardscaler', StandardScaler()), ('logisticregression', LogisticRegression())])
onx = to_onnx(pipe, X_train.astype(numpy.float64), rewrite_ops=True)
%onnxview onx
Let's compare the time to numpy.
def custom_numpy_fct(x):
return numpy.log(x + numpy.float64(1))
%timeit custom_numpy_fct(X_train)
6.15 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit custom_fct(X_train)
22.1 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The new function is slower but the gap is much less on bigger matrices. The default ONNX runtime has a significant cost compare to the cost of a couple of operations on small matrices.
bigx = numpy.random.rand(10000, X_train.shape[1])
%timeit custom_numpy_fct(bigx)
433 µs ± 53.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit custom_fct(bigx)
357 µs ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The following function is equivalent to the module of the output of a FFT transform. The matrix $M_{kn}$ is defined by $M_{kn}=(\exp(-2i\pi kn/N))_{kn}$. Complex features are then obtained by computing $MX$. Taking the module leads to real features: $\sqrt{Re(MX)^2 + Im(MX)^2}$. That's what the following function does.
def custom_fft_abs_py(x):
"onnx fft + abs python"
# see https://jakevdp.github.io/blog/
# 2013/08/28/understanding-the-fft/
dim = x.shape[1]
n = numpy.arange(dim)
k = n.reshape((-1, 1)).astype(numpy.float64)
kn = k * n * (-numpy.pi * 2 / dim)
kn_cos = numpy.cos(kn)
kn_sin = numpy.sin(kn)
ekn = numpy.empty((2,) + kn.shape, dtype=x.dtype)
ekn[0, :, :] = kn_cos
ekn[1, :, :] = kn_sin
res = numpy.dot(ekn, x.T)
tr = res ** 2
mod = tr[0, :, :] + tr[1, :, :]
return numpy.sqrt(mod).T
x = numpy.random.randn(3, 4).astype(numpy.float32)
custom_fft_abs_py(x)
array([[2.2948687 , 1.0178018 , 0.14858188, 1.0178018 ], [2.4203196 , 2.2417266 , 1.7303984 , 2.2417266 ], [3.0329118 , 1.2578778 , 0.75200284, 1.2578778 ]], dtype=float32)
This function cannot be exported into ONNX unless it is written with ONNX operators. This is where the numpy API for ONNX helps speeding up the process.
from mlprodict.npy import onnxnumpy_default, onnxnumpy_np, NDArray
import mlprodict.npy.numpy_onnx_impl as nxnp
def _custom_fft_abs(x):
dim = x.shape[1]
n = nxnp.arange(0, dim).astype(numpy.float32)
k = n.reshape((-1, 1))
kn = (k * (n * numpy.float32(-numpy.pi * 2))) / dim.astype(numpy.float32)
kn3 = nxnp.expand_dims(kn, 0)
kn_cos = nxnp.cos(kn3)
kn_sin = nxnp.sin(kn3)
ekn = nxnp.vstack(kn_cos, kn_sin)
res = nxnp.dot(ekn, x.T)
tr = res ** 2
mod = tr[0, :, :] + tr[1, :, :]
return nxnp.sqrt(mod).T
@onnxnumpy_default
def custom_fft_abs(x: NDArray[Any, numpy.float32],
) -> NDArray[Any, numpy.float32]:
"onnx fft + abs"
return _custom_fft_abs(x)
custom_fft_abs(x)
array([[2.2948687 , 1.0178018 , 0.14858188, 1.0178018 ], [2.4203196 , 2.2417266 , 1.7303984 , 2.2417266 ], [3.0329118 , 1.257878 , 0.75200284, 1.2578778 ]], dtype=float32)
custom_fft_abs
is not a function a class holding an ONNX graph. A method __call__
executes the ONNX graph with a python runtime.
%onnxview custom_fft_abs.compiled.onnx_
Every intermediate output can be logged.
custom_fft_abs(x, verbose=1, fLOG=print)
-- OnnxInference: run 39 nodes Onnx-Shape(x) -> Sh_shape0 +kr='Sh_shape0': (2,) (dtype=int64 min=3 max=4) Onnx-Slice(Sh_shape0, Sl_Slicecst, Sl_Slicecst1, Sl_Slicecst2) -> Sl_output01 +kr='Sl_output01': (1,) (dtype=int64 min=4 max=4) Onnx-Identity(Sl_Slicecst2) -> Sq_Squeezecst +kr='Sq_Squeezecst': (1,) (dtype=int64 min=0 max=0) Onnx-Squeeze(Sl_output01, Sq_Squeezecst) -> Sq_squeezed01 +kr='Sq_squeezed01': () (dtype=int64 min=4 max=4) Onnx-Identity(Sl_Slicecst2) -> Su_Subcst +kr='Su_Subcst': (1,) (dtype=int64 min=0 max=0) Onnx-Sub(Sq_squeezed01, Su_Subcst) -> Su_C0 +kr='Su_C0': (1,) (dtype=int64 min=4 max=4) Onnx-ConstantOfShape(Su_C0) -> Co_output01 +kr='Co_output01': (4,) (dtype=int64 min=1 max=1) Onnx-Identity(Sl_Slicecst2) -> Cu_CumSumcst +kr='Cu_CumSumcst': (1,) (dtype=int64 min=0 max=0) Onnx-CumSum(Co_output01, Cu_CumSumcst) -> Cu_y0 +kr='Cu_y0': (4,) (dtype=int64 min=1 max=4) Onnx-Add(Cu_y0, Ad_Addcst) -> Ad_C01 +kr='Ad_C01': (4,) (dtype=int64 min=0 max=3) Onnx-Cast(Ad_C01) -> Ca_output0 +kr='Ca_output0': (4,) (dtype=float32 min=0.0 max=3.0) Onnx-Reshape(Ca_output0, Re_Reshapecst) -> Re_reshaped0 +kr='Re_reshaped0': (4, 1) (dtype=float32 min=0.0 max=3.0) Onnx-Mul(Ca_output0, Mu_Mulcst) -> Mu_C01 +kr='Mu_C01': (4,) (dtype=float32 min=-18.84955596923828 max=-0.0) Onnx-Mul(Re_reshaped0, Mu_C01) -> Mu_C0 +kr='Mu_C0': (4, 4) (dtype=float32 min=-56.548667907714844 max=-0.0) Onnx-Cast(Sq_squeezed01) -> Ca_output01 +kr='Ca_output01': () (dtype=float32 min=4.0 max=4.0) Onnx-Div(Mu_C0, Ca_output01) -> Di_C0 +kr='Di_C0': (4, 4) (dtype=float32 min=-14.137166976928711 max=-0.0) Onnx-Identity(Sl_Slicecst2) -> Un_Unsqueezecst +kr='Un_Unsqueezecst': (1,) (dtype=int64 min=0 max=0) Onnx-Unsqueeze(Di_C0, Un_Unsqueezecst) -> Un_expanded0 +kr='Un_expanded0': (1, 4, 4) (dtype=float32 min=-14.137166976928711 max=-0.0) Onnx-Cos(Un_expanded0) -> Co_output0 +kr='Co_output0': (1, 4, 4) (dtype=float32 min=-1.0 max=1.0) Onnx-Sin(Un_expanded0) -> Si_output0 +kr='Si_output0': (1, 4, 4) (dtype=float32 min=-1.0 max=1.0) Onnx-Concat(Co_output0, Si_output0) -> Co_concat_result0 +kr='Co_concat_result0': (2, 4, 4) (dtype=float32 min=-1.0 max=1.0) Onnx-Transpose(x) -> Tr_transposed0 +kr='Tr_transposed0': (4, 3) (dtype=float32 min=-2.0982813835144043 max=1.0294874906539917) Onnx-MatMul(Co_concat_result0, Tr_transposed0) -> Ma_Y0 +kr='Ma_Y0': (2, 4, 3) (dtype=float32 min=-3.032911777496338 max=2.2948687076568604) Onnx-Pow(Ma_Y0, Po_Powcst) -> Po_Z0 +kr='Po_Z0': (2, 4, 3) (dtype=float32 min=0.0 max=9.198554039001465) Onnx-Identity(Sl_Slicecst2) -> Sl_Slicecst3 +kr='Sl_Slicecst3': (1,) (dtype=int64 min=0 max=0) Onnx-Identity(Sl_Slicecst) -> Sl_Slicecst4 +kr='Sl_Slicecst4': (1,) (dtype=int64 min=1 max=1) Onnx-Identity(Sl_Slicecst2) -> Sl_Slicecst5 +kr='Sl_Slicecst5': (1,) (dtype=int64 min=0 max=0) Onnx-Slice(Po_Z0, Sl_Slicecst3, Sl_Slicecst4, Sl_Slicecst5) -> Sl_output0 +kr='Sl_output0': (1, 4, 3) (dtype=float32 min=0.020312773063778877 max=9.198554039001465) Onnx-Identity(Sl_Slicecst2) -> Sq_Squeezecst1 +kr='Sq_Squeezecst1': (1,) (dtype=int64 min=0 max=0) Onnx-Squeeze(Sl_output0, Sq_Squeezecst1) -> Sq_squeezed0 +kr='Sq_squeezed0': (4, 3) (dtype=float32 min=0.020312773063778877 max=9.198554039001465) Onnx-Identity(Sl_Slicecst) -> Sl_Slicecst6 +kr='Sl_Slicecst6': (1,) (dtype=int64 min=1 max=1) Onnx-Identity(Sl_Slicecst1) -> Sl_Slicecst7 +kr='Sl_Slicecst7': (1,) (dtype=int64 min=2 max=2) Onnx-Identity(Sl_Slicecst2) -> Sl_Slicecst8 +kr='Sl_Slicecst8': (1,) (dtype=int64 min=0 max=0) Onnx-Slice(Po_Z0, Sl_Slicecst6, Sl_Slicecst7, Sl_Slicecst8) -> Sl_output02 +kr='Sl_output02': (1, 4, 3) (dtype=float32 min=0.0 max=4.499505996704102) Onnx-Identity(Sl_Slicecst2) -> Sq_Squeezecst2 +kr='Sq_Squeezecst2': (1,) (dtype=int64 min=0 max=0) Onnx-Squeeze(Sl_output02, Sq_Squeezecst2) -> Sq_squeezed02 +kr='Sq_squeezed02': (4, 3) (dtype=float32 min=0.0 max=4.499505996704102) Onnx-Add(Sq_squeezed0, Sq_squeezed02) -> Ad_C0 +kr='Ad_C0': (4, 3) (dtype=float32 min=0.022076575085520744 max=9.198554039001465) Onnx-Sqrt(Ad_C0) -> Sq_Y0 +kr='Sq_Y0': (4, 3) (dtype=float32 min=0.1485818773508072 max=3.032911777496338) Onnx-Transpose(Sq_Y0) -> y +kr='y': (3, 4) (dtype=float32 min=0.1485818773508072 max=3.032911777496338)
array([[2.2948687 , 1.0178018 , 0.14858188, 1.0178018 ], [2.4203196 , 2.2417266 , 1.7303984 , 2.2417266 ], [3.0329118 , 1.257878 , 0.75200284, 1.2578778 ]], dtype=float32)
%timeit custom_fft_abs_py(x)
25.3 µs ± 5.76 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit custom_fft_abs(x)
288 µs ± 12.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Again the gap is less on bigger matrices. It cannot be faster with the default runtime as it is also using numpy. That's another story with onnxruntime (see below).
bigx = numpy.random.randn(10000, x.shape[1]).astype(numpy.float32)
%timeit custom_fft_abs_py(bigx)
1.72 ms ± 32.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit custom_fft_abs(bigx)
5.32 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The python runtime is using numpy but is usually quite slow as the runtime needs to go through the graph structure. onnxruntime is faster.
@onnxnumpy_np(runtime='onnxruntime')
def custom_fft_abs_ort(x: NDArray[Any, numpy.float32],
) -> NDArray[Any, numpy.float32]:
"onnx fft + abs"
return _custom_fft_abs(x)
custom_fft_abs(x)
array([[2.2948687 , 1.0178018 , 0.14858188, 1.0178018 ], [2.4203196 , 2.2417266 , 1.7303984 , 2.2417266 ], [3.0329118 , 1.257878 , 0.75200284, 1.2578778 ]], dtype=float32)
%timeit custom_fft_abs_ort(x)
149 µs ± 54 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
onnxruntime is faster than numpy in this case.
%timeit custom_fft_abs_ort(bigx)
217 µs ± 18.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The conversion to ONNX fails if the python function is used.
from mlprodict.onnx_conv import to_onnx
tr = FunctionTransformer(custom_fft_abs_py)
tr.fit(x)
try:
onnx_model = to_onnx(tr, x)
except Exception as e:
print(e)
FunctionTransformer is not supported unless the transform function is of type <class 'function'> wrapped with onnxnumpy.
Now with the onnx version but before, the converter for FunctionTransformer needs to be overwritten to handle this functionality not available in sklearn-onnx. These version are automatically called in function to_onnx from mlprodict.
tr = FunctionTransformer(custom_fft_abs)
tr.fit(x)
onnx_model = to_onnx(tr, x)
from mlprodict.onnxrt import OnnxInference
oinf = OnnxInference(onnx_model)
y_onx = oinf.run({'X': x})
y_onx['variable']
array([[2.2948687 , 1.0178018 , 0.14858188, 1.0178018 ], [2.4203196 , 2.2417266 , 1.7303984 , 2.2417266 ], [3.0329118 , 1.257878 , 0.75200284, 1.2578778 ]], dtype=float32)