Introduction to a numpy API for ONNX: FunctionTransformer

This notebook shows how to write python functions similar functions as numpy offers and get a function which can be converted into ONNX.

A pipeline with FunctionTransformer

Let's convert it into ONNX.

Use ONNX instead of numpy

The pipeline cannot be converter because the converter does not know how to convert the function (numpy.log) held by FunctionTransformer into ONNX. One way to avoid that is to replace it by a function log defined with ONNX operators and executed with an ONNX runtime.

The operator Log is belongs to the graph. There is some overhead by using this function on small matrices. The gap is much less on big matrices.

Slightly more complex functions with a FunctionTransformer

What about more complex functions? It is a bit more complicated too. The previous syntax does not work.

The syntax is different.

Let's compare the time to numpy.

The new function is slower but the gap is much less on bigger matrices. The default ONNX runtime has a significant cost compare to the cost of a couple of operations on small matrices.

Function transformer with FFT

The following function is equivalent to the module of the output of a FFT transform. The matrix $M_{kn}$ is defined by $M_{kn}=(\exp(-2i\pi kn/N))_{kn}$. Complex features are then obtained by computing $MX$. Taking the module leads to real features: $\sqrt{Re(MX)^2 + Im(MX)^2}$. That's what the following function does.

numpy implementation

ONNX implementation

This function cannot be exported into ONNX unless it is written with ONNX operators. This is where the numpy API for ONNX helps speeding up the process.

custom_fft_abs is not a function a class holding an ONNX graph. A method __call__ executes the ONNX graph with a python runtime.

Every intermediate output can be logged.

Again the gap is less on bigger matrices. It cannot be faster with the default runtime as it is also using numpy. That's another story with onnxruntime (see below).

Using onnxruntime

The python runtime is using numpy but is usually quite slow as the runtime needs to go through the graph structure. onnxruntime is faster.

onnxruntime is faster than numpy in this case.

Inside a FunctionTransformer

The conversion to ONNX fails if the python function is used.

Now with the onnx version but before, the converter for FunctionTransformer needs to be overwritten to handle this functionality not available in sklearn-onnx. These version are automatically called in function to_onnx from mlprodict.