# Experimental implementations#

## Helpers#

`mlprodict.testing.experimental_c_impl.experimental_c.code_optimisation`

()

code_optimisation() -> str

Returns a string giving some insights about optimisations.

## Implementation of ONNX operators#

Experimental implementations for algorithm.

### Conv#

Function im2col transforms an image in order to replace a convolution by a matrix multiplication.

`mlprodict.onnxrt.ops_cpu.op_conv_helper.im2col`

(*data*, *kernel_shape* = None, *fill_value* = 0)

Returns the result of im2col on a image NHCW where N is 1. The function is equivalent to

`torch.nn.Unfold()`

(but with padding=1 on all dimensions).

`mlprodict.onnxrt.ops_cpu.op_conv_helper.im2col_naive_implementation`

(*data*, *kernel_shape*, *fill_value* = 0)

Naive implementation for im2col or

`torch.nn.Unfold()`

(but with padding=1).

`mlprodict.onnxrt.ops_cpu.op_conv_helper.im2col_nn`

(*res*)

Functions

`nn_im2col_2d`

and`im2col()`

returns the same results but with different shapes. This function converts a result from`nn_im2col_2d`

into the same shape as a return from`nn_im2col_2d`

.

`mlprodict.onnxrt.ops_cpu.op_conv_helper.im2col_recursive`

(*data*, *kernel_shape*, *fill_value* = 0, *fall_back_dim* = 2)

Recursive implementation, falls back to

`im2col_naive_implementation`

for dimension <= fall_back_dim. The function is equivalent to`torch.nn.Unfold()`

(but with padding=1 on all dimensions).

`mlprodict.onnxrt.ops_cpu.op_conv_helper.nn_im2col_2d`

(*data*, *kernel_shape*, *dilations*, *padding*, *fill_value* = 0)

C++ implementation for im2col or

`torch.nn.Unfold()`

.

`mlprodict.onnxrt.ops_cpu.op_conv_helper.nn_col2im_2d`

(*data*, *output_shape*, *kernel_shape*, *dilations*, *padding*)

C++ implementation for col2im or

`torch.nn.Fold()`

.

### Einsum#

`mlprodict.testing.einsum`

(*equation*, *inputs*, *optimize* = False, *runtime* = ‘batch_dot’, *cache* = True, *opset* = None, *decompose* = True, *strategy* = None, *verbose* = None)

Proposes a new implementation of numpy.einsum. It does not allow expresion using … and expects a right member.

`mlprodict.testing.einsum.einsum_fct.CachedEinsum`

(*self*, *equation*, *runtime* = ‘batch_dot’, *opset* = None, *optimize* = False, *dtype* = <class ‘numpy.float64’>, *decompose* = True, *strategy* = None, *verbose* = None, *key* = None)

Stores all the necessary information to cache the preprocessing of a an einsum equation.

`mlprodict.testing.einsum.optimize_decompose_einsum_equation`

(*equation*, *dtype*, *optimize* = False, *runtime* = ‘batch_dot’, *cache* = True, *opset* = None, *decompose* = True, *strategy* = None, *verbose* = None)

Proposes a new implementation of numpy.einsum. It does not allow expresion using … and expects a right member.

`mlprodict.testing.einsum.einsum_impl.analyse_einsum_equation`

(*equation*)

Analyses an einsum equation.

`mlprodict.testing.einsum.einsum_impl.apply_sequence`

`mlprodict.testing.einsum.decompose_einsum_equation`

(*equation*, *shapes*, *strategy* = ‘simple’, *clean* = False, *verbose* = False)

Decomposes an equation used in numpy.einsum knowing the input shapes. It returns a sequence of operations to do to compute the results.

`mlprodict.testing.experimental_c_impl.experimental_c.custom_einsum_float`

(*equation*, *x*, *y*, *nthread* = 0)

custom_einsum_float(equation: str, x: numpy.ndarray[numpy.float32], y: numpy.ndarray[numpy.float32], nthread: int = 0) -> numpy.ndarray[numpy.float32]

Custom C++ implementation of operator

einsumwith float. The function only works with contiguous arrays. It does not any explicit transposes. It does not support diagonal operator (repetition of the same letter). See python’s version`custom_einsum`

.

`mlprodict.testing.experimental_c_impl.experimental_c.custom_einsum_double`

(*equation*, *x*, *y*, *nthread* = 0)

custom_einsum_double(equation: str, x: numpy.ndarray[numpy.float64], y: numpy.ndarray[numpy.float64], nthread: int = 0) -> numpy.ndarray[numpy.float64]

Custom C++ implementation of operator

einsumwith double. The function only works with contiguous arrays. It does not any explicit transposes. It does not support diagonal operator (repetition of the same letter). See python’s version`custom_einsum`

.

`mlprodict.testing.einsum.einsum_benchmark`

(*equation* = ‘abc,cd->abd’, *shape* = 30, *perm* = False, *runtime* = ‘python’, *use_tqdm* = False, *number* = 5, *repeat* = 5, *opset* = 17)

Investigates whether or not the decomposing einsum is faster.

`mlprodict.testing.einsum.numpy_diagonal`

(*m*, *axis*, *axes*)

Extracts diagonal coefficients from an array.

`mlprodict.testing.einsum.numpy_extended_dot`

(*m1*, *m2*, *axes*, *left*, *right*, *verbose* = False)

Extended version of a matrix multiplication (numpy.dot) with two matrices

m1,m2of the same dimensions. Loops overleftaxes form1andrightaxes form2, summation is done overaxes. Other axes must be empty. This multiplication combines matrix multiplication (dot) and broadcasted multiplication term by term.

`mlprodict.testing.einsum.numpy_extended_dot_python`

(*m1*, *m2*, *axes*, *left*, *right*, *verbose* = False)

Implementation of

`numpy_extended_dot`

in pure python. This implementation is not efficient but shows how to implement this operation without numpy.einsum.

`mlprodict.testing.einsum.numpy_extended_dot_matrix`

(*m1*, *m2*, *axes*, *left*, *right*, *verbose* = False)

Implementation of

`numpy_extended_dot`

using dot product, multiplication, transpose and reduction but not a custom python implementation like`numpy_extended_dot_python`

.

`mlprodict.testing.einsum.einsum_impl_ext.numpy_extended_dot_ouput_shape`

(*m1*, *m2*, *axes*, *left*, *right*)

Computes the output shape of results produced by function

`numpy_extended_dot`

or`numpy_extended_dot_python`

.

### Pad#

`mlprodict.testing.experimental.custom_pad`

(*arr*, *paddings*, *constant* = 0, *verbose* = False)

Implements function pad in python, only the constant version.

### ReduceSum#

`mlprodict.testing.experimental_c_impl.experimental_c.custom_reducesum_rk_double`

(*x*, *nthread* = 0)

custom_reducesum_rk_double(x: numpy.ndarray[numpy.float64], nthread: int = 0) -> numpy.ndarray[numpy.float64]

Custom C++ implementation of operator

ReduceSumwith double when the reduced matrix has two dimensions and the reduced axis is the first one.xis the reduced matrix.nthreadspecifies the number of threads used to distribute. Negative means OMP default values.

`mlprodict.testing.experimental_c_impl.experimental_c.custom_reducesum_rk_float`

(*x*, *nthread* = 0)

custom_reducesum_rk_float(x: numpy.ndarray[numpy.float32], nthread: int = 0) -> numpy.ndarray[numpy.float32]

Custom C++ implementation of operator

ReduceSumwith float when the reduced matrix has two dimensions and the reduced axis is the first one.xis the reduced matrix.nthreadspecifies the number of threads used to distribute. Negative means OMP default values.