module onnxrt.validate.validate_helper#

Inheritance diagram of mlprodict.onnxrt.validate.validate_helper

Short summary#

module mlprodict.onnxrt.validate.validate_helper

Validates runtime for many scikit-learn operators. The submodule relies on onnxconverter_common, sklearn-onnx.

source on GitHub

Classes#

class

truncated documentation

RuntimeBadResultsError

Raised when the results are too different from scikit-learn.

Functions#

function

truncated documentation

_dictionary2str

_dispsimple

_get_problem_data

_measure_time

Measures the execution time for a function.

_merge_options

_multiply_time_kwargs

Multiplies values in time_kwargs following strategy time_kwargs_fact for a given model inst.

_shape_exc

default_time_kwargs

Returns default values number and repeat to measure the execution of a function.

dump_into_folder

Dumps information when an error was detected using pickle.

measure_time

Measures a statement and returns the results as a dictionary.

modules_list

Returns modules and versions currently used.

sklearn_operators

Builds the list of operators from scikit-learn. The function goes through the list of submodule and get …

Methods#

method

truncated documentation

__init__

Documentation#

Validates runtime for many scikit-learn operators. The submodule relies on onnxconverter_common, sklearn-onnx.

source on GitHub

exception mlprodict.onnxrt.validate.validate_helper.RuntimeBadResultsError(msg, obs)#

Bases: RuntimeError

Raised when the results are too different from scikit-learn.

source on GitHub

Parameters:
  • msg – to display

  • obs – observations

source on GitHub

__init__(msg, obs)#
Parameters:
  • msg – to display

  • obs – observations

source on GitHub

mlprodict.onnxrt.validate.validate_helper._dictionary2str(di)#
mlprodict.onnxrt.validate.validate_helper._dispsimple(arr, fLOG)#
mlprodict.onnxrt.validate.validate_helper._get_problem_data(prob, n_features)#
mlprodict.onnxrt.validate.validate_helper._measure_time(fct, repeat=1, number=1, first_run=True)#

Measures the execution time for a function.

Parameters:
  • fct – function to measure

  • repeat – number of times to repeat

  • number – number of times between two measures

  • first_run – if True, runs the function once before measuring

Returns:

last result, average, values

source on GitHub

mlprodict.onnxrt.validate.validate_helper._merge_options(all_conv_options, aoptions)#
mlprodict.onnxrt.validate.validate_helper._multiply_time_kwargs(time_kwargs, time_kwargs_fact, inst)#

Multiplies values in time_kwargs following strategy time_kwargs_fact for a given model inst.

Parameters:
  • time_kwargs – see below

  • time_kwargs_fact – see below

  • instscikit-learn model

Returns:

new time_kwargs

Possible values for time_kwargs_fact:

  • a integer: multiplies number by this number

  • ‘lin’: multiplies value number for linear models depending on the number of rows to process (\propto 1/\log_{10}(n))

<<<

from pprint import pprint
from sklearn.linear_model import LinearRegression
from mlprodict.onnxrt.validate.validate_helper import (
    default_time_kwargs, _multiply_time_kwargs)

lr = LinearRegression()
kw = default_time_kwargs()
pprint(kw)

kw2 = _multiply_time_kwargs(kw, 'lin', lr)
pprint(kw2)

>>>

    {1: {'number': 15, 'repeat': 20},
     10: {'number': 10, 'repeat': 20},
     100: {'number': 4, 'repeat': 10},
     1000: {'number': 4, 'repeat': 4},
     10000: {'number': 2, 'repeat': 2}}
    {1: {'number': 150, 'repeat': 20},
     10: {'number': 100, 'repeat': 20},
     100: {'number': 20, 'repeat': 10},
     1000: {'number': 12, 'repeat': 4},
     10000: {'number': 6, 'repeat': 2}}

source on GitHub

mlprodict.onnxrt.validate.validate_helper._shape_exc(obj)#
mlprodict.onnxrt.validate.validate_helper.default_time_kwargs()#

Returns default values number and repeat to measure the execution of a function.

<<<

from mlprodict.onnxrt.validate.validate_helper import default_time_kwargs
import pprint
pprint.pprint(default_time_kwargs())

>>>

    {1: {'number': 15, 'repeat': 20},
     10: {'number': 10, 'repeat': 20},
     100: {'number': 4, 'repeat': 10},
     1000: {'number': 4, 'repeat': 4},
     10000: {'number': 2, 'repeat': 2}}

keys define the number of rows, values defines number and repeat.

source on GitHub

mlprodict.onnxrt.validate.validate_helper.dump_into_folder(dump_folder, obs_op=None, is_error=True, **kwargs)#

Dumps information when an error was detected using pickle.

Parameters:
  • dump_folder – dump_folder

  • obs_op – obs_op (information)

  • is_error – is it an error or not?

  • kwargs – additional parameters :return: name

source on GitHub

mlprodict.onnxrt.validate.validate_helper.measure_time(stmt, x, repeat=10, number=50, div_by_number=False, first_run=True, max_time=None)#

Measures a statement and returns the results as a dictionary.

Parameters:
  • stmt – string

  • x – matrix

  • repeat – average over repeat experiment

  • number – number of executions in one row

  • div_by_number – divide by the number of executions

  • first_run – if True, runs the function once before measuring

  • max_time – execute the statement until the total goes beyond this time (approximatively), repeat is ignored, div_by_number must be set to True

Returns:

dictionary

See Timer.repeat for a better understanding of parameter repeat and number. The function returns a duration corresponding to number times the execution of the main statement.

source on GitHub

mlprodict.onnxrt.validate.validate_helper.modules_list()#

Returns modules and versions currently used.

<<<

from mlprodict.onnxrt.validate.validate_helper import modules_list
from pyquickhelper.pandashelper import df2rst
from pandas import DataFrame
print(df2rst(DataFrame(modules_list())))

>>>

name

version

mlprodict

0.9.1887

numpy

1.23.5

onnx

1.12.0

onnxmltools

1.11.1

onnxruntime

1.13.1

pandas

1.5.2

scipy

1.9.3

skl2onnx

1.13

sklearn

1.1.1

source on GitHub

mlprodict.onnxrt.validate.validate_helper.sklearn_operators(subfolder=None, extended=False, experimental=True)#

Builds the list of operators from scikit-learn. The function goes through the list of submodule and get the list of class which inherit from :epkg:`scikit-learn:base:BaseEstimator`.

Parameters:
  • subfolder – look into only one subfolder

  • extended – extends the list to the list of operators this package implements a converter for

  • experimental – includes experimental module from scikit-learn (see sklearn.experimental)

Returns:

the list of found operators

source on GitHub