module asv_benchmark.create_asv#

Short summary#

module mlprodict.asv_benchmark.create_asv

Functions to creates a benchmark based on asv for many regressors and classifiers.

source on GitHub

Functions#

function

truncated documentation

_create_asv_benchmark_file

Creates a benchmark file based in the information received through the argument. It uses one of the templates like …

_enumerate_asv_benchmark_all_models

Loops over all possible models and fills a folder with benchmarks following asv concepts.

create_asv_benchmark

Creates an asv benchmark in a folder but does not run it.

Documentation#

Functions to creates a benchmark based on asv for many regressors and classifiers.

source on GitHub

mlprodict.asv_benchmark.create_asv._create_asv_benchmark_file(location, model, scenario, optimisations, new_conv_options, extra, dofit, problem, runtime, X_train, X_test, y_train, y_test, Xort_test, init_types, conv_options, method_name, n_features, dims, opsets, output_index, predict_kwargs, prefix_import, exc, execute=False, location_pyspy=None, patterns=None)#

Creates a benchmark file based in the information received through the argument. It uses one of the templates like TemplateBenchmarkClassifier or TemplateBenchmarkRegressor.

source on GitHub

mlprodict.asv_benchmark.create_asv._enumerate_asv_benchmark_all_models(location, opset_min=10, opset_max=None, runtime=('scikit-learn', 'python'), models=None, skip_models=None, extended_list=True, n_features=None, dtype=None, verbose=0, filter_exp=None, dims=None, filter_scenario=None, exc=True, flat=False, execute=False, dest_pyspy=None, fLOG=<built-in function print>)#

Loops over all possible models and fills a folder with benchmarks following asv concepts.

Parameters:
  • n_features – number of features to try

  • dims – number of observations to try

  • verbose – integer from 0 (None) to 2 (full verbose)

  • opset_min – tries every conversion from this minimum opset

  • opset_max – tries every conversion up to maximum opset

  • runtime – runtime to check, scikit-learn, python, onnxruntime1 to check onnxruntime, onnxruntime2 to check every ONNX node independently with onnxruntime, many runtime can be checked at the same time if the value is a comma separated list

  • models – list of models to test or empty string to test them all

  • skip_models – models to skip

  • extended_list – extends the list of scikit-learn converters with converters implemented in this module

  • n_features – change the default number of features for a specific problem, it can also be a comma separated list

  • dtype – ‘32’ or ‘64’ or None for both, limits the test to one specific number types

  • fLOG – logging function

  • filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input

  • filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra as an input

  • exc – if False, raises warnings instead of exceptions whenever possible

  • flat – one folder for all files or subfolders

  • execute – execute each script to make sure imports are correct

  • dest_pyspy – add a file to profile the prediction function with pyspy

source on GitHub

mlprodict.asv_benchmark.create_asv.create_asv_benchmark(location, opset_min=-1, opset_max=None, runtime=('scikit-learn', 'python_compiled'), models=None, skip_models=None, extended_list=True, dims=(1, 10, 100, 10000), n_features=(4, 20), dtype=None, verbose=0, fLOG=<built-in function print>, clean=True, conf_params=None, filter_exp=None, filter_scenario=None, flat=False, exc=False, build=None, execute=False, add_pyspy=False, env=None, matrix=None)#

Creates an asv benchmark in a folder but does not run it.

Parameters:
  • n_features – number of features to try

  • dims – number of observations to try

  • verbose – integer from 0 (None) to 2 (full verbose)

  • opset_min – tries every conversion from this minimum opset, -1 to get the current opset defined by module onnx

  • opset_max – tries every conversion up to maximum opset, -1 to get the current opset defined by module onnx

  • runtime – runtime to check, scikit-learn, python, python_compiled compiles the graph structure and is more efficient when the number of observations is small, onnxruntime1 to check onnxruntime, onnxruntime2 to check every ONNX node independently with onnxruntime, many runtime can be checked at the same time if the value is a comma separated list

  • models – list of models to test or empty string to test them all

  • skip_models – models to skip

  • extended_list – extends the list of scikit-learn converters with converters implemented in this module

  • n_features – change the default number of features for a specific problem, it can also be a comma separated list

  • dtype – ‘32’ or ‘64’ or None for both, limits the test to one specific number types

  • fLOG – logging function

  • clean – clean the folder first, otherwise overwrites the content

  • conf_params – to overwrite some of the configuration parameters

  • filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input

  • filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra as an input

  • flat – one folder for all files or subfolders

  • exc – if False, raises warnings instead of exceptions whenever possible

  • build – where to put the outputs

  • execute – execute each script to make sure imports are correct

  • add_pyspy – add an extra folder with code to profile each configuration

  • env – None to use the default configuration or same to use the current one

  • matrix – specifies versions for a module, example: {'onnxruntime': ['1.1.1', '1.1.2']}, if a package name starts with ‘~’, the package is removed

Returns:

created files

The default configuration is the following:

<<<

import pprint
from mlprodict.asv_benchmark.create_asv import default_asv_conf

pprint.pprint(default_asv_conf)

>>>

    {'benchmark_dir': 'benches',
     'branches': ['master'],
     'build_command': ['python setup.py build',
                       'PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps '
                       '--no-index -w {build_cache_dir} {build_dir}'],
     'env_dir': 'env',
     'environment_type': 'virtualenv',
     'html_dir': 'html',
     'install_command': ['python -mpip install {wheel_file}'],
     'install_timeout': 600,
     'matrix': {'Pillow': [],
                'cython': [],
                'jinja2': [],
                'joblib': [],
                'lightgbm': [],
                'mlinsights': [],
                'numpy': [],
                'onnx': ['http://localhost:8067/simple/'],
                'onnxconverter-common': ['http://localhost:8067/simple/'],
                'onnxruntime': ['http://localhost:8067/simple/'],
                'pandas': [],
                'pybind11': [],
                'pyquickhelper': [],
                'scikit-learn': ['http://localhost:8067/simple/'],
                'scipy': [],
                'skl2onnx': ['http://localhost:8067/simple/'],
                'xgboost': []},
     'project': 'mlprodict',
     'project_url': 'http://www.xavierdupre.fr/app/mlprodict/helpsphinx/index.html',
     'repo': 'https://github.com/sdpython/mlprodict.git',
     'repo_subdir': '',
     'results_dir': 'results',
     'show_commit_url': 'https://github.com/sdpython/mlprodict/commit/',
     'uninstall_command': ['return-code=any python -mpip uninstall -y {project}'],
     'version': 1}

The benchmark does not seem to work well with setting -environment existing:same. The publishing fails.

source on GitHub