module benchhelper.grid_benchmark

Inheritance diagram of pyquickhelper.benchhelper.grid_benchmark

Short summary

module pyquickhelper.benchhelper.grid_benchmark

Grid benchmark.

source on GitHub

Classes

class

truncated documentation

GridBenchMark

Compares a couple of machine learning models.

Properties

property

truncated documentation

Appendix

Returns the metrics.

Graphs

Returns images of graphs.

Metadata

Returns the metrics.

Metrics

Returns the metrics.

Name

Returns the name of the benchmark.

Methods

method

truncated documentation

__init__

initialisation

bench

run an experiment multiple times, parameter di is the dataset to use

bench_experiment

function to overload

init

Skips it.

init_main

initialisation

predict_score_experiment

function to overload

preprocess_dataset

split the dataset into train and test

run

Runs the benchmark.

Documentation

Grid benchmark.

source on GitHub

class pyquickhelper.benchhelper.grid_benchmark.GridBenchMark(name, datasets, clog=None, fLOG=<function noLOG>, path_to_images='.', cache_file=None, repetition=1, progressbar=None, **params)[source][source]

Bases: pyquickhelper.benchhelper.benchmark.BenchMark

Compares a couple of machine learning models.

source on GitHub

initialisation

Parameters
  • name – name of the test

  • datasets – list of dictionary of dataframes

  • clog – see CustomLog or string

  • fLOG – logging function

  • params – extra parameters

  • path_to_images – path to images

  • cache_file – cache file

  • repetition – repetition of the experiment (to get confidence interval)

  • progressbar – relies on tqdm, example tnrange

If cache_file is specified, the class will store the results of the method bench. On a second run, the function load the cache and run modified or new run (in param_list).

datasets should be a dictionary with dataframes a values with the following keys:

  • 'X': features

  • 'Y': labels (optional)

source on GitHub

__init__(name, datasets, clog=None, fLOG=<function noLOG>, path_to_images='.', cache_file=None, repetition=1, progressbar=None, **params)[source][source]

initialisation

Parameters
  • name – name of the test

  • datasets – list of dictionary of dataframes

  • clog – see CustomLog or string

  • fLOG – logging function

  • params – extra parameters

  • path_to_images – path to images

  • cache_file – cache file

  • repetition – repetition of the experiment (to get confidence interval)

  • progressbar – relies on tqdm, example tnrange

If cache_file is specified, the class will store the results of the method bench. On a second run, the function load the cache and run modified or new run (in param_list).

datasets should be a dictionary with dataframes a values with the following keys:

  • 'X': features

  • 'Y': labels (optional)

source on GitHub

bench(**params)[source][source]

run an experiment multiple times, parameter di is the dataset to use

source on GitHub

bench_experiment(info, **params)[source][source]

function to overload

Parameters
  • info – dictionary with at least key 'X'

  • params – additional parameters

Returns

output of the experiment

source on GitHub

init()[source][source]

Skips it.

source on GitHub

init_main()[source][source]

initialisation

source on GitHub

predict_score_experiment(info, output, **params)[source][source]

function to overload

Parameters
  • info – dictionary with at least key 'X'

  • output – output of the benchmar

  • params – additional parameters

Returns

output of the experiment, tuple of dictionaries

source on GitHub

preprocess_dataset(dsi, **params)[source][source]

split the dataset into train and test

Parameters
  • dsi – dataset index

  • params – additional parameters

Returns

list of (dataset (like info), dictionary for metrics, parameters)

source on GitHub

run(params_list)[source][source]

Runs the benchmark.

source on GitHub