module benchmark.bench_helper
¶
Short summary¶
module pymlbenchmark.benchmark.bench_helper
Implements a benchmark about performance.
Functions¶
function |
truncated documentation |
---|---|
Merges all results for one set of parameters in one row. |
|
Enumerates all possible options. |
|
Automatically removes columns with more than 1/3 nan values. |
Documentation¶
Implements a benchmark about performance.
- pymlbenchmark.benchmark.bench_helper.bench_pivot(data, experiment='lib', value='mean', index=None)¶
Merges all results for one set of parameters in one row.
- Parameters:
data – DataFrame
experiment – column which identifies an experiment
value – value to plot
index – set of parameters which identifies an experiment, if None, guesses it
- Returns:
<<<
import pandas from pymlbenchmark.datasets import experiment_results from pymlbenchmark.benchmark.bench_helper import bench_pivot df = experiment_results('onnxruntime_LogisticRegression') piv = bench_pivot(df) print(piv.head())
>>>
lib ort skl N count dim fit_intercept method 1 100 1 False predict 0.000021 0.000041 predict_proba 0.000023 0.000049 True predict 0.000025 0.000071 predict_proba 0.000026 0.000051 5 False predict 0.000022 0.000042
- pymlbenchmark.benchmark.bench_helper.enumerate_options(options, filter_fct=None)¶
Enumerates all possible options.
- Parameters:
options – dictionary
{name: list of values}
filter_fct – filters out some configurations
- Returns:
list of dictionary
{name: value}
<<<
from pymlbenchmark.benchmark.bench_helper import enumerate_options options = dict(c1=[0, 1], c2=["aa", "bb"]) for row in enumerate_options(options): print("no-filter", row) def filter_out(**opt): return not (opt["c1"] == 1 and opt["c2"] == "aa") for row in enumerate_options(options, filter_out): print("filter", row)
>>>
no-filter {'c1': 0, 'c2': 'aa'} no-filter {'c1': 0, 'c2': 'bb'} no-filter {'c1': 1, 'c2': 'aa'} no-filter {'c1': 1, 'c2': 'bb'} filter {'c1': 0, 'c2': 'aa'} filter {'c1': 0, 'c2': 'bb'} filter {'c1': 1, 'c2': 'bb'}
- pymlbenchmark.benchmark.bench_helper.remove_almost_nan_columns(df, keep=None, fill_keep=True)¶
Automatically removes columns with more than 1/3 nan values.
- Parameters:
df – dataframe
keep – columns to skip
fill_keep – if not None, fill nan value
- Returns:
clean dataframe