module onnx_conv.helpers.lgbm_helper
#
Short summary#
module mlprodict.onnx_conv.helpers.lgbm_helper
Helpers to speed up the conversion of Lightgbm models or transform it.
Functions#
function |
truncated documentation |
---|---|
Dumps Booster to JSON format. Parameters ———- self: booster num_iteration : int or None, optional … |
|
Dumps a Lightgbm booster into JSON. |
|
LightGBM produces sometimes a tree with a node set to use rule |
|
Restores speed up information to help modifying the structure of the tree. |
Documentation#
Helpers to speed up the conversion of Lightgbm models or transform it.
- mlprodict.onnx_conv.helpers.lgbm_helper.dump_booster_model(self, num_iteration=None, start_iteration=0, importance_type='split', verbose=0)#
Dumps Booster to JSON format.
Parameters#
self: booster num_iteration : int or None, optional (default=None)
Index of the iteration that should be dumped. If None, if the best iteration exists, it is dumped; otherwise, all iterations are dumped. If <= 0, all iterations are dumped.
- start_iterationint, optional (default=0)
Start index of the iteration that should be dumped.
- importance_typestring, optional (default=”split”)
What type of feature importance should be dumped. If “split”, result contains numbers of times the feature is used in a model. If “gain”, result contains total gains of splits which use the feature.
verbose: dispays progress (usefull for big trees)
Returns#
- json_reprdict
JSON format of Booster.
Note
This function is inspired from the lightgbm (dump_model. It creates intermediate structure to speed up the conversion into ONNX of such model. The function overwrites the json.load to fastly extract nodes.
- mlprodict.onnx_conv.helpers.lgbm_helper.dump_lgbm_booster(booster, verbose=0)#
Dumps a Lightgbm booster into JSON.
- Parameters:
booster – Lightgbm booster
verbose – verbosity
- Returns:
json, dictionary with more information
- mlprodict.onnx_conv.helpers.lgbm_helper.modify_tree_for_rule_in_set(gbm, use_float=False, verbose=0, count=0, info=None)#
LightGBM produces sometimes a tree with a node set to use rule
==
to a set of values (= in set), the values are separated by||
. This function unfold theses nodes.- Parameters:
gbm – a tree coming from lightgbm dump
use_float – use float otherwise int first then float if it does not work
verbose – verbosity, use tqdm to show progress
count – number of nodes already changed (origin) before this call
info – addition information to speed up this search
- Returns:
number of changed nodes (include count)
A child looks like the following:
<<<
import pprint from mlprodict.onnx_conv.operator_converters.conv_lightgbm import modify_tree_for_rule_in_set tree = {'decision_type': '==', 'default_left': True, 'internal_count': 6805, 'internal_value': 0.117558, 'left_child': {'leaf_count': 4293, 'leaf_index': 18, 'leaf_value': 0.003519117642745049}, 'missing_type': 'None', 'right_child': {'leaf_count': 2512, 'leaf_index': 25, 'leaf_value': 0.012305307958365394}, 'split_feature': 24, 'split_gain': 12.233599662780762, 'split_index': 24, 'threshold': '10||12||13'} modify_tree_for_rule_in_set(tree) pprint.pprint(tree)
>>>
{'decision_type': '==', 'default_left': True, 'internal_count': 6805, 'internal_value': 0.117558, 'left_child': {'leaf_count': 4293, 'leaf_index': 18, 'leaf_value': 0.003519117642745049}, 'missing_type': 'None', 'right_child': {'decision_type': '==', 'default_left': True, 'internal_count': 6805, 'internal_value': 0.117558, 'left_child': {'leaf_count': 4293, 'leaf_index': 18, 'leaf_value': 0.003519117642745049}, 'missing_type': 'None', 'right_child': {'decision_type': '==', 'default_left': True, 'internal_count': 6805, 'internal_value': 0.117558, 'left_child': {'leaf_count': 4293, 'leaf_index': 18, 'leaf_value': 0.003519117642745049}, 'missing_type': 'None', 'right_child': {'leaf_count': 2512, 'leaf_index': 25, 'leaf_value': 0.012305307958365394}, 'split_feature': 24, 'split_gain': 12.233599662780762, 'split_index': 24, 'threshold': 13}, 'split_feature': 24, 'split_gain': 12.233599662780762, 'split_index': 24, 'threshold': 12}, 'split_feature': 24, 'split_gain': 12.233599662780762, 'split_index': 24, 'threshold': 10}
- mlprodict.onnx_conv.helpers.lgbm_helper.restore_lgbm_info(tree)#
Restores speed up information to help modifying the structure of the tree.