Note
Click here to download the full example code or to run this example in your browser via Binder
Convert a pipeline with a LightGBM model¶
sklearn-onnx only converts scikit-learn models into ONNX but many libraries implement scikit-learn API so that their models can be included in a scikit-learn pipeline. This example considers a pipeline including a LightGBM model. sklearn-onnx can convert the whole pipeline as long as it knows the converter associated to a LGBMClassifier. Let’s see how to do it.
Train a LightGBM classifier¶
from pyquickhelper.helpgen.graphviz_helper import plot_graphviz
from mlprodict.onnxrt import OnnxInference
import onnxruntime as rt
from skl2onnx import convert_sklearn, update_registered_converter
from skl2onnx.common.shape_calculator import calculate_linear_classifier_output_shapes # noqa
from onnxmltools.convert.lightgbm.operator_converters.LightGbm import convert_lightgbm # noqa
from skl2onnx.common.data_types import FloatTensorType
import numpy
from sklearn.datasets import load_iris
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from lightgbm import LGBMClassifier
data = load_iris()
X = data.data[:, :2]
y = data.target
ind = numpy.arange(X.shape[0])
numpy.random.shuffle(ind)
X = X[ind, :].copy()
y = y[ind].copy()
pipe = Pipeline([('scaler', StandardScaler()),
('lgbm', LGBMClassifier(n_estimators=3))])
pipe.fit(X, y)
Out:
Pipeline(steps=[('scaler', StandardScaler()),
('lgbm', LGBMClassifier(n_estimators=3))])
Register the converter for LGBMClassifier¶
The converter is implemented in onnxmltools: onnxmltools…LightGbm.py. and the shape calculator: onnxmltools…Classifier.py.
update_registered_converter(
LGBMClassifier, 'LightGbmLGBMClassifier',
calculate_linear_classifier_output_shapes, convert_lightgbm,
options={'nocl': [True, False], 'zipmap': [True, False, 'columns']})
Convert again¶
Compare the predictions¶
Predictions with LightGbm.
Out:
predict [1 1 2 0 1]
predict_proba [[0.28897427 0.42250751 0.28851822]]
Predictions with onnxruntime.
sess = rt.InferenceSession("pipeline_lightgbm.onnx")
pred_onx = sess.run(None, {"input": X[:5].astype(numpy.float32)})
print("predict", pred_onx[0])
print("predict_proba", pred_onx[1][:1])
Out:
predict [1 1 2 0 1]
predict_proba [[0.2889743 0.42250752 0.28851822]]
Final graph¶
oinf = OnnxInference(model_onnx)
ax = plot_graphviz(oinf.to_dot())
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
Total running time of the script: ( 0 minutes 4.208 seconds)