RSS blog page - 1/1 Blog onnx (5)

blog page - 1/1

The bug which makes you waste time


It is not a bug but it is something which makes you waste some significant time just to understand what’s going on. asv would refuse to detect the benchmark I was trying to set up just because filenames did contain dots. So, for asv don’t add a file but use A couple of benchmark for tries: bench1, bench2.


Operator CDist


Notebooks Pairwise distances with ONNX (pdist) shows how much slower an ONNX implementation of function cdist, from 3 to 10 times slower. One way to optimize the converted model is to create dedicated operator such as one for function cdist. Tutorial Converters with options explains how to tell function to_onnx to use the custom operator CDist.


Float, double with ONNX


Replicating what a library does, scikit-learn for example, is different from implementing a function defined in a paper. Every trick needs to be replicated. scikit-learn trees implement a prediction function which takes float features and compares them to double thresholds. Knowning the ONNX assumes that comparison only happens numbers of the same type, you end up with discrepencies.


ONNX updates


The python runtime is now almost complete for all the supported numerical operator implemented in sklearn-onnx. A couple of notebooks introduces a couple of way to investigates issues, to benchmark ONNX models with onnxruntime or python runtime, to check the differences between the same model. It also extend ONNX with operators not in the specification to experiment some assumptions and check it is more efficient. Notebook Precision loss due to float32 conversion with ONNX introduces a way to guess the margins introduced by the conversion from double to single. There also exists a function to convert numpy function into ONNX (see From numpy to ONNX). Its coverage is probably low but it will improve.


ONNX, runtime


Somebody asked me one day if it would be difficult to write a runtime for ONNX in Rust. I just replied that it should not take that long but it would require to implement a way to goes through the nodes of the ONNX graph and to have an implementation for every ONNX Operators


ONNX, runtime, converters


I have been recently working on sklearn-onnx to write converter from scikit-learn operators to ONNX serialization format. I was talking about that a month ago and somebody asked me if there was a runtime implemented in RUST. Not that I know of but I said it would not be too complex to implement one.


XGBoost into python code


Package pyxgboost converts a tree from xgboost into a Python code. Python still needs to be used if the models has to be deployed but it should be faster for small models.


Similar projects


I would not say this module is actively maintained. It was more fun to have the idea, to test it on some simple model than to extend its coverage to all available models in scikit-learn. Some altenatives exists but it is still ongoing work. sklearn-porter proposed to produce code into many languages, C++, Javascipt, PHP, Java, Ruby, Go. It only includes learners and not transforms. onnx proposes to convert any models into a unified format. This module implements the format, onnxmltools, winmltools do the conversion of many models from scikit-learn, xgboost, lightgbm. The produced file can be used to run prediction on GPU and Windows with a dedicated runtime.


Why mlprodict?


What about predicting with scikit-learn with C but still training in Python?


RSS blog page - 1/1 2018-04 (1) 2019-06 (2) 2019-08 (2) 2019-09 (1) 2019-10 (1)