:orphan: |rss_image| **blog page - 1/1** :ref:`Blog ` :ref:`machine_learning (5) ` .. |rss_image| image:: feed-icon-16x16.png :target: ../_downloads/rss.xml :alt: RSS ---- .. index:: blog .. _ap-main-0: blog page - 1/1 +++++++++++++++ .. blogpostagg:: :title: scikit-learn 0.23 :date: 2021-01-03 :keywords: scikit-learn,0.23,0.24 :categories: scikit-learn :rawfile: 2021/2021-01-03_skl.rst The unit test are run against :epkg:`scikit-learn` 0.23, 0.24. Some unit tests are failing with version 0.23. They were disabled instead of looking into a cause which does not appear with the latest version. It affects all classes inheriting from :class:`SkBase ` where a model using it is trained. The issue happens in :epkg:`joblib`. .. blogpostagg:: :title: scikit-learn internal API :date: 2020-09-02 :keywords: API :categories: scikit-learn :rawfile: 2020/2020-09-02_api.rst The signature of method `impurity_improvement `_ will change for version 0.24. That's usually easy to handle two versions of scikit-learn even overloaded in a class except that method is implemented in :epkg:`cython`. The method must be overloaded the same way with the same signature. The way it was handled is implemented in PR `88 `_. ... .. blogpostagg:: :title: Nogil, numpy, cython :date: 2019-03-25 :keywords: nogil,numpy,blas,lapack :categories: cython :rawfile: 2019/2019-03-25_nogil.rst I had to implement a custom criterion to optimize a decision tree and I wanted to leverage :epkg:`scikit-learn` instead of rewriting my own. Version 0.21 of :epkg:`scikit-learn` introduced some changed in the API which make possible to overload an existing criterion and replace some of the logic by another one: `_criterion.pyx `_. The purpose was to show that a fast implementation requires some tricks (see :ref:`piecewiselinearregressioncriterionrst`) and `piecewise_tree_regression_criterion.pyx `_, `piecewise_tree_regression_criterion_fast.pyx `_ for the code. Other than that, every function to overlaod is marked as :epkg:`nogil`. Every function or method marked as *nogil* cannot go through the :epkg:`GIL` (see also :epkg:`PEP-0311`), which no :epkg:`python` object can be created in that method. In fact, no :epkg:`python` can be called inside a :epkg:`Cython` method protected with *nogil*. The issue with that is that any :epkg:`numpy` method cannot be called. ... .. blogpostagg:: :title: Faster Polynomial Features :date: 2019-02-15 :keywords: scikit-learn,polynomial features :categories: machine learning :rawfile: 2019/2019-02-15_poly.rst The current implementation of `PolynomialFeatures `_ in *scikit-learn* computes each new feature independently and that increases the number of data exchanged between *numpy* and *Python*. The idea of the implementation in :class:`ExtendedFeatures ` is to reduce this number by brodcast multiplications. The second optimization occurs by transposing the matrix: dense matrix are organized by rows in memory so it is faster to mulitply two rows than two columns. See :ref:`fasterpolynomialfeaturesrst`. .. blogpostagg:: :title: Piecewise Linear Regression :date: 2019-02-10 :keywords: scikit-learn,linear regression,piecewise :categories: machine learning :rawfile: 2019/2019-02-10_piecewise.rst I decided to turn one of the notebook I wrote about `Piecewise Linear Regression `_. I wanted to turn my code into something usable and following the *scikit-learn* API: :class:`PiecewiseRegression ` and another notebook :ref:`piecewiselinearregressionrst`. .. blogpostagg:: :title: Predictable t-SNE :date: 2019-02-01 :keywords: scikit-learn,t-SNE :categories: machine learning :rawfile: 2019/2019-02-01_tsne.rst :epkg:`t-SNE` is quite an interesting tool to visualize data on a map but it has one drawback: results are not reproducible. It is much more powerful than a :epkg:`PCA` but the results is difficult to interpret. Based on some experiment, if :epkg:`t-SNE` manages to separate classes, there is a good chance that a classifier can get good performances. Anyhow, I implemented a regressor which approximates the :epkg:`t-SNE` outputs so that it can be used as features for a further classifier. I create a notebook :ref:`predictabletsnerst` and a new tranform :class:`PredictableTSNE `. .. blogpostagg:: :title: Pipeline visualization :date: 2019-02-01 :keywords: scikit-learn,pipeline :categories: machine learning :rawfile: 2019/2019-02-01_pipeline.rst :epkg:`scikit-learn` introduced nice feature to be able to process mixed type column in a single pipeline which follows :epkg:`scikit-learn` API: `ColumnTransformer `_ `FeatureUnion `_ and `Pipeline `_. Ideas are not new but it is finally taking place in :epkg:`scikit-learn`. ... .. blogpostagg:: :title: Quantile regression with scikit-learn. :date: 2018-05-07 :keywords: scikit-learn,quantile regression :categories: machine learning :rawfile: 2018/2018-05-07_quantile_regression.rst :epkg:`scikit-learn` does not have any quantile regression. :epkg:`statsmodels` does have one `QuantReg `_ but I wanted to try something I did for my teachings `RĂ©gression Quantile `_ based on `Iteratively reweighted least squares `_. I thought it was a good case study to turn a simple algorithm into a learner :epkg:`scikit-learn` can reused in a pipeline. The notebook :ref:`quantileregressionrst` demonstrates it and it is implemented in :class:`QuantileLinearRegression `. .. blogpostagg:: :title: Function to get insights on machine learned models :date: 2017-11-18 :keywords: reference,blog,post :categories: blog :rawfile: 2017/2017-10-18_first_day.rst Machine learned models are black boxes. The module tries to implements some functions to get insights on machine learned models. ---- |rss_image| **blog page - 1/1** :ref:`2018-05 (1) ` :ref:`2019-02 (4) ` :ref:`2019-03 (1) ` :ref:`2020-09 (1) ` :ref:`2021-01 (1) `