machine learning - 1/1¶
Faster Polynomial Features¶
The current implementation of
in scikit-learn computes each new feature
independently and that increases the number of
data exchanged between numpy and Python.
The idea of the implementation in
is to reduce this number by brodcast multiplications.
The second optimization occurs by transposing the matrix:
dense matrix are organized by rows in memory so
it is faster to mulitply two rows than two columns.
See Faster Polynomial Features.
Piecewise Linear Regression¶
I decided to turn one of the notebook I wrote about
Piecewise Linear Regression.
I wanted to turn my code into something usable and following
the scikit-learn API:
and another notebook Piecewise linear regression with scikit-learn predictors.
t-SNE is quite an interesting tool to
visualize data on a map but it has one drawback:
results are not reproducible. It is much more powerful
than a PCA but the results is difficult to
interpret. Based on some experiment, if t-SNE
manages to separate classes, there is a good chance that
a classifier can get good performances. Anyhow, I implemented
a regressor which approximates the t-SNE outputs
so that it can be used as features for a further classifier.
I create a notebook Predictable t-SNE and a new tranform
Quantile regression with scikit-learn.¶
scikit-learn does not have any quantile regression.
statsmodels does have one
but I wanted to try something I did for my teachings
based on Iteratively reweighted least squares.
I thought it was a good case study to turn a simple algorithm into
a learner scikit-learn can reused in a pipeline.
The notebook Quantile Regression demonstrates it
and it is implemented in