**2019-02 - 1/1** Blog machine_learning (5)

# 2019-02 - 1/1#

## Faster Polynomial Features#

2019-02-15

The current implementation of
PolynomialFeatures
in *scikit-learn* computes each new feature
independently and that increases the number of
data exchanged between *numpy* and *Python*.
The idea of the implementation in
`ExtendedFeatures`

is to reduce this number by brodcast multiplications.
The second optimization occurs by transposing the matrix:
dense matrix are organized by rows in memory so
it is faster to mulitply two rows than two columns.
See Faster Polynomial Features.

## Piecewise Linear Regression#

2019-02-10

I decided to turn one of the notebook I wrote about
Piecewise Linear Regression.
I wanted to turn my code into something usable and following
the *scikit-learn* API:
`PiecewiseRegression`

and another notebook Piecewise linear regression with scikit-learn predictors.

## Predictable t-SNE#

2019-02-01

t-SNE is quite an interesting tool to
visualize data on a map but it has one drawback:
results are not reproducible. It is much more powerful
than a PCA but the results is difficult to
interpret. Based on some experiment, if t-SNE
manages to separate classes, there is a good chance that
a classifier can get good performances. Anyhow, I implemented
a regressor which approximates the t-SNE outputs
so that it can be used as features for a further classifier.
I create a notebook Predictable t-SNE and a new tranform
`PredictableTSNE`

.

## Pipeline visualization#

2019-02-01

scikit-learn introduced nice feature to be able to process mixed type column in a single pipeline which follows scikit-learn API: ColumnTransformer FeatureUnion and Pipeline. Ideas are not new but it is finally taking place in scikit-learn.

…

**2019-02 - 1/1** 2018-05 (1) 2019-02 (4) 2019-03 (1) 2020-09 (1) 2021-01 (1)