.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "gyexamples/ml_basic/plot_normalise.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_gyexamples_ml_basic_plot_normalise.py: Normalisation ============= Quelques lignes pour normaliser. La page `preprocessing `_ recense tous les prétraitements que la librairie :epkg:`scikit-learn` implémente. .. contents:: :local: .. GENERATED FROM PYTHON SOURCE LINES 18-19 Un jeu de données .. GENERATED FROM PYTHON SOURCE LINES 19-30 .. code-block:: default from sklearn.model_selection import train_test_split from sklearn.preprocessing import Normalizer from sklearn.preprocessing import normalize from papierstat.datasets import load_wines_dataset df = load_wines_dataset() X = df.drop(['quality', 'color'], axis=1) y = df['quality'] print(X.head()) .. rst-class:: sphx-glr-script-out .. code-block:: none fixed_acidity volatile_acidity citric_acid ... pH sulphates alcohol 0 7.4 0.70 0.00 ... 3.51 0.56 9.4 1 7.8 0.88 0.00 ... 3.20 0.68 9.8 2 7.8 0.76 0.04 ... 3.26 0.65 9.8 3 11.2 0.28 0.56 ... 3.16 0.58 9.8 4 7.4 0.70 0.00 ... 3.51 0.56 9.4 [5 rows x 11 columns] .. GENERATED FROM PYTHON SOURCE LINES 31-33 Normalisation naïve ------------------- .. GENERATED FROM PYTHON SOURCE LINES 33-36 .. code-block:: default X_norm = normalize(X) print(X_norm[:5]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[1.95152519e-01 1.84603734e-02 0.00000000e+00 5.01067279e-02 2.00426911e-03 2.90091582e-01 8.96646709e-01 2.63139437e-02 9.25655867e-02 1.47682987e-02 2.47896443e-01] [1.07241243e-01 1.20990121e-02 0.00000000e+00 3.57470811e-02 1.34738998e-03 3.43721934e-01 9.21174782e-01 1.37048809e-02 4.39964075e-02 9.34923659e-03 1.34738998e-01] [1.35456648e-01 1.31983400e-02 6.94649475e-04 3.99423448e-02 1.59769379e-03 2.60493553e-01 9.37776791e-01 1.73141382e-02 5.66139322e-02 1.12880540e-02 1.70189121e-01] [1.74366737e-01 4.35916843e-03 8.71833685e-03 2.95800715e-02 1.16763440e-03 2.64663797e-01 9.34107520e-01 1.55373217e-02 4.91963294e-02 9.02970603e-03 1.52570895e-01] [1.95152519e-01 1.84603734e-02 0.00000000e+00 5.01067279e-02 2.00426911e-03 2.90091582e-01 8.96646709e-01 2.63139437e-02 9.25655867e-02 1.47682987e-02 2.47896443e-01]] .. GENERATED FROM PYTHON SOURCE LINES 37-45 Normalisation supervisée ------------------------ Une erreur classique consiste à normaliser avant de séparer les données en apprentissage/test. Cela veut dire que des données de tests sont utilisées pour estimer des coefficients du modèle global qui inclue les prétraitements. .. GENERATED FROM PYTHON SOURCE LINES 45-50 .. code-block:: default norm = Normalizer() X_norm = norm.fit_transform(X) .. GENERATED FROM PYTHON SOURCE LINES 51-55 Ce découpage pose un problème de méthodologie car la moyenne et la variance utilisée pour normaliser ne peuvent être estimées mais seulement sur la base d'apprentissage.s On découpage la base d'abord. .. GENERATED FROM PYTHON SOURCE LINES 55-57 .. code-block:: default X_train, X_test, y_train, y_test = train_test_split(X, y) .. GENERATED FROM PYTHON SOURCE LINES 58-59 On normalise ensuite. .. GENERATED FROM PYTHON SOURCE LINES 59-63 .. code-block:: default norm = Normalizer() X_train_norm = norm.fit_transform(X_train) X_test_norm = norm.transform(X_test) .. GENERATED FROM PYTHON SOURCE LINES 64-66 De cette façon, la même normalisation est appliquée sur la base d'apprentissage et de test. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 2.505 seconds) .. _sphx_glr_download_gyexamples_ml_basic_plot_normalise.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_normalise.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_normalise.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_