Corrélations#

Dessine les corrélations pour un jeu de données.

from seaborn import clustermap

Récupération des données

from papierstat.datasets import load_wines_dataset
df = load_wines_dataset()
print(df.head(n=2).T)
                           0       1
fixed_acidity            7.4     7.8
volatile_acidity         0.7    0.88
citric_acid              0.0     0.0
residual_sugar           1.9     2.6
chlorides              0.076   0.098
free_sulfur_dioxide     11.0    25.0
total_sulfur_dioxide    34.0    67.0
density               0.9978  0.9968
pH                      3.51     3.2
sulphates               0.56    0.68
alcohol                  9.4     9.8
quality                    5       5
color                    red     red

Les corrélations avec seaborn.

clustermap(df.corr(), center=0, cmap="vlag", linewidths=.75, figsize=(4, 4))

# plt.show()
Traceback (most recent call last):
  File "somewhere/workspace/papierstat/papierstat_UT_39_std/_doc/examples/plots/plot_correlations.py", line 21, in <module>
    clustermap(df.corr(), center=0, cmap="vlag", linewidths=.75, figsize=(4, 4))
  File "somewhere/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 10054, in corr
    mat = data.to_numpy(dtype=float, na_value=np.nan, copy=False)
  File "somewhere/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 1838, in to_numpy
    result = self._mgr.as_array(dtype=dtype, copy=copy, na_value=na_value)
  File "somewhere/.local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 1732, in as_array
    arr = self._interleave(dtype=dtype, na_value=na_value)
  File "somewhere/.local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 1794, in _interleave
    result[rl.indexer] = arr
ValueError: could not convert string to float: 'red'

Total running time of the script: ( 0 minutes 0.245 seconds)

Gallery generated by Sphinx-Gallery