2A.ml - Séries temporelles - correction

Links: notebook, html, PDF, python, slides, GitHub

Prédictions sur des séries temporelles.

from jyquickhelper import add_notebook_menu
add_notebook_menu()
%matplotlib inline

Une série temporelles

On récupère le nombre de sessions d’un site web.

import pandas
data = pandas.read_csv("xavierdupre_sessions.csv", sep="\t")
data.set_index("Date", inplace=True)
data.head()
Sessions
Date
28/10/2010 7
29/10/2010 6
30/10/2010 4
31/10/2010 6
01/11/2010 2
data.plot(figsize=(12,4));
c:python372_x64libsite-packagespandasplotting_matplotlibcore.py:1235: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(xticklabels)
../_images/td2a_timeseries_correction_5_1.png
data[-365:].plot(figsize=(12,4));
c:python372_x64libsite-packagespandasplotting_matplotlibcore.py:1235: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(xticklabels)
../_images/td2a_timeseries_correction_6_1.png

Enlever la saisonnalité sans la connaître

Avec fit_seasons.

from seasonal import fit_seasons
cv_seasons, trend = fit_seasons(data["Sessions"])
print(cv_seasons)
# data["cs_seasons"] = cv_seasons
data["trendcs"] = trend
data[-365:].plot(y=["Sessions", "trendcs", "trendsea"], figsize=(14,4));
[ 26.66213008  16.33420353 -86.59519495 -73.57497492  33.23110565
  52.87820674  30.87516435]
c:python372_x64libsite-packagespandasplotting_matplotlibcore.py:1235: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(xticklabels)
../_images/td2a_timeseries_correction_19_2.png

Autocorrélograme

On s’inspire de l’exemple : Autoregressive Moving Average (ARMA): Sunspots data.

import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
fig = plt.figure(figsize=(12,8))
ax1 = fig.add_subplot(211)
fig = plot_acf(data["Sessions"], lags=40, ax=ax1)
ax2 = fig.add_subplot(212)
fig = plot_pacf(data["Sessions"], lags=40, ax=ax2);
../_images/td2a_timeseries_correction_21_0.png

On retrouve bien une période de 7.