Note
Go to the end to download the full example code
Estimation des paramètres d’un modèle SIRD étendu pour la France¶
Le modèle proposé dans l’exemple Estimation des paramètres d’un modèle SIRD pour la France
ne fonctionne pas très bien. Les données collectées sont erronées
pour le recensement des personnes infectées. Comme les malades n’étaient
pas testées au début de l’épidémie, le nombre officiel de personnes
contaminées est en-deça de la réalité. On ajoute un paramètre
pour tenir compte dela avec le modèle CovidSIRDc
. Le modèle suppose
en outre que la contagion est la même tout au long de la
période d’étude alors que les mesures de confinement, le port du
masque impacte significativement le coefficient de propagation.
L’épidémie peut sans doute être modélisée avec un modèle SIRD
mais sur une courte période.
Récupération des données¶
from datetime import timedelta
import warnings
import numpy
import pandas
from pandas import to_datetime
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt
from aftercovid.data import extract_hopkins_data, preprocess_hopkins_data
from aftercovid.models import CovidSIRDc, rolling_estimation
data = extract_hopkins_data()
diff, df = preprocess_hopkins_data(data)
df.tail()
Graphes.
fig = plt.figure(tight_layout=True, figsize=(12, 10))
gs = gridspec.GridSpec(2, 2)
axs = []
ax = fig.add_subplot(gs[0, :])
df.plot(logy=True, title="Données COVID", ax=ax)
axs.append(ax)
ax = fig.add_subplot(gs[1, 0])
df[['recovered', 'confirmed', 'infected']].diff().plot(
title="Différences", ax=ax)
axs.append(ax)
ax = fig.add_subplot(gs[1, 1])
df[['deaths']].diff().plot(title="Différences", ax=ax)
axs.append(ax)
for a in axs:
for tick in a.get_xticklabels():
tick.set_rotation(30)
Estimation d’un modèle¶
L’approche sur une fenêtre glissante suggère que le modèle n’est pas bon pour approcher les données sur toute une période, mais que sur une période courte, le vrai modèle peut être approché par un modèle plus simple. On note \(W^*(t)\) les paramètres optimaux au temps t, on étudie les courbes \(t \rightarrow W^*(t)\) pour voir comment évolue ces paramètres.
['S', 'I', 'R', 'D']
[[6.7e+07 0.0e+00 0.0e+00 0.0e+00]
[6.7e+07 1.0e+00 0.0e+00 0.0e+00]
[6.7e+07 1.0e+00 0.0e+00 0.0e+00]
[6.7e+07 2.0e+00 0.0e+00 0.0e+00]
[6.7e+07 2.0e+00 0.0e+00 0.0e+00]]
Estimation.
k=7 iter=500 loss=1704785237.333 l1=0.000852 coef=[5.51244745e-02 5.19949621e-02 1.00000000e-03 0.00000000e+00
8.09872506e-01 1.00926077e+00] R0=1.0401832984503225 lr=1e-06 cn=1.2087649340185227e-07
k=14 iter=500 loss=1165412839.619 l1=0.000698 coef=[0.04327246 0.06082092 0.001 0. 0.96710839 0.92469848] R0=0.6999646643762756 lr=1e-06 cn=1.4434453617564753e-07
k=21 iter=500 loss=1178844208.762 l1=0.000698 coef=[ 5.39778013e-02 6.53644908e-02 1.00000000e-03 -9.52657144e-02
9.07499570e-01 1.04273666e+00] R0=0.8133536580109411 lr=1e-06 cn=-1.4069096876251338e-05
k=28 iter=500 loss=1359292220.952 l1=0.000703 coef=[0.12636037 0.06908108 0.001 0. 0.1379203 0.81568362] R0=1.8030595968380614 lr=1e-06 cn=2.05851187091254e-08
k=35 iter=500 loss=1210804321.524 l1=0.000726 coef=[ 1.67114193e-01 5.15415035e-02 1.00088441e-03 -1.41187830e-07
1.68445832e-05 1.50259685e+00] R0=3.1805595338150088 lr=1e-06 cn=-1.8537620774911145e-11
k=42 iter=500 loss=1611008585.143 l1=0.000867 coef=[1.67948691e-01 4.77998950e-02 1.00000000e-03 0.00000000e+00
5.39498419e-08 1.75145072e+00] R0=3.441578943996731 lr=1e-06 cn=8.05221520493982e-15
k=49 iter=500 loss=1810886265.905 l1=0.000938 coef=[1.50366562e-01 5.08104331e-02 1.00000000e-03 0.00000000e+00
2.78612093e-07 1.51619584e+00] R0=2.9022448343018 lr=1e-06 cn=4.158389442713024e-14
k=56 iter=500 loss=2028077641.143 l1=0.000931 coef=[1.34455267e-01 6.04573136e-02 1.00000000e-03 0.00000000e+00
1.45366213e-04 1.08307078e+00] R0=2.1877830108133733 lr=1e-06 cn=2.1696449752151045e-11
k=63 iter=500 loss=1895819361.524 l1=0.000928 coef=[0.06683979 0.04910923 0.001 0. 0.6309379 0.93798974] R0=1.333881793864141 lr=1e-06 cn=9.416983536806788e-08
k=70 iter=500 loss=1862425843.810 l1=0.000939 coef=[ 4.02069759e-02 5.56717334e-02 1.00000000e-03 -4.45928894e-02
1.04825453e+00 1.06866699e+00] R0=0.7094714316475454 lr=1e-06 cn=-6.492543582643944e-06
k=77 iter=500 loss=1496359399.619 l1=0.00083 coef=[ 3.00514092e-02 6.35628877e-02 1.00000000e-03 -5.82990282e-01
1.01932781e+00 8.97664486e-01] R0=0.46545949661559693 lr=1e-06 cn=-8.677432296159342e-05
k=84 iter=500 loss=928947346.286 l1=0.000553 coef=[ 1.75432863e-02 8.83467338e-02 1.00008203e-03 -7.35717196e-01
8.50612485e-01 1.01818909e+00] R0=0.19635043627184168 lr=1e-06 cn=-0.00010957177114874905
k=91 iter=500 loss=356274102.857 l1=0.000328 coef=[ 0.02919898 0.10103968 0.00100048 -0.56935693 0.36332277 0.37792229] R0=0.28615186874765003 lr=1e-06 cn=-8.483943987837672e-05
k=98 iter=500 loss=200768146.286 l1=0.000254 coef=[ 0.01199626 0.0940425 0.00100091 -0.86020174 0.08672528 0.35974727] R0=0.12621875893417625 lr=1e-06 cn=-0.00012824698718629722
k=102 iter=500 loss=87205924.571 l1=0.000148 coef=[ 9.67416281e-02 8.87796396e-02 1.00011945e-03 -1.19470285e-08
1.53679117e-02 2.93083428e-01] R0=1.0775438587486592 lr=1e-06 cn=2.2919368101045423e-09
k=104 iter=500 loss=71490560.000 l1=0.000144 coef=[ 0.03444519 0.06742691 0.00100005 -0.67670331 0.61675021 0.95918409] R0=0.5033862442849031 lr=1e-06 cn=-0.00010080744146427216
k=106 iter=500 loss=14579483.429 l1=7.38e-05 coef=[ 5.29179598e-02 5.69480772e-02 1.00000000e-03 -2.20149647e-09
8.33604622e-01 9.43033687e-01] R0=0.9131961301278865 lr=0.0001 cn=1.2441827207360772e-07
k=108 iter=16 loss=10862566.857 l1=6.23e-05 coef=[0.06011852 0.05286451 0.001 0. 0.75210348 0.90660207] R0=1.1161063325661933 lr=1e-06 cn=1.1225425011266694e-07
k=110 iter=20 loss=11031040.762 l1=6.31e-05 coef=[6.14755448e-02 5.56652277e-02 1.00000000e-03 0.00000000e+00
7.11761267e-01 1.12668523e+00] R0=1.0848901042389643 lr=1e-06 cn=1.0623302493084362e-07
k=112 iter=500 loss=4659655.619 l1=4.39e-05 coef=[0.07431363 0.05412434 0.001 0. 0.4438066 0.53382489] R0=1.348109171997623 lr=0.0001 cn=6.623979042994271e-08
k=114 iter=13 loss=5098683.429 l1=4.49e-05 coef=[0.07741925 0.05454134 0.001 0. 0.45302667 0.67254934] R0=1.3939031747957633 lr=1e-06 cn=6.761592064674274e-08
k=116 iter=28 loss=5023411.429 l1=4.34e-05 coef=[0.08240906 0.05512022 0.001 0. 0.38079808 0.6240614 ] R0=1.4684378411188848 lr=1e-06 cn=5.683553430289065e-08
k=118 iter=423 loss=4922498.286 l1=4.29e-05 coef=[0.08365808 0.05553728 0.001 0. 0.3474347 0.62291308] R0=1.4796977738663863 lr=0.0001 cn=5.185592576587513e-08
k=120 iter=53 loss=3336917.714 l1=3.67e-05 coef=[0.10285107 0.0565727 0.001 0. 0.20173681 0.49140301] R0=1.7864553656131994 lr=1e-06 cn=3.010997156098567e-08
k=122 iter=7 loss=3278822.095 l1=3.59e-05 coef=[0.11171896 0.05711476 0.001 0. 0.14461697 0.49822973] R0=1.9223854012980193 lr=1e-06 cn=2.1584622533530204e-08
k=123 iter=500 loss=3303460.571 l1=3.6e-05 coef=[0.11124199 0.05730113 0.001 0. 0.15801755 0.521657 ] R0=1.9080590076706614 lr=0.0001 cn=2.3584709117652187e-08
k=124 iter=475 loss=3539710.476 l1=3.76e-05 coef=[0.1110337 0.05755336 0.001 0. 0.15477681 0.58551918] R0=1.8962822278431477 lr=0.0001 cn=2.3101016726841352e-08
k=125 iter=207 loss=3544424.000 l1=3.77e-05 coef=[0.11122954 0.05724697 0.001 0. 0.15228159 0.56742783] R0=1.9096193502222152 lr=0.0001 cn=2.272859563035071e-08
k=126 iter=4 loss=3542836.952 l1=3.89e-05 coef=[0.12769099 0.05983655 0.001 0. 0.08778711 0.47485418] R0=2.0989188727146773 lr=1e-06 cn=1.3102554100980128e-08
k=127 iter=247 loss=3861068.952 l1=4.06e-05 coef=[0.1361947 0.06070814 0.001 0. 0.04790946 0.51141619] R0=2.2070781900736423 lr=0.0001 cn=7.150665309228099e-09
k=128 iter=52 loss=3870073.524 l1=3.98e-05 coef=[0.14485405 0.06274423 0.001 0. 0.00852612 0.52687529] R0=2.2724262377044973 lr=1e-06 cn=1.2725554330299687e-09
k=129 iter=9 loss=3967654.857 l1=3.98e-05 coef=[1.48831720e-01 6.16830460e-02 1.00000000e-03 0.00000000e+00
6.69211248e-05 5.51805375e-01] R0=2.374353659187325 lr=1e-06 cn=9.988227574941034e-12
Saving the results.
df.index = to_datetime(df.index)
dfcoef.index = to_datetime(dfcoef.index)
df.to_csv("plot_covid_france_sird_cst.data.csv", index=True)
dfcoef.to_csv("plot_covid_france_sird_cst.model.csv", index=True)
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:103: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df.index = to_datetime(df.index)
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:104: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
dfcoef.index = to_datetime(dfcoef.index)
Fin de la période.
dfcoef.tail(n=10)
Statistiques.
Fin de la période.
df.tail(n=10)
Statistiques.
Graphe.
dfcoef['R0=1'] = 1
dfcoef['a_'] = -dfcoef['a']
dfcoef['cn'] = -dfcoef['correctness']
fig, ax = plt.subplots(2, 3, figsize=(14, 6), sharex=True)
with warnings.catch_warnings():
warnings.simplefilter("ignore", DeprecationWarning)
dfcoef[["mu", "nu"]].plot(ax=ax[0, 0], logy=True)
dfcoef[["beta"]].plot(ax=ax[0, 1], logy=True)
dfcoef[["loss_l1"]].plot(ax=ax[1, 0], logy=True)
dfcoef[["R0", "R0=1"]].plot(ax=ax[0, 2])
dfcoef[["a_", "b", "c", "cn"]].plot(ax=ax[1, 2])
ax[0, 2].set_ylim(0, 5)
df.iloc[-Nlast:].drop('safe', axis=1).plot(ax=ax[1, 1], logy=True)
for i in range(ax.shape[1]):
for tick in ax[1, i].get_xticklabels():
tick.set_rotation(30)
fig.suptitle('Estimation de R0 tout au long de la période', fontsize=12)
plt.show()
Graphe sur les dernières valeurs.
dfcoeflast = dfcoef.iloc[-30:, :]
dflast = df.iloc[-30:, :]
fig, ax = plt.subplots(2, 3, figsize=(14, 6), sharex=True)
with warnings.catch_warnings():
warnings.simplefilter("ignore", DeprecationWarning)
dfcoeflast[["mu", "nu"]].plot(ax=ax[0, 0], logy=True)
dfcoeflast[["beta"]].plot(ax=ax[0, 1], logy=True)
dfcoeflast[["loss_l1"]].plot(ax=ax[1, 0], logy=True)
dfcoeflast[["R0", "R0=1"]].plot(ax=ax[0, 2])
ax[0, 2].set_ylim(0, 5)
dfcoeflast[["a_", "b", "c", "cn"]].plot(ax=ax[1, 2])
dflast.drop('safe', axis=1).plot(ax=ax[1, 1], logy=True)
for i in range(ax.shape[1]):
for tick in ax[1, i].get_xticklabels():
tick.set_rotation(30)
fig.suptitle('Estimation de R0 sur la fin de la période', fontsize=12)
plt.show()
Prédictions¶
Ce ne sont pas seulement des prédictions
mais des prédictions en boucle à partir d’un état
initial. On utilise la méthode simulate
.
predictions = model.simulate(X[-30:], 21)
df = pandas.DataFrame(
predictions[9], columns='Spred Ipred Rpred Dpred'.split()).set_index(
pandas.to_datetime(dates[-21:]))
df.tail()
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:186: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
pandas.to_datetime(dates[-21:]))
On représente les prédictions. Ipred est la valeur prédite, Itrue est la valeur connue. Idem pour Dpred et Dtrue.
fig, ax = plt.subplots(2, 3, figsize=(14, 6))
df = pandas.DataFrame(
predictions[9], columns='S Ipred R Dpred'.split()).set_index(
pandas.to_datetime(dates[-21:]))
df[["Ipred"]].plot(ax=ax[0, 0])
df[["Dpred"]].plot(ax=ax[1, 0])
df = pandas.DataFrame(X[-21:], columns='S Itrue R Dtrue'.split()
).set_index(pandas.to_datetime(dates[-21:]))
df[["Itrue"]].plot(ax=ax[0, 0])
df[["Dtrue"]].plot(ax=ax[1, 0])
ax[0, 0].set_title(f"Prediction à partir de {dates[-21]}")
dt = pandas.to_datetime(dates[-7])
dates2 = pandas.to_datetime([dt + timedelta(i) for i in range(21)])
df = pandas.DataFrame(
predictions[-7], columns='S Ipred R Dpred'.split()).set_index(dates2)
df[["Ipred"]].plot(ax=ax[0, 1])
df[["Dpred"]].plot(ax=ax[1, 1])
df = pandas.DataFrame(X[-7:], columns='S Itrue R Dtrue'.split()
).set_index(pandas.to_datetime(dates[-7:]))
df[["Itrue"]].plot(ax=ax[0, 1])
df[["Dtrue"]].plot(ax=ax[1, 1])
ax[0, 1].set_title(f"Prediction à partir de {dates[-7]}")
dt = pandas.to_datetime(dates[-1])
dates2 = pandas.to_datetime([dt + timedelta(i) for i in range(21)])
df = pandas.DataFrame(
predictions[-7], columns='S Ipred R Dpred'.split()).set_index(dates2)
df[["Ipred"]].plot(ax=ax[0, 2])
df[["Dpred"]].plot(ax=ax[1, 2])
ax[0, 1].set_title(f"Prediction à partir de {dates[-1]}")
plt.show()
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:198: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
pandas.to_datetime(dates[-21:]))
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:202: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
).set_index(pandas.to_datetime(dates[-21:]))
somewhereaftercovid_39_std/aftercovid/examples/plot_covid_france_sird_cst.py:214: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
).set_index(pandas.to_datetime(dates[-7:]))
Ces prédictions varient beaucoup car une petite imprécision sur l’estimation a beaucoup d’impact à moyen terme.
Total running time of the script: ( 25 minutes 7.225 seconds)