module datasets.eurostat

Short summary

module sparkouille.datasets.eurostat

Datasets from Eurostat.

source on GitHub



truncated documentation


This function retrieves mortality table from EuroStat through table de mortalité


Datasets from Eurostat.

source on GitHub

sparkouille.datasets.eurostat.table_mortalite_euro_stat(url='', name='demo_mlifetable.tsv.gz', final_name='mortalite.txt', whereTo='.', stop_at=None, fLOG=<function noLOG>)[source]

This function retrieves mortality table from EuroStat through table de mortalité (this link is currently broken, data-publica does not provide such a database anymore, a copy is provided).

  • url – data source

  • name – data table name

  • final_name – the data is compressed, it needs to be uncompressed into a file, this parameter defines its name

  • whereTo – data needs to be downloaded, location of this place

  • stop_at – the overall process is quite long, if not None, it only keeps the first rows

  • fLOG – logging function



The function checks the file final_name exists. If it is the case, the data is not downloaded twice. The header contains a weird format as coordinates are separated by a comma:

indic_de,sex,age,geo\time    2013     2012     2011     2010     2009

We need to preprocess the data to split this information into columns. The overall process takes 4-5 minutes, 10 seconds to download (< 10 Mb), 4-5 minutes to preprocess the data (it could be improved). The processed data contains the following columns:

['annee', 'valeur', 'age', 'age_num', 'indicateur', 'genre', 'pays']

Columns age and age_num look alike. age_num is numeric and is equal to age except when age_num is 85. Everybody above that age fall into the same category. The table contains many indicators:

  • PROBSURV: Probabilité de survie entre deux âges exacts (px)

  • LIFEXP: Esperance de vie à l’âge exact (ex)

  • SURVIVORS: Nombre des survivants à l’âge exact (lx)

  • PYLIVED: Nombre d’années personnes vécues entre deux âges exacts (Lx)

  • DEATHRATE: Taux de mortalité à l’âge x (Mx)

  • PROBDEATH: Probabilité de décès entre deux âges exacts (qx)

  • TOTPYLIVED: Nombre total d’années personne vécues après l’âge exact (Tx)

source on GitHub