module data.population#

Short summary#

module actuariat_python.data.population

Various function to download data about population

source on GitHub

Functions#

function

truncated documentation

fecondite_france

download fecondity table for France (Excel format)

population_france_year

Downloads the data for the French population from INSEE website

table_mortalite_euro_stat

This function retrieves mortality table from EuroStat through table de mortalité

table_mortalite_france_00_02

Download mortality table for France assuming they are available in Excel format.

Documentation#

Various function to download data about population

source on GitHub

actuariat_python.data.population.fecondite_france(url=None)#

download fecondity table for France (Excel format)

Paramètres:

url – source (url or file)

Renvoie:

DataFrame

By default, the data is coming from a local file which is a copy of INSEE: Fécondité selon l’âge détaillé de la mère. The original file cannot be read by pandas so we convert it first. See also INSEE Bilan Démographique 2016.

source on GitHub

actuariat_python.data.population.population_france_year(url='https://www.insee.fr/fr/statistiques/fichier/1892086/pop-totale-france.xls', sheet_name=0, year=2020)#

Downloads the data for the French population from INSEE website

Paramètres:
  • url – url

  • sheet_name – sheet index

  • year – last year to find

Renvoie:

DataFrame

The sheet index is 0 for the all France, 1 for metropolitean France. The last row aggregates multiple ages 1914 ou avant, they will remain aggregated but the label will be changed to 1914. 100 ou plus is replaced by 100.

By default, the data is coming from INSEE, Bilan Démographique.

2017/01: pandas does not seem to be able to read the format (old format). You should convert the file in txt with Excel.

source on GitHub

actuariat_python.data.population.table_mortalite_euro_stat(url='http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=data/', name='demo_mlifetable.tsv.gz', final_name='mortalite.txt', whereTo='.', stop_at=None, fLOG=<function noLOG>)#

This function retrieves mortality table from EuroStat through table de mortalité (this link is currently broken, data-publica does not provide such a database anymore, a copy is provided).

Paramètres:
  • url – data source

  • name – data table name

  • final_name – the data is compressed, it needs to be uncompressed into a file, this parameter defines its name

  • whereTo – data needs to be downloaded, location of this place

  • stop_at – the overall process is quite long, if not None, it only keeps the first rows

  • fLOG – logging function

Renvoie:

data_frame

The function checks the file final_name exists. If it is the case, the data is not downloaded twice.

The header contains a weird format as coordinates are separated by a comma:

indic_de,sex,age,geo    ime    2013     2012     2011     2010     2009

We need to preprocess the data to split this information into columns. The overall process takes 4-5 minutes, 10 seconds to download (< 10 Mb), 4-5 minutes to preprocess the data (it could be improved). The processed data contains the following columns:

['annee', 'valeur', 'age', 'age_num', 'indicateur', 'genre', 'pays']

Columns age and age_num look alike. age_num is numeric and is equal to age except when age_num is 85. Everybody above that age fall into the same category. The table contains many indicators:

  • PROBSURV: Probabilité de survie entre deux âges exacts (px)

  • LIFEXP: Esperance de vie à l’âge exact (ex)

  • SURVIVORS: Nombre des survivants à l’âge exact (lx)

  • PYLIVED: Nombre d’années personnes vécues entre deux âges exacts (Lx)

  • DEATHRATE: Taux de mortalité à l’âge x (Mx)

  • PROBDEATH: Probabilité de décès entre deux âges exacts (qx)

  • TOTPYLIVED: Nombre total d’années personne vécues après l’âge exact (Tx)

source on GitHub

actuariat_python.data.population.table_mortalite_france_00_02(homme=None, femme=None)#

Download mortality table for France assuming they are available in Excel format.

Paramètres:
  • homme – table for men

  • femme – table for women

Renvoie:

DataFrame

The final DataFrame merges both sheets. The data is coming from Institut des Actuaires: Reférences de mortalité or Références techniques.

source on GitHub