Source de données

Wikipédia

mlstatpy.data.wikipedia.download_dump (country, name, folder = “.”, unzip = True, timeout = -1, overwrite = False, fLOG = <function noLOG at 0x7f84c9cf3158>)

Downloads wikipedia dumps from https://dumps.wikimedia.org/frwiki/latest/.

mlstatpy.data.wikipedia.download_pageviews (dt, folder = “.”, unzip = True, timeout = -1, overwrite = False, fLOG = <function noLOG at 0x7f84c9cf3158>)

Downloads wikipedia pagacount for a precise date (up to the hours), the url follows the pattern:

https://dumps.wikimedia.org/other/pageviews/%Y/%Y-%m/pagecounts-%Y%m%d-%H0000.gz

mlstatpy.data.wikipedia.download_titles (country, folder = “.”, unzip = True, timeout = -1, overwrite = False, fLOG = <function noLOG at 0x7f84c9cf3158>)

Downloads wikipedia titles from https://dumps.wikimedia.org/frwiki/latest/latest-all-titles-in-ns0.gz.

mlstatpy.data.wikipedia.enumerate_titles (filename, norm = True, encoding = “utf8”)

Enumerates titles from a file.

mlstatpy.data.wikipedia.download_dump (country, name, folder = “.”, unzip = True, timeout = -1, overwrite = False, fLOG = <function noLOG at 0x7f84c9cf3158>)

Downloads wikipedia dumps from https://dumps.wikimedia.org/frwiki/latest/.