module datasource.http_retrieve

Inheritance diagram of pyensae.datasource.http_retrieve

Short summary

module pyensae.datasource.http_retrieve

Various functions to get data from a website, a reference website.

source on GitHub

Classes

class

truncated documentation

DownloadDataException

raised when data cannot be downloaded

RetrieveDataException

raised when data cannot be downloaded

Functions

function

truncated documentation

download_data

Retrieves a module given its name, a text file or a :epkg:`zip` file, looks for it on http://www.xavierdupre.fr/...

remove_empty_line

Removes empty line in an imported file.

Documentation

Various functions to get data from a website, a reference website.

source on GitHub

exception pyensae.datasource.http_retrieve.DownloadDataException

Bases: Exception

raised when data cannot be downloaded

source on GitHub

exception pyensae.datasource.http_retrieve.RetrieveDataException

Bases: Exception

raised when data cannot be downloaded

source on GitHub

pyensae.datasource.http_retrieve.download_data(name, moduleName=None, url=None, glo=None, loc=None, whereTo='.', website='xd', timeout=None, retry=2, silent=False, fLOG=<function noLOG>)

Retrieves a module given its name, a text file or a :epkg:`zip` file, looks for it on http://www.xavierdupre.fr/... (website), the file is copied at this file location and uncompressed if it is a :epkg:`zip` file (or a :epkg:`tar.gz` file). This function can be replaced in most cases by function urlretrieve.

import urllib.request
url = 'https://...'
dest = "downloaded_file.bin"
urllib.request.urlretrieve(url, dest)
Parameters:
  • name – (str) name of the file to download

  • moduleName – (str|None) like import name as moduleName if name is a module

  • url – (str|list|None) link to the website to use (or the websites if list)

  • glo – (dict|None) if None, it will be replaced globals()

  • loc – (dict|None) if None, it will be replaced locals()

  • whereTo – specify a folder where downloaded files will be placed

  • website – website to look for

  • timeout – timeout (seconds) when establishing the connection (see urlopen)

  • retry – number of retries in case of failure when downloading the data

  • silent – if True, convert some exception into warnings when unzipping a tar file

  • fLOG – logging function

Returns:

modules or list of files

By extension, this function also download various zip files and decompresses it. If the file was already downloaded, the function will not do it again.

Download data for a practical lesson

from pyensae.datasource import download_data
download_data('voeux.zip', website='xd')

Download data from a website

download_data("facebook.tar.gz", website="http://snap.stanford.edu/data/")

If it does not work, I suggest to use standard python: Download a file from Dropbox with Python.

Changed in version 1.1: Parameters retry, silent were added.

Changed in version 1.2: Parameter url can be a list. The function tries the first one which contains the file.

source on GitHub

pyensae.datasource.http_retrieve.remove_empty_line(file)

Removes empty line in an imported file.

Parameters:

file – local file name

source on GitHub