module faq.faq_web
#
Short summary#
module ensae_teaching_cs.faq.faq_web
A few functions about scrapping
Functions#
function |
truncated documentation |
---|---|
Returns the associated driver with some custom settings. The function automatically gets chromedriver if not present … |
|
Uses the module selenium to retrieve the html content of a website. |
|
Uses the module selenium to take a picture of a website. If url and img are lists, the function goes … |
Documentation#
A few functions about scrapping
- ensae_teaching_cs.faq.faq_web._get_selenium_browser(navigator, fLOG=<function noLOG>)#
Returns the associated driver with some custom settings.
The function automatically gets chromedriver if not present (Windows only). On Linux, package chromium-driver should be installed:
apt-get install chromium-driver
.Issue with Selenium and Firefox
Firefox >= v47 does not work on Windows. See Selenium WebDriver and Firefox 47.
Voir ChromeDriver download, Error message: “chromedriver” executable needs to be available in the path.
See Selenium - Remote WebDriver example, see also Running the remote driver with Selenium and python.
- ensae_teaching_cs.faq.faq_web.webhtml(url, navigator='opera', fLOG=<function noLOG>)#
Uses the module selenium to retrieve the html content of a website.
- Paramètres:
url – url
navigator – firefox, chrome, (ie: does not work well)
fLOG – logging function
- Renvoie:
list of [ ( url, html) ]
Check the list of available webdriver at selenium/webdriver and add one to the code if needed.
- ensae_teaching_cs.faq.faq_web.webshot(img, url, navigator='opera', add_date=False, size=None, fLOG=<function noLOG>)#
Uses the module selenium to take a picture of a website. If url and img are lists, the function goes through all the urls and save webshots.
- Paramètres:
img – list of image names
url – url
navigator – firefox, chrome, (ie: does not work well)
add_date – add a date to the image filename
size – to resize the webshot (if not None)
fLOG – logging function
- Renvoie:
list of [ ( url, image name) ]
Check the list of available webdriver at selenium/webdriver and add one to the code if needed.
Chrome requires the chromedriver. See function install_chromedriver.