module hackathon.web_search_helper

Short summary

module ensae_projects.hackathon.web_search_helper

Helpers for the hackathon 2018 related to search internet.

source on GitHub

Functions

function truncated documentation
extract_bing_result Extract the first results from a search page assuming it comes from Bing Image.
query_bing_image Returns the search page from Bing Image for a specific query.

Documentation

Helpers for the hackathon 2018 related to search internet.

source on GitHub

ensae_projects.hackathon.web_search_helper.extract_bing_result(search_page, filter_fct=<function <lambda>>)[source]

Extract the first results from a search page assuming it comes from Bing Image.

Parameters:
  • search_page – content of Bing Image search page (or filename)
  • filter_fct – remove some urls if this function is False filter(u) --> True or False
Returns:

a list with the urls

source on GitHub

ensae_projects.hackathon.web_search_helper.query_bing_image(query, folder_cache='cache_search_page', filter_fct=<function <lambda>>, add_options=False, use_selenium=False, navigator=None, fLOG=None)[source]

Returns the search page from Bing Image for a specific query.

Parameters:
  • query – search query
  • folder_cache – folder used to stored the result page or to retrieve a page if the query was already searched for
  • filter_fct – remove some urls if this function is False filter(u) --> True or False
  • add_options – add options to the search url
  • use_selenium – relies on webhtml
  • navigator – see webhtml
  • fLOG – logging function
Returns:

list of urls

source on GitHub