module ipythonhelper.run_notebook

Short summary

module pyquickhelper.ipythonhelper.run_notebook

Functions to run notebooks.

source on GitHub

Functions

function

truncated documentation

_cache_url_to_file

Downloads file corresponding to url stored in cache_urls.

_existing_dump

Loads an existing dump.

_get_dump_default_path

Proposes a default location to dump results about notebooks execution.

badge_notebook_coverage

Builds a badge reporting on the notebook coverage. It gives the proportion of run cells.

execute_notebook_list

Executes a list of notebooks.

execute_notebook_list_finalize_ut

Checks the list of results and raises an exception if one failed. This is meant to be used in unit tests.

get_additional_paths

Returns a list of paths to add before running the notebooks for a given a list of modules.

notebook_coverage

Extracts a list of notebooks and merges with a list of runs dumped by function execute_notebook_list_finalize_ut(). …

retrieve_notebooks_in_folder

Retrieves notebooks in a test folder.

run_notebook

Runs a notebook end to end, it is inspired from module runipy.

Documentation

Functions to run notebooks.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook._cache_url_to_file(cache_urls, folder, fLOG=<function noLOG>)[source][source]

Downloads file corresponding to url stored in cache_urls.

Parameters
  • cache_urls – list of urls

  • folder – where to store the cached files

  • fLOG – logging function

Returns

dictionary { url: file }

The function detects if the file was already downloaded. In that case, it does not do it a second time.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook._existing_dump(dump)[source][source]

Loads an existing dump.

Parameters

dump – filename

Returns

pandas.DataFrame

source on GitHub

pyquickhelper.ipythonhelper.run_notebook._get_dump_default_path(dump)[source][source]

Proposes a default location to dump results about notebooks execution.

Parameters

dump – location of the dump or module.

Returns

location of the dump

The result might be equal to the input if dump is already path.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.badge_notebook_coverage(df, image_name)[source][source]

Builds a badge reporting on the notebook coverage. It gives the proportion of run cells.

Parameters
Returns

coverage estimation

The function relies on module Pillow.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.execute_notebook_list(folder, notebooks, clean_function=None, valid=None, fLOG=<function noLOG>, additional_path=None, deepfLOG=<function noLOG>, kernel_name='python', log_level='30', extended_args=None, cache_urls=None, replacements=None, detailed_log=None, startup_timeout=300)[source][source]

Executes a list of notebooks.

Parameters
  • folder – folder (where to execute the notebook, current folder for the notebook)

  • notebooks – list of notebooks to execute (or a list of tuple(notebook, code which initializes the notebook))

  • clean_function – function which transform the code before running it

  • valid – if not None, valid is a function which returns whether or not the cell should be executed or not, if the function returns None, the execution of the notebooks and skip the execution of the other cells

  • fLOG – logging function

  • deepfLOG – logging function used to run the notebook

  • additional_path – path to add to sys.path before running the notebook

  • kernel_name – kernel name, it can be None

  • log_level – Choices: (0, 10, 20, 30=default, 40, 50, ‘DEBUG’, ‘INFO’, ‘WARN’, ‘ERROR’, ‘CRITICAL’)

  • extended_args – others arguments to pass to the command line (‘–KernelManager.autorestar=True’ for example), see Jupyter Notebook Arguments for a full list

  • cache_urls – list of urls to cache

  • replacements – additional replacements

  • detailed_log – detailed log

  • startup_timeout – wait for this long for the kernel to be ready, see wait_for_ready

Returns

dictionary of dictionaries { notebook_name: {  } }

If isSuccess is False, statistics contains the execution time, output is the exception raised during the execution.

The signature of function valid_cell is:

def valid_cell(cell):
    return True or False or None to stop execution of the notebook before this cell

The signature of function clean_function is:

def clean_function(cell):
    return new_cell_content

The execution of a notebook might fail because it relies on remote data specified by url. The function downloads the data first and stores it in folder working_dir (must not be None). The url string is replaced by the absolute path to the file.

Changed in version 1.8: Parameters detailed_log, startup_timeout were added.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.execute_notebook_list_finalize_ut(res, dump=None, fLOG=<function noLOG>)[source][source]

Checks the list of results and raises an exception if one failed. This is meant to be used in unit tests.

Parameters
  • res – output of execute_notebook_list

  • dump – if not None, dump the results of the execution in a flat file

  • fLOG – logging function

The dump relies on pandas and append the results a previous dump. If dump is a module, the function stores the output of the execution in a default location only if the process does not run on travis or appveyor. The default location is something like:

    /var/lib/jenkins/workspace/pyquickhelper/pyquickhelper_UT_37_std/_doc/sphinxdoc/source/pyquickhelper/../../../_notebook_dumps/notebook.pyquickhelper.txt

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.get_additional_paths(modules)[source][source]

Returns a list of paths to add before running the notebooks for a given a list of modules.

Returns

list of paths

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.notebook_coverage(module_or_path, dump=None, too_old=30)[source][source]

Extracts a list of notebooks and merges with a list of runs dumped by function execute_notebook_list_finalize_ut.

Parameters
  • module_or_path – a module or a path

  • dump – dump (or None to get the location by default)

  • too_old – drop executions older than too_old days from now

Returns

dataframe

If module_or_path is a module, the function will get a list notebooks assuming it follows the same design as pyquickhelper.

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.retrieve_notebooks_in_folder(folder, posreg='.*[.]ipynb$', negreg=None)[source][source]

Retrieves notebooks in a test folder.

Parameters
  • folder – folder

  • regex – regular expression

Returns

list of found notebooks

source on GitHub

pyquickhelper.ipythonhelper.run_notebook.run_notebook(filename, profile_dir=None, working_dir=None, skip_exceptions=False, outfilename=None, encoding='utf8', additional_path=None, valid=None, clean_function=None, code_init=None, fLOG=<function noLOG>, kernel_name='python', log_level='30', extended_args=None, cache_urls=None, replacements=None, detailed_log=None, startup_timeout=300)[source][source]

Runs a notebook end to end, it is inspired from module runipy.

Parameters
  • filename – notebook filename

  • profile_dir – profile directory

  • working_dir – working directory

  • skip_exceptions – skip exceptions

  • outfilename – if not None, saves the output in this notebook

  • encoding – encoding for the notebooks

  • additional_path – additional paths for import

  • valid – if not None, valid is a function which returns whether or not the cell should be executed or not, if the function returns None, the execution of the notebooks and skip the execution of the other cells

  • clean_function – function which cleans a cell’s code before executing it (None for None)

  • code_init – code to run before the execution of the notebook as if it was a cell

  • fLOG – logging function

  • kernel_name – kernel name, it can be None

  • log_level – Choices: (0, 10, 20, 30=default, 40, 50, ‘DEBUG’, ‘INFO’, ‘WARN’, ‘ERROR’, ‘CRITICAL’)

  • extended_args – others arguments to pass to the command line (‘–KernelManager.autorestar=True’ for example), see Jupyter Notebook Arguments for a full list

  • cache_urls – list of urls to cache

  • replacements – list of additional replacements, list of tuple

  • detailed_log – a second function to log more information when executing the notebook, this should be a function with the same signature as print or None

  • startup_timeout

    wait for this long for the kernel to be ready, see wait_for_ready

Returns

tuple (statistics, output)

Warning

The function calls basicConfig.

Run a notebook end to end

from pyquickhelper.ipythonhelper import run_notebook
run_notebook("source.ipynb", working_dir="temp",
            outfilename="modified.ipynb",
            additional_path=["custom_path"] )

The function adds the local variable theNotebook with the absolute file name of the notebook.

The execution of a notebook might fail because it relies on remote data specified by url. The function downloads the data first and stores it in folder working_dir (must not be None). The url string is replaced by the absolute path to the file.

Changed in version 1.8: Parameters detailed_log, startup_timeout were added.

source on GitHub