Helpers to play with files

java

pyenbc.filehelper.download_java_standalone (version = ‘2.7.1-rc2’)

download the standalone jython if it does not exists, we should version JYTHON_VERSION by default in order to fit the cluster’s version

pyenbc.filehelper.maven_helper.download_jar_from_maven (group, lib, version, location, overwrite = False)

download a jar file from maven

pyenbc.filehelper.pig_helper.download_pig_standalone (pig_version = ‘0.17.0’, hadoop_version = ‘3.3.0’, fLOG = <function noLOG at 0x7ffaad0374c0>)

Downloads the standalone :epkg:`jython`. If it does not exists, we should version HADOOP_VERSION by default in order to fit the cluster’s version.

pyenbc.filehelper.pig_helper.get_hadoop_jars ()

Returns the list of jars to include into the command line in order to run :epkg:`HADOOP`.

pyenbc.filehelper.pig_helper.get_hadoop_path ()

This function assumes a folder pig hadoopjar is present in this directory, the function returns the folder.

pyenbc.filehelper.jython_helper.get_java_cmd ()

return the java path

pyenbc.filehelper.jython_helper.get_java_path ()

returns the java path

raises FileNotFoundError:

if java is not found

pyenbc.filehelper.get_jython_jar ()

This function assumes a file jython-standalone-x.x.x.jar is present in this directory, the function returns the file.

pyenbc.filehelper.pig_helper.get_pig_jars ()

Returns the list of jars to include into the command line in order to run :epkg:`PIG`.

pyenbc.filehelper.pig_helper.get_pig_path ()

This function assumes a folder pig pigjar is present in this directory, the function returns the folder

pyenbc.filehelper.is_java_installed (fLOG = <function noLOG at 0x7ffaad0374c0>)

Checks if :epkg:`java` is installed.

pyenbc.filehelper.run_jython (pyfile, argv = None, jython_path = None, sin = None, timeout = None, fLOG = <function noLOG at 0x7ffaad0374c0>)

runs a jython script and returns the standard output and error

pyenbc.filehelper.pig_helper.run_pig (pigfile, argv = None, pig_path = None, hadoop_path = None, jython_path = None, timeout = None, logpath = ‘logs’, pig_version = ‘0.17.0’, hadoop_version = ‘3.3.0’, jar_no_hadoop = True, fLOG = <function noLOG at 0x7ffaad0374c0>)

Runs a :epkg:`pig` script and returns the standard output and error.