Inference

Options 

RunOptions 

class onnxruntime.RunOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions)

Configuration information for a single Run.

add_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str, arg1: str) → None: Set a single run configuration entry as a pair of strings.

get_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str) → str: Get a single run configuration value using the given configuration key.

property log_severity_level

Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.

Type:: Log severity level for a particular Run() invocation. 0
Type:: Verbose, 1

property log_verbosity_level: VLOG level if DEBUG build and run_log_severity_level is 0. Applies to a particular Run() invocation. Default is 0.

property logid: To identify logs generated by a particular Run() invocation.

property only_execute_path_to_fetches: Only execute the nodes needed by fetch list

property synchronize_execution_providers: Synchronize execution providers after executing session.

property terminate: Set to True to terminate any currently executing calls that are using this RunOptions instance. The individual calls will exit gracefully and return an error status.

property training_mode: Choose to run in training or inferencing mode

SessionOptions 

class onnxruntime.SessionOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions)

Configuration information for a session.

add_external_initializers(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: list, arg1: list) → None

add_free_dimension_override_by_denotation(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) → None: Specify the dimension size for each denotation associated with an input’s free dimension.

add_free_dimension_override_by_name(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) → None: Specify values of named dimensions within model inputs.

add_initializer(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: object) → None

add_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: str) → None: Set a single session configuration entry as a pair of strings.

property enable_cpu_mem_arena: Enables the memory arena on CPU. Arena may pre-allocate memory for future usage. Set this option to false if you don’t want it. Default is True.

property enable_mem_pattern: Enable the memory pattern optimization. Default is true.

property enable_mem_reuse: Enable the memory reuse optimization. Default is true.

property enable_profiling: Enable profiling for this session. Default is false.

property execution_mode: Sets the execution mode. Default is sequential.

property execution_order: Sets the execution order. Default is basic topological order.

get_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) → str: Get a single session configuration value using the given configuration key.

property graph_optimization_level: Graph optimization level for this session.

property inter_op_num_threads: Sets the number of threads used to parallelize the execution of the graph (across nodes). Default is 0 to let onnxruntime choose.

property intra_op_num_threads: Sets the number of threads used to parallelize the execution within nodes. Default is 0 to let onnxruntime choose.

property log_severity_level: Log severity level. Applies to session load, initialization, etc. 0:Verbose, 1:Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.

property log_verbosity_level: VLOG level if DEBUG build and session_log_severity_level is 0. Applies to session load, initialization, etc. Default is 0.

property logid: Logger id to use for session output.

property optimized_model_filepath: File path to serialize optimized model to. Optimized model is not serialized unless optimized_model_filepath is set. Serialized model format will default to ONNX unless: - add_session_config_entry is used to set ‘session.save_model_format’ to ‘ORT’, or - there is no ‘session.save_model_format’ config entry and optimized_model_filepath ends in ‘.ort’ (case insensitive)

property profile_file_prefix: The prefix of the profile file. The current time will be appended to the file name.

register_custom_ops_library(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) → None: Specify the path to the shared library containing the custom op kernels required to run a model.

property use_deterministic_compute: Whether to use deterministic compute. Default is false.

InferenceSession 

class onnxruntime.InferenceSession(path_or_bytes, sess_options=None, providers=None, provider_options=None, **kwargs)[source]

This is the main class used to run a model.

Parameters:

path_or_bytes – filename or serialized ONNX or ORT format model in a byte string
sess_options – session options
providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.
provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.

The model type will be inferred unless explicitly set in the SessionOptions. To explicitly set:

so = onnxruntime.SessionOptions()
# so.add_session_config_entry('session.load_model_format', 'ONNX') or
so.add_session_config_entry('session.load_model_format', 'ORT')

A file extension of ‘.ort’ will be inferred as an ORT format model. All other filenames are assumed to be ONNX format models.

‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.

The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.

disable_fallback(): Disable session.run() fallback mechanism.

enable_fallback(): Enable session.Run() fallback mechanism. If session.Run() fails due to an internal Execution Provider failure, reset the Execution Providers enabled for this session. If GPU is enabled, fall back to CUDAExecutionProvider. otherwise fall back to CPUExecutionProvider.

end_profiling()

End profiling and return results in a file.

The results are stored in a filename if the option onnxruntime.SessionOptions.enable_profiling().

get_inputs(): Return the inputs metadata as a list of onnxruntime.NodeArg.

get_modelmeta(): Return the metadata. See onnxruntime.ModelMetadata.

get_outputs(): Return the outputs metadata as a list of onnxruntime.NodeArg.

get_overridable_initializers(): Return the inputs (including initializers) metadata as a list of onnxruntime.NodeArg.

get_profiling_start_time_ns(): Return the nanoseconds of profiling’s start time Comparable to time.monotonic_ns() after Python 3.3 On some platforms, this timer may not be as precise as nanoseconds For instance, on Windows and MacOS, the precision will be ~100ns

get_provider_options(): Return registered execution providers’ configurations.

get_providers(): Return list of registered execution providers.

get_session_options(): Return the session options. See onnxruntime.SessionOptions.

io_binding(): Return an onnxruntime.IOBinding object`.

run(output_names, input_feed, run_options=None)

Compute the predictions.

Parameters:

output_names – name of the outputs
input_feed – dictionary { input_name: input_value }
run_options – See onnxruntime.RunOptions.

Returns:

list of results, every result is either a numpy array, a sparse tensor, a list or a dictionary.

sess.run([output_name], {input_name: x})

run_with_iobinding(iobinding, run_options=None)

Compute the predictions.

Parameters:

iobinding – the iobinding object that has graph inputs/outputs bind.
run_options – See onnxruntime.RunOptions.

run_with_ort_values(output_names, input_dict_ort_values, run_options=None)

Compute the predictions.

Parameters:

output_names – name of the outputs
input_dict_ort_values – dictionary { input_name: input_ort_value } See OrtValue class how to create OrtValue from numpy array or SparseTensor
run_options – See onnxruntime.RunOptions.

Returns:

an array of OrtValue

sess.run([output_name], {input_name: x})

run_with_ortvaluevector(run_options, feed_names, feeds, fetch_names, fetches, fetch_devices)

Compute the predictions similar to other run_*() methods but with minimal C++/Python conversion overhead.

Parameters:

run_options – See onnxruntime.RunOptions.
feed_names – list of input names.
feeds – list of input OrtValue.
fetch_names – list of output names.
fetches – list of output OrtValue.
fetch_devices – list of output devices.

set_providers(providers=None, provider_options=None)

Register the input list of execution providers. The underlying session is re-created.

Parameters:

providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.
provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.

‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.

The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.

Inference

Options

RunOptions

SessionOptions

InferenceSession

Options 

RunOptions 

SessionOptions 

InferenceSession 