Inference

Options

RunOptions

class onnxruntime.RunOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions)

Configuration information for a single Run.

add_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str, arg1: str) None

Set a single run configuration entry as a pair of strings.

get_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str) str

Get a single run configuration value using the given configuration key.

property log_severity_level

Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.

Type:

Log severity level for a particular Run() invocation. 0

Type:

Verbose, 1

property log_verbosity_level

VLOG level if DEBUG build and run_log_severity_level is 0. Applies to a particular Run() invocation. Default is 0.

property logid

To identify logs generated by a particular Run() invocation.

property only_execute_path_to_fetches

Only execute the nodes needed by fetch list

property synchronize_execution_providers

Synchronize execution providers after executing session.

property terminate

Set to True to terminate any currently executing calls that are using this RunOptions instance. The individual calls will exit gracefully and return an error status.

property training_mode

Choose to run in training or inferencing mode

SessionOptions

class onnxruntime.SessionOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions)

Configuration information for a session.

add_external_initializers(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: list, arg1: list) None
add_free_dimension_override_by_denotation(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) None

Specify the dimension size for each denotation associated with an input’s free dimension.

add_free_dimension_override_by_name(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) None

Specify values of named dimensions within model inputs.

add_initializer(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: object) None
add_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: str) None

Set a single session configuration entry as a pair of strings.

property enable_cpu_mem_arena

Enables the memory arena on CPU. Arena may pre-allocate memory for future usage. Set this option to false if you don’t want it. Default is True.

property enable_mem_pattern

Enable the memory pattern optimization. Default is true.

property enable_mem_reuse

Enable the memory reuse optimization. Default is true.

property enable_profiling

Enable profiling for this session. Default is false.

property execution_mode

Sets the execution mode. Default is sequential.

property execution_order

Sets the execution order. Default is basic topological order.

get_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) str

Get a single session configuration value using the given configuration key.

property graph_optimization_level

Graph optimization level for this session.

property inter_op_num_threads

Sets the number of threads used to parallelize the execution of the graph (across nodes). Default is 0 to let onnxruntime choose.

property intra_op_num_threads

Sets the number of threads used to parallelize the execution within nodes. Default is 0 to let onnxruntime choose.

property log_severity_level

Log severity level. Applies to session load, initialization, etc. 0:Verbose, 1:Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.

property log_verbosity_level

VLOG level if DEBUG build and session_log_severity_level is 0. Applies to session load, initialization, etc. Default is 0.

property logid

Logger id to use for session output.

property optimized_model_filepath

File path to serialize optimized model to. Optimized model is not serialized unless optimized_model_filepath is set. Serialized model format will default to ONNX unless: - add_session_config_entry is used to set ‘session.save_model_format’ to ‘ORT’, or - there is no ‘session.save_model_format’ config entry and optimized_model_filepath ends in ‘.ort’ (case insensitive)

property profile_file_prefix

The prefix of the profile file. The current time will be appended to the file name.

register_custom_ops_library(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) None

Specify the path to the shared library containing the custom op kernels required to run a model.

property use_deterministic_compute

Whether to use deterministic compute. Default is false.

InferenceSession

class onnxruntime.InferenceSession(path_or_bytes, sess_options=None, providers=None, provider_options=None, **kwargs)[source]

This is the main class used to run a model.

Parameters:
  • path_or_bytes – filename or serialized ONNX or ORT format model in a byte string

  • sess_options – session options

  • providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.

  • provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.

The model type will be inferred unless explicitly set in the SessionOptions. To explicitly set:

so = onnxruntime.SessionOptions()
# so.add_session_config_entry('session.load_model_format', 'ONNX') or
so.add_session_config_entry('session.load_model_format', 'ORT')

A file extension of ‘.ort’ will be inferred as an ORT format model. All other filenames are assumed to be ONNX format models.

‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.

The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.

disable_fallback()

Disable session.run() fallback mechanism.

enable_fallback()

Enable session.Run() fallback mechanism. If session.Run() fails due to an internal Execution Provider failure, reset the Execution Providers enabled for this session. If GPU is enabled, fall back to CUDAExecutionProvider. otherwise fall back to CPUExecutionProvider.

end_profiling()

End profiling and return results in a file.

The results are stored in a filename if the option onnxruntime.SessionOptions.enable_profiling().

get_inputs()

Return the inputs metadata as a list of onnxruntime.NodeArg.

get_modelmeta()

Return the metadata. See onnxruntime.ModelMetadata.

get_outputs()

Return the outputs metadata as a list of onnxruntime.NodeArg.

get_overridable_initializers()

Return the inputs (including initializers) metadata as a list of onnxruntime.NodeArg.

get_profiling_start_time_ns()

Return the nanoseconds of profiling’s start time Comparable to time.monotonic_ns() after Python 3.3 On some platforms, this timer may not be as precise as nanoseconds For instance, on Windows and MacOS, the precision will be ~100ns

get_provider_options()

Return registered execution providers’ configurations.

get_providers()

Return list of registered execution providers.

get_session_options()

Return the session options. See onnxruntime.SessionOptions.

io_binding()

Return an onnxruntime.IOBinding object`.

run(output_names, input_feed, run_options=None)

Compute the predictions.

Parameters:
  • output_names – name of the outputs

  • input_feed – dictionary { input_name: input_value }

  • run_options – See onnxruntime.RunOptions.

Returns:

list of results, every result is either a numpy array, a sparse tensor, a list or a dictionary.

sess.run([output_name], {input_name: x})
run_with_iobinding(iobinding, run_options=None)

Compute the predictions.

Parameters:
  • iobinding – the iobinding object that has graph inputs/outputs bind.

  • run_options – See onnxruntime.RunOptions.

run_with_ort_values(output_names, input_dict_ort_values, run_options=None)

Compute the predictions.

Parameters:
  • output_names – name of the outputs

  • input_dict_ort_values – dictionary { input_name: input_ort_value } See OrtValue class how to create OrtValue from numpy array or SparseTensor

  • run_options – See onnxruntime.RunOptions.

Returns:

an array of OrtValue

sess.run([output_name], {input_name: x})
run_with_ortvaluevector(run_options, feed_names, feeds, fetch_names, fetches, fetch_devices)

Compute the predictions similar to other run_*() methods but with minimal C++/Python conversion overhead.

Parameters:
  • run_options – See onnxruntime.RunOptions.

  • feed_names – list of input names.

  • feeds – list of input OrtValue.

  • fetch_names – list of output names.

  • fetches – list of output OrtValue.

  • fetch_devices – list of output devices.

set_providers(providers=None, provider_options=None)

Register the input list of execution providers. The underlying session is re-created.

Parameters:
  • providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.

  • provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.

‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.

The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.