Inference
Options
RunOptions
- class onnxruntime.RunOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions)
Configuration information for a single Run.
- add_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str, arg1: str) None
Set a single run configuration entry as a pair of strings.
- get_run_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions, arg0: str) str
Get a single run configuration value using the given configuration key.
- property log_severity_level
Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.
- Type:
Log severity level for a particular Run() invocation. 0
- Type:
Verbose, 1
- property log_verbosity_level
VLOG level if DEBUG build and run_log_severity_level is 0. Applies to a particular Run() invocation. Default is 0.
- property logid
To identify logs generated by a particular Run() invocation.
- property only_execute_path_to_fetches
Only execute the nodes needed by fetch list
- property synchronize_execution_providers
Synchronize execution providers after executing session.
- property terminate
Set to True to terminate any currently executing calls that are using this RunOptions instance. The individual calls will exit gracefully and return an error status.
- property training_mode
Choose to run in training or inferencing mode
SessionOptions
- class onnxruntime.SessionOptions(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions)
Configuration information for a session.
- add_external_initializers(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: list, arg1: list) None
- add_free_dimension_override_by_denotation(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) None
Specify the dimension size for each denotation associated with an input’s free dimension.
- add_free_dimension_override_by_name(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: int) None
Specify values of named dimensions within model inputs.
- add_initializer(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: object) None
- add_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str, arg1: str) None
Set a single session configuration entry as a pair of strings.
- property enable_cpu_mem_arena
Enables the memory arena on CPU. Arena may pre-allocate memory for future usage. Set this option to false if you don’t want it. Default is True.
- property enable_mem_pattern
Enable the memory pattern optimization. Default is true.
- property enable_mem_reuse
Enable the memory reuse optimization. Default is true.
- property enable_profiling
Enable profiling for this session. Default is false.
- property execution_mode
Sets the execution mode. Default is sequential.
- property execution_order
Sets the execution order. Default is basic topological order.
- get_session_config_entry(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) str
Get a single session configuration value using the given configuration key.
- property graph_optimization_level
Graph optimization level for this session.
- property inter_op_num_threads
Sets the number of threads used to parallelize the execution of the graph (across nodes). Default is 0 to let onnxruntime choose.
- property intra_op_num_threads
Sets the number of threads used to parallelize the execution within nodes. Default is 0 to let onnxruntime choose.
- property log_severity_level
Log severity level. Applies to session load, initialization, etc. 0:Verbose, 1:Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.
- property log_verbosity_level
VLOG level if DEBUG build and session_log_severity_level is 0. Applies to session load, initialization, etc. Default is 0.
- property logid
Logger id to use for session output.
- property optimized_model_filepath
File path to serialize optimized model to. Optimized model is not serialized unless optimized_model_filepath is set. Serialized model format will default to ONNX unless: - add_session_config_entry is used to set ‘session.save_model_format’ to ‘ORT’, or - there is no ‘session.save_model_format’ config entry and optimized_model_filepath ends in ‘.ort’ (case insensitive)
- property profile_file_prefix
The prefix of the profile file. The current time will be appended to the file name.
- register_custom_ops_library(self: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg0: str) None
Specify the path to the shared library containing the custom op kernels required to run a model.
- property use_deterministic_compute
Whether to use deterministic compute. Default is false.
InferenceSession
- class onnxruntime.InferenceSession(path_or_bytes, sess_options=None, providers=None, provider_options=None, **kwargs)[source]
This is the main class used to run a model.
- Parameters:
path_or_bytes – filename or serialized ONNX or ORT format model in a byte string
sess_options – session options
providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.
provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.
The model type will be inferred unless explicitly set in the SessionOptions. To explicitly set:
so = onnxruntime.SessionOptions() # so.add_session_config_entry('session.load_model_format', 'ONNX') or so.add_session_config_entry('session.load_model_format', 'ORT')
A file extension of ‘.ort’ will be inferred as an ORT format model. All other filenames are assumed to be ONNX format models.
‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.
The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.
- disable_fallback()
Disable session.run() fallback mechanism.
- enable_fallback()
Enable session.Run() fallback mechanism. If session.Run() fails due to an internal Execution Provider failure, reset the Execution Providers enabled for this session. If GPU is enabled, fall back to CUDAExecutionProvider. otherwise fall back to CPUExecutionProvider.
- end_profiling()
End profiling and return results in a file.
The results are stored in a filename if the option
onnxruntime.SessionOptions.enable_profiling()
.
- get_inputs()
Return the inputs metadata as a list of
onnxruntime.NodeArg
.
- get_modelmeta()
Return the metadata. See
onnxruntime.ModelMetadata
.
- get_outputs()
Return the outputs metadata as a list of
onnxruntime.NodeArg
.
- get_overridable_initializers()
Return the inputs (including initializers) metadata as a list of
onnxruntime.NodeArg
.
- get_profiling_start_time_ns()
Return the nanoseconds of profiling’s start time Comparable to time.monotonic_ns() after Python 3.3 On some platforms, this timer may not be as precise as nanoseconds For instance, on Windows and MacOS, the precision will be ~100ns
- get_provider_options()
Return registered execution providers’ configurations.
- get_providers()
Return list of registered execution providers.
- get_session_options()
Return the session options. See
onnxruntime.SessionOptions
.
- io_binding()
Return an onnxruntime.IOBinding object`.
- run(output_names, input_feed, run_options=None)
Compute the predictions.
- Parameters:
output_names – name of the outputs
input_feed – dictionary
{ input_name: input_value }
run_options – See
onnxruntime.RunOptions
.
- Returns:
list of results, every result is either a numpy array, a sparse tensor, a list or a dictionary.
sess.run([output_name], {input_name: x})
- run_with_iobinding(iobinding, run_options=None)
Compute the predictions.
- Parameters:
iobinding – the iobinding object that has graph inputs/outputs bind.
run_options – See
onnxruntime.RunOptions
.
- run_with_ort_values(output_names, input_dict_ort_values, run_options=None)
Compute the predictions.
- Parameters:
output_names – name of the outputs
input_dict_ort_values – dictionary
{ input_name: input_ort_value }
SeeOrtValue
class how to create OrtValue from numpy array or SparseTensorrun_options – See
onnxruntime.RunOptions
.
- Returns:
an array of OrtValue
sess.run([output_name], {input_name: x})
- run_with_ortvaluevector(run_options, feed_names, feeds, fetch_names, fetches, fetch_devices)
Compute the predictions similar to other run_*() methods but with minimal C++/Python conversion overhead.
- Parameters:
run_options – See
onnxruntime.RunOptions
.feed_names – list of input names.
feeds – list of input OrtValue.
fetch_names – list of output names.
fetches – list of output OrtValue.
fetch_devices – list of output devices.
- set_providers(providers=None, provider_options=None)
Register the input list of execution providers. The underlying session is re-created.
- Parameters:
providers – Optional sequence of providers in order of decreasing precedence. Values can either be provider names or tuples of (provider name, options dict). If not provided, then all available providers are used with the default precedence.
provider_options – Optional sequence of options dicts corresponding to the providers listed in ‘providers’.
‘providers’ can contain either names or names and options. When any options are given in ‘providers’, ‘provider_options’ should not be used.
The list of providers is ordered by precedence. For example [‘CUDAExecutionProvider’, ‘CPUExecutionProvider’] means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.