.. _l-ortvalue-doc: ======== OrtValue ======== :epkg:`numpy` has its :class:`numpy.ndarray`, :epkg:`pytorch` has its :class:`torch.Tensor`. :epkg:`onnxruntime` has its `OrtValue`. As opposed to the other two framework, `OrtValue` does not support simple operations such as addition, subtraction, multiplication or division. It can only be used to be consumed by :epkg:`onnxruntime` or converted into another object such as :class:`numpy.ndarray`. An `OrtValue` can hold more than a dense tensor, it can also be a sparse tensor, a sequence of tensors or a map of tensors. Like :class:`torch.Tensor`, the data can be located on CPU, CUDA, ... .. contents:: :local: .. note:: :epkg:`onnxruntime` implements a C class named `OrtValue` but referred as :epkg:`C_OrtValue` and a python wrapper for it also named :epkg:`OrtValue`. This documentation uses :epkg:`C_OrtValue` directly. The wrapper is usually calling the same C functions. The same goes for :epkg:`OrtDevice` and :epkg:`C_OrtDevice`. They can be imported like this: :: from onnxruntime.capi._pybind_state import ( OrtValue as C_OrtValue, OrtDevice as C_OrtDevice) .. _l-doc-device: Device ====== A device is associated to a tensor. It indicates where the data is stored. It is defined by: * a device type: CPU, CUDA, FGPA * a device index: if there are many devices of the same type, it tells which one is used. * an allocator: it is possible to change the way memory is allocated. Next example shows how to create a CPU device. .. runpython:: :showcode: from onnxruntime.capi._pybind_state import ( OrtDevice as C_OrtDevice) ort_device = C_OrtDevice( C_OrtDevice.cpu(), C_OrtDevice.default_memory(), 0) print(ort_device) print(ort_device.device_type(), C_OrtDevice.cpu()) And the next one how to create a CUDA device. .. runpython:: :showcode: from onnxruntime.capi._pybind_state import ( OrtDevice as C_OrtDevice) ort_device = C_OrtDevice( C_OrtDevice.cuda(), C_OrtDevice.default_memory(), 0) print(ort_device) print(ort_device.device_type(), C_OrtDevice.cuda()) The class has three methods: * *device_type()*: returns the device type * *device_id()*: returns the device index * *device_mem_type()*: *not available yet* Memory Allocator ================ .. index:: incomplete **to be continued later** OrtValue ======== This class is a generic type. It hides any supported type by :epkg:`onnxruntime`, a tensor, a sparse tensor, a sequence of tensors, a map of tensors. From python point of view, it is only a container. It is only possible to export, convert or get information about it. The only way to manipulate *OrtValue* is to go through an ONNX graph loaded by an :epkg:`InferenceSession`. Following section refers to the C implementation of :epkg:`C_OrtValue`. Creation from numpy +++++++++++++++++++ The most easier way is to create an :epkg:`C_OrtValue` from a :class:`numpy.ndarray`. Next example does that on CPU. However even that simple example hides some important detail. .. runpython:: :showcode: import numpy from onnxruntime.capi._pybind_state import ( # pylint: disable=E0611 OrtValue as C_OrtValue, OrtDevice as C_OrtDevice, OrtMemType) from onnxcustom.utils.print_helper import str_ortvalue vect = numpy.array([100, 100], dtype=numpy.float32) device = C_OrtDevice(C_OrtDevice.cpu(), OrtMemType.DEFAULT, 0) ort_value = C_OrtValue.ortvalue_from_numpy(vect, device) print(ort_value) print(str_ortvalue(ort_value)) # Data pointers? print(ort_value.data_ptr()) print(vect.__array_interface__['data']) The last two lines show that both objects point to the same location. To avoid copying the data, :epkg:`onnxruntime` only creates a structure wrapping the same memory buffer. As a result, the numpy array must **remain alive** as long as the instance of `C_OrtValue` is. If it does not, the program usually crashes with no exception but a segmentation fault. Creation from a new buffer ++++++++++++++++++++++++++ Method `ortvalue_from_shape_and_type` can create a new :epkg:`C_OrtValue` owning its buffer. .. runpython:: :showcode: import numpy from onnxruntime.capi._pybind_state import ( # pylint: disable=E0611 OrtValue as C_OrtValue, OrtDevice as C_OrtDevice, OrtMemType) from onnxcustom.utils.print_helper import str_ortvalue device = C_OrtDevice(C_OrtDevice.cpu(), OrtMemType.DEFAULT, 0) ort_value = C_OrtValue.ortvalue_from_shape_and_type( [100, 100], numpy.float32, device) print(ort_value) print(str_ortvalue(ort_value)) # Address can be given to another C function to populate the buffer. print(ort_value.data_ptr()) Export to numpy +++++++++++++++ Unless it is reused by another library or :epkg:`onnxruntime` itself, the only way to access the data it contains is to create a numpy array with method `numpy`. .. runpython:: :showcode: import numpy from onnxruntime.capi._pybind_state import ( # pylint: disable=E0611 OrtValue as C_OrtValue, OrtDevice as C_OrtDevice, OrtMemType) from onnxcustom.utils.print_helper import str_ortvalue vect = numpy.array([100, 100], dtype=numpy.float32) device = C_OrtDevice(C_OrtDevice.cpu(), OrtMemType.DEFAULT, 0) ort_value = C_OrtValue.ortvalue_from_numpy(vect, device) print(ort_value) print(str_ortvalue(ort_value)) # Data pointers? print(ort_value.data_ptr()) print(vect.__array_interface__['data']) # to numpy vect2 = ort_value.numpy() print(vect2.__array_interface__['data']) Method `numpy` makes a copy. Next section brings more details about avoiding that copy. DLPack ====== :epkg:`DLPack` is a protocol imagined to avoid copying memory when data is created by one framework and used by another one. The safest way is to copy entirely the data in its own containers. But that costs a lot if the data is big or may be even difficult if the data is big compared to the memory size. The DLpack structure describes a tensor, or a multidimensional vector with a specific element type and a specific shape. It also keeps the location or device where the data is (CPU, CUDA, ...). When a library B receives a DLpack structure from a library A, it: * creates its own to store any information it needs * it deletes the structure it receives by calling a destructor store in the structure itself. The library B takes ownership of the data and is now responsible for its deletion unless a library C requests its ownership through a DLpack structure as well. :epkg:`pytorch` implements this through two functions `to_dlpack` and `from_dlpack` (see `torch.utils.dlpack `_). :epkg:`numpy` implements it as well. The changes were merged in `PR 19083 `_. :epkg:`onnxruntime-training` implements a couple of scenarios based on :epkg:`pytorch` and needs this protocol to avoid unnecessary data transfer. Conversion ++++++++++ Method `to_dlpack` exports a :epkg:`C_OrtValue` into a DLPack stucture. Static method `from_dlpack` creates :epkg:`C_OrtValue` from a DLPack stucture. Everytime one of these methods is used, the previous container loses ownership to the next one. Only this one must be used. It becomes responsible for the data deletion. .. runpython:: :showcode: import numpy from onnxruntime.capi._pybind_state import ( # pylint: disable=E0611 OrtValue as C_OrtValue, OrtDevice as C_OrtDevice, OrtMemType) from onnxcustom.utils.print_helper import str_ortvalue vect = numpy.array([100, 100], dtype=numpy.float32) device = C_OrtDevice(C_OrtDevice.cpu(), OrtMemType.DEFAULT, 0) ort_value = C_OrtValue.ortvalue_from_numpy(vect, device) print("ptr", ort_value.data_ptr()) # export dlp = ort_value.to_dlpack() print(dlp) # export back to onnxruntime ort_value_back = C_OrtValue.from_dlpack(dlp, False) # dlp structure is no longer valid print("ptr", ort_value_back.data_ptr()) print(str_ortvalue(ort_value_back)) .. index:: incomplete **to be continued later** See `PR 9610 `_. OrtValueVector ++++++++++++++ This container is equivalent to a list of :epkg:`C_OrtValue`. It optimizes the conversion to DLPack structure (see `PR 9610 `_). .. index:: incomplete **to be continued later** Boolean ambiguity +++++++++++++++++ Boolean type is usually represented as a vector of unsigned bytes. This information is not actually stored in the DLPack structure and there is no way to distinguish between the two. That's why method `from_dlpack` has an additional parameter. You can read more about this in `issue 75 `_. Sparse Tensors ============== Sparse tensors only represent 2D matrices and are much more efficient in standard machine learning to represent categories or text features. This structure is usually created by an operator such as :epkg:`OneHotEncoder` or :epkg:`TfIdfVectorizer`. CSR +++ The following example shows how to create a sparse tensor (C version, :epkg:`C_SparseTensor`) from a :epkg:`CSR` matrix and to convert it back to this format. .. runpython:: :showcode: import numpy from scipy.sparse import csr_matrix from onnxruntime.capi._pybind_state import ( SparseTensor as C_SparseTensor, OrtDevice as C_OrtDevice) ort_device = C_OrtDevice( C_OrtDevice.cpu(), C_OrtDevice.default_memory(), 0) dense = (numpy.random.randn(100, 10) >= 2).astype(numpy.float32) print("sparse ratio:", dense.sum() * 1.0 / dense.size) csr = csr_matrix(dense) print("csr_matrix:") print(csr) ort_sparse = C_SparseTensor.sparse_csr_from_numpy( csr.shape, csr.data, csr.indices, csr.indptr, ort_device) print("ort_sparse.values() ->", ort_sparse.values()) # Back to csr_matrix. ort_csr = ort_sparse.get_csrc_data() csr2 = csr_matrix( (ort_sparse.values(), ort_csr.inner(), ort_csr.outer()), shape=ort_sparse.dense_shape()) print("retrieved:") print(csr2) COO +++ Previous example was changed to do the same with format :epkg:`COO`. .. runpython:: :showcode: import numpy from scipy.sparse import coo_matrix from onnxruntime.capi._pybind_state import ( SparseTensor as C_SparseTensor, OrtDevice as C_OrtDevice) ort_device = C_OrtDevice( C_OrtDevice.cpu(), C_OrtDevice.default_memory(), 0) dense = (numpy.random.randn(100, 10) >= 2).astype(numpy.float32) print("sparse ratio:", dense.sum() * 1.0 / dense.size) coo = coo_matrix(dense) print("coo_matrix:") print(coo) ort_sparse = C_SparseTensor.sparse_coo_from_numpy( coo.shape, coo.data, numpy.hstack([coo.row.reshape((-1, 1)), coo.col.reshape((-1, 1))]), ort_device) print("ort_sparse.values() ->", ort_sparse.values()) # Back to coo_matrix. ort_coo = ort_sparse.get_coo_data() indices = ort_coo.indices() coo2 = coo_matrix( (ort_sparse.values(), (indices[:, 0], indices[:, 1])), shape=ort_sparse.dense_shape()) print("retrieved:") print(coo2)