DequantizeLinear#

DequantizeLinear - 13#

Version

  • name: DequantizeLinear (GitHub)

  • domain: main

  • since_version: 13

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 13.

Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. ‘x_scale’ and ‘x_zero_point’ must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. ‘x_zero_point’ and ‘x’ must have same type. ‘x’ and ‘y’ must have same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0).

Attributes

  • axis: (Optional) The axis of the dequantizing dimension of the input tensor. Ignored for per-tensor quantization. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input). Default value is 1.

Inputs

Between 2 and 3 inputs.

  • x (heterogeneous) - T: N-D quantized input tensor to be de-quantized.

  • x_scale (heterogeneous) - tensor(float): Scale for input ‘x’. It can be a scalar, which means a per- tensor/layer dequantization, or a 1-D tensor for per-axis dequantization.

  • x_zero_point (optional, heterogeneous) - T: Zero point for input ‘x’. Shape must match x_scale. It’s optional. Zero point is 0 when it’s not specified.

Outputs

  • y (heterogeneous) - tensor(float): N-D full precision output tensor. It has same shape as input ‘x’.

Type Constraints

  • T in ( tensor(int32), tensor(int8), tensor(uint8) ): Constrain ‘x_zero_point’ and ‘x’ to 8-bit/32-bit integer tensor.

Examples

default

node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

# scalar zero point and scale
x = np.array([0, 3, 128, 255]).astype(np.uint8)
x_scale = np.float32(2)
x_zero_point = np.uint8(128)
y = np.array([-256, -250, 0, 254], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear",
)

_axis

node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

# 1-D tensor zero point and scale of size equal to axis 1 of the input tensor
x = np.array(
    [
        [
            [[3, 89], [34, 200], [74, 59]],
            [[5, 24], [24, 87], [32, 13]],
            [[245, 99], [4, 142], [121, 102]],
        ],
    ],
    dtype=np.uint8,
)
x_scale = np.array([2, 4, 5], dtype=np.float32)
x_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (
    x.astype(np.float32) - x_zero_point.reshape(1, 3, 1, 1).astype(np.float32)
) * x_scale.reshape(1, 3, 1, 1)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_axis",
)

Differences

00The linear dequantization operator. It consumes a quantized tensor, a scale, a zero point to compute the full precision tensor.The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor.
11The dequantization formula is y = (x - x_zero_point) * x_scale. 'x_scale' and 'x_zero_point' are both scalars.The dequantization formula is y = (x - x_zero_point) * x_scale. 'x_scale' and 'x_zero_point' must have same shape, and can be either a scalar
2for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
23'x_zero_point' and 'x' must have same type. 'x' and 'y' must have same shape. In the case of dequantizing int32,'x_zero_point' and 'x' must have same type. 'x' and 'y' must have same shape. In the case of dequantizing int32,
34there's no zero point (zero point is supposed to be 0).there's no zero point (zero point is supposed to be 0).
45
6**Attributes**
7
8* **axis**:
9 (Optional) The axis of the dequantizing dimension of the input
10 tensor. Ignored for per-tensor quantization. Negative value means
11 counting dimensions from the back. Accepted range is [-r, r-1] where
12 r = rank(input). Default value is 1.
13
514**Inputs****Inputs**
615
716Between 2 and 3 inputs.Between 2 and 3 inputs.
817
918* **x** (heterogeneous) - **T**:* **x** (heterogeneous) - **T**:
1019 N-D quantized input tensor to be de-quantized. N-D quantized input tensor to be de-quantized.
1120* **x_scale** (heterogeneous) - **tensor(float)**:* **x_scale** (heterogeneous) - **tensor(float)**:
1221 Scale for input 'x'. It's a scalar, which means a per-tensor/layer Scale for input 'x'. It can be a scalar, which means a per-
1322 quantization. tensor/layer dequantization, or a 1-D tensor for per-axis
23 dequantization.
1424* **x_zero_point** (optional, heterogeneous) - **T**:* **x_zero_point** (optional, heterogeneous) - **T**:
1525 Zero point for input 'x'. It's a scalar, which means a per- Zero point for input 'x'. Shape must match x_scale. It's optional.
16 tensor/layer quantization. It's optional. 0 is the default value
1726 when it's not specified. Zero point is 0 when it's not specified.
1827
1928**Outputs****Outputs**
2029
2130* **y** (heterogeneous) - **tensor(float)**:* **y** (heterogeneous) - **tensor(float)**:
2231 N-D full precision output tensor. It has same shape as input 'x'. N-D full precision output tensor. It has same shape as input 'x'.
2332
2433**Type Constraints****Type Constraints**
2534
2635* **T** in (* **T** in (
2736 tensor(int32), tensor(int32),
2837 tensor(int8), tensor(int8),
2938 tensor(uint8) tensor(uint8)
3039 ): ):
3140 Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor. Constrain 'x_zero_point' and 'x' to 8-bit/32-bit integer tensor.

DequantizeLinear - 10#

Version

  • name: DequantizeLinear (GitHub)

  • domain: main

  • since_version: 10

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 10.

Summary

The linear dequantization operator. It consumes a quantized tensor, a scale, a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. ‘x_scale’ and ‘x_zero_point’ are both scalars. ‘x_zero_point’ and ‘x’ must have same type. ‘x’ and ‘y’ must have same shape. In the case of dequantizing int32, there’s no zero point (zero point is supposed to be 0).

Inputs

Between 2 and 3 inputs.

  • x (heterogeneous) - T: N-D quantized input tensor to be de-quantized.

  • x_scale (heterogeneous) - tensor(float): Scale for input ‘x’. It’s a scalar, which means a per-tensor/layer quantization.

  • x_zero_point (optional, heterogeneous) - T: Zero point for input ‘x’. It’s a scalar, which means a per- tensor/layer quantization. It’s optional. 0 is the default value when it’s not specified.

Outputs

  • y (heterogeneous) - tensor(float): N-D full precision output tensor. It has same shape as input ‘x’.

Type Constraints

  • T in ( tensor(int32), tensor(int8), tensor(uint8) ): Constrain ‘x_zero_point’ and ‘x’ to 8-bit/32-bit integer tensor.