com.ms.internal.nhwc - QLinearConvTranspose#

QLinearConvTranspose - 1 (com.ms.internal.nhwc)#

Version

This version of the operator has been available since version 1 of domain com.ms.internal.nhwc.

Summary

Similar to ConvTranspose in onnx, but with quantization.

The convolution transpose operator consumes an input tensor and a filter, and computes the output.

If the pads parameter is provided the shape of the output is calculated via the following equation:

output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]

output_shape can also be explicitly specified in which case pads values are auto generated using these equations:

total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i] If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2) Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET Default value is ?.

  • dilations: dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis. Default value is ?.

  • group: number of groups input channels and output channels are divided into. Default value is ?.

  • kernel_shape: The shape of the convolution kernel. If not present, should be inferred from input W. Default value is ?.

  • output_padding: Additional elements added to the side with higher coordinate indices in the output. Each padding value in “output_padding” must be less than the corresponding stride/dilation dimension. By default, this attribute is a zero vector. Note that this attribute doesn’t directly affect the computed output values. It only controls the selection of the computed values, so changing this attribute only adds or removes output elements. If “output_shape” is explicitly provided, “output_padding” does not contribute additional size to “output_shape” but participates in the computation of the needed padding amount. This is also called adjs or adjustment in some frameworks. Default value is ?.

  • output_shape: The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified pads values are ignored. See doc for details for equations to generate pads Default value is ?.

  • pads: Padding for the beginning and ending along each spatial axis Default value is ?.

  • strides: Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis. Default value is ?.

Inputs

Between 8 and 9 inputs.

  • x (heterogeneous) - T1: Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 … x Dn)

  • x_scale (heterogeneous) - tensor(float): Scale tensor for input ‘x’. It’s a scalar, which means a per- tensor/layer quantization.

  • x_zero_point (heterogeneous) - T1: Zero point tensor for input ‘x’. It’s a scalar, which means a per- tensor/layer quantization.

  • w (heterogeneous) - T2: The weight tensor that will be used in the convolutions; has size (C x M/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps.

  • w_scale (heterogeneous) - tensor(float): Scale tensor for input ‘w’. It could be a scalar or a 1-D tensor, which means a per-tensor/layer or per output channel quantization. If it’s a 1-D tensor, its number of elements should be equal to the number of output channels (M).

  • w_zero_point (heterogeneous) - T2: Zero point tensor for input ‘w’. It could be a scalar or a 1-D tensor, which means a per-tensor/layer or per output channel quantization. If it’s a 1-D tensor, its number of elements should be equal to the number of output channels (M).

  • y_scale (heterogeneous) - tensor(float): Scale tensor for output ‘y’. It’s a scalar, which means a per- tensor/layer quantization.

  • y_zero_point (heterogeneous) - T3: Zero point tensor for output ‘y’. It’s a scalar, which means a per- tensor/layer quantization.

  • B (optional, heterogeneous) - T4: Optional 1D bias to be added to the convolution, has size of M. Bias must be quantized using scale = x_scale * w_scale and zero_point = 0

Outputs

  • y (heterogeneous) - T3: Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.

Examples