com.microsoft - NhwcConv#

NhwcConv - 1 (com.microsoft)#

Version

  • name: NhwcConv (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level:

  • shape inference:

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • auto_pad:

Default value is ?.

  • dilations: dilation value along each spatial axis of the filter. If not present, the dilation defaults is 1 along each spatial axis. Default value is ?.

  • group: number of groups input channels and output channels are divided into. Default value is ?.

  • kernel_shape: The shape of the convolution kernel. If not present, should be inferred from input W. Default value is ?.

  • pads:

Default value is ?.

  • strides: Stride along each spatial axis. If not present, the stride defaults is 1 along each spatial axis. Default value is ?.

Inputs

Between 2 and 3 inputs.

  • X (heterogeneous) - T: Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 … x Dn). Optionally, if dimension denotation is in effect, the operation expects input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

  • W (heterogeneous) - T: The weight tensor that will be used in the convolutions; has size (M x C/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the kernel shape will be (M x C/group x k1 x k2 x … x kn), where (k1 x k2 x … kn) is the dimension of the kernel. Optionally, if dimension denotation is in effect, the operation expects the weight tensor to arrive with the dimension denotation of [FILTER_OUT_CHANNEL, FILTER_IN_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL …]. Assuming zero based indices for the shape array, X.shape[1] == (W.shape[1] * group) == C and W.shape[0] mod G == 0. Or in other words FILTER_IN_CHANNEL multiplied by the number of groups should be equal to DATA_CHANNEL and the number of feature maps M should be a multiple of the number of groups G.

  • B (optional, heterogeneous) - T: Optional 1D bias to be added to the convolution, has size of M.

Outputs

  • Y (heterogeneous) - T: Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.

Examples