MaxPool#

MaxPool - 12#

Version

  • name: MaxPool (GitHub)

  • domain: main

  • since_version: 12

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 12.

Summary

MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

or#

output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

if ceil_mode is enabled

* pad_shape[i] is sum of pads along axis i

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]

The output of each pooling window is maximum number of elements exclude pad.

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that output_shape[i] = ceil(input_shape[i] / strides[i]) for each axis i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER. Default value is 'NOTSET'.

  • ceil_mode: Whether to use ceil or floor (default) to compute the output shape. Default value is 0.

  • dilations: Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • storage_order: The storage order of the tensor. 0 is row major, and 1 is column major. This attribute is used only to convert an n-tuple index value into a single integer value for producing the second output. Default value is 0.

  • strides: Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.

Inputs

  • X (heterogeneous) - T: Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

Between 1 and 2 outputs.

  • Y (heterogeneous) - T: Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

  • Indices (optional, heterogeneous) - I: Indices tensor from max pooling across the input tensor. The dimensions of indices are the same as output tensor. The values in indices of are the indices of the selected values during pooling. The indices are computed as flatten 1-D tensor, and the indices do not consider padding. So the values in indices are in [0, N x C x D1 x … x Dn).

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16), tensor(int8), tensor(uint8) ): Constrain input and output types to float and 8 bit tensors.

  • I in ( tensor(int64) ): Constrain index tensor to int64

Examples

_maxpool_2d_uint8

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.uint8)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.uint8)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_uint8")

_maxpool_2d_precomputed_pads

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_pads")

_maxpool_with_argmax_2d_precomputed_pads

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y", "z"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.float32)
z = np.array(
    [
        [
            [
                [12, 13, 14, 14, 14],
                [17, 18, 19, 19, 19],
                [22, 23, 24, 24, 24],
                [22, 23, 24, 24, 24],
                [22, 23, 24, 24, 24],
            ]
        ]
    ]
).astype(np.int64)

expect(
    node,
    inputs=[x],
    outputs=[y, z],
    name="test_maxpool_with_argmax_2d_precomputed_pads",
)

_maxpool_2d_precomputed_strides

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[2, 2], strides=[2, 2]
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_strides"
)

_maxpool_with_argmax_2d_precomputed_strides

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y", "z"],
    kernel_shape=[2, 2],
    strides=[2, 2],
    storage_order=1,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)
z = np.array([[[[6, 16], [8, 18]]]]).astype(np.int64)

expect(
    node,
    inputs=[x],
    outputs=[y, z],
    name="test_maxpool_with_argmax_2d_precomputed_strides",
)

_maxpool_2d_precomputed_same_upper

"""
input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 3, 3]
pad_shape: [2, 2] -> [1, 1, 1, 1] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9, 10], [17, 19, 20], [22, 24, 25]]]]).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_same_upper"
)

_maxpool_1d_default

"""
input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2],
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = [2]
strides = [1]
out_shape = get_output_shape("VALID", x_shape[2:], kernel_shape, strides)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, [0], "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_1d_default")

_maxpool_2d_default

"""
input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape("VALID", x_shape[2:], kernel_shape, strides)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, (0, 0), "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_default")

_maxpool_3d_default

"""
input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape = get_output_shape("VALID", x_shape[2:], kernel_shape, strides)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, [0, 0, 0], "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_3d_default")

_maxpool_2d_same_upper

"""
input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape("SAME_UPPER", x_shape[2:], kernel_shape, strides)
pad_shape = get_pad_shape(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, pad_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_upper")

_maxpool_2d_same_lower

"""
input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_LOWER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape("SAME_LOWER", x_shape[2:], kernel_shape, strides)
pad_shape = get_pad_shape(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, pad_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_lower")

_maxpool_2d_pads

"""
input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[2, 2, 2, 2],
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = pad_top = pad_right = pad_left = 2
pad_shape = [pad_top + pad_bottom, pad_left + pad_right]
out_shape = get_output_shape(
    "VALID", np.add(x_shape[2:], pad_shape), kernel_shape, strides
)
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, pad_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_pads")

_maxpool_2d_strides

"""
input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
node = onnx.helper.make_node(
    "MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[5, 5], strides=[3, 3]
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (5, 5)
strides = (3, 3)
out_shape = get_output_shape("VALID", x_shape[2:], kernel_shape, strides)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, (0, 0), "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_strides")

_maxpool_2d_ceil

"""
input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    ceil_mode=True,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_ceil")

_maxpool_2d_dilations

"""
input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[1, 1],
    dilations=[2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_dilations")

Differences

00MaxPool consumes an input tensor X and applies max pooling acrossMaxPool consumes an input tensor X and applies max pooling across
11the tensor according to kernel sizes, stride sizes, and pad lengths.the tensor according to kernel sizes, stride sizes, and pad lengths.
22max pooling consisting of computing the max on all values of amax pooling consisting of computing the max on all values of a
33subset of the input tensor according to the kernel size and downsampling thesubset of the input tensor according to the kernel size and downsampling the
44data into the output tensor Y for further processing. The output spatial shape will be following:data into the output tensor Y for further processing. The output spatial shape will be following:
55::::
66
77 output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
88
99oror
1010::::
1111
1212 output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
1313
1414if ceil_mode is enabledif ceil_mode is enabled
1515
1616::::
1717
1818 * pad_shape[i] is sum of pads along axis i * pad_shape[i] is sum of pads along axis i
1919
2020auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
2121::::
2222
2323 VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i]) VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
2424 SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
2525
2626And pad shape will be following if SAME_UPPER or SAME_LOWER:And pad shape will be following if SAME_UPPER or SAME_LOWER:
2727::::
2828
2929 pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i] pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
3030
3131The output of each pooling window is maximum number of elements exclude pad.The output of each pooling window is maximum number of elements exclude pad.
3232
3333**Attributes****Attributes**
3434
3535* **auto_pad**:* **auto_pad**:
3636 auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
3737 Where default value is NOTSET, which means explicit padding is used. Where default value is NOTSET, which means explicit padding is used.
3838 SAME_UPPER or SAME_LOWER mean pad the input so that the output SAME_UPPER or SAME_LOWER mean pad the input so that output_shape[i]
39 spatial size match the input.In case of odd number add the extra
39 = ceil(input_shape[i] / strides[i]) for each axis i. The padding
40 is split between the two sides equally or almost equally (depending
41 on whether it is even or odd). In case the padding is an odd number,
4042 padding at the end for SAME_UPPER and at the beginning for the extra padding is added at the end for SAME_UPPER and at the
4143 SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'. beginning for SAME_LOWER. Default value is 'NOTSET'.
4244* **ceil_mode**:* **ceil_mode**:
4345 Whether to use ceil or floor (default) to compute the output shape. Default value is 0. Whether to use ceil or floor (default) to compute the output shape. Default value is 0.
4446* **dilations**:* **dilations**:
4547 Dilation value along each spatial axis of filter. If not present, Dilation value along each spatial axis of filter. If not present,
4648 the dilation defaults to 1 along each spatial axis. the dilation defaults to 1 along each spatial axis.
4749* **kernel_shape** (required):* **kernel_shape** (required):
4850 The size of the kernel along each axis. The size of the kernel along each axis.
4951* **pads**:* **pads**:
5052 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
5153 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
5254 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
5355 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
5456 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
5557 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
5658 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
5759 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
5860 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
5961* **storage_order**:* **storage_order**:
6062 The storage order of the tensor. 0 is row major, and 1 is column The storage order of the tensor. 0 is row major, and 1 is column
6163 major. Default value is 0. major. This attribute is used only to convert an n-tuple index value
64 into a single integer value for producing the second output. Default value is 0.
6265* **strides**:* **strides**:
6366 Stride along each spatial axis. If not present, the stride defaults Stride along each spatial axis. If not present, the stride defaults
6467 to 1 along each spatial axis. to 1 along each spatial axis.
6568
6669**Inputs****Inputs**
6770
6871* **X** (heterogeneous) - **T**:* **X** (heterogeneous) - **T**:
6972 Input data tensor from the previous operator; dimensions for image Input data tensor from the previous operator; dimensions for image
7073 case are (N x C x H x W), where N is the batch size, C is the number case are (N x C x H x W), where N is the batch size, C is the number
7174 of channels, and H and W are the height and the width of the data. of channels, and H and W are the height and the width of the data.
7275 For non image case, the dimensions are in the form of (N x C x D1 x For non image case, the dimensions are in the form of (N x C x D1 x
7376 D2 ... Dn), where N is the batch size. Optionally, if dimension D2 ... Dn), where N is the batch size. Optionally, if dimension
7477 denotation is in effect, the operation expects the input data tensor denotation is in effect, the operation expects the input data tensor
7578 to arrive with the dimension denotation of [DATA_BATCH, to arrive with the dimension denotation of [DATA_BATCH,
7679 DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].
7780
7881**Outputs****Outputs**
7982
8083Between 1 and 2 outputs.Between 1 and 2 outputs.
8184
8285* **Y** (heterogeneous) - **T**:* **Y** (heterogeneous) - **T**:
8386 Output data tensor from average or max pooling across the input Output data tensor from average or max pooling across the input
8487 tensor. Dimensions will vary based on various kernel, stride, and tensor. Dimensions will vary based on various kernel, stride, and
8588 pad sizes. Floor value of the dimension is used pad sizes. Floor value of the dimension is used
8689* **Indices** (optional, heterogeneous) - **I**:* **Indices** (optional, heterogeneous) - **I**:
8790 Indices tensor from max pooling across the input tensor. The Indices tensor from max pooling across the input tensor. The
8891 dimensions of indices are the same as output tensor. The values in dimensions of indices are the same as output tensor. The values in
8992 indices of are the indices of the selected values during pooling. indices of are the indices of the selected values during pooling.
9093 The indices are computed as flatten 1-D tensor, and the indices do The indices are computed as flatten 1-D tensor, and the indices do
9194 not consider padding. So the values in indices are in [0, N x C x D1 not consider padding. So the values in indices are in [0, N x C x D1
9295 x ... x Dn). x ... x Dn).
9396
9497**Type Constraints****Type Constraints**
9598
9699* **T** in (* **T** in (
97100 tensor(double), tensor(double),
98101 tensor(float), tensor(float),
99102 tensor(float16) tensor(float16),
103 tensor(int8),
104 tensor(uint8)
100105 ): ):
101106 Constrain input and output types to float tensors. Constrain input and output types to float and 8 bit tensors.
102107* **I** in (* **I** in (
103108 tensor(int64) tensor(int64)
104109 ): ):
105110 Constrain index tensor to int64 Constrain index tensor to int64

MaxPool - 11#

Version

  • name: MaxPool (GitHub)

  • domain: main

  • since_version: 11

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 11.

Summary

MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

or#

output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

if ceil_mode is enabled

* pad_shape[i] is sum of pads along axis i

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]

The output of each pooling window is maximum number of elements exclude pad.

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that the output spatial size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.

  • ceil_mode: Whether to use ceil or floor (default) to compute the output shape. Default value is 0.

  • dilations: Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • storage_order: The storage order of the tensor. 0 is row major, and 1 is column major. Default value is 0.

  • strides: Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.

Inputs

  • X (heterogeneous) - T: Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

Between 1 and 2 outputs.

  • Y (heterogeneous) - T: Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

  • Indices (optional, heterogeneous) - I: Indices tensor from max pooling across the input tensor. The dimensions of indices are the same as output tensor. The values in indices of are the indices of the selected values during pooling. The indices are computed as flatten 1-D tensor, and the indices do not consider padding. So the values in indices are in [0, N x C x D1 x … x Dn).

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • I in ( tensor(int64) ): Constrain index tensor to int64

Differences

00MaxPool consumes an input tensor X and applies max pooling acrossMaxPool consumes an input tensor X and applies max pooling across
11the tensor according to kernel sizes, stride sizes, and pad lengths.the tensor according to kernel sizes, stride sizes, and pad lengths.
22max pooling consisting of computing the max on all values of amax pooling consisting of computing the max on all values of a
33subset of the input tensor according to the kernel size and downsampling thesubset of the input tensor according to the kernel size and downsampling the
44data into the output tensor Y for further processing. The output spatial shape will be following:data into the output tensor Y for further processing. The output spatial shape will be following:
55::::
66
77 output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
88
99oror
1010::::
1111
1212 output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
1313
1414if ceil_mode is enabledif ceil_mode is enabled
1515
1616::::
1717
1818 * pad_shape[i] is sum of pads along axis i * pad_shape[i] is sum of pads along axis i
1919
2020auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
2121::::
2222
2323 VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i]) VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
2424 SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
2525
2626And pad shape will be following if SAME_UPPER or SAME_LOWER:And pad shape will be following if SAME_UPPER or SAME_LOWER:
2727::::
2828
2929 pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i] pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
3030
3131The output of each pooling window is maximum number of elements exclude pad.The output of each pooling window is maximum number of elements exclude pad.
3232
3333**Attributes****Attributes**
3434
3535* **auto_pad**:* **auto_pad**:
3636 auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
3737 Where default value is NOTSET, which means explicit padding is used. Where default value is NOTSET, which means explicit padding is used.
3838 SAME_UPPER or SAME_LOWER mean pad the input so that the output SAME_UPPER or SAME_LOWER mean pad the input so that the output
3939 spatial size match the input.In case of odd number add the extra spatial size match the input.In case of odd number add the extra
4040 padding at the end for SAME_UPPER and at the beginning for padding at the end for SAME_UPPER and at the beginning for
4141 SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'. SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.
4242* **ceil_mode**:* **ceil_mode**:
4343 Whether to use ceil or floor (default) to compute the output shape. Default value is 0. Whether to use ceil or floor (default) to compute the output shape. Default value is 0.
4444* **dilations**:* **dilations**:
4545 Dilation value along each spatial axis of filter. Dilation value along each spatial axis of filter. If not present,
46 the dilation defaults to 1 along each spatial axis.
4647* **kernel_shape** (required):* **kernel_shape** (required):
4748 The size of the kernel along each axis. The size of the kernel along each axis.
4849* **pads**:* **pads**:
4950 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
5051 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
5152 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
5253 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
5354 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
5455 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
5556 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
5657 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
5758 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
5859* **storage_order**:* **storage_order**:
5960 The storage order of the tensor. 0 is row major, and 1 is column The storage order of the tensor. 0 is row major, and 1 is column
6061 major. Default value is 0. major. Default value is 0.
6162* **strides**:* **strides**:
6263 Stride along each spatial axis. Stride along each spatial axis. If not present, the stride defaults
64 to 1 along each spatial axis.
6365
6466**Inputs****Inputs**
6567
6668* **X** (heterogeneous) - **T**:* **X** (heterogeneous) - **T**:
6769 Input data tensor from the previous operator; dimensions for image Input data tensor from the previous operator; dimensions for image
6870 case are (N x C x H x W), where N is the batch size, C is the number case are (N x C x H x W), where N is the batch size, C is the number
6971 of channels, and H and W are the height and the width of the data. of channels, and H and W are the height and the width of the data.
7072 For non image case, the dimensions are in the form of (N x C x D1 x For non image case, the dimensions are in the form of (N x C x D1 x
7173 D2 ... Dn), where N is the batch size. Optionally, if dimension D2 ... Dn), where N is the batch size. Optionally, if dimension
7274 denotation is in effect, the operation expects the input data tensor denotation is in effect, the operation expects the input data tensor
7375 to arrive with the dimension denotation of [DATA_BATCH, to arrive with the dimension denotation of [DATA_BATCH,
7476 DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].
7577
7678**Outputs****Outputs**
7779
7880Between 1 and 2 outputs.Between 1 and 2 outputs.
7981
8082* **Y** (heterogeneous) - **T**:* **Y** (heterogeneous) - **T**:
8183 Output data tensor from average or max pooling across the input Output data tensor from average or max pooling across the input
8284 tensor. Dimensions will vary based on various kernel, stride, and tensor. Dimensions will vary based on various kernel, stride, and
8385 pad sizes. Floor value of the dimension is used pad sizes. Floor value of the dimension is used
8486* **Indices** (optional, heterogeneous) - **I**:* **Indices** (optional, heterogeneous) - **I**:
8587 Indices tensor from max pooling across the input tensor. The Indices tensor from max pooling across the input tensor. The
8688 dimensions of indices are the same as output tensor. The values in dimensions of indices are the same as output tensor. The values in
8789 indices of are the indices of the selected values during pooling. indices of are the indices of the selected values during pooling.
8890 The indices are computed as flatten 1-D tensor, and the indices do The indices are computed as flatten 1-D tensor, and the indices do
8991 not consider padding. So the values in indices are in [0, N x C x D1 not consider padding. So the values in indices are in [0, N x C x D1
9092 x ... x Dn). x ... x Dn).
9193
9294**Type Constraints****Type Constraints**
9395
9496* **T** in (* **T** in (
9597 tensor(double), tensor(double),
9698 tensor(float), tensor(float),
9799 tensor(float16) tensor(float16)
98100 ): ):
99101 Constrain input and output types to float tensors. Constrain input and output types to float tensors.
100102* **I** in (* **I** in (
101103 tensor(int64) tensor(int64)
102104 ): ):
103105 Constrain index tensor to int64 Constrain index tensor to int64

MaxPool - 10#

Version

  • name: MaxPool (GitHub)

  • domain: main

  • since_version: 10

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 10.

Summary

MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

or#

output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)

if ceil_mode is enabled

* pad_shape[i] is sum of pads along axis i

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]

The output of each pooling window is maximum number of elements exclude pad.

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that the output spatial size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.

  • ceil_mode: Whether to use ceil or floor (default) to compute the output shape. Default value is 0.

  • dilations: Dilation value along each spatial axis of filter.

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • storage_order: The storage order of the tensor. 0 is row major, and 1 is column major. Default value is 0.

  • strides: Stride along each spatial axis.

Inputs

  • X (heterogeneous) - T: Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

Between 1 and 2 outputs.

  • Y (heterogeneous) - T: Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

  • Indices (optional, heterogeneous) - I: Indices tensor from max pooling across the input tensor. The dimensions of indices are the same as output tensor. The values in indices of are the indices of the selected values during pooling. The indices are computed as flatten 1-D tensor, and the indices do not consider padding. So the values in indices are in [0, N x C x D1 x … x Dn).

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • I in ( tensor(int64) ): Constrain index tensor to int64

Differences

00MaxPool consumes an input tensor X and applies max pooling acrossMaxPool consumes an input tensor X and applies max pooling across
11the tensor according to kernel sizes, stride sizes, and pad lengths.the tensor according to kernel sizes, stride sizes, and pad lengths.
22max pooling consisting of computing the max on all values of amax pooling consisting of computing the max on all values of a
33subset of the input tensor according to the kernel size and downsampling thesubset of the input tensor according to the kernel size and downsampling the
44data into the output tensor Y for further processing. The output spatial shape will be following:data into the output tensor Y for further processing. The output spatial shape will be following:
55::::
66
77 output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
88
9or
10::
11
12 output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
13
14if ceil_mode is enabled
15
16::
17
918 * pad_shape[i] is sum of pads along axis i * pad_shape[i] is sum of pads along axis i
1019
1120auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
1221::::
1322
1423 VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i]) VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
1524 SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
1625
1726And pad shape will be following if SAME_UPPER or SAME_LOWER:And pad shape will be following if SAME_UPPER or SAME_LOWER:
1827::::
1928
2029 pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i] pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
2130
2231The output of each pooling window is maximum number of elements exclude pad.The output of each pooling window is maximum number of elements exclude pad.
2332
2433**Attributes****Attributes**
2534
2635* **auto_pad**:* **auto_pad**:
2736 auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
2837 Where default value is NOTSET, which means explicit padding is used. Where default value is NOTSET, which means explicit padding is used.
2938 SAME_UPPER or SAME_LOWER mean pad the input so that the output SAME_UPPER or SAME_LOWER mean pad the input so that the output
3039 spatial size match the input.In case of odd number add the extra spatial size match the input.In case of odd number add the extra
3140 padding at the end for SAME_UPPER and at the beginning for padding at the end for SAME_UPPER and at the beginning for
3241 SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'. SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.
42* **ceil_mode**:
43 Whether to use ceil or floor (default) to compute the output shape. Default value is 0.
44* **dilations**:
45 Dilation value along each spatial axis of filter.
3346* **kernel_shape** (required):* **kernel_shape** (required):
3447 The size of the kernel along each axis. The size of the kernel along each axis.
3548* **pads**:* **pads**:
3649 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
3750 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
3851 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
3952 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
4053 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
4154 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
4255 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
4356 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
4457 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
4558* **storage_order**:* **storage_order**:
4659 The storage order of the tensor. 0 is row major, and 1 is column The storage order of the tensor. 0 is row major, and 1 is column
4760 major. Default value is 0. major. Default value is 0.
4861* **strides**:* **strides**:
4962 Stride along each spatial axis. Stride along each spatial axis.
5063
5164**Inputs****Inputs**
5265
5366* **X** (heterogeneous) - **T**:* **X** (heterogeneous) - **T**:
5467 Input data tensor from the previous operator; dimensions for image Input data tensor from the previous operator; dimensions for image
5568 case are (N x C x H x W), where N is the batch size, C is the number case are (N x C x H x W), where N is the batch size, C is the number
5669 of channels, and H and W are the height and the width of the data. of channels, and H and W are the height and the width of the data.
5770 For non image case, the dimensions are in the form of (N x C x D1 x For non image case, the dimensions are in the form of (N x C x D1 x
5871 D2 ... Dn), where N is the batch size. Optionally, if dimension D2 ... Dn), where N is the batch size. Optionally, if dimension
5972 denotation is in effect, the operation expects the input data tensor denotation is in effect, the operation expects the input data tensor
6073 to arrive with the dimension denotation of [DATA_BATCH, to arrive with the dimension denotation of [DATA_BATCH,
6174 DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].
6275
6376**Outputs****Outputs**
6477
6578Between 1 and 2 outputs.Between 1 and 2 outputs.
6679
6780* **Y** (heterogeneous) - **T**:* **Y** (heterogeneous) - **T**:
6881 Output data tensor from average or max pooling across the input Output data tensor from average or max pooling across the input
6982 tensor. Dimensions will vary based on various kernel, stride, and tensor. Dimensions will vary based on various kernel, stride, and
7083 pad sizes. Floor value of the dimension is used pad sizes. Floor value of the dimension is used
7184* **Indices** (optional, heterogeneous) - **I**:* **Indices** (optional, heterogeneous) - **I**:
7285 Indices tensor from max pooling across the input tensor. The Indices tensor from max pooling across the input tensor. The
7386 dimensions of indices are the same as output tensor. The values in dimensions of indices are the same as output tensor. The values in
7487 indices of are the indices of the selected values during pooling. indices of are the indices of the selected values during pooling.
7588 The indices are computed as flatten 1-D tensor, and the indices do The indices are computed as flatten 1-D tensor, and the indices do
7689 not consider padding. So the values in indices are in [0, N x C x D1 not consider padding. So the values in indices are in [0, N x C x D1
7790 x ... x Dn). x ... x Dn).
7891
7992**Type Constraints****Type Constraints**
8093
8194* **T** in (* **T** in (
8295 tensor(double), tensor(double),
8396 tensor(float), tensor(float),
8497 tensor(float16) tensor(float16)
8598 ): ):
8699 Constrain input and output types to float tensors. Constrain input and output types to float tensors.
87100* **I** in (* **I** in (
88101 tensor(int64) tensor(int64)
89102 ): ):
90103 Constrain index tensor to int64 Constrain index tensor to int64

MaxPool - 8#

Version

  • name: MaxPool (GitHub)

  • domain: main

  • since_version: 8

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 8.

Summary

MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)

* pad_shape[i] is sum of pads along axis i

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]

The output of each pooling window is maximum number of elements exclude pad.

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that the output spatial size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • storage_order: The storage order of the tensor. 0 is row major, and 1 is column major. Default value is 0.

  • strides: Stride along each spatial axis.

Inputs

  • X (heterogeneous) - T: Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

Between 1 and 2 outputs.

  • Y (heterogeneous) - T: Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

  • Indices (optional, heterogeneous) - I: Indices tensor from max pooling across the input tensor. The dimensions of indices are the same as output tensor. The values in indices of are the indices of the selected values during pooling. The indices are computed as flatten 1-D tensor, and the indices do not consider padding. So the values in indices are in [0, N x C x D1 x … x Dn).

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • I in ( tensor(int64) ): Constrain index tensor to int64

Differences

00MaxPool consumes an input tensor X and applies max pooling acrossMaxPool consumes an input tensor X and applies max pooling across
11the tensor according to kernel sizes, stride sizes, and pad lengths.the tensor according to kernel sizes, stride sizes, and pad lengths.
22max pooling consisting of computing the max on all values of amax pooling consisting of computing the max on all values of a
33subset of the input tensor according to the kernel size and downsampling thesubset of the input tensor according to the kernel size and downsampling the
44data into the output tensor Y for further processing. The output spatial shape will be following:data into the output tensor Y for further processing. The output spatial shape will be following:
55::::
66
77 output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1) output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
88
99 * pad_shape[i] is sum of pads along axis i * pad_shape[i] is sum of pads along axis i
1010
1111auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
1212::::
1313
1414 VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i]) VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
1515 SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
1616
1717And pad shape will be following if SAME_UPPER or SAME_LOWER:And pad shape will be following if SAME_UPPER or SAME_LOWER:
1818::::
1919
2020 pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i] pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]
2121
2222The output of each pooling window is maximum number of elements exclude pad.The output of each pooling window is maximum number of elements exclude pad.
2323
2424**Attributes****Attributes**
2525
2626* **auto_pad**:* **auto_pad**:
2727 auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
2828 Where default value is NOTSET, which means explicit padding is used. Where default value is NOTSET, which means explicit padding is used.
2929 SAME_UPPER or SAME_LOWER mean pad the input so that the output SAME_UPPER or SAME_LOWER mean pad the input so that the output
3030 spatial size match the input.In case of odd number add the extra spatial size match the input.In case of odd number add the extra
3131 padding at the end for SAME_UPPER and at the beginning for padding at the end for SAME_UPPER and at the beginning for
3232 SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'. SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.
3333* **kernel_shape** (required):* **kernel_shape** (required):
3434 The size of the kernel along each axis. The size of the kernel along each axis.
3535* **pads**:* **pads**:
3636 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
3737 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
3838 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
3939 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
4040 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
4141 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
4242 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
4343 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
4444 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
45* **storage_order**:
46 The storage order of the tensor. 0 is row major, and 1 is column
47 major. Default value is 0.
4548* **strides**:* **strides**:
4649 Stride along each spatial axis. Stride along each spatial axis.
4750
4851**Inputs****Inputs**
4952
5053* **X** (heterogeneous) - **T**:* **X** (heterogeneous) - **T**:
5154 Input data tensor from the previous operator; dimensions for image Input data tensor from the previous operator; dimensions for image
5255 case are (N x C x H x W), where N is the batch size, C is the number case are (N x C x H x W), where N is the batch size, C is the number
5356 of channels, and H and W are the height and the width of the data. of channels, and H and W are the height and the width of the data.
5457 For non image case, the dimensions are in the form of (N x C x D1 x For non image case, the dimensions are in the form of (N x C x D1 x
5558 D2 ... Dn), where N is the batch size. Optionally, if dimension D2 ... Dn), where N is the batch size. Optionally, if dimension
5659 denotation is in effect, the operation expects the input data tensor denotation is in effect, the operation expects the input data tensor
5760 to arrive with the dimension denotation of [DATA_BATCH, to arrive with the dimension denotation of [DATA_BATCH,
5861 DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].
5962
6063**Outputs****Outputs**
6164
65Between 1 and 2 outputs.
66
6267* **Y** (heterogeneous) - **T**:* **Y** (heterogeneous) - **T**:
6368 Output data tensor from average or max pooling across the input Output data tensor from average or max pooling across the input
6469 tensor. Dimensions will vary based on various kernel, stride, and tensor. Dimensions will vary based on various kernel, stride, and
6570 pad sizes. Floor value of the dimension is used pad sizes. Floor value of the dimension is used
71* **Indices** (optional, heterogeneous) - **I**:
72 Indices tensor from max pooling across the input tensor. The
73 dimensions of indices are the same as output tensor. The values in
74 indices of are the indices of the selected values during pooling.
75 The indices are computed as flatten 1-D tensor, and the indices do
76 not consider padding. So the values in indices are in [0, N x C x D1
77 x ... x Dn).
6678
6779**Type Constraints****Type Constraints**
6880
6981* **T** in (* **T** in (
7082 tensor(double), tensor(double),
7183 tensor(float), tensor(float),
7284 tensor(float16) tensor(float16)
7385 ): ):
7486 Constrain input and output types to float tensors. Constrain input and output types to float tensors.
87* **I** in (
88 tensor(int64)
89 ):
90 Constrain index tensor to int64

MaxPool - 1#

Version

  • name: MaxPool (GitHub)

  • domain: main

  • since_version: 1

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1.

Summary

MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)

* pad_shape[i] is sum of pads along axis i

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]

The output of each pooling window is maximum number of elements exclude pad.

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that the output spatial size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • strides: Stride along each spatial axis.

Inputs

  • X (heterogeneous) - T: Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

  • Y (heterogeneous) - T: Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.