.. _l-onnx-doccom.microsoft-RemovePadding:

=============================
com.microsoft - RemovePadding
=============================

.. contents::
    :local:


.. _l-onnx-opcom-microsoft-removepadding-1:

RemovePadding - 1 (com.microsoft)
=================================

**Version**

* **name**: `RemovePadding (GitHub) <https://github.com/onnx/onnx/blob/main/docs/Operators.md#com.microsoft.RemovePadding>`_
* **domain**: **com.microsoft**
* **since_version**: **1**
* **function**:
* **support_level**:
* **shape inference**:

This version of the operator has been available
**since version 1 of domain com.microsoft**.

**Summary**

Compress transformer input by removing paddings. It assumes padding is on the right side of sequence.

The input has padding with shape (batch_size, sequence_length, hidden_size). This will generate two outputs:
output has shape (total_tokens, hidden_size); token_offset with shape (batch_size, sequence_length).

token_offset has offsets of all non-padding tokens first, then offset of all padding tokens. It is
a list of batch_size * sequence_length elements, which is reshaped to 2D for convenience of shape inference.

**Inputs**

* **input** (heterogeneous) - **T**:
  Input tensor with shape (batch_size, sequence_length, hidden_size)
* **sequence_token_count** (heterogeneous) - **M**:
  Number of non-padding tokens in each sequence with shape
  (batch_size).

**Outputs**

* **output** (heterogeneous) - **T**:
  output tensor with shape (total_tokens, hidden_size)
* **token_offset** (heterogeneous) - **M**:
  Offset of non-padding tokens, and those of padding tokens. Its shape
  is (batch_size, sequence_length)
* **cumulated_seq_len** (heterogeneous) - **M**:
  Cumulated sequence lengths. Its shape is (batch_size + 1)
* **max_seq_len** (heterogeneous) - **M**:
  Max sequence length without padding. Its shape is (1)

**Examples**