.. _l-onnx-doccom.microsoft-RemovePadding: ============================= com.microsoft - RemovePadding ============================= .. contents:: :local: .. _l-onnx-opcom-microsoft-removepadding-1: RemovePadding - 1 (com.microsoft) ================================= **Version** * **name**: `RemovePadding (GitHub) `_ * **domain**: **com.microsoft** * **since_version**: **1** * **function**: * **support_level**: * **shape inference**: This version of the operator has been available **since version 1 of domain com.microsoft**. **Summary** Compress transformer input by removing paddings. It assumes padding is on the right side of sequence. The input has padding with shape (batch_size, sequence_length, hidden_size). This will generate two outputs: output has shape (total_tokens, hidden_size); token_offset with shape (batch_size, sequence_length). token_offset has offsets of all non-padding tokens first, then offset of all padding tokens. It is a list of batch_size * sequence_length elements, which is reshaped to 2D for convenience of shape inference. **Inputs** * **input** (heterogeneous) - **T**: Input tensor with shape (batch_size, sequence_length, hidden_size) * **sequence_token_count** (heterogeneous) - **M**: Number of non-padding tokens in each sequence with shape (batch_size). **Outputs** * **output** (heterogeneous) - **T**: output tensor with shape (total_tokens, hidden_size) * **token_offset** (heterogeneous) - **M**: Offset of non-padding tokens, and those of padding tokens. Its shape is (batch_size, sequence_length) * **cumulated_seq_len** (heterogeneous) - **M**: Cumulated sequence lengths. Its shape is (batch_size + 1) * **max_seq_len** (heterogeneous) - **M**: Max sequence length without padding. Its shape is (1) **Examples**