com.microsoft - BiasSoftmax#
BiasSoftmax - 1 (com.microsoft)#
Version
name: BiasSoftmax (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level:
shape inference:
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Y = softmax(scores + bias)) with simple broadcast on bias. Intended to specialize softmax(scores + additive_mask) commonly found in transformer models.
Attributes
axis: apply softmax to elements for dimensions axis or higher Default value is
?
.is_inner_broadcast (required): true if broadcast bias across input for dimensions broadcast_axis to axis-1, otherwise broadcast bias across input for dimensions 0 to broadcast_axis - 1 Default value is
?
.
Inputs
data (heterogeneous) - T: The input data as Tensor.
bias (heterogeneous) - T: The bias (or mask) as Tensor.
Outputs
output (heterogeneous) - T: The output.
Examples