com.microsoft - BiasSoftmax#
BiasSoftmax - 1 (com.microsoft)#
Version
name: BiasSoftmax (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level:
shape inference:
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Y = softmax(scores + bias)) with simple broadcast on bias. Intended to specialize softmax(scores + additive_mask) commonly found in transformer models.
Attributes
broadcast_axis: broadcast bias across input for dimensions broadcast_axis to softmax_axis-1 Default value is
?
.softmax_axis: apply softmax to elements for dimensions softmax_axis or higher Default value is
?
.
Inputs
data (heterogeneous) - T: The input data as Tensor.
bias (heterogeneous) - T: The bias (or mask) as Tensor.
Outputs
output (heterogeneous) - T: The output.
Examples