I'm trying to apply a softmax function on some rows of a tensor, but the problem is that some of my rows have all -inf values. As such, softmax on these rows outputs NaN, which causes problems later in the model.
As such, I want to create a function that applies softmax to a row unless in is all -inf. In that case, it outputs a zero vector. Is there any easy way to do this?
Would something like setting all rows that are all
nan
after the softmax to 0 work for you? This way you make sure that you are not overwriting any unexpected nans.