something about softmax

caoqi95

发布于 2019-06-20 11:29:00

4680

发布于 2019-06-20 11:29:00

[1]. Softmax vs. Softmax-Loss: Numerical Stability

function softmax(z)
  #z = z - maximum(z)
  o = exp(z)
  return o / sum(o)
end
function gradient_together(z, y)
  o = softmax(z)
  o[y] -= 1.0
  return o
end
function gradient_separated(z, y)
  o = softmax(z)
  ∂o_∂z = diagm(o) - o*o'
  ∂f_∂o = zeros(size(o))
  ∂f_∂o[y] = -1.0 / o[y]
  return ∂o_∂z * ∂f_∂o
end

[2]. PyTorch - VGG output layer - no softmax?

The reason why this is done is because you only need the softmax layer at the time of inferencing. While training, to calculate the loss you don’t need to softmax and just calculate loss without it. This way the number of computations get reduced!

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2019.06.20 ，如有侵权请联系 cloudcommunity@tencent.com 删除

python

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

python

登录后参与评论

0 条评论

热度

something about softmax

something about softmax

[1]. Softmax vs. Softmax-Loss: Numerical Stability

[2]. PyTorch - VGG output layer - no softmax?

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐