使用对数标准化向量以避免溢出
看过上面的内容并看过softmax标准化后,我试图标准化一个向量,同时避免溢出 -
也就是说,如果我有一个数组 x[1], x[2] x[3], x[4], ... , x[n]
对我来说,标准化形式的元素平方和为 1.0 并通过将每个元素除以得到 sqrt(x[1]*x[1]+x[2]*x[2]+...+x[n]*x[n])
现在平方和可能会溢出即使平方根足够小以适合浮点变量,所以我想人们可以做类似的事情 s=(2*log(fabs(x[1]))+2*log(fabs(x[2]))+...+2*log(fabs(x[n])))/ 2
并将元素计算为
exp(log(fabs(x[1]))-s), ..., exp(log(fabs(x[n]))-s
但是
上面是不正确的,因为 log(A+B) 不是 log(A)+log(B) - 现在有没有一种方法可以进行向量归一化来更好地避免溢出?
Problem with arithmetic using logarithms to avoid numerical underflow (take 2)
Having seen the above and having seen softmax normalization I was trying to normalize a vector while avoiding overflow -
that is if I have an arrayx[1], x[2] x[3], x[4], ... , x[n]
the normalized form for me has the sum of squares of elements as 1.0
and is obtained by dividing each element bysqrt(x[1]*x[1]+x[2]*x[2]+...+x[n]*x[n])
now the sum of squares can overflow even if the square root is small enough to fit into a floating point variable, so I imagined one could do something likes=(2*log(fabs(x[1]))+2*log(fabs(x[2]))+...+2*log(fabs(x[n])))/2
and calculating the elements as
exp(log(fabs(x[1]))-s), ..., exp(log(fabs(x[n]))-s
BUT
The above is incorrect as log(A+B) is not log(A)+log(B) - now is there a way to do vector normalization that avoids overflow better?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
相反,
您可能希望在平方之前将向量的元素除以最大可能值,
然后 y 向量的范数应等于或小于零。
n * max_x
的值仍然可能溢出,因此您也需要小心,确保操作以非溢出顺序执行。Instead of
you might want to divide the elements of the vector by the maximum possible value before squaring
The norm of the
y
vector should then be equal or smaller than zero. The value ofn * max_x
could still overflow, so you need to be careful there, too, that the operations are executed in a non-overflowing order.您似乎做出这样的假设:
与:
但是,这是不正确的,因为您不能像这样简化总和的对数。
You seem to be making the assumption that:
is the same as:
However, this isn't correct, as you can't simplify the logarithm of a sum like that.
KennyTM 是正确的 - 你关于对数的想法是错误的。
您不能使用 L2 范数,因为它要求您计算向量的大小,而这正是您遇到溢出问题的原因。
也许 L-无穷范数(首先将向量中的每个分量除以最大分量的绝对值)会更好。请务必保留最大绝对值,以便获得正确的幅度。
我完全理解您需要 L2 范数,但如果溢出确实是一个问题,您需要采取中间步骤来获得它:
KennyTM is correct - your ideas about logarithms are wrong.
You can't use an L2 norm, because it requires that you calculate the magnitude of the vector, which is exactly what you're having overflow issues with.
Perhaps the L-infinity norm, where you divide each component in the vector by the absolute value of the maximum component first will be better. Be sure to hang onto that max absolute value so you can get the right magnitude back.
I understand completely that you need the L2 norm, but if overflow is indeed an issue you'll need to take intermediate steps to get it: