使用对数标准化向量以避免溢出

发布于 2024-08-24 01:41:14 字数 685 浏览 2 评论 0原文

使用对数避免数值下溢的算术问题 (采取2)

看过上面的内容并看过softmax标准化后,我试图标准化一个向量,同时避免溢出 -

也就是说,如果我有一个数组 x[1], x[2] x[3], x[4], ... , x[n]

对我来说,标准化形式的元素平方和为 1.0 并通过将每个元素除以得到 sqrt(x[1]*x[1]+x[2]*x[2]+...+x[n]*x[n])

现在平方和可能会溢出即使平方根足够小以适合浮点变量,所以我想人们可以做类似的事情 s=(2*log(fabs(x[1]))+2*log(fabs(x[2]))+...+2*log(fabs(x[n])))/ 2

并将元素计算为

exp(log(fabs(x[1]))-s), ..., exp(log(fabs(x[n]))-s

但是

上面是不正确的,因为 log(A+B) 不是 log(A)+log(B) - 现在有没有一种方法可以进行向量归一化来更好地避免溢出?

Problem with arithmetic using logarithms to avoid numerical underflow (take 2)

Having seen the above and having seen softmax normalization I was trying to normalize a vector while avoiding overflow -

that is if I have an array
x[1], x[2] x[3], x[4], ... , x[n]

the normalized form for me has the sum of squares of elements as 1.0
and is obtained by dividing each element by
sqrt(x[1]*x[1]+x[2]*x[2]+...+x[n]*x[n])

now the sum of squares can overflow even if the square root is small enough to fit into a floating point variable, so I imagined one could do something like
s=(2*log(fabs(x[1]))+2*log(fabs(x[2]))+...+2*log(fabs(x[n])))/2

and calculating the elements as

exp(log(fabs(x[1]))-s), ..., exp(log(fabs(x[n]))-s

BUT

The above is incorrect as log(A+B) is not log(A)+log(B) - now is there a way to do vector normalization that avoids overflow better?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

北城半夏 2024-08-31 01:41:14

相反,

norm  = sqrt(x[1] * x[1] + ... + x[n] * x[n])

您可能希望在平方之前将向量的元素除以最大可能值,

max_x = max(x[1], ..., x[n])
y[1] = x[1] / max_x / n
...
y[n] = x[n] / max_x / n
norm = n * sqrt(y[1] * y[1] + ... + y[n] * y[n]) * max_x

然后 y 向量的范数应等于或小于零。 n * max_x 的值仍然可能溢出,因此您也需要小心,确保操作以非溢出顺序执行。

Instead of

norm  = sqrt(x[1] * x[1] + ... + x[n] * x[n])

you might want to divide the elements of the vector by the maximum possible value before squaring

max_x = max(x[1], ..., x[n])
y[1] = x[1] / max_x / n
...
y[n] = x[n] / max_x / n
norm = n * sqrt(y[1] * y[1] + ... + y[n] * y[n]) * max_x

The norm of the y vector should then be equal or smaller than zero. The value of n * max_x could still overflow, so you need to be careful there, too, that the operations are executed in a non-overflowing order.

盗梦空间 2024-08-31 01:41:14

您似乎做出这样的假设:

log(x^2 + y^2)

与:

log(x^2) + log(y^2)

但是,这是不正确的,因为您不能像这样简化总和的对数。

You seem to be making the assumption that:

log(x^2 + y^2)

is the same as:

log(x^2) + log(y^2)

However, this isn't correct, as you can't simplify the logarithm of a sum like that.

孤凫 2024-08-31 01:41:14

KennyTM 是正确的 - 你关于对数的想法是错误的。

您不能使用 L2 范数,因为它要求您计算向量的大小,而这正是您遇到溢出问题的原因。

也许 L-无穷范数(首先将向量中的每个分量除以最大分量的绝对值)会更好。请务必保留最大绝对值,以便获得正确的幅度。

我完全理解您需要 L2 范数,但如果溢出确实是一个问题,您需要采取中间步骤来获得它:

  1. 找到向量的最大绝对值。
  2. 将每个分量除以最大绝对值进行归一化;最大值现在为 +/- 1。
  3. 计算归一化分量平方和的平方根。我建议对值进行排序并按升序添加它们,以确保小组件不会丢失。
  4. 乘以最大绝对值即可得到原始向量的 L2 范数。

KennyTM is correct - your ideas about logarithms are wrong.

You can't use an L2 norm, because it requires that you calculate the magnitude of the vector, which is exactly what you're having overflow issues with.

Perhaps the L-infinity norm, where you divide each component in the vector by the absolute value of the maximum component first will be better. Be sure to hang onto that max absolute value so you can get the right magnitude back.

I understand completely that you need the L2 norm, but if overflow is indeed an issue you'll need to take intermediate steps to get it:

  1. Find the max absolute value of the vector.
  2. Divide each component by the max absolute value to normalize; max value is now +/- 1.
  3. Calculate the square root of the sum of squares of normalized components. I'd recommend sorting the values and adding them in ascending order to make sure that small components aren't lost.
  4. Multiply by the max absolute value to get the L2 norm of the original vector.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文