在 R 中处理非常小的数字
我需要计算一个非常小的数字列表,例如
(0.1)^1000、0.2^(1200),
然后将它们标准化,以便它们的总和为 1 即
a1 = 0.1^1000, a2 = 0.2^1200
我想计算 a1' = a1/(a1+a2), a2'=a2(a1+a2)。
当我得到 a1=0 时,我遇到了下溢问题。我该如何解决这个问题? 理论上我可以处理日志,然后 log(a1) = 1000*log(0.l) 将是一种没有下溢问题的表示 a1 的方法 - 但为了标准化,我需要得到 log(a1+a2) - 我无法计算,因为我无法直接表示 a1。
我正在使用 R 进行编程 - 据我所知,c# 中没有像 Decimal 这样的数据类型 可以让你得到比双精度更好的值。
任何建议将不胜感激,谢谢
I need to calculate a list of very small numbers such as
(0.1)^1000, 0.2^(1200),
and then normalize them so they will sum up to one
i.e.
a1 = 0.1^1000,
a2 = 0.2^1200
And I want to calculate
a1' = a1/(a1+a2),
a2'=a2(a1+a2).
I'm running into underflow problems, as I get a1=0. How can I get around this?
Theoretically I could deal with logs, and then log(a1) = 1000*log(0.l) would be a way to represent a1 without underflow problems - But in order to normalize I would need to get
log(a1+a2) - which I can't compute since I can't represent a1 directly.
I'm programming with R - as far as I can tell there is no data type such Decimal in c# which
allows you to get better than double-precision value.
Any suggestions will be appreciated, thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
从数学上来说,这些数字之一将是appx。零,还有另一个。你们的数字之间的差异很大,所以我什至想知道这是否有意义。
但一般来说,要做到这一点,您可以使用 R 底层的
logspace_add
C 函数的想法。可以定义logxpy ( =log(x+y) )
logxpy ( =log(x+y) ) 当lx = log(x)
和ly = log(y)
as :这意味着我们可以使用:
如果您有,也可以递归调用此函数更多数字。请注意,1 仍然是 1,而不是 1 减去
5.807...e-162
。如果您确实需要更高的精度并且您的平台支持长双精度类型,您可以使用 C 或 C++ 等代码编写所有内容,并稍后返回结果。但如果我是对的,R 目前只能处理普通的双打,所以最终在显示结果时你将再次失去精度。编辑:
为您进行数学计算:
现在您只需将最大的作为 lx,然后您就可以得到
logxpy()
中的表达式。编辑2:为什么要取最大值呢?很简单,确保您在 exp(lx-ly) 中使用负数。如果 lx-ly 变得太大,那么 exp(lx-ly) 将返回 Inf。这不是一个正确的结果。 exp(ly-lx) 将返回 0,这会得到更好的结果:
假设 lx=1 且 ly=1000,则:
Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.
But to do that in general, you can use the idea from the
logspace_add
C-function that's underneath the hood of R. One can definelogxpy ( =log(x+y) )
whenlx = log(x)
andly = log(y)
as :Which means that we can use :
This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus
5.807...e-162
. If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.EDIT :
to do the math for you :
Now you just take the largest as lx, and then you come at the expression in
logxpy()
.EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:
Say lx=1 and ly=1000, then :
Brobdingnag
包处理非常大或小数字,本质上将乔里斯的答案包装成一种方便的形式。The
Brobdingnag
package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.尝试任意精度包:
Rmpfr
"R MPFR -可靠的多精度浮点”Ryacas
“R 接口“Yacas”计算机代数系统” - 也可能能够做到任意精度。Try the arbitrary precision packages:
Rmpfr
"R MPFR - Multiple Precision Floating-Point Reliable"Ryacas
"R Interface to the 'Yacas' Computer Algebra System" - may also be able to do arbitrary precision.也许你可以将 a1 和 a2 视为分数。在您的示例中,
您将得到
可以使用 gmp 包计算的值:
Maybe you can treat a1 and a2 as fractions. In your example, with
you would arrive at
which can be computed using the gmp package: