在 R 中处理非常小的数字

发布于 2024-11-03 10:50:55 字数 409 浏览 0 评论 0原文

我需要计算一个非常小的数字列表,例如

(0.1)^1000、0.2^(1200),

然后将它们标准化,以便它们的总和为 1 即

a1 = 0.1^1000, a2 = 0.2^1200

我想计算 a1' = a1/(a1+a2), a2'=a2(a1+a2)。

当我得到 a1=0 时,我遇到了下溢问题。我该如何解决这个问题? 理论上我可以处理日志,然后 log(a1) = 1000*log(0.l) 将是一种没有下溢问题的表示 a1 的方法 - 但为了标准化,我需要得到 log(a1+a2) - 我无法计算,因为我无法直接表示 a1。

我正在使用 R 进行编程 - 据我所知,c# 中没有像 Decimal 这样的数据类型 可以让你得到比双精度更好的值。

任何建议将不胜感激,谢谢

I need to calculate a list of very small numbers such as

(0.1)^1000, 0.2^(1200),

and then normalize them so they will sum up to one
i.e.

a1 = 0.1^1000,
a2 = 0.2^1200

And I want to calculate
a1' = a1/(a1+a2),
a2'=a2(a1+a2).

I'm running into underflow problems, as I get a1=0. How can I get around this?
Theoretically I could deal with logs, and then log(a1) = 1000*log(0.l) would be a way to represent a1 without underflow problems - But in order to normalize I would need to get
log(a1+a2) - which I can't compute since I can't represent a1 directly.

I'm programming with R - as far as I can tell there is no data type such Decimal in c# which
allows you to get better than double-precision value.

Any suggestions will be appreciated, thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

鲜血染红嫁衣 2024-11-10 10:50:55

从数学上来说,这些数字之一将是appx。零,还有另一个。你们的数字之间的差异很大,所以我什至想知道这是否有意义。

但一般来说,要做到这一点,您可以使用 R 底层的 logspace_add C 函数的想法。可以定义 logxpy ( =log(x+y) )logxpy ( =log(x+y) ) 当 lx = log(x)ly = log(y) as :

logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))

这意味着我们可以使用:

> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)

> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162

> exp(la2 - logxpy(la1,la2))
[1] 1

如果您有,也可以递归调用此函数更多数字。请注意,1 仍然是 1,而不是 1 减去 5.807...e-162 。如果您确实需要更高的精度并且您的平台支持长双精度类型,您可以使用 C 或 C++ 等代码编写所有内容,并稍后返回结果。但如果我是对的,R 目前只能处理普通的双打,所以最终在显示结果时你将再次失去精度。


编辑:

为您进行数学计算:

log(x+y) = log(exp(lx)+exp(ly))
         = log( exp(lx) * (1 + exp(ly-lx) )
         = lx + log ( 1 + exp(ly - lx)  )

现在您只需将最大的作为 lx,然后您就可以得到 logxpy() 中的表达式。

编辑2:为什么要取最大值呢?很简单,确保您在 exp(lx-ly) 中使用负数。如果 lx-ly 变得太大,那么 exp(lx-ly) 将返回 Inf。这不是一个正确的结果。 exp(ly-lx) 将返回 0,这会得到更好的结果:

假设 lx=1 且 ly=1000,则:

> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000

Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.

But to do that in general, you can use the idea from the logspace_add C-function that's underneath the hood of R. One can define logxpy ( =log(x+y) ) when lx = log(x) and ly = log(y) as :

logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))

Which means that we can use :

> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)

> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162

> exp(la2 - logxpy(la1,la2))
[1] 1

This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus 5.807...e-162 . If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.


EDIT :

to do the math for you :

log(x+y) = log(exp(lx)+exp(ly))
         = log( exp(lx) * (1 + exp(ly-lx) )
         = lx + log ( 1 + exp(ly - lx)  )

Now you just take the largest as lx, and then you come at the expression in logxpy().

EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:

Say lx=1 and ly=1000, then :

> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000
影子的影子 2024-11-10 10:50:55

Brobdingnag 包处理非常大或小数字,本质上将乔里斯的答案包装成一种方便的形式。

a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)

The Brobdingnag package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.

a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)
魂牵梦绕锁你心扉 2024-11-10 10:50:55

尝试任意精度包:

  • Rmpfr "R MPFR -可靠的多精度浮点”
  • Ryacas “R 接口“Yacas”计算机代数系统” - 也可能能够做到任意精度。

Try the arbitrary precision packages:

  • Rmpfr "R MPFR - Multiple Precision Floating-Point Reliable"
  • Ryacas "R Interface to the 'Yacas' Computer Algebra System" - may also be able to do arbitrary precision.
素手挽清风 2024-11-10 10:50:55

也许你可以将 a1 和 a2 视为分数。在您的示例中,

a1 = (a1num/a1denom)^1000  # 1/10
a2 = (a2num/a2denom)^1200  # 1/5

您将得到

a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)

可以使用 gmp 包计算的值:

library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))

Maybe you can treat a1 and a2 as fractions. In your example, with

a1 = (a1num/a1denom)^1000  # 1/10
a2 = (a2num/a2denom)^1200  # 1/5

you would arrive at

a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)

which can be computed using the gmp package:

library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文