在 R 中处理非常小的数字

发布于 2024-11-03 10:50:55 字数 409 浏览 4 评论 0原文

我需要计算一个非常小的数字列表，例如

(0.1)^1000、0.2^(1200)，

然后将它们标准化，以便它们的总和为 1 即

a1 = 0.1^1000， a2 = 0.2^1200

我想计算 a1' = a1/(a1+a2), a2'=a2(a1+a2)。

当我得到 a1=0 时，我遇到了下溢问题。我该如何解决这个问题？理论上我可以处理日志，然后 log(a1) = 1000*log(0.l) 将是一种没有下溢问题的表示 a1 的方法 - 但为了标准化，我需要得到 log(a1+a2) - 我无法计算，因为我无法直接表示 a1。

我正在使用 R 进行编程 - 据我所知，c# 中没有像 Decimal 这样的数据类型可以让你得到比双精度更好的值。

任何建议将不胜感激，谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鲜血染红嫁衣 2024-11-10 10:50:55

从数学上来说，这些数字之一将是appx。零，还有另一个。你们的数字之间的差异很大，所以我什至想知道这是否有意义。

但一般来说，要做到这一点，您可以使用 R 底层的 logspace_add C 函数的想法。可以定义 logxpy ( =log(x+y) )logxpy ( =log(x+y) ) 当 lx = log(x) 和 ly = log(y) as ：

logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))

这意味着我们可以使用：

> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)

> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162

> exp(la2 - logxpy(la1,la2))
[1] 1

如果您有，也可以递归调用此函数更多数字。请注意，1 仍然是 1，而不是 1 减去 5.807...e-162 。如果您确实需要更高的精度并且您的平台支持长双精度类型，您可以使用 C 或 C++ 等代码编写所有内容，并稍后返回结果。但如果我是对的，R 目前只能处理普通的双打，所以最终在显示结果时你将再次失去精度。

编辑：

为您进行数学计算：

log(x+y) = log(exp(lx)+exp(ly))
         = log( exp(lx) * (1 + exp(ly-lx) )
         = lx + log ( 1 + exp(ly - lx)  )

现在您只需将最大的作为 lx，然后您就可以得到 logxpy() 中的表达式。

编辑2：为什么要取最大值呢？很简单，确保您在 exp(lx-ly) 中使用负数。如果 lx-ly 变得太大，那么 exp(lx-ly) 将返回 Inf。这不是一个正确的结果。 exp(ly-lx) 将返回 0，这会得到更好的结果：

假设 lx=1 且 ly=1000，则：

> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000

Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.

But to do that in general, you can use the idea from the logspace_add C-function that's underneath the hood of R. One can define logxpy ( =log(x+y) ) when lx = log(x) and ly = log(y) as :

logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))

Which means that we can use :

> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)

> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162

> exp(la2 - logxpy(la1,la2))
[1] 1

This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus 5.807...e-162 . If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.

EDIT :

to do the math for you :

log(x+y) = log(exp(lx)+exp(ly))
         = log( exp(lx) * (1 + exp(ly-lx) )
         = lx + log ( 1 + exp(ly - lx)  )

Now you just take the largest as lx, and then you come at the expression in logxpy().

EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:

Say lx=1 and ly=1000, then :

> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000

回复收藏 0 原文

影子的影子 2024-11-10 10:50:55

Brobdingnag 包处理非常大或小数字，本质上将乔里斯的答案包装成一种方便的形式。

a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)

The Brobdingnag package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.

a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)

回复收藏 0 原文

魂牵梦绕锁你心扉 2024-11-10 10:50:55

尝试任意精度包：

Rmpfr "R MPFR -可靠的多精度浮点”
Ryacas “R 接口“Yacas”计算机代数系统” - 也可能能够做到任意精度。

回复收藏 0 原文

素手挽清风 2024-11-10 10:50:55

也许你可以将 a1 和 a2 视为分数。在您的示例中，

a1 = (a1num/a1denom)^1000  # 1/10
a2 = (a2num/a2denom)^1200  # 1/5

您将得到

a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)

可以使用 gmp 包计算的值：

library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))

Maybe you can treat a1 and a2 as fractions. In your example, with

a1 = (a1num/a1denom)^1000  # 1/10
a2 = (a2num/a2denom)^1200  # 1/5

you would arrive at

a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)

which can be computed using the gmp package:

library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))

回复收藏 0 原文

~没有更多了~

关于作者

删除会话

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

在 R 中处理非常小的数字

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

在 R 中处理非常小的数字

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。