保留 R 数据框中的数值精度?

发布于 2024-10-09 15:55:35 字数 1211 浏览 0 评论 0原文

当我从数值向量创建数据帧时,R 似乎将值截断为低于我在分析中所需的精度:

data.frame(x=0.99999996)

返回 1 (*但请参阅更新 1)

在拟合样条线时我陷入困境(x,y) 并且其中两个 x 值由于舍入而设置为 1,而 y 发生变化。我可以解决这个问题,但我更愿意使用标准解决方案(如果可用)。

示例

这是一个示例数据集

d <- data.frame(x = c(0.668732936336141, 0.95351462456867,
0.994620622127435, 0.999602102672081, 0.999987126195509, 0.999999955814133,
0.999999999999966), y = c(38.3026509783688, 11.5895099585560,
10.0443344234229, 9.86152339768516, 9.84461434575695, 9.81648333804257,
9.83306725758297))

以下解决方案有效,但我更喜欢不太主观的东西:

plot(d$x, d$y, ylim=c(0,50))
lines(spline(d$x, d$y),col='grey') #bad fit
lines(spline(d[-c(4:6),]$x, d[-c(4:6),]$y),col='red') #reasonable fit

更新1

*自从发布此问题后,我意识到这会返回1 即使数据帧仍然包含原始值,例如

> dput(data.frame(x=0.99999999996))

返回

structure(list(x = 0.99999999996), .Names = "x", row.names = c(NA, 
-1L), class = "data.frame")

Update 2

在使用 dput 发布此示例数据集以及 Dirk 的一些指针后,我可以看到问题不在于 x 值的截断,而在于我用来计算 y 的模型中数值误差的限制。这证明删除一些等效数据点是合理的(如示例红线所示)。

When I create a dataframe from numeric vectors, R seems to truncate the value below the precision that I require in my analysis:

data.frame(x=0.99999996)

returns 1 (*but see update 1)

I am stuck when fitting spline(x,y) and two of the x values are set to 1 due to rounding while y changes. I could hack around this but I would prefer to use a standard solution if available.

example

Here is an example data set

d <- data.frame(x = c(0.668732936336141, 0.95351462456867,
0.994620622127435, 0.999602102672081, 0.999987126195509, 0.999999955814133,
0.999999999999966), y = c(38.3026509783688, 11.5895099585560,
10.0443344234229, 9.86152339768516, 9.84461434575695, 9.81648333804257,
9.83306725758297))

The following solution works, but I would prefer something that is less subjective:

plot(d$x, d$y, ylim=c(0,50))
lines(spline(d$x, d$y),col='grey') #bad fit
lines(spline(d[-c(4:6),]$x, d[-c(4:6),]$y),col='red') #reasonable fit

Update 1

*Since posting this question, I realize that this will return 1 even though the data frame still contains the original value, e.g.

> dput(data.frame(x=0.99999999996))

returns

structure(list(x = 0.99999999996), .Names = "x", row.names = c(NA, 
-1L), class = "data.frame")

Update 2

After using dput to post this example data set, and some pointers from Dirk, I can see that the problem is not in the truncation of the x values but the limits of the numerical errors in the model that I have used to calculate y. This justifies dropping a few of the equivalent data points (as in the example red line).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

给妤﹃绝世温柔 2024-10-16 15:55:35

如果您确实希望设置 R 以完全不合理的精度打印其结果,请使用:options(digits=16)

请注意,这对于使用 htese 结果的函数的准确性没有任何作用。它只是改变值打印到控制台时的显示方式。除非您输入的有效数字超出了横坐标可以处理的范围,否则在存储或访问这些值时不会对这些值进行舍入。 'digits' 选项对浮点数的最大精度没有影响。

If you really want set up R to print its results with utterly unreasonable precision, then use: options(digits=16).

Note that this does nothing for that accuracy of functions using htese results. It merely changes how values appear when they are printed to the console. There is no rounding of the values as they are being stored or accessed unless you put in more significant digits than the abscissa can handle. The 'digits' option has no effect on the maximal precision of floating point numbers.

七婞 2024-10-16 15:55:35

请重新阅读 R FAQ 7.31 以及其中引用的参考文献——一篇非常著名的论文,讲述了每个人都应该了解计算机上的浮点表示。

Kerngighan 和 Plauger 的结束语也很精彩:

10.0 乘以 0.1 几乎不可能是 1.0。

除了数值精度问题之外,当然还有 R 打印时使用的小数位数少于其内部使用的小数位数:

> for (d in 4:8) print(0.99999996, digits=d)
[1] 1
[1] 1
[1] 1
[1] 1
[1] 0.99999996
> 

Please re-read R FAQ 7.31 and the reference cited therein -- a really famous paper on what everbody should know about floating-point representation on computers.

The closing quote from Kerngighan and Plauger is also wonderful:

10.0 times 0.1 is hardly ever 1.0.

And besides the numerical precision issue, there is of course also how R prints with fewer decimals than it uses internally:

> for (d in 4:8) print(0.99999996, digits=d)
[1] 1
[1] 1
[1] 1
[1] 1
[1] 0.99999996
> 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文