控制R中打印输出的小数位数

发布于 2024-08-21 19:53:30 字数 639 浏览 7 评论 0原文

R 中有一个选项可以控制数字显示。例如:

options(digits=10)

应该给出 10 位数字的计算结果,直到 R 会话结束。在R的帮助文件中,digits参数的定义如下:

digits:控制位数 打印数值时打印。 这只是一个建议。有效值 为 1...22,默认为 7

因此,它表示这只是一个建议。如果我希望始终显示 10 位数字而不是更多或更少怎么办?

我的第二个问题是,如果我想显示超过 22 位数字,即进行更精确的计算,例如 100 位数字,该怎么办?是否可以使用基础 R,或者我是否需要额外的包/函数?

编辑:感谢jmoy的建议,我尝试了sprintf("%.100f",pi),它给出了

[1] "3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000"

48位小数。这是R可以处理的最大极限吗?

There is an option in R to get control over digit display. For example:

options(digits=10)

is supposed to give the calculation results in 10 digits till the end of R session. In the help file of R, the definition for digits parameter is as follows:

digits: controls the number of digits
to print when printing numeric values.
It is a suggestion only. Valid values
are 1...22 with default 7

So, it says this is a suggestion only. What if I like to always display 10 digits, not more or less?

My second question is, what if I like to display more than 22 digits, i.e. for more precise calculations like 100 digits? Is it possible with base R, or do I need an additional package/function for that?

Edit: Thanks to jmoy's suggestion, I tried sprintf("%.100f",pi) and it gave

[1] "3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000"

which has 48 decimals. Is this the maximum limit R can handle?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

药祭#氼 2024-08-28 19:53:30

如果您自己生成整个输出,则可以使用sprintf(),例如

> sprintf("%.10f",0.25)
[1] "0.2500000000"

指定您要格式化具有十个小数点的浮点数(在%.10f中) f 代表浮点数,.10 指定十个小数点)。

我不知道有什么方法可以强制 R 的高级函数打印精确的位数。

如果您打印 R 的常用数字,则显示 100 位数字没有意义,因为使用 64 位双精度数可以获得的最佳精度约为 16 位十进制数字(请查看系统上的 .Machine$double.eps)。剩下的数字将只是垃圾。

If you are producing the entire output yourself, you can use sprintf(), e.g.

> sprintf("%.10f",0.25)
[1] "0.2500000000"

specifies that you want to format a floating point number with ten decimal points (in %.10f the f is for float and the .10 specifies ten decimal points).

I don't know of any way of forcing R's higher level functions to print an exact number of digits.

Displaying 100 digits does not make sense if you are printing R's usual numbers, since the best accuracy you can get using 64-bit doubles is around 16 decimal digits (look at .Machine$double.eps on your system). The remaining digits will just be junk.

∞琼窗梦回ˉ 2024-08-28 19:53:30

它只是一个建议的原因是您可以很容易地编写一个忽略选项值的打印函数。内置的打印和格式化功能确实使用 options 值作为默认值。

至于第二个问题,由于 R 使用有限精度算术,因此您的答案在小数点后 15 或 16 位之外就不准确,因此通常不需要更多。 gmprcdd 包处理多精度算术(通过与 gmp 库的接口),但这主要与对于双精度数,使用大整数而不是更多小数位。

MathematicaMaple 将允许您根据需要提供任意数量的小数位。

编辑:
考虑小数位和有效数字之间的差异可能会很有用。如果您正在进行的统计测试依赖于超过 15 位有效数字的差异,那么您的分析几乎肯定是垃圾。

另一方面,如果您只是处理非常小的数字,那么问题就不那么大了,因为 R 可以处理小至 .Machine$double.xmin (通常为 2e-308)的数字。

比较这两个分析。

x1 <- rnorm(50, 1, 1e-15)
y1 <- rnorm(50, 1 + 1e-15, 1e-15)
t.test(x1, y1)  #Should throw an error

x2 <- rnorm(50, 0, 1e-15)
y2 <- rnorm(50, 1e-15, 1e-15)
t.test(x2, y2)  #ok

在第一种情况下,数字之间的差异仅出现在许多有效数字之后,因此数据“几乎恒定”。在第二种情况下,虽然数字之间的差异大小相同,但与数字本身的大小相比,它们很大。


正如 e3bo 所提到的,您可以使用 Rmpfr 包来使用多精度浮点数。

mpfr("3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825")

与常规(双精度)数字向量相比,这些向量使用起来更慢且占用更多内存,但如果您遇到条件较差的问题或不稳定的算法,它们可能会很有用。

The reason it is only a suggestion is that you could quite easily write a print function that ignored the options value. The built-in printing and formatting functions do use the options value as a default.

As to the second question, since R uses finite precision arithmetic, your answers aren't accurate beyond 15 or 16 decimal places, so in general, more aren't required. The gmp and rcdd packages deal with multiple precision arithmetic (via an interace to the gmp library), but this is mostly related to big integers rather than more decimal places for your doubles.

Mathematica or Maple will allow you to give as many decimal places as your heart desires.

EDIT:
It might be useful to think about the difference between decimal places and significant figures. If you are doing statistical tests that rely on differences beyond the 15th significant figure, then your analysis is almost certainly junk.

On the other hand, if you are just dealing with very small numbers, that is less of a problem, since R can handle number as small as .Machine$double.xmin (usually 2e-308).

Compare these two analyses.

x1 <- rnorm(50, 1, 1e-15)
y1 <- rnorm(50, 1 + 1e-15, 1e-15)
t.test(x1, y1)  #Should throw an error

x2 <- rnorm(50, 0, 1e-15)
y2 <- rnorm(50, 1e-15, 1e-15)
t.test(x2, y2)  #ok

In the first case, differences between numbers only occur after many significant figures, so the data are "nearly constant". In the second case, Although the size of the differences between numbers are the same, compared to the magnitude of the numbers themselves they are large.


As mentioned by e3bo, you can use multiple-precision floating point numbers using the Rmpfr package.

mpfr("3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825")

These are slower and more memory intensive to use than regular (double precision) numeric vectors, but can be useful if you have a poorly conditioned problem or unstable algorithm.

全部不再 2024-08-28 19:53:30

另一种解决方案能够根据需要控制要打印的十进制位数(如果您不想打印多余的零)

例如,如果您有一个向量作为元素并且想得到它的sum

elements <- c(-1e-05, -2e-04, -3e-03, -4e-02, -5e-01, -6e+00, -7e+01, -8e+02)
sum(elements)
## -876.5432

显然,最后一个数字1被截断了,理想的结果应该是-876.54321,但如果设置为固定打印小数选项,例如 sprintf("%.10f", sum(elements)),冗余零生成为 -876.5432100000

按照此处的教程操作:打印十进制数字,如果能够识别多少位十进制数字在某个数字中,例如-876.54321,需要打印5位十进制数字,那么我们可以为format函数设置一个参数,如下所示

decimal_length <- 5
formatC(sum(elements), format = "f", digits = decimal_length)
## -876.54321

:根据每次查询改变decimal_length,这样可以满足不同的十进制打印要求。

One more solution able to control the how many decimal digits to print out based on needs (if you don't want to print redundant zero(s))

For example, if you have a vector as elements and would like to get sum of it

elements <- c(-1e-05, -2e-04, -3e-03, -4e-02, -5e-01, -6e+00, -7e+01, -8e+02)
sum(elements)
## -876.5432

Apparently, the last digital as 1 been truncated, the ideal result should be -876.54321, but if set as fixed printing decimal option, e.g sprintf("%.10f", sum(elements)), redundant zero(s) generate as -876.5432100000

Following the tutorial here: printing decimal numbers, if able to identify how many decimal digits in the certain numeric number, like here in -876.54321, there are 5 decimal digits need to print, then we can set up a parameter for format function as below:

decimal_length <- 5
formatC(sum(elements), format = "f", digits = decimal_length)
## -876.54321

We can change the decimal_length based on each time query, so it can satisfy different decimal printing requirement.

羞稚 2024-08-28 19:53:30

如果您主要使用 tibble,则有一个强制使用数字的函数:num()

这是一个示例:

library(tidyverse)

data <- tribble(
  
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)

data <-
  data %>%
  mutate(across(where(is.numeric), ~ num(., digits = 3)))

data
#> # A tibble: 9 × 2
#>      weight weight_selfreport
#>   <num:.3!>         <num:.3!>
#> 1    81.500            81.670
#> 2    72.600            72.595
#> 3    92.900            93.013
#> 4    79.400            79.401
#> 5    94.600            96.642
#> 6    80.200            79.401
#> 7   116.200           113.430
#> 8    95.400            95.735
#> 9    99.500            99.819

因此,您甚至可以根据您的需求决定使用不同的舍入选项。我发现它非常有帮助,并且是打印 dfs 的相当快速的解决方案。

If you work primarily with tibbles, there is a function that enforces digits: num().

Here is an example:

library(tidyverse)

data <- tribble(
  
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)

data <-
  data %>%
  mutate(across(where(is.numeric), ~ num(., digits = 3)))

data
#> # A tibble: 9 × 2
#>      weight weight_selfreport
#>   <num:.3!>         <num:.3!>
#> 1    81.500            81.670
#> 2    72.600            72.595
#> 3    92.900            93.013
#> 4    79.400            79.401
#> 5    94.600            96.642
#> 6    80.200            79.401
#> 7   116.200           113.430
#> 8    95.400            95.735
#> 9    99.500            99.819

Thus you can even decide to have different rounding options depending on what your needs are. I find it very helpful and a rather quick solution to printing dfs.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文