当前位置：文江博客话题详情

r floating-point r-faq floating-accuracy

为什么这些数字不相等？

发布于 2025-02-10 08:48:26 字数 185 浏览 2 评论 0 原文

以下代码显然是错误的。有什么问题？

i <- 0.1
i <- i + 0.05
i
## [1] 0.15
if(i==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")
## i does not equal 0.15

原文

The following code is obviously wrong. What's the problem?

i <- 0.1
i <- i + 0.05
i
## [1] 0.15
if(i==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")
## i does not equal 0.15

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

半衾梦 2025-02-17 08:48:26

通用（语言不可知论）原因

，因为并非所有数字都可以在

这是计算机算术的一个众所周知的限制，并在几个地方进行了讨论：

R FAQ提出了问题： r faq 7.31
patrick burns的R Inferno 将第一个“圈子”用于这个问题（第9页）
David Goldberg，“每个计算机科学家对浮动的了解 -点算术，“ acm计算调查 23 ，1（1991-03），5-48 doi＆gt; 10.1145/103162.103163 （）
浮点指南 - 每个程序员对浮点算术的了解
=“ http://0.3000000000000000004.com” rel =“ noreferrer”> 0.3000000000000000004.com 比较跨编程语言的浮点算术算术，
包括几个堆栈溢出问题，包括包括

规范副本

的 r 中的这是不使用 == == ，而是 > 功能。或更确切地说，由于 ALL.Equal 如果有任何差异，则提供了很多详细信息， istrue（all.equal（...））。

if(isTRUE(all.equal(i,0.15))) cat("i equals 0.15") else cat("i does not equal 0.15")

产生

i equals 0.15

更多使用 all.equal 而不是 == 的示例（最后一个示例应该表明这将正确显示差异）。

0.1+0.05==0.15
#[1] FALSE
isTRUE(all.equal(0.1+0.05, 0.15))
#[1] TRUE
1-0.1-0.1-0.1==0.7
#[1] FALSE
isTRUE(all.equal(1-0.1-0.1-0.1, 0.7))
#[1] TRUE
0.3/0.1 == 3
#[1] FALSE
isTRUE(all.equal(0.3/0.1, 3))
#[1] TRUE
0.1+0.1==0.15
#[1] FALSE
isTRUE(all.equal(0.1+0.1, 0.15))
#[1] FALSE

https://stackoverflow.com/a/a/22228139/892313”中直接复制的更多细节

直接从

虽然当您说的话时，r略微存在：

1.1-0.2
#[1] 0.9
0.9
#[1] 0.9

您可以在小数点中找出它真正的想法：

sprintf("%.54f",1.1-0.2)
#[1] "0.900000000000000133226762955018784850835800170898437500"
sprintf("%.54f",0.9)
#[1] "0.900000000000000022204460492503130808472633361816406250"

您可以看到这些数字有所不同，但是表示形式有些笨拙。如果我们在二进制中查看它们（嗯，十六进制，这是同等的），我们会得到更清晰的图片：

sprintf("%a",0.9)
#[1] "0x1.ccccccccccccdp-1"
sprintf("%a",1.1-0.2)
#[1] "0x1.ccccccccccccep-1"
sprintf("%a",1.1-0.2-0.9)
#[1] "0x1p-53"

您可以看到它们通过 2^-53 而不同，这很重要，因为此数字是最小的两个数字之间值接近1的数字之间的代表差。

我们可以通过在r 字段：

 ?.Machine
 #....
 #double.eps     the smallest positive floating-point number x 
 #such that 1 + x != 1. It equals base^ulp.digits if either 
 #base is 2 or rounding is 0; otherwise, it is 
 #(base^ulp.digits) / 2. Normally 2.220446e-16.
 #....
 .Machine$double.eps
 #[1] 2.220446e-16
 sprintf("%a",.Machine$double.eps)
 #[1] "0x1p-52"

您可以使用此事实创建一个“几乎等于”的函数，该功能检查差异是否接近浮点中最小的代表数字。实际上，这已经存在： all.Equal 。

?all.equal
#....
#all.equal(x,y) is a utility to compare R objects x and y testing ‘near equality’.
#....
#all.equal(target, current,
#      tolerance = .Machine$double.eps ^ 0.5,
#      scale = NULL, check.attributes = TRUE, ...)
#....

因此，ALL.Equal函数实际上是在检查数字之间的差异是两个Mantissas之间最小差的平方根。

这种算法在称为Denormals的数字非常小的数字上有些有趣，但是您不必为此担心。

比较向量

上述讨论假定了两个单个值的比较。在r中，没有标量，只有向量和隐式矢量化是语言的优势。为了比较向量元素的价值，以前的原理成立，但实现略有不同。 == 是矢量化的（进行元素的比较），而 all.Equal 将整个向量作为一个实体进行比较。

使用以前的示例

a <- c(0.1+0.05, 1-0.1-0.1-0.1, 0.3/0.1, 0.1+0.1)
b <- c(0.15,     0.7,           3,       0.15)

== 不给出“预期”结果， all.equal 不执行元素

a==b
#[1] FALSE FALSE FALSE FALSE
all.equal(a,b)
#[1] "Mean relative difference: 0.01234568"
isTRUE(all.equal(a,b))
#[1] FALSE

，而是在两个向量上循环的版本必须是则使用

mapply(function(x, y) {isTRUE(all.equal(x, y))}, a, b)
#[1]  TRUE  TRUE  TRUE FALSE

如果需要的功能版本，

elementwise.all.equal <- Vectorize(function(x, y) {isTRUE(all.equal(x, y))})

它可以编写，可以将其称为

elementwise.all.equal(a, b)
#[1]  TRUE  TRUE  TRUE FALSE

替代，而不是包装 all.equal.equal 在更多函数调用中，您可以仅复制<<的相关内部。代码> all.equal.numeric 并使用隐式矢量化：

tolerance = .Machine$double.eps^0.5
# this is the default tolerance used in all.equal,
# but you can pick a different tolerance to match your needs

abs(a - b) < tolerance
#[1]  TRUE  TRUE  TRUE FALSE

这是 dplyr ::接近采用的方法，该方法将自己记录为

这是比较两个浮点数的向量（成对）相等的安全方法。这比使用 == 更安全，因为它具有内置的公差

dplyr::near(a, b)
#[1]  TRUE  TRUE  TRUE FALSE

测试，用于在矢量内出现值的值

in Standard R函数％也可能遭受如果应用于浮点值，则相同的问题。例如：

x = seq(0.85, 0.95, 0.01)
# [1] 0.85 0.86 0.87 0.88 0.89 0.90 0.91 0.92 0.93 0.94 0.95
0.92 %in% x
# [1] FALSE

我们可以定义一个新的Infix操作员，以在以下比较中允许公差：

`%.in%` = function(a, b, eps = sqrt(.Machine$double.eps)) {
  any(abs(b-a) <= eps)
}

0.92 %.in% x
# [1] TRUE

dplyr ::接近包裹在中> 也可以用于矢量化检查

any(dplyr::near(0.92, x))
# [1] TRUE

General (language agnostic) reason

Since not all numbers can be represented exactly in IEEE floating point arithmetic (the standard that almost all computers use to represent decimal numbers and do math with them), you will not always get what you expected. This is especially true because some values which are simple, finite decimals (such as 0.1 and 0.05) are not represented exactly in the computer and so the results of arithmetic on them may not give a result that is identical to a direct representation of the "known" answer.

This is a well known limitation of computer arithmetic and is discussed in several places:

The R FAQ has question devoted to it: R FAQ 7.31
The R Inferno by Patrick Burns devotes the first "Circle" to this problem (starting on page 9)
David Goldberg, "What Every Computer Scientist Should Know About Floating-point Arithmetic," ACM Computing Surveys 23, 1 (1991-03), 5-48 doi>10.1145/103162.103163 (revision also available)
The Floating-Point Guide - What Every Programmer Should Know About Floating-Point Arithmetic
0.30000000000000004.com compares floating point arithmetic across programming languages
Several Stack Overflow questions including
- Why are floating point numbers inaccurate?
- Why can't decimal numbers be represented exactly in binary?
- Is floating point math broken?
- Canonical duplicate for "floating point is inaccurate" (a meta discussion about a canonical answer for this issue)

Comparing scalars

The standard solution to this in R is not to use ==, but rather the all.equal function. Or rather, since all.equal gives lots of detail about the differences if there are any, isTRUE(all.equal(...)).

if(isTRUE(all.equal(i,0.15))) cat("i equals 0.15") else cat("i does not equal 0.15")

yields

i equals 0.15

Some more examples of using all.equal instead of == (the last example is supposed to show that this will correctly show differences).

0.1+0.05==0.15
#[1] FALSE
isTRUE(all.equal(0.1+0.05, 0.15))
#[1] TRUE
1-0.1-0.1-0.1==0.7
#[1] FALSE
isTRUE(all.equal(1-0.1-0.1-0.1, 0.7))
#[1] TRUE
0.3/0.1 == 3
#[1] FALSE
isTRUE(all.equal(0.3/0.1, 3))
#[1] TRUE
0.1+0.1==0.15
#[1] FALSE
isTRUE(all.equal(0.1+0.1, 0.15))
#[1] FALSE

Some more detail, directly copied from an answer to a similar question:

The problem you have encountered is that floating point cannot represent decimal fractions exactly in most cases, which means you will frequently find that exact matches fail.

while R lies slightly when you say:

1.1-0.2
#[1] 0.9
0.9
#[1] 0.9

You can find out what it really thinks in decimal:

sprintf("%.54f",1.1-0.2)
#[1] "0.900000000000000133226762955018784850835800170898437500"
sprintf("%.54f",0.9)
#[1] "0.900000000000000022204460492503130808472633361816406250"

You can see these numbers are different, but the representation is a bit unwieldy. If we look at them in binary (well, hex, which is equivalent) we get a clearer picture:

sprintf("%a",0.9)
#[1] "0x1.ccccccccccccdp-1"
sprintf("%a",1.1-0.2)
#[1] "0x1.ccccccccccccep-1"
sprintf("%a",1.1-0.2-0.9)
#[1] "0x1p-53"

You can see that they differ by 2^-53, which is important because this number is the smallest representable difference between two numbers whose value is close to 1, as this is.

We can find out for any given computer what this smallest representable number is by looking in R's machine field:

 ?.Machine
 #....
 #double.eps     the smallest positive floating-point number x 
 #such that 1 + x != 1. It equals base^ulp.digits if either 
 #base is 2 or rounding is 0; otherwise, it is 
 #(base^ulp.digits) / 2. Normally 2.220446e-16.
 #....
 .Machine$double.eps
 #[1] 2.220446e-16
 sprintf("%a",.Machine$double.eps)
 #[1] "0x1p-52"

You can use this fact to create a 'nearly equals' function which checks that the difference is close to the smallest representable number in floating point. In fact this already exists: all.equal.

?all.equal
#....
#all.equal(x,y) is a utility to compare R objects x and y testing ‘near equality’.
#....
#all.equal(target, current,
#      tolerance = .Machine$double.eps ^ 0.5,
#      scale = NULL, check.attributes = TRUE, ...)
#....

So the all.equal function is actually checking that the difference between the numbers is the square root of the smallest difference between two mantissas.

This algorithm goes a bit funny near extremely small numbers called denormals, but you don't need to worry about that.

Comparing vectors

The above discussion assumed a comparison of two single values. In R, there are no scalars, just vectors and implicit vectorization is a strength of the language. For comparing the value of vectors element-wise, the previous principles hold, but the implementation is slightly different. == is vectorized (does an element-wise comparison) while all.equal compares the whole vectors as a single entity.

Using the previous examples

a <- c(0.1+0.05, 1-0.1-0.1-0.1, 0.3/0.1, 0.1+0.1)
b <- c(0.15,     0.7,           3,       0.15)

== does not give the "expected" result and all.equal does not perform element-wise

a==b
#[1] FALSE FALSE FALSE FALSE
all.equal(a,b)
#[1] "Mean relative difference: 0.01234568"
isTRUE(all.equal(a,b))
#[1] FALSE

Rather, a version which loops over the two vectors must be used

mapply(function(x, y) {isTRUE(all.equal(x, y))}, a, b)
#[1]  TRUE  TRUE  TRUE FALSE

If a functional version of this is desired, it can be written

elementwise.all.equal <- Vectorize(function(x, y) {isTRUE(all.equal(x, y))})

which can be called as just

elementwise.all.equal(a, b)
#[1]  TRUE  TRUE  TRUE FALSE

Alternatively, instead of wrapping all.equal in even more function calls, you can just replicate the relevant internals of all.equal.numeric and use implicit vectorization:

tolerance = .Machine$double.eps^0.5
# this is the default tolerance used in all.equal,
# but you can pick a different tolerance to match your needs

abs(a - b) < tolerance
#[1]  TRUE  TRUE  TRUE FALSE

This is the approach taken by dplyr::near, which documents itself as

This is a safe way of comparing if two vectors of floating point numbers are (pairwise) equal. This is safer than using ==, because it has a built in tolerance

dplyr::near(a, b)
#[1]  TRUE  TRUE  TRUE FALSE

Testing for occurrence of a value within a vector

The standard R function %in% can also suffer from the same issue if applied to floating point values. For example:

x = seq(0.85, 0.95, 0.01)
# [1] 0.85 0.86 0.87 0.88 0.89 0.90 0.91 0.92 0.93 0.94 0.95
0.92 %in% x
# [1] FALSE

We can define a new infix operator to allow for a tolerance in the comparison as follows:

`%.in%` = function(a, b, eps = sqrt(.Machine$double.eps)) {
  any(abs(b-a) <= eps)
}

0.92 %.in% x
# [1] TRUE

dplyr::near wrapped in any can also be used for the vectorized check

any(dplyr::near(0.92, x))
# [1] TRUE

回复收藏 0 原文

墨落画卷 2025-02-17 08:48:26

添加Brian的评论（这是原因），您可以通过使用 all.Equal 而不是：

# i <- 0.1
# i <- i + 0.05
# i
#if(all.equal(i, .15)) cat("i equals 0.15\n") else cat("i does not equal 0.15\n")
#i equals 0.15

每个Joshua的警告是更新的代码（感谢Joshua）：

 i <- 0.1
 i <- i + 0.05
 i
if(isTRUE(all.equal(i, .15))) { #code was getting sloppy &went to multiple lines
    cat("i equals 0.15\n") 
} else {
    cat("i does not equal 0.15\n")
}
#i equals 0.15

Adding to Brian's comment (which is the reason) you can over come this by using all.equal instead:

# i <- 0.1
# i <- i + 0.05
# i
#if(all.equal(i, .15)) cat("i equals 0.15\n") else cat("i does not equal 0.15\n")
#i equals 0.15

Per Joshua's warning here is the updated code (Thanks Joshua):

 i <- 0.1
 i <- i + 0.05
 i
if(isTRUE(all.equal(i, .15))) { #code was getting sloppy &went to multiple lines
    cat("i equals 0.15\n") 
} else {
    cat("i does not equal 0.15\n")
}
#i equals 0.15

回复收藏 0 原文

笑着哭最痛 2025-02-17 08:48:26

dplyr ::近（）是测试是否相等的两个浮点数向量的选项。这是 docs ：

sqrt(2) ^ 2 == 2
#> [1] FALSE
library(dplyr)
near(sqrt(2) ^ 2, 2)
#> [1] TRUE

该函数具有内置的公差参数： tol = .machine $ double.eps^0.5 可以调整。默认参数与 all.equal（）的默认值相同。

dplyr::near() is an option for testing if two vectors of floating point numbers are equal. This is the example from the docs:

sqrt(2) ^ 2 == 2
#> [1] FALSE
library(dplyr)
near(sqrt(2) ^ 2, 2)
#> [1] TRUE

The function has a built in tolerance parameter: tol = .Machine$double.eps^0.5 that can be adjusted. The default parameter is the same as the default for all.equal().

回复收藏 0 原文

葬シ愛 2025-02-17 08:48:26

这是hackish，但很快：

if(round(i, 10)==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")

This is hackish, but quick:

if(round(i, 10)==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")

回复收藏 0 原文

-残月青衣踏尘吟 2025-02-17 08:48:26

具有双精度算术的广义比较（“＆lt; =”，“”，“＆gt; =”，“ =”）：

比较A＆lt; = b：

IsSmallerOrEqual <- function(a,b) {   
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" && (a<b | all.equal(a, b))) { return(TRUE)
 } else if (a < b) { return(TRUE)
     } else { return(FALSE) }
}

IsSmallerOrEqual(abs(-2-(-2.2)), 0.2) # TRUE
IsSmallerOrEqual(abs(-2-(-2.2)), 0.3) # TRUE
IsSmallerOrEqual(abs(-2-(-2.2)), 0.1) # FALSE
IsSmallerOrEqual(3,3); IsSmallerOrEqual(3,4); IsSmallerOrEqual(4,3) 
# TRUE; TRUE; FALSE

比较A＆gt; = b：

IsBiggerOrEqual <- function(a,b) {
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" && (a>b | all.equal(a, b))) { return(TRUE)
 } else if (a > b) { return(TRUE)
     } else { return(FALSE) }
}
IsBiggerOrEqual(3,3); IsBiggerOrEqual(4,3); IsBiggerOrEqual(3,4) 
# TRUE; TRUE; FALSE

比较A = B：

IsEqual <- function(a,b) {
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" ) { return(TRUE)
 } else { return(FALSE) }
}

IsEqual(0.1+0.05,0.15) # TRUE

Generalized comparisons ("<=", ">=", "=") in double precision arithmetic:

Comparing a <= b:

IsSmallerOrEqual <- function(a,b) {   
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" && (a<b | all.equal(a, b))) { return(TRUE)
 } else if (a < b) { return(TRUE)
     } else { return(FALSE) }
}

IsSmallerOrEqual(abs(-2-(-2.2)), 0.2) # TRUE
IsSmallerOrEqual(abs(-2-(-2.2)), 0.3) # TRUE
IsSmallerOrEqual(abs(-2-(-2.2)), 0.1) # FALSE
IsSmallerOrEqual(3,3); IsSmallerOrEqual(3,4); IsSmallerOrEqual(4,3) 
# TRUE; TRUE; FALSE

Comparing a >= b:

IsBiggerOrEqual <- function(a,b) {
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" && (a>b | all.equal(a, b))) { return(TRUE)
 } else if (a > b) { return(TRUE)
     } else { return(FALSE) }
}
IsBiggerOrEqual(3,3); IsBiggerOrEqual(4,3); IsBiggerOrEqual(3,4) 
# TRUE; TRUE; FALSE

Comparing a = b:

IsEqual <- function(a,b) {
# Control the existence of "Mean relative difference..." in all.equal; 
# if exists, it results in character, not logical:
if (   class(all.equal(a, b)) == "logical" ) { return(TRUE)
 } else { return(FALSE) }
}

IsEqual(0.1+0.05,0.15) # TRUE

回复收藏 0 原文

梦毁影碎の 2025-02-17 08:48:26

我也有类似的问题。我使用了以下解决方案。

@我发现了有关解决方案的工作，涉及不平等的剪切间隔。 @ 我
通过将选项设置为2位数字，在R中使用了圆函数
没有解决问题。

options(digits = 2)
cbind(
  seq(      from = 1, to = 9, by = 1 ), 
  cut( seq( from = 1, to = 9, by = 1),          c( 0, 3, 6, 9 ) ),
  seq(      from = 0.1, to = 0.9, by = 0.1 ), 
  cut( seq( from = 0.1, to = 0.9, by = 0.1),    c( 0, 0.3, 0.6, 0.9 )),
  seq(      from = 0.01, to = 0.09, by = 0.01 ), 
  cut( seq( from = 0.01, to = 0.09, by = 0.01),    c( 0, 0.03, 0.06, 0.09 ))
)

基于选项（数字= 2）的不等剪切间隔的输出：

  [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1  0.1    1 0.01    1
 [2,]    2    1  0.2    1 0.02    1
 [3,]    3    1  0.3    2 0.03    1
 [4,]    4    2  0.4    2 0.04    2
 [5,]    5    2  0.5    2 0.05    2
 [6,]    6    2  0.6    2 0.06    3
 [7,]    7    3  0.7    3 0.07    3
 [8,]    8    3  0.8    3 0.08    3
 [9,]    9    3  0.9    3 0.09    3


options(digits = 200)
cbind(
  seq(      from = 1, to = 9, by = 1 ), 
  cut( round(seq( from = 1, to = 9, by = 1), 2),          c( 0, 3, 6, 9 ) ),
  seq(      from = 0.1, to = 0.9, by = 0.1 ), 
  cut( round(seq( from = 0.1, to = 0.9, by = 0.1), 2),    c( 0, 0.3, 0.6, 0.9 )),
  seq(      from = 0.01, to = 0.09, by = 0.01 ), 
  cut( round(seq( from = 0.01, to = 0.09, by = 0.01), 2),    c( 0, 0.03, 0.06, 0.09 ))
)

基于圆形功能相等切割间隔的输出：

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1  0.1    1 0.01    1
 [2,]    2    1  0.2    1 0.02    1
 [3,]    3    1  0.3    1 0.03    1
 [4,]    4    2  0.4    2 0.04    2
 [5,]    5    2  0.5    2 0.05    2
 [6,]    6    2  0.6    2 0.06    2
 [7,]    7    3  0.7    3 0.07    3
 [8,]    8    3  0.8    3 0.08    3
 [9,]    9    3  0.9    3 0.09    3

I had a similar problem. I used the following solution.

@ I found this work around solution about unequal cut intervals. @ I
used the round function in R. By setting the option to 2 digits, did
not solved the problem.

options(digits = 2)
cbind(
  seq(      from = 1, to = 9, by = 1 ), 
  cut( seq( from = 1, to = 9, by = 1),          c( 0, 3, 6, 9 ) ),
  seq(      from = 0.1, to = 0.9, by = 0.1 ), 
  cut( seq( from = 0.1, to = 0.9, by = 0.1),    c( 0, 0.3, 0.6, 0.9 )),
  seq(      from = 0.01, to = 0.09, by = 0.01 ), 
  cut( seq( from = 0.01, to = 0.09, by = 0.01),    c( 0, 0.03, 0.06, 0.09 ))
)

output of unequal cut intervals based on options(digits = 2):

  [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1  0.1    1 0.01    1
 [2,]    2    1  0.2    1 0.02    1
 [3,]    3    1  0.3    2 0.03    1
 [4,]    4    2  0.4    2 0.04    2
 [5,]    5    2  0.5    2 0.05    2
 [6,]    6    2  0.6    2 0.06    3
 [7,]    7    3  0.7    3 0.07    3
 [8,]    8    3  0.8    3 0.08    3
 [9,]    9    3  0.9    3 0.09    3


options(digits = 200)
cbind(
  seq(      from = 1, to = 9, by = 1 ), 
  cut( round(seq( from = 1, to = 9, by = 1), 2),          c( 0, 3, 6, 9 ) ),
  seq(      from = 0.1, to = 0.9, by = 0.1 ), 
  cut( round(seq( from = 0.1, to = 0.9, by = 0.1), 2),    c( 0, 0.3, 0.6, 0.9 )),
  seq(      from = 0.01, to = 0.09, by = 0.01 ), 
  cut( round(seq( from = 0.01, to = 0.09, by = 0.01), 2),    c( 0, 0.03, 0.06, 0.09 ))
)

output of equal cut intervals based on round function:

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1  0.1    1 0.01    1
 [2,]    2    1  0.2    1 0.02    1
 [3,]    3    1  0.3    1 0.03    1
 [4,]    4    2  0.4    2 0.04    2
 [5,]    5    2  0.5    2 0.05    2
 [6,]    6    2  0.6    2 0.06    2
 [7,]    7    3  0.7    3 0.07    3
 [8,]    8    3  0.8    3 0.08    3
 [9,]    9    3  0.9    3 0.09    3

回复收藏 0 原文

在巴黎塔顶看东京樱花 2025-02-17 08:48:26

只是为了增加讨论，我最近发布给cran的一个软件包，cppdubles，使用相对差异比较了双精度浮点矢量，除非两个数字接近零，否则使用了绝对差异，类似于> > > > ALL.Equal 。

library(cppdoubles)

# Large floating-point error
x1 <- 1.1 * 100 * 10^200
x2 <- 110 * 10^200
dplyr::near(x1, x2)
#> [1] FALSE
x1 %~==% x2
#> [1] TRUE

# Alternatively we can use double_equal() which is the same as %~==%
double_equal(x1, x2)
#> [1] TRUE

# Other operators
sqrt(2)^2 %~>=% 2
#> [1] TRUE
sqrt(2)^2 %~<=% 2
#> [1] TRUE
sqrt(2)^2 %~>% 2
#> [1] FALSE
sqrt(2)^2 %~<% 2
#> [1] FALSE

# All cppdoubles functions are 'vectorised'

elementwise.all.equal <- Vectorize(function(x, y) {isTRUE(all.equal(x, y))})

x <- abs(rnorm(10^4))
y <- sqrt(x)^2

bench::mark(
  cppdoubles = double_equal(x, y),
  dplyr = dplyr::near(x, y),
  base = elementwise.all.equal(x, y), 
  min_iterations = 3
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 cppdoubles  300.9µs  308.1µs   2896.      39.1KB     2.00
#> 2 dplyr        37.2µs   76.2µs   8116.     117.3KB    16.0 
#> 3 base        214.7ms    221ms      4.38   410.5KB    19.0

^{在2023-12-29创建的使用}

Just to add to the discussion, a package I recently released to CRAN, cppdoubles, compares double-precision floating point vectors using relative differencing, except when either number is close to zero, in which case absolute differences are used, similar to all.equal.

library(cppdoubles)

# Large floating-point error
x1 <- 1.1 * 100 * 10^200
x2 <- 110 * 10^200
dplyr::near(x1, x2)
#> [1] FALSE
x1 %~==% x2
#> [1] TRUE

# Alternatively we can use double_equal() which is the same as %~==%
double_equal(x1, x2)
#> [1] TRUE

# Other operators
sqrt(2)^2 %~>=% 2
#> [1] TRUE
sqrt(2)^2 %~<=% 2
#> [1] TRUE
sqrt(2)^2 %~>% 2
#> [1] FALSE
sqrt(2)^2 %~<% 2
#> [1] FALSE

# All cppdoubles functions are 'vectorised'

elementwise.all.equal <- Vectorize(function(x, y) {isTRUE(all.equal(x, y))})

x <- abs(rnorm(10^4))
y <- sqrt(x)^2

bench::mark(
  cppdoubles = double_equal(x, y),
  dplyr = dplyr::near(x, y),
  base = elementwise.all.equal(x, y), 
  min_iterations = 3
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 cppdoubles  300.9µs  308.1µs   2896.      39.1KB     2.00
#> 2 dplyr        37.2µs   76.2µs   8116.     117.3KB    16.0 
#> 3 base        214.7ms    221ms      4.38   410.5KB    19.0

^{Created on 2023-12-29 with reprex v2.0.2}

回复收藏 0 原文

~没有更多了~

关于作者

白鸥掠海

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

为什么这些数字不相等？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

通用（语言不可知论）原因

规范副本

比较向量

测试，用于在矢量内出现值的值

General (language agnostic) reason

Comparing scalars

Comparing vectors

Testing for occurrence of a value within a vector

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

为什么这些数字不相等？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

通用（语言不可知论）原因

规范副本

比较向量

测试，用于在矢量内出现值的值

General (language agnostic) reason

Comparing scalars

Comparing vectors

Testing for occurrence of a value within a vector

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。