为什么这些数字不相等?
以下代码显然是错误的。有什么问题?
i <- 0.1
i <- i + 0.05
i
## [1] 0.15
if(i==0.15) cat("i equals 0.15") else cat("i does not equal 0.15")
## i does not equal 0.15
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
dplyr ::近()
是测试是否相等的两个浮点数向量的选项。这是 docs :该函数具有内置的公差参数:
tol = .machine $ double.eps^0.5
可以调整。默认参数与all.equal()
的默认值相同。dplyr::near()
is an option for testing if two vectors of floating point numbers are equal. This is the example from the docs:The function has a built in tolerance parameter:
tol = .Machine$double.eps^0.5
that can be adjusted. The default parameter is the same as the default forall.equal()
.这是hackish,但很快:
This is hackish, but quick:
具有双精度算术的广义比较(“&lt; =”,“”,“&gt; =”,“ =”):
比较A&lt; = b:
比较A&gt; = b:
比较A = B:
Generalized comparisons ("<=", ">=", "=") in double precision arithmetic:
Comparing a <= b:
Comparing a >= b:
Comparing a = b:
我也有类似的问题。我使用了以下解决方案。
基于选项(数字= 2)的不等剪切间隔的输出:
基于圆形功能相等切割间隔的输出:
I had a similar problem. I used the following solution.
output of unequal cut intervals based on options(digits = 2):
output of equal cut intervals based on round function:
只是为了增加讨论,我最近发布给cran的一个软件包,cppdubles,使用相对差异比较了双精度浮点矢量,除非两个数字接近零,否则使用了绝对差异,类似于
>
>
>
> ALL.Equal
。在2023-12-29创建的使用
Just to add to the discussion, a package I recently released to CRAN, cppdoubles, compares double-precision floating point vectors using relative differencing, except when either number is close to zero, in which case absolute differences are used, similar to
all.equal
.Created on 2023-12-29 with reprex v2.0.2
通用(语言不可知论)原因
,因为并非所有数字都可以在
这是计算机算术的一个众所周知的限制,并在几个地方进行了讨论:
规范副本
的
r
中的这是不使用== ==
,而是
>
功能。或更确切地说,由于
ALL.Equal
如果有任何差异,则提供了很多详细信息,istrue(all.equal(...))
。产生
更多使用
all.equal
而不是==
的示例(最后一个示例应该表明这将正确显示差异)。https://stackoverflow.com/a/a/22228139/892313”中直接复制的更多细节
直接从
虽然当您说的话时,r略微存在:
您可以在小数点中找出它真正的想法:
您可以看到这些数字有所不同,但是表示形式有些笨拙。如果我们在二进制中查看它们(嗯,十六进制,这是同等的),我们会得到更清晰的图片:
您可以看到它们通过
2^-53
而不同,这很重要,因为此数字是最小的两个数字之间值接近1的数字之间的代表差。我们可以通过在r 字段:
您可以使用此事实创建一个“几乎等于”的函数,该功能检查差异是否接近浮点中最小的代表数字。实际上,这已经存在:
all.Equal
。因此,ALL.Equal函数实际上是在检查数字之间的差异是两个Mantissas之间最小差的平方根。
这种算法在称为Denormals的数字非常小的数字上有些有趣,但是您不必为此担心。
比较向量
上述讨论假定了两个单个值的比较。在r中,没有标量,只有向量和隐式矢量化是语言的优势。为了比较向量元素的价值,以前的原理成立,但实现略有不同。
==
是矢量化的(进行元素的比较),而all.Equal
将整个向量作为一个实体进行比较。使用以前的示例
==
不给出“预期”结果,all.equal
不执行元素,而是在两个向量上循环的版本必须是 则使用
如果需要的功能版本,
它可以编写,可以将其称为
替代,而不是包装
all.equal.equal
在更多函数调用中,您可以仅复制<<的相关内部。代码> all.equal.numeric 并使用隐式矢量化:这是
dplyr ::接近
采用的方法,该方法将自己记录为测试,用于在矢量内出现值的值
in Standard R函数
%
也可能遭受如果应用于浮点值,则相同的问题。例如:我们可以定义一个新的Infix操作员,以在以下比较中允许公差:
dplyr ::接近
包裹在中>
也可以用于矢量化检查General (language agnostic) reason
Since not all numbers can be represented exactly in IEEE floating point arithmetic (the standard that almost all computers use to represent decimal numbers and do math with them), you will not always get what you expected. This is especially true because some values which are simple, finite decimals (such as 0.1 and 0.05) are not represented exactly in the computer and so the results of arithmetic on them may not give a result that is identical to a direct representation of the "known" answer.
This is a well known limitation of computer arithmetic and is discussed in several places:
Comparing scalars
The standard solution to this in
R
is not to use==
, but rather theall.equal
function. Or rather, sinceall.equal
gives lots of detail about the differences if there are any,isTRUE(all.equal(...))
.yields
Some more examples of using
all.equal
instead of==
(the last example is supposed to show that this will correctly show differences).Some more detail, directly copied from an answer to a similar question:
The problem you have encountered is that floating point cannot represent decimal fractions exactly in most cases, which means you will frequently find that exact matches fail.
while R lies slightly when you say:
You can find out what it really thinks in decimal:
You can see these numbers are different, but the representation is a bit unwieldy. If we look at them in binary (well, hex, which is equivalent) we get a clearer picture:
You can see that they differ by
2^-53
, which is important because this number is the smallest representable difference between two numbers whose value is close to 1, as this is.We can find out for any given computer what this smallest representable number is by looking in R's machine field:
You can use this fact to create a 'nearly equals' function which checks that the difference is close to the smallest representable number in floating point. In fact this already exists:
all.equal
.So the all.equal function is actually checking that the difference between the numbers is the square root of the smallest difference between two mantissas.
This algorithm goes a bit funny near extremely small numbers called denormals, but you don't need to worry about that.
Comparing vectors
The above discussion assumed a comparison of two single values. In R, there are no scalars, just vectors and implicit vectorization is a strength of the language. For comparing the value of vectors element-wise, the previous principles hold, but the implementation is slightly different.
==
is vectorized (does an element-wise comparison) whileall.equal
compares the whole vectors as a single entity.Using the previous examples
==
does not give the "expected" result andall.equal
does not perform element-wiseRather, a version which loops over the two vectors must be used
If a functional version of this is desired, it can be written
which can be called as just
Alternatively, instead of wrapping
all.equal
in even more function calls, you can just replicate the relevant internals ofall.equal.numeric
and use implicit vectorization:This is the approach taken by
dplyr::near
, which documents itself asTesting for occurrence of a value within a vector
The standard R function
%in%
can also suffer from the same issue if applied to floating point values. For example:We can define a new infix operator to allow for a tolerance in the comparison as follows:
dplyr::near
wrapped inany
can also be used for the vectorized check添加Brian的评论(这是原因),您可以通过使用
all.Equal
而不是:每个Joshua的警告是更新的代码(感谢Joshua):
Adding to Brian's comment (which is the reason) you can over come this by using
all.equal
instead:Per Joshua's warning here is the updated code (Thanks Joshua):