将双精度数与整数进行同一性比较时,什么会转换成什么?
好的,我知道您通常不应该比较两个浮点数是否相等。然而,在 William Kahan 的浮点计算中对舍入的无意识评估是多么徒劳? 他展示了以下代码(我相信是伪代码):
Real Function T(Real z) :
T := exp(z) ; ... rounded, of course.
If (T = 1) Return( T ) ; ... when |z| is very tiny.
If (T = 0) Return( T := –1/z ) ; ... when exp(z) underflows.
Return( T := ( T – 1 )/log(T) ) ; ... in all other cases.
End T .
现在,我有兴趣在 C 或 C++ 中实现此代码,并且我有两个相关问题:
a)如果我将 T 设为双精度,则比较 (T == 1) 或 (T == 0) 0 和 1 是否会转换为 double 以保留多类型表达式中涉及的值的精度?
b) 这仍然算作比较两个浮点数是否相等吗?
OK, so I know you're generally not supposed to compare two floating-point numbers for equality. However, in William Kahan's How Futile are Mindless Assessments of Roundoff in Floating-Point Computation? he shows the following code (pseudo-code, I believe):
Real Function T(Real z) :
T := exp(z) ; ... rounded, of course.
If (T = 1) Return( T ) ; ... when |z| is very tiny.
If (T = 0) Return( T := –1/z ) ; ... when exp(z) underflows.
Return( T := ( T – 1 )/log(T) ) ; ... in all other cases.
End T .
Now, I'm interested in implementing this in C or C++, and I have two related questions:
a) if I take T to be a double, then in the comparison (T == 1) or (T == 0) would 0 and 1 get converted to double to preserve the precision of the values involved in a multi-type expression?
b) does this still count as comparing two floating-point numbers for equality?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,是的。
对于 32 位整数,
double
可以精确地表示每个值。然而,当您将 double 与 64 位 int 进行比较时,如果 int 大于 2^52,则可能会出现舍入错误。不过,您可以使用 long double,它至少有 64 位尾数。当然,最好的方法是仅使用浮点文字:
1.0
或仅1.
的类型为double
,1.0 f
是一个float
,而my_float_type(1)
具有它应该具有的任何类型 :v) 。Yes and yes.
For 32-bit ints,
double
can represent every value precisely. When you compare a double to a 64-bit int, however, there will be potential roundoff error if the int is greater than 2^52. You can uselong double
, though, which has at least 64 bits of mantissa.Of course, the best way is just to use a floating-point literal:
1.0
or just1.
has typedouble
,1.0f
is afloat
, andmy_float_type(1)
has whatever type it's supposed to :v) .整数被转换为双精度数。
请参阅第 5 节开头的 C++ 中的表达式标准。
如果您知道浮点数包含精确值,则无需担心不精确的表示形式。
无符号整数可以精确地表示为浮点数,只要它们适合尾数 + 1 位,对于有符号整数,它是尾数 + 2 位(除了最大的负整数,2-31) 32 位整数)。
分母中的2次方分数也可以精确表示。
The integer gets converted to a double.
See at the beginning of section 5 Expressions in the C++ standard.
If you know that floating point numbers contain exact values, then you don't need to worry about inexact representations.
Unsigned integers can be represented exactly as floating point numbers as long as they fit into the mantissa + 1 bit, for signed integers it is mantissa + 2 bits (except for the most negative integer, 2-31 for 32 bit ints).
Fractions with a power of 2 in the denominator can also be represented exactly.