将双精度数与整数进行同一性比较时,什么会转换成什么?

发布于 2024-09-24 13:37:52 字数 675 浏览 4 评论 0原文

好的,我知道您通常不应该比较两个浮点数是否相等。然而,在 William Kahan 的浮点计算中对舍入的无意识评估是多么徒劳? 他展示了以下代码(我相信是伪代码):

Real Function T(Real z) :
      T := exp(z) ;                       ... rounded, of course.
      If (T = 1) Return( T ) ;            ... when |z| is very tiny.
      If (T = 0) Return( T := –1/z ) ;    ... when exp(z) underflows.
      Return( T := ( T – 1 )/log(T) ) ;   ... in all other cases.
      End T .

现在,我有兴趣在 C 或 C++ 中实现此代码,并且我有两个相关问题:

a)如果我将 T 设为双精度,则比较 (T == 1) 或 (T == 0) 0 和 1 是否会转换为 double 以保留多类型表达式中涉及的值的精度?

b) 这仍然算作比较两个浮点数是否相等吗?

OK, so I know you're generally not supposed to compare two floating-point numbers for equality. However, in William Kahan's How Futile are Mindless Assessments of Roundoff in Floating-Point Computation? he shows the following code (pseudo-code, I believe):

Real Function T(Real z) :
      T := exp(z) ;                       ... rounded, of course.
      If (T = 1) Return( T ) ;            ... when |z| is very tiny.
      If (T = 0) Return( T := –1/z ) ;    ... when exp(z) underflows.
      Return( T := ( T – 1 )/log(T) ) ;   ... in all other cases.
      End T .

Now, I'm interested in implementing this in C or C++, and I have two related questions:

a) if I take T to be a double, then in the comparison (T == 1) or (T == 0) would 0 and 1 get converted to double to preserve the precision of the values involved in a multi-type expression?

b) does this still count as comparing two floating-point numbers for equality?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

随风而去 2024-10-01 13:37:52

是的,是的。

对于 32 位整数,double 可以精确地表示每个值。然而,当您将 double 与 64 位 int 进行比较时,如果 int 大于 2^52,则可能会出现舍入错误。不过,您可以使用 long double,它至少有 64 位尾数。

当然,最好的方法是仅使用浮点文字:1.0 或仅 1. 的类型为 double1.0 f 是一个 float,而 my_float_type(1) 具有它应该具有的任何类型 :v) 。

Yes and yes.

For 32-bit ints, double can represent every value precisely. When you compare a double to a 64-bit int, however, there will be potential roundoff error if the int is greater than 2^52. You can use long double, though, which has at least 64 bits of mantissa.

Of course, the best way is just to use a floating-point literal: 1.0 or just 1. has type double, 1.0f is a float, and my_float_type(1) has whatever type it's supposed to :v) .

乙白 2024-10-01 13:37:52

整数被转换为双精度数。

请参阅第 5 节开头的 C++ 中的表达式标准

如果您知道浮点数包含精确值,则无需担心不精确的表示形式。

无符号整数可以精确地表示为浮点数,只要它们适合尾数 + 1 位,对于有符号整数,它是尾数 + 2 位(除了最大的负整数,2-31) 32 位整数)。

分母中的2次方分数也可以精确表示。

The integer gets converted to a double.

See at the beginning of section 5 Expressions in the C++ standard.

If you know that floating point numbers contain exact values, then you don't need to worry about inexact representations.

Unsigned integers can be represented exactly as floating point numbers as long as they fit into the mantissa + 1 bit, for signed integers it is mantissa + 2 bits (except for the most negative integer, 2-31 for 32 bit ints).

Fractions with a power of 2 in the denominator can also be represented exactly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文