通过 0.0 时减去浮点数时出错

发布于 2024-11-02 23:07:59 字数 578 浏览 9 评论 0原文

以下程序：

#include <stdio.h>

int main()
{
    double val = 1.0;
    int i;

    for (i = 0; i < 10; i++)
    {
        val -= 0.2;
        printf("%g %s\n", val, (val == 0.0 ? "zero" : "non-zero"));
    }

    return 0;
}

产生以下输出：

0.8 non-zero
0.6 non-zero
0.4 non-zero
0.2 non-zero
5.55112e-17 non-zero
-0.2 non-zero
-0.4 non-zero
-0.6 non-zero
-0.8 non-zero
-1 non-zero

谁能告诉我从 0.2 减去 0.2 时导致错误的原因是什么？这是舍入错误还是其他原因？最重要的是，如何避免这个错误？

编辑：看来结论是不用担心，因为 5.55112e-17 非常接近于零（感谢@therefromhere 提供该信息）。

原文

The following program:

#include <stdio.h>

int main()
{
    double val = 1.0;
    int i;

    for (i = 0; i < 10; i++)
    {
        val -= 0.2;
        printf("%g %s\n", val, (val == 0.0 ? "zero" : "non-zero"));
    }

    return 0;
}

Produces this output:

0.8 non-zero
0.6 non-zero
0.4 non-zero
0.2 non-zero
5.55112e-17 non-zero
-0.2 non-zero
-0.4 non-zero
-0.6 non-zero
-0.8 non-zero
-1 non-zero

Can anyone tell me what is causing the error when subtracting 0.2 from 0.2? Is this a rounding error or something else? Most importantly, how do I avoid this error?

EDIT: It looks like the conclusion is to not worry about it, given 5.55112e-17 is extremely close to zero (thanks to @therefromhere for that information).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

围归者 2024-11-09 23:07:59

这是因为浮点数不能以精确值存储在内存中。因此，在浮点值中使用 == 永远都不安全。使用 double 会提高精度，但同样不会精确。比较浮点值的正确方法是执行以下操作：

val == target;   // not safe

// instead do this
// where EPS is some suitable low value like 1e-7
fabs(val - target) < EPS;

编辑：正如评论中指出的，问题的主要原因是 0.2 无法准确存储。因此，当您从某个值中减去它时，每次都会导致一些错误。如果您重复进行这种浮点计算，那么在某些时候错误将会很明显。我想说的是，所有浮点值都不能被存储，因为它们的数量是无限的。轻微的错误值通常不会被注意到，但使用连续计算会导致更高的累积误差。

Its because floating points numbers can not be stored in memory in exact value. So it is never safe to use == in floating point values. Using double will increase the precision, but again that will not be exact. The correct way to compare a floating point value is to do something like this:

val == target;   // not safe

// instead do this
// where EPS is some suitable low value like 1e-7
fabs(val - target) < EPS;

EDIT: As pointed in the comments, the main reason of the problem is that 0.2 can't be stored exactly. So when you are subtracting it from some value, every time causing some error. If you do this kind of floating point calculation repeatedly then at certain point the error will be noticeable. What I am trying to say is that all floating points values can't be stored, as there are infinites of them. A slight wrong value is not generally noticeable but using that is successive computation will lead to higher cumulative error.

回复收藏 0 原文

能否归途做我良人 2024-11-09 23:07:59

0.2 不是双精度浮点数，因此它被四舍五入到最接近的双精度数，即：

            0.200000000000000011102230246251565404236316680908203125

这相当笨重，所以让我们以十六进制形式查看它：

          0x0.33333333333334

现在，让我们看看重复减去该值时会发生什么从 1.0 开始：

          0x1.00000000000000
        - 0x0.33333333333334
        --------------------
          0x0.cccccccccccccc

精确的结果无法用双精度表示，因此它被舍入，这给出：

          0x0.ccccccccccccd

在十进制中，这正是：

            0.8000000000000000444089209850062616169452667236328125

现在我们重复该过程：

          0x0.ccccccccccccd
        - 0x0.33333333333334
        --------------------
          0x0.9999999999999c
rounds to 0x0.999999999999a
           (0.600000000000000088817841970012523233890533447265625 in decimal)

          0x0.999999999999a
        - 0x0.33333333333334
        --------------------
          0x0.6666666666666c
rounds to 0x0.6666666666666c
           (0.400000000000000077715611723760957829654216766357421875 in decimal)

          0x0.6666666666666c
        - 0x0.33333333333334
        --------------------
          0x0.33333333333338
rounds to 0x0.33333333333338
           (0.20000000000000006661338147750939242541790008544921875 in decimal)

          0x0.33333333333338
        - 0x0.33333333333334
        --------------------
          0x0.00000000000004
rounds to 0x0.00000000000004
           (0.000000000000000055511151231257827021181583404541015625 in decimal)

因此，我们看到浮点算术所需的累积舍入产生您正在观察的非常小的非零结果。舍入很微妙，但它是确定性的，不是魔术，也不是错误。值得花时间去了解。

0.2 is not a double precision floating-point number, so it is rounded to the nearest double precision number, which is:

            0.200000000000000011102230246251565404236316680908203125

That's rather unwieldy, so let's look at it in hex instead:

          0x0.33333333333334

Now, let's follow what happens when this value is repeatedly subtracted from 1.0:

          0x1.00000000000000
        - 0x0.33333333333334
        --------------------
          0x0.cccccccccccccc

The exact result is not representable in double precision, so it is rounded, which gives:

          0x0.ccccccccccccd

In decimal, this is exactly:

            0.8000000000000000444089209850062616169452667236328125

Now we repeat the process:

          0x0.ccccccccccccd
        - 0x0.33333333333334
        --------------------
          0x0.9999999999999c
rounds to 0x0.999999999999a
           (0.600000000000000088817841970012523233890533447265625 in decimal)

          0x0.999999999999a
        - 0x0.33333333333334
        --------------------
          0x0.6666666666666c
rounds to 0x0.6666666666666c
           (0.400000000000000077715611723760957829654216766357421875 in decimal)

          0x0.6666666666666c
        - 0x0.33333333333334
        --------------------
          0x0.33333333333338
rounds to 0x0.33333333333338
           (0.20000000000000006661338147750939242541790008544921875 in decimal)

          0x0.33333333333338
        - 0x0.33333333333334
        --------------------
          0x0.00000000000004
rounds to 0x0.00000000000004
           (0.000000000000000055511151231257827021181583404541015625 in decimal)

Thus, we see that the accumulated rounding that is required by floating-point arithmetic produces the very small non-zero result that you are observing. Rounding is subtle, but it is deterministic, not magic, and not a bug. It's worth taking the time to learn about.

回复收藏 0 原文

哎呦我呸! 2024-11-09 23:07:59

浮点运算无法准确表示所有数字。因此，像您观察到的舍入误差是不可避免的。

一种可能的策略是使用定点格式，例如小数或货币数据类型。此类类型仍然无法表示所有数字，但会按照您对本示例的预期运行。

回复收藏 0 原文

掌心的温暖 2024-11-09 23:07:59

详细说明一下：如果浮点数的尾数以二进制编码（就像大多数当代 FPU 中的情况一样），那么只有数字 1/2、1/4、1/8 的（倍数）之和， 1/16，...可以用尾数精确表示。值 0.2 近似为 1/8 + 1/16 + .... 一些甚至更小的数字，但用有限尾数无法达到 0.2 的精确值。

您可以尝试以下操作：

 printf("%.20f", 0.2);

您（可能）会看到您认为的 0.2 不是 0.2，而是一个略有不同的数字（实际上，在我的计算机上它打印 0.20000000000000001110）。现在你明白为什么你永远无法达到 0。

但是如果你让 val = 12.5 并在循环中减去 0.125，你就可以达到零。

To elaborate a bit: if the mantissa of the floating point number is encoded in binary (as is the case in most contemporary FPUs), then only sums of (multiples) of the numbers 1/2, 1/4, 1/8, 1/16, ... can be represented exactly in the mantissa. The value 0.2 is approximated with 1/8 + 1/16 + .... some even smaller numbers, yet the exact value of 0.2 can not be reached with a finite mantissa.

You can try the following:

 printf("%.20f", 0.2);

and you'll (probably) see that what you think is 0.2 is not 0.2 but a number that is a tiny amount different (actually, on my computer it prints 0.20000000000000001110). Now you understand why you can never reach 0.

But if you let val = 12.5 and subtract 0.125 in your loop, you could reach zero.

回复收藏 0 原文

~没有更多了~