浮点数，什么时候出现，如何工作

发布于 2024-11-19 17:58:36 字数 437 浏览 0 评论 0原文

据我所知float可以精确地表示14个数字。

假设我们有

a = 564214623154
b = 54252

并将其相乘 c=a*b 它应该是 30609771735350808 但编译时它显示我 3.0609771735351E+16 所以据我了解，它应该会失去一些精度，但是当我将 c 除以 a 时 c/a 我得到 564214623154 精确结果，没有任何精度丢失

另一个例子让我们说

c = 30609771735350808 
d = 30609761111111111

e=cd 应该是 10624239697 但编译时它显示我 10624239696 所以精度丢失

那么只有当我减去或添加两个数字时精度才会丢失吗？

如果重要的话我使用 php

原文

As far as I know float can represent 14 numbers precisely.

So let's say we have

a = 564214623154
b = 54252

and we multiply this
c=a*b and it should be 30609771735350808 but when compiled it shows me 3.0609771735351E+16
So as I understand it should lose some precision but when I divide c by a
c/a I get 564214623154 exact result without any precision lost

another example lets say we have

c = 30609771735350808 
d = 30609761111111111

e=c-d should be 10624239697 but when compiled it shows me 10624239696 so precision is lost

So is precision lost only when I subtract or add two numbers?

If it matters I use php

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

亣腦蒛氧 2024-11-26 17:58:36

乘法和除法也可能会失去精度。 PHP 和 JavaScript 以 IEEE-754 格式存储数字，其中包含 52 位尾数和 11 位指数。有些整数可以精确表示，有些则不然。

让我们试试这些：

在实数中（用 Ruby 生成）：

45345657434523 * 9347287748322342 / 74387422372 = 5697991604786167788

在 PHP 和 JavaScript 中

45345657434523 * 9347287748322342 / 74387422372 = 5697991604786168000

因此，我们也会失去乘法和除法的精度。

编辑：在重新审视OP的问题时，这似乎不是一个很好的答案，因为结果包含超过15位小数位的精度。如果问题的目的是是否对一堆数字进行乘法和除法，每个数字都以 15 位或更少的精度表示，那么最终结果往往会保持很高的精度（前提是不上溢或下溢）。因此，您可以乘以 1.25E35 * 2.5E7 并精确地得到 3.125e+42，因为 PHP 和 JavaScript 本质上会将有效数字组相乘并将指数相加。但是，如果将这两个值相加，则会得到 1.25E35 + 2.5E7 = 1.25E35。没错，你给一个数字加上 2500 万，它并没有改变！这是因为，正如 OP 所说，您只能获得 14 或 15 位小数位的精度。尝试通过写出 120000000000000000000000000000000000 + 25000000 来手动添加这两个值。 14-15位数字从左边开始数，你无法全部拾取。

底线是加法和减法更容易出现精度问题。很高兴知道。

It is possible to lose precision with multiplication and division also. PHP and JavaScript store numbers in IEEE-754 format with 52 bits of mantissa and 11 bits of exponent. Some integers are represented exactly and some are not.

Let's try these:

In Real Math (generated with Ruby):

45345657434523 * 9347287748322342 / 74387422372 = 5697991604786167788

In PHP and JavaScript

45345657434523 * 9347287748322342 / 74387422372 = 5697991604786168000

So we lose precision with multiplication and division also.

EDIT: On revisiting the OP's question it seems like this was not a great answer, because the result contained over 15 decimal digits of precision. If the intent of the question is whether multiplying and dividing a bunch of numbers each of which was represented in 15 digits of precision or less, then the final result tends to keep a good deal of precision (provided you don't overflow or underflow). So you can multiply 1.25E35 * 2.5E7 and get precisely 3.125e+42 because PHP and JavaScript will essentially multiply the groups of significant figures and add up the exponents. However, if you ADD those two values you get 1.25E35 + 2.5E7 = 1.25E35. That's right, you add 25 million to a number and it does not change! That is because, as the OP says, you only get 14 or 15 decimal digits of precision. Try adding those two values by hand by writing out 120000000000000000000000000000000000 + 25000000. The 14-15 digits start counting from the left and you can't pick them all up.

Bottom line is precision problems are more likely to arise with addition and subtraction. Good to be aware of.

回复收藏 0 原文

纸短情长 2024-11-26 17:58:36

在第一种情况下，您不会损失任何精度，PHP 只是将较大的数字格式化为浮点数。（在内部，数字被保存为浮点数。）尝试一下获得“精确”输出：

$a = 564214623154;
$b = 54252;
$c = $a * $b;
printf("%u, %u\n", $c, $c/$a);

接下来，在 c * d 的情况下，您的两个数字已经超出了标准的容量IEEE-64 位浮点数（即 53 位，而您至少需要 55 位），因此在存储这些数字时精度已经丢失。

加法/减法过程中丢失精度的问题称为“取消”：您花费了所有存储空间的所有最重要的位都被取消，并且最终没有足够的准确位来填充 manitssa。这就是生活。

想象一下，您坐在月球上，在英国伍斯特对您兄弟的胡须长度进行了两次测量。比较两个测量值会受到您存储大量精度的要求的影响。

In your first case you lose no precision, PHP is just formatting the larger number as a float. (Internally the number is kept as a float.) Try this go get the "precise" output:

$a = 564214623154;
$b = 54252;
$c = $a * $b;
printf("%u, %u\n", $c, $c/$a);

Next up, in the case of c * d, your two numbers individually already exceed the capacity of a standard IEEE-64-bit float (which is 53 bit, while you need at least 55), so precision is already lost when you store those numbers.

The problem of losing precision during addition/subtraction is called "cancellation": All the most-significant bits on which you spent all your storage canceled out, and you end up with not enough accurate bits to fill up the manitssa. C'est la vie.

Imagine you're sitting on the moon and you take two measurements of your brother's beard hair length in Worcester, UK. Comparing the two measurements suffers from your requirement to store a very large amount of precision.

回复收藏 0 原文

~没有更多了~