信任浮点运算
我知道像 0.3 这样的数字存在问题,无法使用浮点数表示,因此它们会产生浮点错误。
那么可以表示的数字呢?像 0.5、0.75 等...如果我处理的是 2 的负幂的数字以及由它们组成的数字,我可以相信浮点算术是没有错误的吗?
I know there are problem with numbers like 0.3 which cannot be expressed using floating point numbers so they generate floating point errors.
What about the numbers that can be represented? like 0.5, 0.75, etc... can I trust floating point arithmetic to be error free if I'm dealing with numbers that are negative powers of two and numbers composed from them?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
假设您有一个 IEEE754 架构,并且如果您只执行加法、减法和乘法,并且如果结果合适,那么它应该是正确的。仅当所得分母是 2 的幂时才能使用除法。任何其他内置数学函数(如 exp 和 log)都不可能是正确的(由于 Lindemann-Weierstrass);对于非自然功率也是如此(尽管大多数 CPU 中甚至没有内置功率函数)。
Assuming you have an IEEE754 architecture, and if you are only performing addition, subtraction and multiplication, and if the result fits, then it should be correct. Division can only be used if the resulting denominator is a power of two. Any other built-in maths functions like exp and log cannot possibly be correct (due to Lindemann-Weierstrass); ditto for non-natural powers (though there isn't even a built-in power function in most CPUs anyway).
还有另一个明显的限制:普通浮点数将具有(例如)53 位有效数,因此(缩放后)所涉及的每个数字的二进制表示形式必须适合 53 个二进制数字,以避免丢失精度。
There is another obvious restriction: a normal floating point number will have (for example) a 53-bit significand, so (after scaling) the binary representation of every number involved has to fit into 53 binary digits to avoid losing precision.
一些加法和减法在浮点算术中可以是精确的,但通常乘法在浮点算术中不能免于舍入,因为您需要双倍的位数来表示乘积。
Some additions and subtractions can be exact in floating-point arithmetic but in general multiplication cannot be free of rounding in floating-point arithmetic because you need double the number of bits to represent the product.
当您拥有符合 IEEE-754 标准的浮点实现时,您可以保证基本运算(加法、减法、乘法、除法、余数、平方根)的计算尽可能精确。因此,您可以安全地执行以下所有运算:
与其他基本运算相比,余数运算对于任何两个操作数。
您只需确保您永远不需要比浮点类型提供的精度更高的精度。
When you have an IEEE-754-conforming floating point implementation you have the guarantee that the basic operations (addition, subtraction, multiplication, division, remainder, square root) are computed as precisely as possible. So you can do all of the following operations safely:
In contrast to the other basic operations, the remainder operation is precise for any two operands.
You just have to make sure that you never need more precision than the precision provided by the floating point type.
您需要获取 IEEE 浮点规范的副本并进行研究。如今,几乎所有编译器和 CPU 都严格遵循这一点,因此,如果您严格遵循规范,您可以获得“准确”的结果。
您并不总是(取决于语言)控制的一件事是计算结果是保留在寄存器中还是存储回“home”。这可能会影响结转到下一次计算的精度。
但几乎每种常见的计算语言都实现(或作为附加组件提供)某种“长十进制”或“长整数”支持,可用于生成任意长度/精度的精确结果,只要您坚持以这些形式表示的价值观。
You need to get a copy of the IEEE floating point spec and study it. Pretty much all compilers and CPUs follow that to the letter these days, so if you follow the spec to the letter you can get "exact" results.
The one thing you don't always (depending on the language) have control over is whether the result of a computation remains in a register or is stored back to "home". This can affect the precision that is carried forward into the next computation.
But just about every common computing language implements (or has available as an add-on) some sort of "long decimal" or "long integer" support that can be used to produce exact results to an arbitrary length/precision, so long as you stick to values that are representable in those forms.
首先,在 x86 架构上实施 IEEE 规范,尝试处理所有异常情况(但有例外)。除以零、下溢和溢出是明显的例外。另一个不太明显的问题是“不精确”,即操作的结果无法准确表示。无论如何 - 据我了解 - 许多开发环境只是掩盖了导致此类情况被忽视的异常。即使在没有辅助轮的环境中,不精确的异常也往往会被掩盖,但当然可以启用。
至于二的负幂问题,答案是你应该确保不安全的值不会出现在它们没有任何业务的地方。 0 作为除数,负值变为 sqrt 或 log/ln 等。这意味着对输入进行控制,以便算法在使用它们时不会出现异常。由于您的异常可能会被屏蔽,因此在您面对结果之前,您的算法可能已经对错误值做了相当多的工作:+NAN、-NAN 或来自 printf 的“脏话删除”格式。
浮点运算带来的问题可能会导致(并且经常导致)一系列蠕虫病毒。所以我的建议是,您可以更多地了解底层并尝试插入不同 fp 操作的值。
如今,成为浮点大师并不需要太多时间。
To begin with the implementation of the IEEE specification on the x86 architecture attempts to handle all out-of-the-ordinary situations with exceptions. Divide by zero, underflow and overflow are obvious exceptions. Another, not so obvious, is "inexact" i e the result of an operation cannot be represented exactly. In any case - as I understand it - many development environments simply mask out the exceptions causing such situations to go unnoticed. Even in environments that don't have training wheels the inexact exception tends to be masked out but can, of course, be enabled.
As to the question of negative powers of two the answer is rather that you should make sure that unsafe values don't end up where they don't have any business being. 0 as the divisor, negative values into sqrt or log/ln etc. This implies implementing control over the inputs such that the algorithms don't freak when using them. Since your exceptions will probably be masked your algorithm may have done quite a bit of work with bad values before you're faced with the result: +NAN, -NAN or "expletive deleted"-looking formatting from printf.
Floating point arithmetic brings issues with it which can lead to (and often does) a can of worms. So my recommendation is that you peek some more under the hood and experiment with the values you plug into different fp operations.
These days, it doesn't take much to become a floating point guru.