所有单精度数字都可以用双精度格式表示吗?
给定以 IEEE-754 单精度格式(在某些语言/平台中通常称为 float
)表示的任意数字,我可以确定该数字也可以以双精度格式精确表示?
如果是这样,在考虑半精度到单精度和双精度到四精度时,该属性是否有效?
Given an arbitrary number represented in the IEEE-754 single-precision format (commonly known as float
in some languages/platforms) can I be certain that number can be represented exactly in the double-precision format as well?
If so, is that property valid when considering half-precision to single-precision and double-precision to quadruple-precision?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,双精度数可以代表浮点数可以代表的任何数字。对于四精度等也是如此。
浮点数以
1.01bx 2^-1
的形式表示(在本例中为 0.625)。数字的重要组成部分是尾数(基本上是一个二进制数,其小数点通常位于第一个数字之后)和指数。二进制浮点格式之间的唯一主要区别是每个组件的位数。数字使用的位数越多,每个部分可用的位数就越多。因此,32 位“浮点型”的有效数可能为 1.01000000000000000000000,而(64 位)“双精度型”则在点后有大约 50 位数字。这意味着任何可以用浮点型精确表示的数字也可以用双精度型精确表示,因为您既提高了精度(读作:更有效的数字),又提高了范围。这类似于 64 位整数变量可以保存任何 32 位整数;额外的位几乎没有被使用。
当然,当您将其转换为双精度值时,由于舍入误差而被截断的任何位都不会返回到数字中——这意味着浮点数中的 0.3 是一个不精确的结果,例如 0.2999999875 或其他东西(我不想计算),当你将其转换为双精度时,它不会变得更接近 0.3 —— 它仍然会是0.2999999875。如果您想要更接近的近似值,则需要从一开始就使用双精度数重新进行计算。
Yes, a double can represent any number that a float can. Likewise for quad-precision, etc.
A floating-point number is represented in a form like
1.01b x 2^-1
(0.625, in this case). The significant components of the number are the significand, which is basically a binary number with a radix point usually right after the first digit, and the exponent.The only major difference between the binary floating-point formats is the number of bits for each component. The more bits the number uses, the more bits are available for each part. So a 32-bit "float" might have 1.01000000000000000000000 for the significand, and a (64-bit) "double" would have about 50 digits after the dot. This means that any number that is exactly representable in a float is also exactly representable in a double, since you have both increased precision (read: more significant digits) and increased range. It's similar to how a 64-bit integer variable can hold any 32-bit integer; the extra bits just pretty much go unused.
Of course, any bits that got chopped off due to rounding error won't make it back into the number when you convert it to a double -- meaning the 0.3 you have in your float, being an inexact result like 0.2999999875 or something (i don't feel like calculating), isn't going to get any closer to 0.3 when you convert it to a double -- it's going to still be 0.2999999875. If you want a closer approximation, you'll need to redo the calculations with doubles from the start.
是的。事实上,您可以做出更强有力的声明:两个单精度数字的每个乘积都可以精确地用双精度表示(半精度和单精度或双精度和四精度也是如此)。
Yes. In fact, you can make an even stronger statement: every product of two single-precision numbers is representable exactly in double precision (ditto for half and single or double and quad).