用双精度数表示整数

发布于 2024-07-17 00:55:59 字数 188 浏览 4 评论 0原文

双精度数(给定字节数,具有合理的尾数/指数平衡)是否始终完全精确地保存该字节数一半的无符号整数的范围?

例如,八字节双精度可以完全精确地保存四字节无符号整数的数字范围吗?

这归结为一个两字节浮点是否可以容纳一字节无符号整型的范围。

一字节 unsigned int 当然是 0 -> 255.

Can a double (of a given number of bytes, with a reasonable mantissa/exponent balance) always fully precisely hold the range of an unsigned integer of half that number of bytes?

E.g. can an eight byte double fully precisely hold the range of numbers of a four byte unsigned int?

What this will boil down to is if a two byte float can hold the range of a one byte unsigned int.

A one byte unsigned int will of course be 0 -> 255.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

南笙 2024-07-24 00:55:59

IEEE754 64 位双精度数可以表示任何 32 位整数,因为它有 53 个奇数(a) 位可用于精度,而 32 位整数只需要 32 位:-)

(非 IEEE754 双精度)64 位浮点数的精度小于 32 位是合理的。 这将允许真正巨大的数字(由于指数),但以精度为代价。

底线是,如果浮点数尾数中的精度位比整数中的精度位多(并且指数中有足够的位来缩放它),那么可以在不损失精度的情况下表示它。


(a) 从技术上讲,第 53 位精度是序列开头隐含的 1,因此“可变性”量可能仅为 52 位。 无论是 52 还是 53,仍然足够表示每个 32 位整数。

An IEEE754 64-bit double can represent any 32-bit integer, simply because it has 53-odd(a) bits available for precision and the 32-bit integer only needs, well, 32 :-)

It would be plausible for a (non IEEE754 double precision) 64-bit floating point number to have less than 32 bits of precision. That would allow truly huge numbers (due to the exponent) but at the cost of precision.

The bottom line is that, provided there are more bits of precision in the mantissa of the floating point number than there are in the integer (and enough bits in the exponent to scale it), then it can be represented without loss of precision.


(a) Technically, the 53rd bit of precision is an implied 1 at the start of the sequence so the amount of "variablity" may only be 52 bits. Whether it's 52 or 53, it's still enough bits to represent every 32-bit integer.

心在旅行 2024-07-24 00:55:59

是的。 保证浮点型(或双精度型)精确表示任何不需要截断的整数。 对于双精度数,有 53 位精度,因此足以精确表示任何 32 位整数,并且也有一小部分(从统计学上来说)64 位整数。

Yes. A float (or double) is guaranteed to exactly represent any integer that does not need to be truncated. For a double, there is 53 bits of precision, so that is more than enough to exactly represent any 32 bit integer, and a tiny (statistically speaking) proportion of 64 bit ones too.

滴情不沾 2024-07-24 00:55:59

您可以准确表示的范围具体取决于实现中的许多因素,但是您可以通过以下方式对其进行下限:如果指数字段设置为 0,则可以准确表示宽度最大为你的尾数字段(假设有一个符号位)。 对于 IEEE 754 双精度,这意味着您可以精确地表示 52 位数字。 一般来说,尾数将超过整个结构宽度的一半。

Exactly what the range is that you can represent exactly depends on a lot of factors in your implementation, but you can lower-bound it by saying that, if the exponent field is set to 0, you can exactly represent integers up to the width of your mantissa field (assuming a sign bit). For IEEE 754 double-precision, this means you can represent 52-bit numbers exactly. In general, your mantissa will be over half the width of the overall structure.

青瓷清茶倾城歌 2024-07-24 00:55:59

在谈论浮点数时,我不会使用“完全精确”这个词。 但是,是的,double 可以表示 32 位整数。

我不知道还有哪些浮点数和整数的其他组合也适用于此。

实际上,您不想费心使用超出机器支持范围的浮点,因此只需切换到使用 bignums 的有理算术即可。 这样,您就可以保证精度。

I wouldn't use the words "fully precisely" when talking about floating-point numbers. But yes, a double can represent a 32-bit integer.

I do not know which other combinations of floats and ints that this is also true for.

Practically speaking, you don't want to bother using floating point above what your machine supports, so just switch to rational arithmetic with bignums. That way, you're guaranteed precision.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文