浮点精度位

发布于 2024-11-14 16:38:34 字数 189 浏览 2 评论 0原文

在此 wiki 文章 中,它显示了 23 位精度、8 位指数和 1 位符号

浮点数类型中隐藏的第 24 位(使 7 个有效数字为 (23+1))在哪里?

In this wiki article it shows 23 bits for precision, 8 for exponent, and 1 for sign

Where is the hidden 24th bit in float type that makes (23+1) for 7 significand digits?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

晨光如昨 2024-11-21 16:38:34

浮点数通常是标准化的。例如,考虑一下我们大多数人在学校学到的科学记数法。您始终对指数进行缩放,以便小数点前恰好有一位数字。例如,您可以写入 1.23456x102,而不是 123.456。

计算机上的浮点通常以相同的方式处理(几乎1):数字被标准化,因此二进制小数点之前只有一位数字(二进制​​小数点,因为大多数工作都以二进制而不是十进制形式进行)。但有一个区别:在二进制的情况下,这意味着小数点之前的数字必须1。由于它始终是 1,因此没有真正需要存储该位。为了在每个浮点数中节省一点存储空间,1 位是隐式的,而不是被存储。

和往常一样,情况比这要复杂一些。主要区别在于非规范化数字。例如,如果您正在使用科学记数法,但只能使用 -99 到 +99 之间的指数。如果您想存储像 1.234*10-102 这样的数字,您将无法直接执行此操作,因此它可能会向下舍入为 0。

非规范化数字为您提供一种解决这个问题的方法。使用非规范化数字,您可以将其存储为 0.001234*10-99。假设(计算机上的通常情况)尾数和指数的位数都是有限的,这会损失一些精度,但仍然避免丢弃所有精度并只是将其称为<代码>0。


1 从技术上讲,存在差异,但它们对所涉及的基本理解没有影响。

Floating point numbers are usually normalized. Consider, for example, scientific notation as most of us learned it in school. You always scale the exponent so there's exactly one digit before the decimal point. For example, instead of 123.456, you write 1.23456x102.

Floating point on a computer is normally handled (almost1) the same way: numbers are normalized so there's exactly one digit before the binary point (binary point since most work in binary instead of decimal). There's one difference though: in the case of binary, that means the digit before the decimal point must be a 1. Since it's always a 1, there's no real need to store that bit. To save a bit of storage in each floating point number, that 1 bit is implicit instead of being stored.

As usual, there's just a bit more to the situation than that though. The main difference is denormalized numbers. Consider, for example, if you were doing scientific notation but you could only use exponents from -99 to +99. If you wanted to store a number like, say, 1.234*10-102, you wouldn't be able to do that directly, so it would probably just get rounded down to 0.

Denormalized numbers give you a way to deal with that. Using a denormalized number, you'd store that as 0.001234*10-99. Assuming (as is normally the case on a computer) that the number of digits for the mantissa and exponent are each limited, this loses some precision, but still avoids throwing away all the precision and just calling it 0.


1 Technically, there are differences, but they make no difference to the basic understanding involved.

蓝礼 2024-11-21 16:38:34

http://en.wikipedia.org/wiki/Single_ precision_floating-point_format#IEEE_754_single_ precision_binary_floating-point_format:_binary32

真正的有效数包括23
右边的小数位
二进制小数点和隐式前导
位(二进制小数点左边)
值为 1,除非指数是
全零存储

解释得很好,按照约定/设计,最后一位没有显式存储,而是通过规范说明,除非所有内容都是 0'os,否则它就在那里。

http://en.wikipedia.org/wiki/Single_precision_floating-point_format#IEEE_754_single_precision_binary_floating-point_format:_binary32

The true significand includes 23
fraction bits to the right of the
binary point and an implicit leading
bit (to the left of the binary point)
with value 1 unless the exponent is
stored with all zeros

Explains it pretty well, it is by convention/design that last bit is not stored explicitly but rather stated by specification that it is there unless everything is 0'os.

热风软妹 2024-11-21 16:38:34

在您编写时,单精度浮点格式具有一个符号位、八个指数位和 23 个有效位。令 s 为符号位,e 为指数位,f 为有效数位。以下是各种位组合的含义:

如果 e 和 f 为零,则对象为 +0 或 -0,具体取决于 s 是 0 还是 1。

如果 e 为零且 f 不为零,则对象为 (-1 )s * 21-127 * 0.f。 “0.f”表示写入0、句点和f的23位,然后将其解释为二进制数字。例如,0.011000...是 3/8。这些是“次正常”数字。

如果 0 < e < 255,对象为 (-1)s * 2e-127 * 1.f。 “1.f”与上面的“0.f”类似,只不过以 1 而不是 0 开头。这是隐式位。大多数浮点数都是这种格式;这些是“正常”数字。

如果 e 为 255 并且 f 为零,则根据 s 是 0 还是 1,该对象为 +无穷大或 -无穷大。

如果 e 为 255 并且 f 不为零,则该对象为 NaN(非数字)。 NaN 的 f 字段的含义取决于实现;标准没有完全规定。通常,如果第一位为零,则为信令NaN;否则它是一个安静的 NaN。

As you write, the single-precision floating-point format has a sign bit, eight exponent bits, and 23 significand bits. Let s be the sign bit, e be the exponent bits, and f be the significand bits. Here is what various combinations of bits stand for:

If e and f are zero, the object is +0 or -0, according to whether s is 0 or 1.

If e is zero and f is not, the object is (-1)s * 21-127 * 0.f. "0.f" means to write 0, period, and the 23 bits of f, then interpret that as a binary numeral. E.g., 0.011000... is 3/8. These are the "subnormal" numbers.

If 0 < e < 255, the object is (-1)s * 2e-127 * 1.f. "1.f" is similar to "0.f" above, except you start with 1 instead of 0. This is the implicit bit. Most of the floating-point numbers are in this format; these are the "normal" numbers.

If e is 255 and f is zero, the object is +infinity or -infinity, according to whether s is 0 or 1.

If e is 255 and f is not zero, the object is a NaN (Not a Number). The meaning of the f field of a NaN is implementation dependent; it is not fully specified by the standard. Commonly, if the first bit is zero, it is a signaling NaN; otherwise it is a quiet NaN.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文