C——浮点舍入
我想了解浮点数是如何工作的。
我想我想通过评估以下内容来测试我所知道/需要学习的内容:我想找到最小的 x
使得 x + 1 = x
,其中 x 是浮点数。
据我了解,如果 x
足够大,使得 x + 1
比下一个比浮点可表示的 x 高的数字更接近 x,就会发生这种情况。直观上来说,似乎我的有效数字没有足够的数字。那么这个数字 x 是否就是尾数全为 1 的数字?但后来我似乎无法弄清楚指数必须是多少。显然它必须很大(无论如何,相对于 10^0)。
I'm trying to understand how floating point numbers work.
I think I'd like to test out what I know / need to learn by evaluating the following: I would like to find the smallest x
such that x + 1 = x
, where x
is a floating point number.
As I understand it, this would happen in the case where x
is large enough so that x + 1
is closer to x than the next number higher than x representable by floating point. So intuitively it seems it would be the case where I don't have enough digits in the significand. Would this number x then be the number where the significand is all 1's. But then I can't seem to figure out what the exponent would have to be. Obviously it would have to be big (relative to 10^0, anyway).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您只需要用指数表示尾数中 LS 位的值。当这是>时1 那么你就满足了你的条件。对于单精度浮点数,LS 位的值为 2^-24*2^exp,因此当 exp > 时满足条件。 24,即 25。因此,满足此条件的最小(标准化)数字将为 1.0 * 2^25 = 33554432.0f。
我还没有检查过这一点,所以我的数学可能在某个地方有偏差(例如,2 倍),并且 FP 单元也可能舍入到第 24 位以上,因此可能还需要考虑 2 倍的因素这个,但你明白了一般的想法......
You just need an expression for the value of the LS bit in the mantissa in terms of the exponent. When this is > 1 then you have met your condition. For a single precision float the LS bit has a value of 2^-24*2^exp, so the condition would me met when exp is > 24, i.e. 25. The smallest (normalized) number where this condition would be satisfied would therefore be 1.0 * 2^25 = 33554432.0f.
I haven't checked this, so my maths may be off somewhere (e.g. by a factor of 2) and it's also possible that the FP unit does rounding beyond the 24th bit, so there may be a further factor of 2 needed to account for this, but you get the general idea...
从 1.0 开始,不断加倍,直到测试成功:
Start with 1.0, and keep doubling it until the test succeeds:
我建议在尝试理解 fp 数字和 fp 算术时,使用十进制,尾数为 5 位,指数为 2 位。 (或者,如果 5 和 2 不适合您,则可以选择 6 和 3 或您喜欢的任何其他小数字。) 问题:
都更容易用十进制计算出来,并且您学到的教训完全是通用的。一旦您弄清楚了这一点,通过 IEEE fp 算法增强您的知识将相对简单。您还可以相对轻松地找出其他 fp 算术系统。
I suggest that while trying to understand f-p numbers and f-p arithmetic you work in decimal with 5 digits in the significand and 2 in the exponent. (Or, if 5 and 2 don't suit you, 6 and 3 or any other small numbers you like.) The issues of:
are all much easier to figure out in decimal and the lessons you learn are entirely general. Once you've got this figured out, enhancing your knowledge with IEEE f-p arithmetic will be relatively straightforward. You'll also be able to figure out other f-p arithmetic systems with relative ease.