为什么将整数转换为 float16 很危险?
我最近遇到了一个令人惊讶且恼人的错误,其中我将整数转换为 float16 并且值发生了变化:
>>> import numpy as np
>>> np.array([2049]).astype(np.float16)
array([2048.], dtype=float16)
>>> np.array([2049]).astype(np.float16).astype(np.int32)
array([2048.], dtype=int32)
这可能不是一个错误,因为它也发生在 PyTorch 上。我猜它与半浮点表示有关,但我无法弄清楚为什么 2049 是第一个被错误转换的整数。
这个问题与Python并不特别相关(我猜)
I have run recently into a surprising and annoying bug in which I converted an integer into a float16 and the value changed:
>>> import numpy as np
>>> np.array([2049]).astype(np.float16)
array([2048.], dtype=float16)
>>> np.array([2049]).astype(np.float16).astype(np.int32)
array([2048.], dtype=int32)
This is likely not a bug, because it happens also for PyTorch. I guess it is related to half-float representation, but I couldn't figure out why 2049 is the first integer that is badly casted.
The question is not specially related to Python (I guess)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你是对的,它通常与浮点数的定义方式有关(正如其他人所说,在 IEEE 754 中)。让我们看一下:
浮点数由符号 s(此处为 1 位)、尾数 m(此处为 10 位)和指数 e(此处为 5 位,即 −14 ≤ e ≤ 15)表示。然后计算浮点数 x,
其中基 b 为 2,[1] 为固定(免费)位。
最多 2**11 我们的整数可以用尾数精确表示,其中
,那么事情就变得有趣了:
等等...
观看此视频以获取详细示例https://www.youtube.com/watch?v=L8OYx1I8qNg
You are right, its in general related to how floating-point numbers are defined (In IEEE 754 as others said). Lets look into it:
The float is represented by a sign s (here 1 bit), a mantissa m (here 10 bits) and an exponent e (here 5 bits for −14 ≤ e ≤ 15). The float x is then calculated by
where the basis b is 2 and [1] is a fixed (for-free) bit.
Up to 2**11 our integer number can be represented exactly by the mantissa, where
then things get interesting:
and so on...
Watch this video for detailed examples https://www.youtube.com/watch?v=L8OYx1I8qNg
IEEE 754 规范允许
float16
11 位用于有效数字(小数),5 位用于指数。我想象一下,尝试表示 2049 时,您会达到有效数字位的限制,2 ** 11 == 2048
。不过,我不确定为什么 2049 年会变成 2048 年。
资料来源:维基百科:IEEE_754
The IEEE 754 spec allows
float16
11 bits for the significand (fraction), and 5 for the exponent. I imagine that trying to represent 2049 you hit the limit of the bits for the significand,2 ** 11 == 2048
.I am unsure why 2049 becomes 2048, however.
Source: wikipedia:IEEE_754