为什么将整数转换为 float16 很危险?
我最近遇到了一个令人惊讶且恼人的错误,其中我将整数转换为 float16 并且值发生了变化:
>>> import numpy as np
>>> np.array([2049]).astype(np.float16)
array([2048.], dtype=float16)
>>> np.array([2049]).astype(np.float16).astype(np.int32)
array([2048.], dtype=int32)
这可能不是一个错误,因为它也发生在 PyTorch 上。我猜它与半浮点表示有关,但我无法弄清楚为什么 2049 是第一个被错误转换的整数。
这个问题与Python并不特别相关(我猜)
I have run recently into a surprising and annoying bug in which I converted an integer into a float16 and the value changed:
>>> import numpy as np
>>> np.array([2049]).astype(np.float16)
array([2048.], dtype=float16)
>>> np.array([2049]).astype(np.float16).astype(np.int32)
array([2048.], dtype=int32)
This is likely not a bug, because it happens also for PyTorch. I guess it is related to half-float representation, but I couldn't figure out why 2049 is the first integer that is badly casted.
The question is not specially related to Python (I guess)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你是对的,它通常与浮点数的定义方式有关(正如其他人所说,在 IEEE 754 中)。让我们看一下:
浮点数由符号 s(此处为 1 位)、尾数 m(此处为 10 位)和指数 e(此处为 5 位,即 −14 ≤ e ≤ 15)表示。然后计算浮点数 x,
其中基 b 为 2,[1] 为固定(免费)位。
最多 2**11 我们的整数可以用尾数精确表示,其中
,那么事情就变得有趣了:
等等...
观看此视频以获取详细示例https://www.youtube.com/watch?v=L8OYx1I8qNg
You are right, its in general related to how floating-point numbers are defined (In IEEE 754 as others said). Lets look into it:
The float is represented by a sign s (here 1 bit), a mantissa m (here 10 bits) and an exponent e (here 5 bits for −14 ≤ e ≤ 15). The float x is then calculated by
where the basis b is 2 and [1] is a fixed (for-free) bit.
Up to 2**11 our integer number can be represented exactly by the mantissa, where
then things get interesting:
and so on...
Watch this video for detailed examples https://www.youtube.com/watch?v=L8OYx1I8qNg
IEEE 754 规范允许
float16
11 位用于有效数字(小数),5 位用于指数。我想象一下,尝试表示 2049 时,您会达到有效数字位的限制,2 ** 11 == 2048
。不过,我不确定为什么 2049 年会变成 2048 年。
资料来源:维基百科:IEEE_754
The IEEE 754 spec allows
float16
11 bits for the significand (fraction), and 5 for the exponent. I imagine that trying to represent 2049 you hit the limit of the bits for the significand,2 ** 11 == 2048
.I am unsure why 2049 becomes 2048, however.
Source: wikipedia:IEEE_754