浮点数是如何存储的? 什么时候重要?

发布于 2024-07-04 07:20:33 字数 289 浏览 23 评论 0原文

这个问题的后续工作中,似乎有些数字根本无法用浮点数表示,而是近似的。

浮点数是如何存储的?

不同尺寸有共同标准吗?

如果使用浮点,需要注意哪些问题?

它们是否跨语言兼容(即,我需要处理哪些转换才能通过 TCP/IP 将浮点数从 python 程序发送到 C 程序)?

In follow up to this question, it appears that some numbers cannot be represented by floating point at all, and instead are approximated.

How are floating point numbers stored?

Is there a common standard for the different sizes?

What kind of gotchas do I need to watch out for if I use floating point?

Are they cross-language compatible (ie, what conversions do I need to deal with to send a floating point number from a python program to a C program over TCP/IP)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

栖迟 2024-07-11 07:20:33

在跟进这个问题时,它
看来有些数字不能
完全由浮点数表示,
而是近似值。

正确的。

浮点数是如何存储的?
不同尺寸有共同标准吗?

正如其他海报已经提到的,几乎完全是 IEEE754 及其后继者
IEEE754R。 谷歌搜索它会给你上千种解释以及位模式及其解释。
如果您在获取它时仍然遇到问题,有两种仍然常见的 FP 格式:IBM 和 DEC-VAX。 对于一些深奥的机器和编译器(BlitzBasic、TurboPascal)有一些
奇怪的格式。

如果使用浮点,需要注意哪些问题?
它们是否跨语言兼容(即,我需要处理哪些转换才能
通过 TCP/IP 将浮点数从 python 程序发送到 C 程序?

实际上没有,它们是跨语言兼容的。

非常罕见的怪癖:

  • IEEE754 定义了 sNaN(信号 NaN)和 qNaN(安静 NaN)。 前一个会导致陷阱,迫使处理器在加载时调用处理程序例程。 后者不这样做。 因为语言设计者讨厌 sNaN 中断他们的工作流程并支持它们强制支持处理程序例程,所以 sNaN 几乎总是默默地转换为 qNaN。
    因此,不要依赖 1:1 的原始转换。 但再次强调:这种情况非常罕见,只有在 NaN 时才会发生

  • 如果不同计算机之间共享文件,您可能会遇到字节序问题(字节顺序错误)。 它很容易被检测到,因为您得到的数字为 NaN。

In follow up to this question, it
appears that some numbers cannot be
represented by floating point at all,
and instead are approximated.

Correct.

How are floating point numbers stored?
Is there a common standard for the different sizes?

As the other posters already mentioned, almost exclusively IEEE754 and its successor
IEEE754R. Googling it gives you thousand explanations together with bit patterns and their explanation.
If you still have problems to get it, there are two still common FP formats: IBM and DEC-VAX. For some esoteric machines and compilers (BlitzBasic, TurboPascal) there are some
odd formats.

What kind of gotchas do I need to watch out for if I use floating point?
Are they cross-language compatible (ie, what conversions do I need to deal with to
send a floating point number from a python program to a C program over TCP/IP)?

Practically none, they are cross-language compatible.

Very rare occuring quirks:

  • IEEE754 defines sNaNs (signalling NaNs) and qNaNs (quiet NaNs). The former ones cause a trap which forces the processor to call a handler routine if loaded. The latter ones don't do this. Because language designers hated the possibility that sNaNs interrupt their workflow and supporting them enforce support for handler routines, sNaNs are almost always silently converted into qNaNs.
    So don't rely on a 1:1 raw conversion. But again: This is very rare and occurs only if NaNs
    are present.

  • You can have problems with endianness (the bytes are in the wrong order) if files between different computers are shared. It is easily detectable because you are getting NaNs for numbers.

吻安 2024-07-11 07:20:33

IEEE 二进制浮点算术标准 (IEEE 754)

是的, 当以二进制存储时,分为三部分:符号、指数和分数。

Yes there is the IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754)

The number is split into three parts, sign, exponent and fraction, when stored in binary.

层林尽染 2024-07-11 07:20:33

这篇题为“IEEE 标准 754 浮点数”的文章可能会有所帮助。 老实说,我不确定我是否理解你的问题,所以我不确定这是否会有帮助,但我希望会有帮助。

This article entitled "IEEE Standard 754 Floating Point Numbers" may be helpful. To be honest I'm not completely sure I'm understanding your question so I'm not sure that this is going to be helpful but I hope it will be.

无戏配角 2024-07-11 07:20:33

如果您确实担心浮点舍入错误,大多数语言都提供没有浮点错误的数据类型。 SQL Server 具有 Decimal 和 Money 数据类型。 .Net 具有十进制数据类型。 它们不像 Java 中的 BigDecimal 那样具有无限精度,但它们精确到定义的小数点位数。 因此,您不必担心输入的美元值 $4.58 会保存为浮点值 4.579999999999997

If you're really worried about floating point rounding errors, most languages offer data types that don't have floating point errors. SQL Server has the Decimal and Money data types. .Net has the Decimal data type. They aren't infinite precision like BigDecimal in Java, but they are precise down to the number of decimal points they are defined for. So you don't have to worry about a dollar value you type in as $4.58 getting saved as a floating point value of 4.579999999999997

厌味 2024-07-11 07:20:33

我记得的是 32 位浮点数使用 24 位来存储实际数字,其余 8 位用作 10 的幂,确定小数点在哪里。

我对这个话题有点生疏了……

What I remember is a 32 bit floating point is stored using 24 bits for a actual number, and the remain 8 bits are used as a power of 10, determining where the decimal point is.

I'm a bit rusty on the subject tho...

半世蒼涼 2024-07-11 07:20:33

如前所述,关于 IEEE 754 的维基百科文章很好地展示了浮点数如何存储在大多数系统上。

现在,这里有一些常见的问题:

  • 最大的问题是您几乎不想比较两个浮点数是否相等(或不相等)。 您需要使用大于/小于比较。
  • 对浮点数进行的运算越多,舍入误差就越大。
  • 精度受到分数大小的限制,因此您可能无法正确添加相隔几个数量级的数字。 (例如,您无法将 1E-30 添加到 1E30。)

As mentioned, the Wikipedia article on IEEE 754 does a good job of showing how floating point numbers are stored on most systems.

Now, here are some common gotchas:

  • The biggest is that you almost never want to compare two floating point numbers for equality (or inequality). You'll want to use greater than/less than comparisons instead.
  • The more operations you do on a floating point number, the more significant rounding errors can become.
  • Precision is limited by the size of the fraction, so you may not be able to correctly add numbers that are separated by several orders of magnitude. (For example, you won't be able to add 1E-30 to 1E30.)
过度放纵 2024-07-11 07:20:33

至于你问题的第二部分,除非性能和效率对你的项目很重要,否则我建议你通过 TCP/IP 将浮点数据作为字符串传输。 这可以让您避免字节对齐等问题,并简化调试。

As to the second part of your question, unless performance and efficiency are important for your project, then I suggest you transfer the floating point data as a string over TCP/IP. This lets you avoid issues such as byte alignment and will ease debugging.

月寒剑心 2024-07-11 07:20:33

该标准是 IEEE 754

当然,当 IEE754 不够好时,还有其他方法来存储数字。 Java 的 BigDecimal 等库可用于大多数平台,并且可以很好地映射到 SQL 的数字类型。 符号可用于无理数,无法用二进制或十进制浮点精确表示的比率可以存储为比率。

The standard is IEEE 754.

Of course, there are other means to store numbers when IEE754 isn't good enough. Libraries like Java's BigDecimal are available for most platforms and map well to SQL's number type. Symbols can be used for irrational numbers, and ratios that can't be accurately represented in binary or decimal floating point can be stored as a ratio.

陌路终见情 2024-07-11 07:20:33

基本上,您需要担心浮点数的精度位数是有限的。 在测试相等性时,或者如果您的程序实际上需要比该数据类型提供的精度更多的数字,这可能会导致问题。

在 C++ 中,一个好的经验法则是认为 float 可以提供 7 位精度,而 double 则可以提供 15 位。此外,如果您有兴趣了解如何测试相等性,您可以查看 问题线程。

Basically what you need to worry about in floating point numbers is that there is a limited number of digits of precision. This can cause problems when testing for equality, or if your program actually needs more digits of precision than what that data type give you.

In C++, a good rule of thumb is to think that a float gives you 7 digits of precision, while a double gives you 15. Also, if you are interested in knowing how to test for equality, you can look at this question thread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文