糟糕的浮点魔法

发布于 2024-12-11 05:09:27 字数 1167 浏览 0 评论 0原文

我有一个奇怪的浮点问题。

背景

我正在为具有大型整数算术协处理器的 8 位处理器实现双精度(64 位)IEEE 754 浮点库。为了测试这个库,我将我的代码返回的值与英特尔浮点指令返回的值进行比较。这些并不总是一致,因为英特尔的浮点单元在内部以 80 位格式存储值,并带有 64 位尾数。

示例(全部为十六进制):

X = 4C816EFD0D3EC47E:
有偏指数 = 4C8(真实指数 = 1C9),尾数 = 116EFD0D3EC47E

Y = 449F20CDC8A5D665:
偏置指数 = 449(真实指数 = 14A),尾数 = 1F20CDC8A5D665

计算 X * Y

尾数的乘积为 10F5643E3730A17FF62E39D6CDB0,四舍五入到 53(十进制)位时为 10F5643E3730A1(因为7FF62E39D6CDB0 为零)。所以结果中正确的尾数是10F5643E3730A1。

但如果使用 64 位尾数进行计算,则 10F5643E3730A17FF62E39D6CDB0 将向上舍入为 10F5643E3730A1800,再次舍入为 53 位时将变为 10F5643E3730A2。最低有效数字已从 1 更改为 2。

综上所述:我的库返回正确的尾数 10F5643E3730A1,但 Intel 硬件返回(正确)10F5643E3730A2,因为其内部 64 位尾数。

问题:

现在,我不明白的是:有时英特尔硬件会在尾数中返回 10F5643E3730A1!我有两个程序,一个 Windows 控制台程序和一个 Windows GUI 程序,都是由 Qt 使用 g++ 4.5.2 构建的。正如预期的那样,控制台程序返回 10F5643E3730A2,但 GUI 程序返回 10F5643E3730A1。他们使用相同的库函数,其中包含三个指令:

fldl   -0x18(%ebp)
fmull  -0x10(%ebp)
fstpl  0x4(%esp)

并且这三个指令在两个程序中计算出不同的结果。 (我已经在调试器中单步调试了它们。)在我看来,这可能是 Qt 在其 GUI 启动代码中配置 FPU 所做的事情,但我找不到任何有关的文档这。有人知道这里发生了什么事吗?

I have a strange floating-point problem.

Background:

I am implementing a double-precision (64-bit) IEEE 754 floating-point library for an 8-bit processor with a large integer arithmetic co-processor. To test this library, I am comparing the values returned by my code against the values returned by Intel's floating-point instructions. These don't always agree, because Intel's Floating-Point Unit stores values internally in an 80-bit format, with a 64-bit mantissa.

Example (all in hex):

X = 4C816EFD0D3EC47E:
biased exponent = 4C8 (true exponent = 1C9), mantissa = 116EFD0D3EC47E

Y = 449F20CDC8A5D665:
biased exponent = 449 (true exponent = 14A), mantissa = 1F20CDC8A5D665

Calculate X * Y

The product of the mantissas is 10F5643E3730A17FF62E39D6CDB0, which when rounded to 53 (decimal) bits is 10F5643E3730A1 (because the top bit of 7FF62E39D6CDB0 is zero). So the correct mantissa in the result is 10F5643E3730A1.

But if the computation is carried out with a 64-bit mantissa, 10F5643E3730A17FF62E39D6CDB0 is rounded up to 10F5643E3730A1800, which when rounded again to 53 bits becomes 10F5643E3730A2. The least significant digit has changed from 1 to 2.

To sum up: my library returns the correct mantissa 10F5643E3730A1, but the Intel hardware returns (correctly) 10F5643E3730A2, because of its internal 64-bit mantissa.

The problem:

Now, here's what I don't understand: sometimes the Intel hardware returns 10F5643E3730A1 in the mantissa! I have two programs, a Windows console program and a Windows GUI program, both built by Qt using g++ 4.5.2. The console program returns 10F5643E3730A2, as expected, but the GUI program returns 10F5643E3730A1. They are using the same library function, which has the three instructions:

fldl   -0x18(%ebp)
fmull  -0x10(%ebp)
fstpl  0x4(%esp)

And these three instructions compute a different result in the two programs. (I have stepped through them both in the debugger.) It seems to me that this might be something that Qt does to configure the FPU in its GUI startup code, but I can't find any documentation about this. Does anybody have any idea what's happening here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

空气里的味道 2024-12-18 05:09:27

函数的指令流和输入并不唯一地确定其执行。您还必须考虑处理器执行时已经建立的环境。

如果您检查 x87 控制字,您会发现它设置为两种不同的状态,对应于您观察到的两种行为。其中,精度控制 [位 9:8] 已设置为 10b(53 位)。另一种情况下,它被设置为11b(64 位)。

至于到底是什么建立了非默认状态,它可能是在执行代码之前该线程中发生的任何事情。任何被拉入的图书馆都可能是可疑的。如果您想做一些考古学,确凿的证据通常是 fldcw 指令(尽管控制字也可以通过 fldenvfrstor 写入> 和finit

The instructions stream of and inputs to a function do not uniquely determine its execution. You must also consider the environment that is already established in the processor at the time of its execution.

If you inspect the x87 control word, you will find that it is set in two different states, corresponding to your two observed behaviors. In one, the precision control [bits 9:8] has been set to 10b (53 bits). In the other, it is set to 11b (64 bits).

As to exactly what is establishing the non-default state, it could be anything that happens in that thread prior to execution of your code. Any libraries that are pulled in are likely suspects. If you want to do some archaeology, the smoking gun is typically the fldcw instruction (though the control word can also be written to by fldenv, frstor, and finit.

送君千里 2024-12-18 05:09:27

通常这是编译器设置。例如检查以下 Visual C++ 页面:
http://msdn.microsoft.com/en -us/library/aa289157%28v=vs.71%29.aspx

或英特尔的此文档:
http://cache-www.intel.com/cd /00/00/34/76/347605_347605.pdf

特别是intel文档提到了处理器内部的一些标志,这些标志决定了FPU的行为 指示。这解释了为什么相同的代码在两个程序中表现不同(一个程序设置的标志与另一个程序不同)。

normally it's a compiler setting. Check for example the following page for Visual C++:
http://msdn.microsoft.com/en-us/library/aa289157%28v=vs.71%29.aspx

or this document for intel:
http://cache-www.intel.com/cd/00/00/34/76/347605_347605.pdf

Especially the intel document mentions some flags inside the processor that determine the behavior of the FPU instructions. This explains why the same code behaves differently in 2 programs (one sets the flags different to the other).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文