为什么 FLD1 加载 NaN?
我有一个单行 C 函数,它只是 return value * pow(1.+rate, -delay);
- 它将未来值折扣为当前值。反汇编的有趣部分是,
0x080555b9 : neg %eax 0x080555bb : push %eax 0x080555bc : fildl (%esp) 0x080555bf : lea 0x4(%esp),%esp 0x080555c3 : fldl 0xfffffff0(%ebp) 0x080555c6 : fld1 0x080555c8 : faddp %st,%st(1) 0x080555ca : fxch %st(1) 0x080555cc : fstpl 0x8(%esp) 0x080555d0 : fstpl (%esp) 0x080555d3 : call 0x8051ce0 0x080555d8 : fmull 0xfffffff8(%ebp)
当单步执行此函数时,gdb 表示(速率为 0.02,延迟为 2;您可以在堆栈上看到它们):
(gdb) si 0x080555c6 30 return value * pow(1.+rate, -delay); (gdb) info float R7: Valid 0x4004a6c28f5c28f5c000 +41.68999999999999773 R6: Valid 0x4004e15c28f5c28f6000 +56.34000000000000341 R5: Valid 0x4004dceb851eb851e800 +55.22999999999999687 R4: Valid 0xc0008000000000000000 -2 =>R3: Valid 0x3ff9a3d70a3d70a3d800 +0.02000000000000000042 R2: Valid 0x4004ff147ae147ae1800 +63.77000000000000313 R1: Valid 0x4004e17ae147ae147800 +56.36999999999999744 R0: Valid 0x4004efb851eb851eb800 +59.92999999999999972 Status Word: 0x1861 IE PE SF TOP: 3 Control Word: 0x037f IM DM ZM OM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word: 0x0000 Instruction Pointer: 0x73:0x080555c3 Operand Pointer: 0x7b:0xbff41d78 Opcode: 0xdd45
在 fld1
之后:
(gdb) si 0x080555c8 30 return value * pow(1.+rate, -delay); (gdb) info float R7: Valid 0x4004a6c28f5c28f5c000 +41.68999999999999773 R6: Valid 0x4004e15c28f5c28f6000 +56.34000000000000341 R5: Valid 0x4004dceb851eb851e800 +55.22999999999999687 R4: Valid 0xc0008000000000000000 -2 R3: Valid 0x3ff9a3d70a3d70a3d800 +0.02000000000000000042 =>R2: Special 0xffffc000000000000000 Real Indefinite (QNaN) R1: Valid 0x4004e17ae147ae147800 +56.36999999999999744 R0: Valid 0x4004efb851eb851eb800 +59.92999999999999972 Status Word: 0x1261 IE PE SF C1 TOP: 2 Control Word: 0x037f IM DM ZM OM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word: 0x0020 Instruction Pointer: 0x73:0x080555c6 Operand Pointer: 0x7b:0xbff41d78 Opcode: 0xd9e8
在这之后,一切都会走向地狱。事情会被严重高估或低估,所以即使我的 freeciv AI 尝试中没有其他错误,它也会选择所有错误的策略。比如派遣全军前往北极。 (唉,如果我能走到这一步就好了。)
我一定是错过了一些明显的东西,或者被某些东西蒙蔽了双眼,因为我无法相信 fld1
可能会失败。更不用说,只有在几次通过此函数后才会失败。在前面的过程中,FPU 正确地将 1 加载到 ST(0) 中。 0x080555c6 处的字节肯定编码 fld1
- 在运行的进程上使用 x/... 检查。
什么给?
I have a one-liner C function that is just return value * pow(1.+rate, -delay);
- it discounts a future value to a present value. The interesting part of the disassembly is
0x080555b9 : neg %eax 0x080555bb : push %eax 0x080555bc : fildl (%esp) 0x080555bf : lea 0x4(%esp),%esp 0x080555c3 : fldl 0xfffffff0(%ebp) 0x080555c6 : fld1 0x080555c8 : faddp %st,%st(1) 0x080555ca : fxch %st(1) 0x080555cc : fstpl 0x8(%esp) 0x080555d0 : fstpl (%esp) 0x080555d3 : call 0x8051ce0 0x080555d8 : fmull 0xfffffff8(%ebp)
While single-stepping through this function, gdb says (rate is 0.02, delay is 2; you can see them on the stack):
(gdb) si 0x080555c6 30 return value * pow(1.+rate, -delay); (gdb) info float R7: Valid 0x4004a6c28f5c28f5c000 +41.68999999999999773 R6: Valid 0x4004e15c28f5c28f6000 +56.34000000000000341 R5: Valid 0x4004dceb851eb851e800 +55.22999999999999687 R4: Valid 0xc0008000000000000000 -2 =>R3: Valid 0x3ff9a3d70a3d70a3d800 +0.02000000000000000042 R2: Valid 0x4004ff147ae147ae1800 +63.77000000000000313 R1: Valid 0x4004e17ae147ae147800 +56.36999999999999744 R0: Valid 0x4004efb851eb851eb800 +59.92999999999999972 Status Word: 0x1861 IE PE SF TOP: 3 Control Word: 0x037f IM DM ZM OM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word: 0x0000 Instruction Pointer: 0x73:0x080555c3 Operand Pointer: 0x7b:0xbff41d78 Opcode: 0xdd45
And after the fld1
:
(gdb) si 0x080555c8 30 return value * pow(1.+rate, -delay); (gdb) info float R7: Valid 0x4004a6c28f5c28f5c000 +41.68999999999999773 R6: Valid 0x4004e15c28f5c28f6000 +56.34000000000000341 R5: Valid 0x4004dceb851eb851e800 +55.22999999999999687 R4: Valid 0xc0008000000000000000 -2 R3: Valid 0x3ff9a3d70a3d70a3d800 +0.02000000000000000042 =>R2: Special 0xffffc000000000000000 Real Indefinite (QNaN) R1: Valid 0x4004e17ae147ae147800 +56.36999999999999744 R0: Valid 0x4004efb851eb851eb800 +59.92999999999999972 Status Word: 0x1261 IE PE SF C1 TOP: 2 Control Word: 0x037f IM DM ZM OM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word: 0x0020 Instruction Pointer: 0x73:0x080555c6 Operand Pointer: 0x7b:0xbff41d78 Opcode: 0xd9e8
After this, everything goes to hell. Things get grossly over or undervalued, so even if there were no other bugs in my freeciv AI attempt, it would choose all the wrong strategies. Like sending the whole army to the arctic. (Sigh, if only I were getting that far.)
I must be missing something obvious, or getting blinded by something, because I can't believe that fld1
should ever possibly fail. Even less that it should fail only after a handful of passes through this function. On earlier passes the FPU correctly loads 1 into ST(0). The bytes at 0x080555c6 definitely encode fld1
- checked with x/... on the running process.
What gives?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
非常合适。这里有一个堆栈溢出。
具体来说,您(或者可能是您的编译器)已经溢出了 x87 堆栈。它只能保存 8 个值,并且在发出
fld1
时,它已经满了(由标记字0000
表示)。因此,fld1
溢出堆栈(由IE、SF、C1
表示),从而导致您看到的结果。至于为什么会发生这种情况,您可能在使用 x87 指令之前使用了 MMX 指令而没有使用
EMMS
,或者您的编译器有错误,或者您的汇编代码违反了您平台的 ABI(或者您正在使用的库违反了 ABI)。Remarkably appropriate. What you have here is a stack overflow.
Specifically, you (or possibly your compiler) has overflowed the x87 stack. It can only hold 8 values, and at the time that the
fld1
is issued, it is already full (indicated by the tag word of0000
). Thus, thefld1
overflows the stack (indicated byIE, SF, C1
) which causes the result that you're seeing.As to why this is happening, you may have used MMX instructions without using an
EMMS
before using the x87 instructions, or your compiler has a bug, or you have assembly code somewhere that violates your platform's ABI (or a library that you are using violates the ABI).看来您有 FPU 堆栈溢出。 FPU 标记字为 0,表示使用所有寄存器。您还可以看到所有标记为“有效”的寄存器,而我希望其中一些寄存器为空。
我不知道为什么会发生这种情况。也许您有一些 MMX 代码不发出
EMMS
指令?或者也许某些内联汇编无法正确清除堆栈?It looks like you have an FPU stack overflow. The FPU tag word is 0, which means that all registers are used. You can also see all registers marked as "valid", when I would expect some to be empty.
I don't know why this would happen though. Maybe you have some MMX code which doesn't issue the
EMMS
instruction? Or maybe some inline assembly which doesn't clear the stack properly?