x86 汇编器：浮点比较

发布于 2024-11-29 15:29:18 字数 692 浏览 2 评论 0原文

作为编译器项目的一部分，我必须为 x86 编写 GNU 汇编程序代码来比较浮点值。我试图找到有关如何在线执行此操作的资源，据我了解，它的工作原理如下：

假设我要比较的两个值是浮点堆栈上的唯一值，则 fcomi指令将比较值并设置 CPU 标志，以便可以使用 je、jne、jl、... 指令。

我这么问是因为这只在某些时候有效。例如：

.section    .data
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0

.globl main
    .type   main, @function
main:
    flds f1
    flds f2
    fcomi
    jg leb
    pushl $msg
    call printf
    addl $4, %esp
leb:
    pushl $0
    call exit

即使我认为应该打印“Hallo”，也不会打印“Hallo”，如果你切换 f1 和 f2，它仍然不会打印，这是一个逻辑矛盾。然而， je 和 jne 似乎工作正常。

我做错了什么？

PS：fcomip 只弹出一个值还是同时弹出两个值？

原文

As part of a compiler project I have to write GNU assembler code for x86 to compare floating point values. I have tried to find resources on how to do this online and from what I understand it works like this:

Assuming the two values I want to compare are the only values on the floating point stack, then the fcomi instruction will compare the values and set the CPU-flags so that the je, jne, jl, ... instructions can be used.

I'm asking because this only works sometimes. For example:

.section    .data
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0

.globl main
    .type   main, @function
main:
    flds f1
    flds f2
    fcomi
    jg leb
    pushl $msg
    call printf
    addl $4, %esp
leb:
    pushl $0
    call exit

will not print "Hallo" even though I think it should, and if you switch f1 and f2 it still won't which is a logical contradiction. je and jne however seem to work fine.

What am I doing wrong?

PS: does the fcomip pop only one value or does it pop both?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

泅人 2024-12-06 15:29:18

TL:DR: 使用上方/下方条件（如无符号整数）来测试比较结果。

对于各种历史原因（映射通过 fcom / fstsw 从 FP 状态字到 FLAGS / sahf 其中 fcomi（PPro 中的新功能）匹配），FP 比较集合 CF，而不是 OF / SF。另请参阅http://www.ray.masmcode.com/tutorial/fpuchap7.htm< /a>

现代 SSE/SSE2 标量与 FLAGS 也遵循此操作，与 [u]commiss / sd。（与 SIMD 比较不同，SIMD 比较有一个谓词作为指令的一部分，作为立即数，因为它们只为每个元素生成一个全零/全一结果，而不是一组标志。）

这全部来自卷2 Intel 64 和 IA-32 架构软件开发人员指南手册。

FCOMI 仅设置 CMP 所做的一些标志。您的代码有 %st(0) == 9 和 %st(1) == 10。（因为它们加载的是一个堆栈），参考第2A卷第3-348页的表格，您可以看到情况是“ST0 < ST(i)”，因此它将清除ZF和PF，设置CF。同时在第 pg 上。 3-544 卷。在图 2A 中，您可以看出 JG 的意思是“如果更大则跳短（ZF=0 且 SF=OF）”。换句话说，它正在测试符号、溢出和零标志，但 FCOMI 不会设置符号或溢出！

根据您希望跳转的条件，您应该查看可能的比较结果并决定何时跳转。

+--------------------+---+---+---+
| Comparison results | Z | P | C |
+--------------------+---+---+---+
| ST0 > ST(i)        | 0 | 0 | 0 |
| ST0 < ST(i)        | 0 | 0 | 1 |
| ST0 = ST(i)        | 1 | 0 | 0 |
| unordered          | 1 | 1 | 1 |  one or both operands were NaN.
+--------------------+---+---+---+

我制作了这个小表，以便更容易理解：

+--------------+---+---+-----+------------------------------------+
| Test         | Z | C | Jcc | Notes                              |
+--------------+---+---+-----+------------------------------------+
| ST0 < ST(i)  | X | 1 | JB  | ZF will never be set when CF = 1   |
| ST0 <= ST(i) | 1 | 1 | JBE | Either ZF or CF is ok              |
| ST0 == ST(i) | 1 | X | JE  | CF will never be set in this case  |
| ST0 != ST(i) | 0 | X | JNE |                                    |
| ST0 >= ST(i) | X | 0 | JAE | As long as CF is clear we are good |
| ST0 > ST(i)  | 0 | 0 | JA  | Both CF and ZF must be clear       |
+--------------+---+---+-----+------------------------------------+
Legend: X: don't care, 0: clear, 1: set

换句话说，条件代码与使用无符号比较的条件代码相匹配。如果您使用 FMOVcc，情况也是如此。

如果 fcomi 的一个（或两个）操作数为 NaN，则设置 ZF=1 PF=1 CF=1。（FP 比较有 4 种可能的结果：>、<、== 或无序）。如果您关心代码如何处理 NaN，则可能需要额外的 jp 或 jnp。但并非总是如此：例如，ja 仅当 CF=0 且 ZF=0 时才为 true，因此在无序情况下不会被采用。如果您希望无序案例采用与以下或等于相同的执行路径，那么 ja 就是您所需要的。

在这里，如果您希望它打印（即 if (!(f2 > f1)) { put("hello"); }）和 < code>JBE 如果您不这样做（对应于 if (!(f2 <= f1)) { puts("hello"); }）。（请注意，这可能有点令人困惑，因为我们只有在不跳转时才打印）。

关于你的第二个问题：默认情况下 fcomi 不会弹出任何内容。您需要它的近亲 fcomip，它会弹出 %st0。使用后您应该始终清除 fpu 寄存器堆栈，因此假设您希望打印消息，那么您的程序最终会像这样：

.section    .rodata
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0 

.globl main
    .type   main, @function
main:
    flds   f1
    flds   f2
    fcomip
    fstp   %st(0) # to clear stack
    ja     leb # won't jump, jbe will
    pushl  $msg
    call   printf
    addl   $4, %esp
leb:
    pushl  $0
    call   exit

TL:DR: Use above / below conditions (like for unsigned integer) to test the result of compares.

For various historical reasons (mapping from FP status word to FLAGS via fcom / fstsw / sahf which fcomi (new in PPro) matches), FP compares set CF, not OF / SF. See also http://www.ray.masmcode.com/tutorial/fpuchap7.htm

Modern SSE/SSE2 scalar compares into FLAGS follow this as well, with [u]comiss / sd. (Unlike SIMD compares, which have a predicate as part of the instruction, as an immediate, since they only produce a single all-zeros / all-ones result for each element, not a set of FLAGS.)

This is all coming from Volume 2 of Intel 64 and IA-32 Architectures Software Developer's Manuals.

FCOMI sets only some of the flags that CMP does. Your code has %st(0) == 9 and %st(1) == 10. (Since it's a stack they're loaded onto), referring to the table on page 3-348 in Volume 2A you can see that this is the case "ST0 < ST(i)", so it will clear ZF and PF and set CF. Meanwhile on pg. 3-544 Vol. 2A you can read that JG means "Jump short if greater (ZF=0 and SF=OF)". In other words it's testing the sign, overflow and zero flags, but FCOMI doesn't set sign or overflow!

Depending on which conditions you wish to jump, you should look at the possible comparison results and decide when you want to jump.

+--------------------+---+---+---+
| Comparison results | Z | P | C |
+--------------------+---+---+---+
| ST0 > ST(i)        | 0 | 0 | 0 |
| ST0 < ST(i)        | 0 | 0 | 1 |
| ST0 = ST(i)        | 1 | 0 | 0 |
| unordered          | 1 | 1 | 1 |  one or both operands were NaN.
+--------------------+---+---+---+

I've made this small table to make it easier to figure out:

+--------------+---+---+-----+------------------------------------+
| Test         | Z | C | Jcc | Notes                              |
+--------------+---+---+-----+------------------------------------+
| ST0 < ST(i)  | X | 1 | JB  | ZF will never be set when CF = 1   |
| ST0 <= ST(i) | 1 | 1 | JBE | Either ZF or CF is ok              |
| ST0 == ST(i) | 1 | X | JE  | CF will never be set in this case  |
| ST0 != ST(i) | 0 | X | JNE |                                    |
| ST0 >= ST(i) | X | 0 | JAE | As long as CF is clear we are good |
| ST0 > ST(i)  | 0 | 0 | JA  | Both CF and ZF must be clear       |
+--------------+---+---+-----+------------------------------------+
Legend: X: don't care, 0: clear, 1: set

In other words the condition codes match those for using unsigned comparisons. The same goes if you're using FMOVcc.

If either (or both) operand to fcomi is NaN, it sets ZF=1 PF=1 CF=1. (FP compares have 4 possible results: >, <, ==, or unordered). If you care what your code does with NaNs, you may need an extra jp or jnp. But not always: for example, ja is only true if CF=0 and ZF=0, so it will be not-taken in the unordered case. If you want the unordered case to take the same execution path as below or equal, then ja is all you need.

Here you should use JA if you want it to print (ie. if (!(f2 > f1)) { puts("hello"); }) and JBE if you don't (corresponds to if (!(f2 <= f1)) { puts("hello"); }). (Note this might be a little confusing due to the fact that we only print if we don't jump).

Regarding your second question: by default fcomi doesn't pop anything. You want its close cousin fcomip which pops %st0. You should always clear the fpu register stack after usage, so all in all your program ends up like this assuming you want the message printed:

.section    .rodata
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0 

.globl main
    .type   main, @function
main:
    flds   f1
    flds   f2
    fcomip
    fstp   %st(0) # to clear stack
    ja     leb # won't jump, jbe will
    pushl  $msg
    call   printf
    addl   $4, %esp
leb:
    pushl  $0
    call   exit

回复收藏 0 原文

~没有更多了~