有序/无序比较是什么意思?

发布于 2024-12-22 22:09:50 字数 181 浏览 3 评论 0 原文

查看上交所运营商

CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles

有序和无序是什么意思?我在x86指令集中查找了等效指令,它似乎只有无序(FUCOM)。

Looking at the SSE operators

CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles

What do ordered and unordered mean? I looked for equivalent instructions in the x86 instruction set, and it only seems to have unordered (FUCOM).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

陪我终i 2024-12-29 22:09:50

有序比较检查两个操作数是否均为NaN。相反,无序比较检查任一操作数是否为NaN

本页提供了更多相关信息:

这里的想法是与 NaN 的比较是不确定的。 (无法决定结果)因此,有序/无序比较会检查情况是否如此。

double a = 0.;
double b = 0.;

__m128d x = _mm_set1_pd(a / b);     //  NaN
__m128d y = _mm_set1_pd(1.0);       //  1.0
__m128d z = _mm_set1_pd(1.0);       //  1.0

__m128d c0 = _mm_cmpord_pd(x,y);    //  NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y);  //  NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z);    //  1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z);  //  1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x);    //  NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x);  //  NaN vs. NaN

cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;

结果:

0
-1
-1
0
0
-1

如果操作数可比较(两个数字都不是 NaN),有序返回 true:

  • 1.01.0 的有序比较给出 true< /代码>。
  • NaN 和 1.0 的有序比较给出 false
  • NaNNaN 的有序比较给出 false

无序比较正好相反:

  • 1.01.0无序比较给出 false
  • NaN 和 1.0 的无序比较给出 true
  • NaN 和 NaN 的无序比较给出 true

An ordered comparison checks if neither operand is NaN. Conversely, an unordered comparison checks if either operand is a NaN.

This page gives some more information on this:

The idea here is that comparisons with NaN are indeterminate. (can't decide the result) So an ordered/unordered comparison checks if this is (or isn't) the case.

double a = 0.;
double b = 0.;

__m128d x = _mm_set1_pd(a / b);     //  NaN
__m128d y = _mm_set1_pd(1.0);       //  1.0
__m128d z = _mm_set1_pd(1.0);       //  1.0

__m128d c0 = _mm_cmpord_pd(x,y);    //  NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y);  //  NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z);    //  1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z);  //  1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x);    //  NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x);  //  NaN vs. NaN

cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;

Result:

0
-1
-1
0
0
-1

Ordered return true if the operands are comparable (neither number is NaN):

  • Ordered comparison of 1.0 and 1.0 gives true.
  • Ordered comparison of NaN and 1.0 gives false.
  • Ordered comparison of NaN and NaN gives false.

Unordered comparison is the exact opposite:

  • Unordered comparison of 1.0 and 1.0 gives false.
  • Unordered comparison of NaN and 1.0 gives true.
  • Unordered comparison of NaN and NaN gives true.
随波逐流 2024-12-29 22:09:50

此英特尔指南:http://intel80386.com/simd/mmx2-doc.html包含两个相当简单的示例:

CMPORDPS 比较有序并行标量

操作码周期指令 0F C2 .. 07 2 (3) CMPORDPS
xmm reg,xmm reg/mem128

CMPORDPS op1、op2

op1包含4个单精度32位浮点值op2
包含4个单精度32位浮点值

op1[0] = (op1[0] != NaN) && (op2[0] != NaN)
op1[1] = (op1[1] != NaN) && (op2[1] != NaN)
op1[2] = (op1[2] != NaN) && (op2[2] != NaN)
op1[3] = (op1[3] != NaN) && (op2[3] != NaN)

真 = 0xFFFFFFFF
假 = 0x00000000

<小时>

CMPUNORDPS 比较无序并行标量

操作码周期指令 0F C2 .. 03 2 (3) CMPUNORDPS
xmm reg,xmm reg/mem128

CMPUNORDPS op1、op2

op1包含4个单精度32位浮点值op2
包含4个单精度32位浮点值

op1[0] = (op1[0] == NaN) || (op2[0] == NaN)
op1[1] = (op1[1] == NaN) || (op2[1] == NaN)
op1[2] = (op1[2] == NaN) || (op2[2] == NaN)
op1[3] = (op1[3] == NaN) || (op2[3] == NaN)

真 = 0xFFFFFFFF
假 = 0x00000000

区别在于 AND(有序)与 OR(无序)。

This Intel guide: http://intel80386.com/simd/mmx2-doc.html contains examples of the two which are fairly straight-forward:

CMPORDPS Compare Ordered Parallel Scalars

Opcode Cycles Instruction 0F C2 .. 07 2 (3) CMPORDPS
xmm reg,xmm reg/mem128

CMPORDPS op1, op2

op1 contains 4 single precision 32-bit floating point values op2
contains 4 single precision 32-bit floating point values

op1[0] = (op1[0] != NaN) && (op2[0] != NaN)
op1[1] = (op1[1] != NaN) && (op2[1] != NaN)
op1[2] = (op1[2] != NaN) && (op2[2] != NaN)
op1[3] = (op1[3] != NaN) && (op2[3] != NaN)

TRUE  = 0xFFFFFFFF
FALSE = 0x00000000

CMPUNORDPS Compare Unordered Parallel Scalars

Opcode Cycles Instruction 0F C2 .. 03 2 (3) CMPUNORDPS
xmm reg,xmm reg/mem128

CMPUNORDPS op1, op2

op1 contains 4 single precision 32-bit floating point values op2
contains 4 single precision 32-bit floating point values

op1[0] = (op1[0] == NaN) || (op2[0] == NaN)
op1[1] = (op1[1] == NaN) || (op2[1] == NaN)
op1[2] = (op1[2] == NaN) || (op2[2] == NaN)
op1[3] = (op1[3] == NaN) || (op2[3] == NaN)

TRUE  = 0xFFFFFFFF
FALSE = 0x00000000

The difference is AND (ordered) vs OR (unordered).

晨敛清荷 2024-12-29 22:09:50

简短版本:无序是两个 FP 值可以具有的关系。标量比较设置的 FLAGS,以便您可以检查所需的任何条件(例如 ucomisd xmm0, xmm1 / jp unordered),但 SIMD 比较需要将条件(谓词)编码到要并行检查的指令以生成元素值为 0 / 0xFF 的向量...无处为每个元素放置单独的 FLAGS 结果。

FUCOM 中的“Unordered”意味着当比较结果无序时,它不会引发 FP“无效”异常,而 FCOM 会引发。这与 OQ 和 OS cmpps 谓词之间的区别相同,而不是“无序”谓词。 (参见“信号
#IA 开启
英特尔 asm 手册中的 cmppd 文档中的“QNAN”列。( cmppd 按字母顺序排列在前,并且具有更完整的文档,与 cmpps / cmpss/sd))

(FP 异常 默认情况下被屏蔽,因此它们不会导致CPU 捕获硬件异常处理程序,只需在 MXCSR 中设置粘性标志,或为 x87 指令设置旧版 x87 状态字。)


ORD 和 UNORD 是谓词的两种选择cmppd / cmpps / cmpss / cmpsd insns cmppd 条目中的完整表格按字母顺序排列在前面)。该 html 摘录具有可读的表格格式,但是英特尔的官方 PDF 原始版本要好一些(请参阅 标记 wiki 以获得链接)。

如果两个浮点操作数都不是 NaN,则它们将相对于彼此排序。如果其中一个为 NaN,则它们是无序的。即 ordered = (x>y) | (x==y) | (x==y) | (x。没错,对于浮点来说,这些事情都不可能是真的。有关浮点疯狂的更多信息,请参阅 Bruce Dawson 的优秀的系列文章。

cmpps 采用谓词并生成结果向量,而不是在两个标量之间进行比较并设置标志,以便您可以在事后检查您想要的任何谓词。因此,它需要针对您可以检查的所有内容提供特定的谓词。


标量等效项是 comis / ucommiss,用于根据 FP 比较结果设置 ZF/PF/CF(其工作方式类似于 x87 比较指令(请参阅本答案的最后一部分) ,但在 XMM 规则的低元素上)。

要检查是否无序,请查看 PF。如果比较是有序的,您可以查看其他标志来查看操作数是否大于、等于或小于(使用与无符号整数相同的条件,例如 jae 表示“高于”或“等于”)。


COMISS 指令 与 UCOMISS 指令的不同之处在于,它发出 SIMD 浮点信号当源操作数为 QNaN 或 SNaN 时,点无效操作异常 (#I)。仅当源操作数是 SNaN 时,UCOMISS 指令才会发出无效数字异常信号。

(SNaN 不是自然发生的;如果异常被屏蔽,像 sqrt(-1)inf - inf 这样的操作将产生 QNaN,否则捕获并且不产生结果。

) FP 异常被屏蔽,因此这实际上不会中断您的程序;它只是设置 MXCSR 中的位,您可以稍后检查。

这与 cmpps / vcmpps 的 O/UQ 与 O/US 谓词风格相同。 AVX 版本的 cmp[ps][sd] 指令具有扩展的谓词选择,因此它们需要命名约定来跟踪它们。

O 与 U 告诉您当操作数无序时谓词是否为真。

Q 与 S 告诉您如果任一操作数是 Quiet NaN,#I 是否会被引发。如果任一操作数是 Signaling NaN,#I 将始终被引发,但这些不是“自然发生的”。您不能将它们作为其他操作的输出,只能通过自己创建位模式(例如作为函数的错误返回值,以确保稍后检测到问题)。


x87 等效方法是使用 fcomfucom 来设置 FPU 状态字 -> fstsw 斧头 -> sahf,或者最好是 fucomi 直接设置 EFLAGS,如 ucomis

U/非 U 区别与 x87 指令相同,如 comis / ucommiss

short version: Unordered is a relation two FP values can have. Scalar compares set FLAGS so you can check any condition you want (e.g. ucomisd xmm0, xmm1 / jp unordered), but SIMD compares need to encode the condition (predicate) into the instruction to be checked in parallel to produce a vector with element values of 0 / 0xFF.... Nowhere to put a separate FLAGS result for each element.

The "Unordered" in FUCOM means it doesn't raise an FP "invalid" exception when the comparison result is unordered, while FCOM does. This is the same as the distinction between OQ and OS cmpps predicates, not the "unordered" predicate. (See the "Signals
#IA on
QNAN" column in the cmppd docs in Intel's asm manuals. (cmppd is alphabetically first and has the more complete docs, vs. cmpps / cmpss/sd))

(FP exceptions are masked by default so they don't cause the CPU to trap to a hardware exception handler, just set sticky flags in MXCSR, or the legacy x87 status word for x87 instructions.)


ORD and UNORD are two choices of predicate for the cmppd / cmpps / cmpss / cmpsd insns (full tables in the cmppd entry which is alphabetically first). That html extract has readable table formatting, but Intel's official PDF original is somewhat better. (See the tag wiki for links).

Two floating point operands are ordered with respect to each other if neither is NaN. They're unordered if either is NaN. i.e. ordered = (x>y) | (x==y) | (x<y);. That's right, with floating point it's possible for none of those things to be true. For more Floating Point madness, see Bruce Dawson's excellent series of articles.

cmpps takes a predicate and produces a vector of results, instead of doing a comparison between two scalars and setting flags so you can check any predicate you want after the fact. So it needs specific predicates for everything you can check.


The scalar equivalent is comiss / ucomiss to set ZF/PF/CF from the FP comparison result (which works like the x87 compare instructions (see the last section of this answer), but on the low element of XMM regs).

To check for unordered, look at PF. If the comparison is ordered, you can look at the other flags to see whether the operands were greater, equal, or less (using the same conditions as for unsigned integers, like jae for Above or Equal).


The COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid numeric exception only if a source operand is an SNaN.

(SNaN is not naturally occurring; operations like sqrt(-1) or inf - inf will produce QNaN if exceptions are masked, else trap and not produce a result.)

Normally FP exceptions are masked, so this doesn't actually interrupt your program; it just sets the bit in the MXCSR which you can check later.

This is the same as O/UQ vs. O/US flavours of predicate for cmpps / vcmpps. The AVX version of the cmp[ps][sd] instructions have an expanded choice of predicate, so they needed a naming convention to keep track of them.

The O vs. U tells you whether the predicate is true when the operands are unordered.

The Q vs. S tells you whether #I will be raised if either operand is a Quiet NaN. #I will always be raised if either operand is a Signalling NaN, but those are not "naturally occurring". You don't get them as outputs from other operations, only by creating the bit pattern yourself (e.g. as an error-return value from a function, to ensure detection of problems later).


The x87 equivalent is using fcom or fucom to set the FPU status word -> fstsw ax -> sahf, or preferably fucomi to set EFLAGS directly like ucomiss.

The U / non-U distinction is the same with x87 instructions as for comiss / ucomiss

枕花眠 2024-12-29 22:09:50

你可以通过llvm CC的定义来理解“ordered CC”和“unordered CC”的含义,其中“CC”表示CondCode。
在'llvm/include/llvm/CodeGen/ISDOpcodes.h'(我的源代码版本是llvm-10.0.1)中,您可以看到CondCode的枚举如下:

enum CondCode {
// Opcode          N U L G E       Intuitive operation
SETFALSE,      //    0 0 0 0       Always false (always folded)
SETOEQ,        //    0 0 0 1       True if ordered and equal
SETOGT,        //    0 0 1 0       True if ordered and greater than
SETOGE,        //    0 0 1 1       True if ordered and greater than or equal
SETOLT,        //    0 1 0 0       True if ordered and less than
SETOLE,        //    0 1 0 1       True if ordered and less than or equal
SETONE,        //    0 1 1 0       True if ordered and operands are unequal
SETO,          //    0 1 1 1       True if ordered (no nans)
SETUO,         //    1 0 0 0       True if unordered: isnan(X) | isnan(Y)
SETUEQ,        //    1 0 0 1       True if unordered or equal
SETUGT,        //    1 0 1 0       True if unordered or greater than
SETUGE,        //    1 0 1 1       True if unordered, greater than, or equal
SETULT,        //    1 1 0 0       True if unordered or less than
SETULE,        //    1 1 0 1       True if unordered, less than, or equal
SETUNE,        //    1 1 1 0       True if unordered or not equal
SETTRUE,       //    1 1 1 1       Always true (always folded)
// Don't care operations: undefined if the input is a nan.
SETFALSE2,     //  1 X 0 0 0       Always false (always folded)
SETEQ,         //  1 X 0 0 1       True if equal
SETGT,         //  1 X 0 1 0       True if greater than
SETGE,         //  1 X 0 1 1       True if greater than or equal
SETLT,         //  1 X 1 0 0       True if less than
SETLE,         //  1 X 1 0 1       True if less than or equal
SETNE,         //  1 X 1 1 0       True if not equal
SETTRUE2,      //  1 X 1 1 1       Always true (always folded)
SETCC_INVALID       // Marker value.
};

这意味着:对于浮点条件比较,'ordered CC ' 的意思是“订购并” CC',而'无序CC'意味着'无序|抄送'。

换句话说,在浮点比较中,如果 NaN 是“不是数字”,则

  • “有序 CC”返回 true,如果:“两个操作数都不是 NaN”并且“CC 为 true”,
  • “无序 CC”返回 true,如果:“one”或多个操作数为 NaN' OR 'CC is true'

您还可以看到,'ordered CC' 绝对与 'unordered !CC' 相反。

You may understand the meaning of 'ordered CC' and 'unordered CC' through llvm CC definition, where 'CC' means CondCode.
In 'llvm/include/llvm/CodeGen/ISDOpcodes.h' (my source code version is llvm-10.0.1), you could see the enum of CondCode as below:

enum CondCode {
// Opcode          N U L G E       Intuitive operation
SETFALSE,      //    0 0 0 0       Always false (always folded)
SETOEQ,        //    0 0 0 1       True if ordered and equal
SETOGT,        //    0 0 1 0       True if ordered and greater than
SETOGE,        //    0 0 1 1       True if ordered and greater than or equal
SETOLT,        //    0 1 0 0       True if ordered and less than
SETOLE,        //    0 1 0 1       True if ordered and less than or equal
SETONE,        //    0 1 1 0       True if ordered and operands are unequal
SETO,          //    0 1 1 1       True if ordered (no nans)
SETUO,         //    1 0 0 0       True if unordered: isnan(X) | isnan(Y)
SETUEQ,        //    1 0 0 1       True if unordered or equal
SETUGT,        //    1 0 1 0       True if unordered or greater than
SETUGE,        //    1 0 1 1       True if unordered, greater than, or equal
SETULT,        //    1 1 0 0       True if unordered or less than
SETULE,        //    1 1 0 1       True if unordered, less than, or equal
SETUNE,        //    1 1 1 0       True if unordered or not equal
SETTRUE,       //    1 1 1 1       Always true (always folded)
// Don't care operations: undefined if the input is a nan.
SETFALSE2,     //  1 X 0 0 0       Always false (always folded)
SETEQ,         //  1 X 0 0 1       True if equal
SETGT,         //  1 X 0 1 0       True if greater than
SETGE,         //  1 X 0 1 1       True if greater than or equal
SETLT,         //  1 X 1 0 0       True if less than
SETLE,         //  1 X 1 0 1       True if less than or equal
SETNE,         //  1 X 1 1 0       True if not equal
SETTRUE2,      //  1 X 1 1 1       Always true (always folded)
SETCC_INVALID       // Marker value.
};

That means: for floating-point condition comparision, 'ordered CC' means 'ordered & CC', while 'unordered CC' means ' unordered | CC'.

In another word, in floating-point comparison, where NaN is 'Not A Number',

  • 'ordered CC' returns true if: 'both operands are not NaN' AND 'CC is true'
  • 'unordered CC' returns true if: 'one or more operands are NaN' OR 'CC is true'

You can also see, that 'ordered CC' is definitely the opposite of 'unordered !CC'.

从﹋此江山别 2024-12-29 22:09:50

也许此页面在视觉上C++ 内在函数有帮助吗? :)

CMPORDPS

r0 := (a0 ord? b0) ? 0xffffffff : 0x0
r1 := (a1 ord? b1) ? 0xffffffff : 0x0
r2 := (a2 ord? b2) ? 0xffffffff : 0x0
r3 := (a3 ord? b3) ? 0xffffffff : 0x0

CMPUNORDPS

r0 := (a0 unord? b0) ? 0xffffffff : 0x0
r1 := a1 ; r2 := a2 ; r3 := a3

Perhaps this page on Visual C++ intrinsics can be of help? :)

CMPORDPS

r0 := (a0 ord? b0) ? 0xffffffff : 0x0
r1 := (a1 ord? b1) ? 0xffffffff : 0x0
r2 := (a2 ord? b2) ? 0xffffffff : 0x0
r3 := (a3 ord? b3) ? 0xffffffff : 0x0

CMPUNORDPS

r0 := (a0 unord? b0) ? 0xffffffff : 0x0
r1 := a1 ; r2 := a2 ; r3 := a3
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文