因此,首先,我将仅描述任务:
我需要:
- 比较两个
__ M128i
。
- 以某种方式使用特定
uint16_t
值(可能使用 _mm_movemask_epi8
首先使用 _mm_movemask_epi8
),以某种方式进行了表和结果。
- 基于此结果进行初始值的
混合
。
因此,问题是您可能已经猜到了混合物接受 __ M128i
作为掩码,我将拥有 uint16_t
。因此,要么我需要某种类型的反向指令 _mm_movemask_epi8
,要么完全做其他事情。
一些要点 - 我可能无法将 uint16_t
值更改为其他类型,这很复杂;我在SSE4.2上这样做,所以没有AVX;这里有一个类似的问题,但这是关于AVX的,我非常没有经验,因此我无法采用该解决方案。
PS:我可能还需要为手臂做这件事,这将感谢任何建议。
So first I'll just describe the task:
I need to:
- Compare two
__m128i
.
- Somehow do the bitwise and of the result with a certain
uint16_t
value (probably using _mm_movemask_epi8
first and then just &
).
- Do the
blend
of the initial values based on the result of that.
So the problem is as you might've guessed that blend accepts __m128i
as a mask and I will be having uint16_t
. So either I need some sort of inverse instruction for _mm_movemask_epi8
or do something else entirely.
Some points -- I probably cannot change that uint16_t
value to some other type, it's complicated; I doing that on SSE4.2, so no AVX; there's a similar question here How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)? but it's about avx and I'm very inexperienced with this so I cannot adopt the solution.
PS: I might need to do that for arm as well, would appreciate any suggestions.
发布评论
评论(1)
当您进行
时,_mm_movemask_epi8
在矢量比较之后,它会生成-1
true
和 and 0 0 forsfalse
,您将获得一个16位整数(仅假设SSE)具有n
th bit 设置n
th 字节等于-1
在向量中。以下是反向(反向?)操作。
请注意,您可能需要使用从整数掩码转换的向量蒙版进行位操作,而无需在整数操作和矢量OPS之间来回来回操作。
When you do
_mm_movemask_epi8
after a vector comparison, which produces-1
fortrue
and0
forfalse
, you'll get a 16-bit integer (assuming SSE only) having then
th bit set for then
th byte equal to-1
in the vector.The following is the reverse (inverse?) operation.
Note that you might want to do a bitwise operation with the vector mask converted from an integer mask, without going back and forth between integer ops and vector ops.