打印 __m128i 变量
我正在尝试学习使用内在函数进行编码,下面是一个代码,它执行加法
使用的编译器:icc
#include<stdio.h>
#include<emmintrin.h>
int main()
{
__m128i a = _mm_set_epi32(1,2,3,4);
__m128i b = _mm_set_epi32(1,2,3,4);
__m128i c;
c = _mm_add_epi32(a,b);
printf("%d\n",c[2]);
return 0;
}
我收到以下错误:
test.c(9): error: expression must have pointer-to-object type
printf("%d\n",c[2]);
如何打印变量c 类型为
__m128i
I'm trying to learn to code using intrinsics and below is a code which does addition
compiler used: icc
#include<stdio.h>
#include<emmintrin.h>
int main()
{
__m128i a = _mm_set_epi32(1,2,3,4);
__m128i b = _mm_set_epi32(1,2,3,4);
__m128i c;
c = _mm_add_epi32(a,b);
printf("%d\n",c[2]);
return 0;
}
I get the below error:
test.c(9): error: expression must have pointer-to-object type
printf("%d\n",c[2]);
How do I print the values in the variable c
which is of type __m128i
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用此函数打印它们:
在打印它们之前将 128 位拆分为 16 位(或 32 位)。
如果您有 64 位支持,这是一种 64 位拆分和打印的方法:
注意:将
&var
直接转换为int*< /code> 或
uint16_t*
也适用于 MSVC,但这违反了严格的别名并且是未定义的行为。使用 memcpy 是执行相同操作的标准兼容方法,并且通过最小的优化,编译器将生成完全相同的二进制代码。Use this function to print them:
You split 128bits into 16-bits(or 32-bits) before printing them.
This is a way of 64-bit splitting and printing if you have 64-bit support available:
Note: casting the
&var
directly to anint*
oruint16_t*
would also work MSVC, but this violates strict aliasing and is undefined behaviour. Usingmemcpy
is the standard compliant way to do the same and with minimal optimization the compiler will generate the exact same binary code.或任何你想要的格式字符串。
_mm_setr_epiX
)。如果您希望按照英特尔手册使用的相同顺序打印,请反转数组索引,其中最重要的元素位于左侧(如_mm_set_epiX
)。 显示向量寄存器的约定相关:使用
__m128i*
从int
数组加载是安全的,因为__m128
类型被定义为允许别名,就像 ISO Cunsigned char*
一样。 (例如,在 gcc 的标头中,定义包括__attribute__((may_alias))
。)相反不安全(将
int*
指向__m128i 对象的一部分)。 MSVC 保证这是安全的,但 GCC/clang 则不然。 (-fstrict-aliasing
默认情况下处于启用状态)。有时它可以与 GCC/clang 一起使用,但为什么要冒险呢?有时甚至会干扰优化;请参阅此问答。另请参阅 硬件 SIMD 之间是否存在“reinterpret_cast”向量指针和相应的类型有未定义的行为吗?参见GCC AVX对于 GCC 破坏代码的真实示例,__m256i 转换为 int 数组会导致错误的值,该代码将
int*
指向__m256i
。我怀疑在 GNU C 中将
float *
指向__m256
(在 GNU C 中使用float
元素定义)是安全的,或者出于同样的原因,along long *
位于__m256i
中,按照 GCC 和 Clang 的定义方式。但 GNU 和 MSVC 不是(或者不是)唯一的编译器,例如,我听说 SunCC 对一些相关的事情更加挑剔,比如 C++ 中的联合类型双关语、IIRC。(uint32_t*) &my_vector
违反了 C 和 C++ 别名规则,并且不能保证按您期望的方式工作。存储到本地数组然后访问它是保证安全的。它甚至对大多数编译器进行了优化,因此您可以直接从 xmm 到整数寄存器而不是实际存储/重新加载movq
/pextrq
,例如例子。<一href="https://gcc.godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAKxAEZSBnVAV2OUxAHIBSAJgGY8AO2QAbZlgDU3fgGE8AWwXCCxYQDoEM7NwAMAQT6CR4qTNkMC%2BIQU3 a9hgcLETM0uZfyo7/HQYdGeABmQlhBkgD6EcgADuIMccwMAU4mru4WVgCGonjAQj7YksUA9CWSsrsS 0kmHCmAyR%2Brn5WQwQAJTqxQAqCHgNCJhZWMQ1mEF1DTl5Qq0dKfyYocEBBgBuqHjokjG0vAAcEYM AHhHM%2BxBRCnv7eJLC7dIA7ABCDsXF0y1ttABsj8wVIcCJI1twAKwvP4QgAiMjeBg%2BkSUEUsJE wqLwNygVxueAAVO01qR7kJ2vD3h8YmobEEIHxeGs/md9iBpLxwUZjuzOQJuXxefx%2BRyuc9ZDzRQL JSK%2BRLZdwnuKpfKZUK5WrlRrFerhYLDdK9RDZEIGaRKUiLWDIbpYSSrVDbcV7bxHaCIS9%2BK77 QAWL3uzngmEkt2Q35%2ByFPW3mi0fe37R32gCcCfdtBtgbtqdoKchexzUM9GZDUN9RfttADMPJ/ARh IECP8602212ByOmFOzD%2Blwi1wOdwez1rSK%2Bsx%2B/0kgJszJBcdhFMRH17ClRRGIGIYWIOON7 eMJxNJ1eHVJpBDpDL为什么7vzZmv1Kr1OuKCqVqvvGrfOqNJt4vDNi6dd10yDYtszLd0XXAyFCxAn180 rTNQ3zSNAyPAJ61WfQNi2HYbnbTt%2BF4Hs%2B1uQ8h2jUc5j%2BAEVEIiJZ3dUs4RraNl1XdFMWx Yi9yJEkHgXfQkWpFRz1/NZvTOQjb0/O8nzvb9TWLYDEIdKCXkg2D3RgtD/AwxssObXC2xOM5fm9bj% 2BzIhVj0%2BZoxwgajJFEVAhGAJyXLc515xrUpyinAgzPoyRlGABAQWAPA1jcIISAULICAAWksGk3IAdyyYghGEYAGlSwgEB5URRGOeF7gIMBOAaKgkhBZzXNJSQmAUNx9BeABJZIAORFc0XXTjtws259z4 skBKE09RMZXhTO9aTBSK98FoU38VOU0DYR0usG0MfTAnCG0DBUYKsmECBsPQR5rMpXFLKi5B3Bhbr UUwVQIkwGI8AuehJD/SR%2BBJb0SXBElfhJJ5/0EmMoehj59hJRMSTTRHvr2RH/skWhAYx4GMf%2BM bilbQ4TOYMyzswZBNqpPDicIsmKfxoyiY7M5uzuymCep5nzjpzaFW2owlnwIJOHaUhRC4cFOFIIQu F0KXUC4WQ%2BDeXgXkalg2DcIxaClghZZF0WAGsQH4cF1ETRNvXBRNCN/X5TaeUHxc4b0pYUOhdF0 aX9dIBXOClhgQC9vXODl0W4FgGBEBQVAFA%2B0RMDICgIDQOO8AT4gQDwZBkFocGJlEAhE8DiAACM fdL4QMoATy4HXSFT5qbAAesEURa9DqWsDi1yE59/B12QAhIvqH2O3J5gi7rqWVEwZ25dIVRFGn0Xcl LwPIFF1AYmHlzA84BLjnuhKAHUclESQEub/hL%2BOAAJZuAGVugDjX2DoVeJalmXO99rhjn2L8BKZ l7g5wxk8SQEBcCEBIOyfg31ZCx3jonOBvB2i631u0I2IBeDenNpbcE/ALYEMBs7V2pB3ben2OoX4vB fi/EIWmXgTxvT0Mtt7X%2BfsA5B0Xpg0gEdo6p2QUnSgQj06JxQKILIrlwSe1IAXIuxAS7l1/pXWY xAO710bksAgrd2790wD3YAfdf4D3JsPKK%2B8F7j2QJPDgnB66z3nrrNQ7sHEizFngde8At47zwHvL GSURAyBhPkZgytlZVEvmfIql9r6XzimwLQ/AYQIFaKlOeohX6sHfrQT%2BnBJYcIXn7ABQCQFiGkW 5M2uh1C6EgdAtccCEFIPEaMbW6DeGdywaQQYwwJEdDFlwch7twS0HUPwJ4/B%2BCsP2ImJ4HIplFPl lwbhwdMHYP4DU8ETwni6H2Psm2NtdATMGZwfg38fZcM6WHfhUcBEQCQGIjO5BRETIzpI2giYIhmXk enRRyiK5Vw0dPBuscm66Lbh3Be3dKkmOhXgQeFjR6/xsXYkFTifZLzcTrVeXiN4DO3rvIQ%2B8r7HG yZrD%2BpzCk/2Kf/QBwDvSSAALKPwAGrik%2BT9XQedJAACVugn3qfgRp2sSSILThnOB/AOkhzDhs s2UzEx/H2N6RM3LGHgzIW7EATxqGW14Ny70uDeBVGtuDGlyz/aMB4bKjx9ykAsAIDESeLyU5vIkf9T AwqSCUudtSy5XAjA3zygQAqpSGXMrZRyxMXKeX8pPhgrpotekjEoPk4ZOCxm6G2YmfY2ztn7O9LoA QSy/5WsDmspNpBjZTPUEa/Y8DdUNu9FM/Zpzzmlquba7pzteAXM4Ss65BtSBRSUf4mW3ogA%3D%3D" rel="nofollow noreferrer">Godbolt 编译器资源管理器上的源代码 + asm 输出:证明它可以使用 MSVC 进行编译等等。
如果您需要移植到 C99 或 C++03 或更早版本(即没有 C11 / C++11),请删除
alignas()
并使用storeu
而不是store
。或者使用__attribute__((aligned(16)))
或__declspec(align(16) )
代替。(如果您正在使用内在函数编写代码,则应该使用最新的编译器版本。较新的编译器通常比旧编译器生成更好的 asm,包括 SSE/AVX 内在函数。但也许您想将 gcc-6.3 与
一起使用 - std=gnu++03
C++03 模式,适用于尚未准备好 C++11 或其他内容的代码库。)调用所有 4 个函数的示例输出
如果要填充,请调整格式字符串前导零以获得一致的输出宽度。请参阅
printf(3)
。or whatever format-string you want.
_mm_setr_epiX
). Reverse the array indices if you prefer printing in the same order Intel's manuals use, where the most significant element is on the left (like_mm_set_epiX
). Related: Convention for displaying vector registersUsing a
__m128i*
to load from an array ofint
is safe because the__m128
types are defined to allow aliasing just like ISO Cunsigned char*
. (e.g. in gcc's headers, the definition includes__attribute__((may_alias))
.)The reverse isn't safe (pointing an
int*
onto part of a__m128i
object). MSVC guarantees that's safe, but GCC/clang don't. (-fstrict-aliasing
is on by default). It sometimes works with GCC/clang, but why risk it? It sometimes even interferes with optimization; see this Q&A. See also Is `reinterpret_cast`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior?See GCC AVX __m256i cast to int array leads to wrong values for a real-world example of GCC breaking code which points an
int*
at a__m256i
.I suspect it is safe in GNU C to point a
float *
at a__m256
(which is defined in GNU C withfloat
elements), or along long *
at a__m256i
the way GCC and Clang define it, for the same reason. But GNU and MSVC aren't (or weren't) the only compilers, e.g. I've heard that SunCC was more picky about some related things like union type-punning in C++, IIRC.(uint32_t*) &my_vector
violates the C and C++ aliasing rules, and is not guaranteed to work the way you'd expect. Storing to a local array and then accessing it is guaranteed to be safe. It even optimizes away with most compilers, so you getmovq
/pextrq
directly from xmm to integer registers instead of an actual store/reload, for example.Source + asm output on the Godbolt compiler explorer: proof it compiles with MSVC and so on.
If you need portability to C99 or C++03 or earlier (i.e. without C11 / C++11), remove the
alignas()
and usestoreu
instead ofstore
. Or use__attribute__((aligned(16)))
or__declspec( align(16) )
instead.(If you're writing code with intrinsics, you should be using a recent compiler version. Newer compilers usually make better asm than older compilers, including for SSE/AVX intrinsics. But maybe you want to use gcc-6.3 with
-std=gnu++03
C++03 mode for a codebase that isn't ready for C++11 or something.)Sample output from calling all 4 functions on
Adjust the format strings if you want to pad with leading zeros for consistent output width. See
printf(3)
.我知道这个问题被标记为 C,但在寻找同一问题的 C++ 解决方案时,它也是最好的搜索结果。
因此,这可能是 C++ 实现:
用法:
结果:
注意:存在一种简单的方法来避免
if (size(T)==1)
,请参阅 https://stackoverflow.com/a/28414758/2436175I know this question is tagged C, but it was the best search result also when looking for a C++ solution to the same problem.
So, this could be a C++ implementation:
Usage:
Result:
Note: there exists a simple way to avoid the
if (size(T)==1)
, see https://stackoverflow.com/a/28414758/2436175试试这个代码。
Try this code.