_mm_loadu_si32在Ubuntu上未被GCC识别
当我尝试使用_MM_LOADU_SI32
时,VSCODE给出了错误消息:类型“ int”的值不能用于初始化类型的实体“ __m128i
尝试编译时,我会收到错误消息:函数'_mm_loadu_si32'
的隐式声明
是奇怪的部分是_mm_mm_loadu_si32
之前的几行,我正在使用_mm_mm_loadu_si128
。 _mm_loadu_si64
也有效。
另外,在Windows上,我的程序编译。
我运行sudo apt-get update
和sudo apt-get升级
,因此问题没有过时的软件。这是否仅限于Ubuntu?
OS:Ubuntu 20.04
GCC:9.4.0
When I try to use _mm_loadu_si32
, VScode gives me the error message:a value of type "int" cannot be used to initialize an entity of type "__m128i
When trying to compile, I get the error message:implicit declaration of function '_mm_loadu_si32'
The weird part is that a couple lines before _mm_loadu_si32
, I'm using _mm_loadu_si128
without having any kind of problems. _mm_loadu_si64
also works.
Also, on windows, my program compiles.
I ran sudo apt-get update
and sudo apt-get upgrade
, so the problem isn't outdated software. Is this some kind of gcc bug restricted to Ubuntu?
OS: Ubuntu 20.04
gcc: 9.4.0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的GCC太老了,您需要GCC11才能通过
inmintrin.h
定义它,并且您需要GCC11.3或GCC12对于非破裂版本,它放置了已加载的版本字节位于结果矢量中的正确位置,并且要对齐 /严格确定安全。 gcc bug
9975499754Clang有时会错过定义一些“助手”内在的,最终才能到他们身边。这是其中一种情况,甚至更糟糕的是,第一次添加它是越野车的尝试。那里有GCC版本(GCC11.0至11.2)支持它,但会将其错误编译(将dword或word放置在加载后的顶部元素中,而不是底部,因为它们使用了
_mm_set
而不是标题实现中的_mm_setr
。)FP等效4字节负载,
__ M128 _mm_load_ss(float*)
已永远定义,但是在海湾合作委员会的实现中仍然不像其他编译器那样对齐或严格确定安全。 GCC的标题derefsfloat*
,而不是使用memcpy
或__属性__(((校准)(1),May_alias))
指针类型。那是 gcc bug pr84508 。因此,不幸的是,在GCC中,不是可以安全地使用
_mm_castps_si128(_mm_load_ss(((float*)ptr)))
。老年编译器的便携式实现
您的最佳选择不协调的4个字节负载可能是此便携式实现:
它在GCC/Clang/MSVC上很好地编译( godbolt 显示全部)。 GCC和Clang的旧版本:已测试的GCC4.7和GCC12,只是预期
movd XMM0,[rdi]
/ret
。但是它愚蠢地在ICC上编译,加载到EAX中,然后存储/重新加载或
movd XMM0,eax
,而不是movd
的内存源操作数。这也可作为PMOVZX / PMOVSX负载的建筑块(用于狭窄负载的重要用例之一
Your GCC is too old, you need GCC11 for it to be defined by
immintrin.h
And you need GCC11.3 or GCC12 for a non-broken version that puts the loaded bytes in the correct place in the resulting vector, and to be alignment / strict-aliasing safe. GCC bug
99754GCC and/or clang sometimes miss defining some "helper" intrinsics, only eventually getting around to them. This is one of those cases, and even worse, the first attempt at adding it was buggy. There are GCC versions out there (GCC11.0 through 11.2) which support it but mis-compile it (shuffling the dword or word into the top element after loading, instead of the bottom, because they used
_mm_set
instead of_mm_setr
in the header implementation.)The FP equivalent 4-byte load,
__m128 _mm_load_ss(float*)
, has been defined forever, but is still not alignment or strict-aliasing safe in GCC's implementation like it is in other compilers. GCC's header derefs thefloat*
, instead of usingmemcpy
or an__attribute__((aligned(1),may_alias))
pointer type. That's GCC bug PR84508.So unfortunately, in GCC, it's not safe to use
_mm_castps_si128( _mm_load_ss( (float*)ptr ))
either.Portable implementation for older compilers
Your best bet for an aliasing-safe unaligned 4-byte load is probably this portable implementation:
This compiles nicely on GCC/clang/MSVC (Godbolt showing all). Both old and new versions of GCC and clang: Tested GCC4.7 and GCC12, just the expected
movd xmm0, [rdi]
/ret
.But it compiles stupidly on ICC, loading into EAX and then either store/reload or
movd xmm0, eax
, instead of a memory source operand formovd
.This is also useful as a building-block for pmovzx / pmovsx loads (one of the significant use-cases for narrow loads into
__m128i
, especially unaligned and aliasing-safe loads), such as