C++ 中 SSE/AVX 的 x86 CPU 调度
我有一个算法,该算法受益于 SSE(2) 内在函数的手动优化。此外,该算法未来还将能够受益于256位AVX寄存器。
最佳方法是什么
- 我的问题是在编译时注册我的类的可用性变体的 ?因此,如果我的类是:
Foo
、FooSSE2
和FooAVX
我需要一种方法来确定在运行时编译哪些类 - 。当前CPU的能力。在最低级别,这将导致
cpuid
调用。 - 在运行时根据编译的内容和支持的内容决定使用什么。
虽然我可以解决上述大部分问题,但这似乎是一个足够常见的问题,因此必须出现一些最佳实践。理想情况下,我试图避免 #ifdef
混乱
#ifdef COMPILE_SSE2
if (sse2_supported)
// Use the SSE2 class
#endif
I have an algorithm which benefits from hand optimisation with SSE(2) intrinsics. Moreover, the algorithm will also be able to benefit from the 256-bit AVX registers in the future.
My question is what is the best way to
- Register the availability variants of my class at compile time; so if my classes are, say:
Foo
,FooSSE2
andFooAVX
I require a means of determining at runtime what classes are compiled in. - Determine the capabilities of the current CPU. At the lowest level this will result in a
cpuid
call. - Decide at runtime what to use based on what is compiled and what is supported.
While I can hack most of the above it seems to be a common enough problem that some best practices must have emerged. Ideally I am trying to avoid the #ifdef
mess
#ifdef COMPILE_SSE2
if (sse2_supported)
// Use the SSE2 class
#endif
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只需创建一个“工厂”类或函数来创建类的适当实例,并隐藏实现工厂的文件中的所有逻辑。
有一些类或文件本地布尔值,例如“isSSE2Supported”或“isAVXSupported”。启动时,调用一些函数来初始化这些值。然后,您的工厂逻辑可以检查这些值以确定要使用哪个类。
由于 SSE2 始终在 x64 芯片上可用,因此您实际上并不想避免所有 ifdef。您可以避免在某些 x64 版本的类中进行编译。
Just create a "factory" class or function to create appropriate instances of your class and hide all the logic in the file that implements the factory.
Have some class or file local boolean values like "isSSE2Supported" or "isAVXSupported". On startup, call some function to initialize these values. Your factory logic can then check against the values to determine which class to use.
Since SSE2 is always available on x64 chips, you don't really want to avoid all of the ifdefs. You can avoid compiling in some of the classes for x64 builds.