是否可以将 __m128 变量中的内部值作为 C++ 中的属性进行访问班级?
我想要一个使用 SSE 内在函数实现的 Vector 类(表示 3 个浮点数的向量)(因此我不会使用 __m128 类型的第四个元素)。但我希望能够像属性一样轻松访问它们:因此 myVector.x 将访问 vec 中的 0-31 位,myVector.y 将访问 vec 中的 32-63 位,但不必调用某些 getX()方法。 “x”属性是“vec”的 0-31 位的一种别名。 是否可以 ?
class Vector {
public:
float x;
float y;
float z;
private:
__m128 vec;
}
I would like to have a Vector class (which represents vector of 3 floats) implemented with SSE intrinsics (so I will not use the 4th elements of the __m128 type). But I would like to be able to access them easily like attributes : so myVector.x will access the 0-31 bits in vec, myVector.y will access the 32-63 bits in vec, but without having to call some getX() method. The 'x' attribute would be a sort of alias for the 0-31 bits of 'vec'.
Is it possible ?
class Vector {
public:
float x;
float y;
float z;
private:
__m128 vec;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不可以,因为这违反了强别名规则。
当然,您可以使用强制转换或联合来假装 __m128 是浮点数组,但优化器不会为您保持一致性,因为您违反了语言规则。
请参阅什么是严格的别名规则?
(根据该规则,使用联合访问是安全的,但这仅适用于以下情况)您正在命名联合体。获取对联合体成员的指针或引用,然后直接使用该指针或引用是不安全的。)
No, because this violates the strong aliasing rule.
Sure you can use casts or unions to pretend the
__m128
is an array of floats, but the optimizer will not maintain coherency for you, because you're breaking the language's rules.See What is the strict aliasing rule?
(According to the rule, access using a union is safe, but that only applies when you are naming the union. Taking a pointer or reference to a union member and then using the pointer or reference directly later is not safe.)
您也许可以使用联合,例如
那么浮点数将是
aVec.xyz[0]
、aVec.xyz[1]
,以及aVec.xyz[2]
和__m128
将是aVec.vec
。float
数组在这里有四个元素,但没有说明您必须使用第四个元素。You could perhaps use a union, something like
Then the floats would be
aVec.xyz[0]
,aVec.xyz[1]
, andaVec.xyz[2]
and the__m128
would beaVec.vec
. Thefloat
array has four elements here, but nothing says you have to use the fourth one.您可以编写一个自动与
__m128
相互转换的结构:这样做的缺点是
Vec4f
将通过两个 SSE 寄存器而不是一个传递(按值传递时:< a href="https://godbolt.org/z/sutmuM" rel="nofollow noreferrer">https://godbolt.org/z/sutmuM)。总的来说,我建议创建一个只包含
__m128
和重载x()
、y()
等方法的结构。如果可能的话,无论如何应该避免对 SSE 寄存器进行逐元素操作(使用第零个元素除外)。注意:
alignas(16)
需要 C++11,大多数编译器都有特定于编译器的替代方案。或者,您可以使用_mm_loadu_ps
和_mm_storeu_ps
代替。You can write a struct which automatically converts to and from
__m128
:This has the disadvantage that
Vec4f
would be passed via two SSE registers instead of one (when passed by value: https://godbolt.org/z/sutmuM).Overall, I'd suggest to rather make a struct which just contains an
__m128
and overloadx()
,y()
, etc methods. Element-wise operations on SSE registers should be avoided anyway if possible (except using the zeroth element).N.B.:
alignas(16)
requires C++11, there are compiler-specific alternatives for most compilers. Alternatively, you can use_mm_loadu_ps
and_mm_storeu_ps
instead.