C struct padding 是否会使这种使用不安全?
假设我有一个结构,无论是联合还是其他:
typedef struct {
union {
struct { float x, y, z; } xyz;
struct { float r, g, b; } rgb;
float xyz[3];
} notAnonymous;
} Vector3;
我听说一些编译器会自动填充结构,以通过创建字对齐边界来增强性能。
据推测,这种协同作用意味着结构体的大小不能保证为其组件字段大小的总和,因此数组 xyzs
的数据损坏和/或溢出发生了变化,如下:
inline Vector3 v3Make(float x, float y, float z) { Vector3 v = {x,y,z}; return v; }
float xyzs[6];
*(Vector3*)&xyzs[3] = v3Make(4.0f,5.0f,6.0f);
*(Vector3*)&xyzs[0] = v3Make(1.0f,2.0f,3.0f);
正确?
Suppose I have a struct, be it union'd or otherwise:
typedef struct {
union {
struct { float x, y, z; } xyz;
struct { float r, g, b; } rgb;
float xyz[3];
} notAnonymous;
} Vector3;
I've heard that some compilers automatically pad structs to enhance performance by creating word-aligned boundaries.
Presumably such synergy means the size of a struct cannot be guaranteed to be the sum of its component field sizes, and therefore there is a change of data corruption and/or overflow for array xyzs
in the following:
inline Vector3 v3Make(float x, float y, float z) { Vector3 v = {x,y,z}; return v; }
float xyzs[6];
*(Vector3*)&xyzs[3] = v3Make(4.0f,5.0f,6.0f);
*(Vector3*)&xyzs[0] = v3Make(1.0f,2.0f,3.0f);
Correct?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
确实,编译器可以用它想要的任何类型的填充来放置我们的结构。您可以使用
#pragma pack
或__attribute__((packed))
来避免大多数编译器上的填充。实际上,其中有三个 32 位字段,因此这可能不会成为问题。您可以通过对结构类型或该类型的变量使用sizeof
进行检查,然后查看结果。问题是您试图将
Vector3
分配给最后两行中的float
变量。这是不会被允许的。你可以侵入你想要做的事情:但这看起来很丑陋,更不用说令人困惑了。将
xyzs
更改为Vector3
数组而不仅仅是float
会好得多。It's true the compiler can lay our your structure with what ever kind of padding it wants. You can use
#pragma pack
or__attribute__((packed))
to avoid padding on most compilers. In practice, you have three 32-bit fields in there, so it's probably not going to be a problem. You can check by usingsizeof
on your structure type or a variable of that type and seeing what comes out.What is a problem is that you're trying to assign a
Vector3
to afloat
variable in your last two lines. That's not going to be allowed. You could hack in what you're trying to do:But that's pretty ugly looking, not to mention confusing. It would be a lot better to change
xyzs
to being an array ofVector3
rather than of justfloat
.请参阅 C 缺陷报告 #074
http: //www.open-std.org/jtc1/sc22/wg14/docs/rr/dr_074.html
See answers to the questions in C Defect Report #074
http://www.open-std.org/jtc1/sc22/wg14/docs/rr/dr_074.html
它本质上并不是不安全的,编译器/链接器会处理所有偏移量。
除非...您将结构传递给用另一种语言或在另一个系统上编写的另一个程序,或者传递给在同一系统上用相同语言编写但具有不同编译器设置的另一个程序。那么偏移量可能无法正确计算。
It's not inherently unsafe, the compiler/linker takes care of all the offsets.
Unless...you pass the struct to another program written in another language or on another system or to another program written in the same language on the same system but with different compiler settings. Then the offsets may not be calculated correctly.
根据 C 标准,这至少是实现定义的(取决于填充问题),并且可能是未定义的行为(由于别名规则?),但在所有现实世界的编译器上,它都会按预期工作。类型的对齐永远不会大于其大小(它总是均匀地划分类型的大小),并且只有病态的糟糕编译器才会在结构中插入额外的填充,超出将每个成员填充到其类型的正确对齐所需的范围。
话虽如此,至少在我的书中,这种黑客行为是为了语法醋而无端调用未定义的行为,是不可接受的。如果您有机会想要以数组形式访问数据,只需始终使用数组形式即可。记住矢量分量始终是
v[0]
、v[1]
和v[2]
比请记住v[1]
和rgb.g
可能引用内存中的同一对象......Per the C standard this is at least implementation-defined (dependent on padding issues) and perhaps undefined behavior (due to aliasing rules?), but on all real-world compilers it will work as expected. The alignment of a type can never be larger than its size (it always divides the size of the type evenly) and only a pathologically bad compiler would insert additional padding in structs beyond what's needed to pad each member to the correct alignment for its type.
With that said, in my book at least, this kind of hack is a gratuitous invocation of undefined behavior for the sake of syntactic vinegar and is not acceptable. If there's ever a chance you want to access data in array style, simply always use the array form. It's much less confusing to remember that the vector components are always
v[0]
,v[1]
, andv[2]
than to remember thatv[1]
andrgb.g
might refer to the same object in memory...