奇怪的海湾合作委员会行为
给出以下 C++ 代码:
struct vertex_type {
float x, y, z;
//vertex_type() {}
//vertex_type(float x, float y, float z) : x(x), y(y), z(z) {}
};
typedef struct {
vertex_type vertex[10000];
} obj_type;
obj_type cube = {
{
{-1, -1, -1},
{1, -1, -1},
{-1, 1, -1},
{1, 1, -1},
{-1, -1, 1},
{1, -1, 1},
{-1, 1, 1},
{1, 1, 1}
}
};
int main() {
return 0;
}
当我将(当前已注释掉的)构造函数添加到 vertex_type
结构中时,编译时间突然增加了 10-15 秒。 我被难住了,查看了 gcc 生成的程序集(使用 -S
),发现代码生成的大小比以前大了几百倍。
...
movl $0x3f800000, cube+84(%rip)
movl $0x3f800000, cube+88(%rip)
movl $0x3f800000, cube+92(%rip)
movl $0x00000000, cube+96(%rip)
...
movl $0x00000000, cube+119996(%rip)
...
通过省略构造函数定义,生成的程序集完全不同。
.globl cube
.data
.align 32
.type cube, @object
.size cube, 120
cube:
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.zero 24
.text
显然,编译器生成的代码存在显着差异。 这是为什么? 另外,为什么 gcc 在一种情况下将所有元素归零,而不是在另一种情况下?
编辑: 我正在使用以下编译器标志:-std=c++0x
和 g++ 4.5.2。
Given the following C++ code:
struct vertex_type {
float x, y, z;
//vertex_type() {}
//vertex_type(float x, float y, float z) : x(x), y(y), z(z) {}
};
typedef struct {
vertex_type vertex[10000];
} obj_type;
obj_type cube = {
{
{-1, -1, -1},
{1, -1, -1},
{-1, 1, -1},
{1, 1, -1},
{-1, -1, 1},
{1, -1, 1},
{-1, 1, 1},
{1, 1, 1}
}
};
int main() {
return 0;
}
When I added the (currently commented out) constructors into the vertex_type
struct, it abruptly 10-15 second rise in compilation time.
Stumped, I looked to the assembly generated by gcc (using -S
), and noticed that code-gen size was several hundred times bigger than before.
...
movl $0x3f800000, cube+84(%rip)
movl $0x3f800000, cube+88(%rip)
movl $0x3f800000, cube+92(%rip)
movl $0x00000000, cube+96(%rip)
...
movl $0x00000000, cube+119996(%rip)
...
By leaving out the constructor definition, the generated assembly was completely different.
.globl cube
.data
.align 32
.type cube, @object
.size cube, 120
cube:
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.zero 24
.text
Obviously there is a significant difference in the code generated by the compiler.
Why is that?
Also, why does gcc zero all the elements in one situation and not the other?
edit:
I am using the following compiler flags: -std=c++0x
with g++ 4.5.2.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是GCC 中长期缺失的优化。它应该能够为这两种情况生成相同的代码,但它不能。
如果没有构造函数,您的
vertex_type
是一个 POD 结构,GCC 可以在编译时初始化其静态/全局实例。使用构造函数,它最多可以生成代码以在程序启动时初始化全局。This is a long-standing missing optimization in GCC. It should be able to generate the same code for both cases, but it can't.
Without the constructors, your
vertex_type
is a POD structure, which GCC can initialize static/global instances of at compile time. With the constructors, the best it can do is generate code to initialize the global at program startup.如果您有自定义构造函数,编译器应该为其创建的所有向量调用它。如果您不编写自己的构造函数,则它默认为生成的构造函数。但由于没有类型是复杂的,因此不需要调用它。并且该数组作为常量表存储在二进制文件中。
尝试内联默认构造函数并将其留空。当然,它可能仅在启用优化的情况下工作
If you have custom constructor, the compiler should call it for all vector it create. If you don't write your own, it default to a generated constructor. But as no type are complex, it just don't need to call it. And the array is store as a constant table in the binary.
Try inlining your default constructor and let it empty. Of course, it may only work wit h optimization enabled