C++固定大小数组与相同类型的多个对象
我想知道(除了明显的语法差异之外)包含对象(同一类型)的多个实例的类或该类型的固定大小的对象数组之间是否存在任何效率差异。
在代码中:
struct A {
double x;
double y;
double z;
};
struct B {
double xvec[3];
};
实际上,我会使用 boost::arrays,它是 C 风格数组的更好的 C++ 替代品。
我主要关心构造/销毁和读/写这样的双精度数,因为这些类通常被构造只是为了调用它们的成员函数之一。
感谢您的帮助/建议。
I was wondering whether (apart from the obvious syntax differences) there would be any efficiency difference between having a class containing multiple instances of an object (of the same type) or a fixed size array of objects of that type.
In code:
struct A {
double x;
double y;
double z;
};
struct B {
double xvec[3];
};
In reality I would be using boost::arrays which are a better C++ alternative to C-style arrays.
I am mainly concerned with construction/destruction and reading/writing such doubles, because these classes will often be constructed just to invoke one of their member functions once.
Thank you for your help/suggestions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
通常,这两个结构的表示形式完全相同。但是,如果您为您的用例选择了错误的选项,则可能会导致性能不佳。
例如,如果您需要使用数组访问循环中的每个元素,您可以执行以下操作:
但是,如果没有数组,您要么需要重复代码:
这意味着代码重复 - 可以采用任何一种方式。一方面循环代码更少;另一方面,非常紧密的循环在现代处理器上可能会非常快,并且代码重复可能会破坏 I-cache。
另一个选项是开关:
这可以避免可能较大的 i-cache 占用空间,但会对性能产生巨大的负面影响。不要这样做。
另一方面,原则上,如果您的编译器不是很聪明,数组访问可能会更慢:
由于 xvec[0] 和 xvec[1] 是不同的,原则上,编译器应该能够将 xvec[1] 的值保存在寄存器中,因此不必在下一行重新加载该值。但是,某些编译器可能不够聪明,无法注意到 xvec[0] 和 xvec[1] 没有别名。在这种情况下,使用单独的字段可能会快一点。
简而言之,这并不是说其中之一在所有情况下都要快。这是关于将表示形式与您使用它的方式相匹配。
就我个人而言,我建议使用任何使 xvec 上运行的代码最自然的代码。不值得花费大量的时间去担心那些最多可能只会产生很小的性能差异的事情,以至于你只能在微基准测试中发现它。
Typically the representation of those two structs would be exactly the same. It is, however, possible to have poor performance if you pick the wrong one for your use case.
For example, if you need to access each element in a loop, with an array you could do:
However, without an array, you'd either need to duplicate code:
This means code duplication - which can go either way. On the one hand there's less loop code; on the other hand very tight loops can be quite fast on modern processors, and code duplication can blow away the I-cache.
The other option is a switch:
This avoids the possibly-large i-cache footprint, but has a huge negative performance impact. Don't do this.
On the other hand, it is, in principle, possible for array accesses to be slower, if your compiler isn't very smart:
Since xvec[0] and xvec[1] are distinct, in principle, the compiler ought to be able to keep the value of xvec[1] in a register, so it doesn't have to reload the value at the next line. However, it's possible some compilers might not be smart enough to notice that xvec[0] and xvec[1] don't alias. In this case, using seperate fields might be a very tiny bit faster.
In short, it's not about one or the other being fast in all cases. It's about matching the representation to how you use it.
Personally, I would suggest going with whatever makes the code working on xvec most natural. It's not worth spending a lot of human time worrying about something that, at best, will probably only produce such a small performance difference that you'll only catch it in micro-benchmarks.
MVC++ 2010 生成了与您的示例中的两个 POD 结构完全相同的读/写代码。由于读/写的偏移量在编译时是可计算的,因此这并不奇怪。建造和破坏也是如此。
至于实际性能,一般规则适用:如果重要则对其进行分析,如果不重要 - 为什么要关心?
对于结构体的用户来说,对数组成员进行索引可能需要更多工作,但话又说回来,他可以更轻松地迭代元素。
MVC++ 2010 generated exactly the same code for reading/writing from two POD structs like in your example. Since the offsets to read/write to are computable at compile time, this is not surprising. Same goes for construction and destruction.
As for the actual performance, the general rule applies: profile it if it matters, if it doesn't - why care?
Indexing into an array member is perhaps a bit more work for the user of your struct, but then again, he can more easily iterate over the elements.
如果您无法决定并希望保持选择开放,您可以使用匿名联合:
某些编译器还支持匿名结构,在这种情况下您可以保留
xyz
部分。In case you can't decide and want to keep your options open, you can use an anonymous union:
Some compilers also support anonymous structs, in that case you can leave the
xyz
part out.这取决于。例如,您给出的示例是一个支持“老式”数组的经典示例:数学点/向量(或矩阵)
对象中的私有
界面,可以正常
在构造函数中初始化它们
(否则,经典数组
初始化是我不知道的事情
真的很喜欢,语法方面)
在这种情况下(使用数学向量/矩阵示例),我总是最终在内部使用 C 样式数组,因为您可以循环它们,而不是为每个组件编写复制/粘贴代码。
但这是一个特殊情况——对我来说,现在的 C++ 中数组 == STL 向量,它速度很快,而且我不必担心任何问题:)
It depends. For instance, the example you gave is a classic one in favor of 'old-school' arrays: a math point/vector (or matrix)
private in an object
interface, you can properly
initialize them in the constructor
(otherwise, classic array
inialization is something I don't
really like, syntax-wise)
In such cases (going with the math vector/matrix examples), I always ended up using C-style arrays internally, as you can loop over them instead of writing copy/pasted code for each component.
But this is a special case -- for me, in C++ nowadays arrays == STL vector, it's fast and I don't have to worry about nuthin' :)
区别在于将变量存储在内存中。在第一个示例中,编译器可以添加填充来对齐数据。但就你的具体情况而言,这并不重要。
The difference can be in storing the variables in memory. In the first example compiler can add padding to align the data. But in your paticular case it doesn't matter.
原始数组提供比 C++ 数组更好的缓存局部性,但是,如前所述,数组示例相对于多个对象的唯一优势是能够迭代元素。
真正的答案当然是创建一个测试用例并进行测量。
raw arrays offer better cache locality than c++ arrays, as presented however, the array example's only advantage over the multiple objects is the ability to iterate over the elements.
The real answer is of course, create a test case and measure.