矢量 [] 与复制
什么更快和/或通常更好?
vector<myType> myVec;
int i;
myType current;
for( i = 0; i < 1000000; i ++ )
{
current = myVec[ i ];
doSomethingWith( current );
doAlotMoreWith( current );
messAroundWith( current );
checkSomeValuesOf( current );
}
或者
vector<myType> myVec;
int i;
for( i = 0; i < 1000000; i ++ )
{
doSomethingWith( myVec[ i ] );
doAlotMoreWith( myVec[ i ] );
messAroundWith( myVec[ i ] );
checkSomeValuesOf( myVec[ i ] );
}
我目前正在使用第一个解决方案。每秒确实有数百万次调用,并且每一个位比较/移动都存在性能问题。
What is faster and/or generally better?
vector<myType> myVec;
int i;
myType current;
for( i = 0; i < 1000000; i ++ )
{
current = myVec[ i ];
doSomethingWith( current );
doAlotMoreWith( current );
messAroundWith( current );
checkSomeValuesOf( current );
}
or
vector<myType> myVec;
int i;
for( i = 0; i < 1000000; i ++ )
{
doSomethingWith( myVec[ i ] );
doAlotMoreWith( myVec[ i ] );
messAroundWith( myVec[ i ] );
checkSomeValuesOf( myVec[ i ] );
}
I'm currently using the first solution. There are really millions of calls per second and every single bit comparison/move is performance-problematic.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
第一个版本可能不必要地昂贵,因为它依赖于在向量中创建对象的副本。除非
myType
是一些非常小且简单的对象,例如int
,否则存储引用可能是一个更好的主意。它也应该在需要时声明,而不是更早,以限制别名问题,可能否则会导致编译器发出效率较低的代码:创建副本的一个优点与使用引用不同,它可能会导致编译器将对象加载到寄存器中,而不是在每次访问时从内存中读取它。所以这两个版本都值得尝试。
当然,复制与参考建议也适用于您的每个功能。他们是根据价值还是根据参考来看待论证?根据他们使用它的用途以及
myType
的定义方式,其中一种可能比另一种更快。第二个版本是有缺陷的,因为它(除非编译器能够优化它)要求每次都在内存中查找对象。根据您的 STL 实现,由于对
operator[]
进行边界检查,可能还会产生一些开销。首先创建一个临时文件,然后将其传递给每个函数,这是正确的方法。问题是该临时值是否应该是值类型 (
myType
),还是引用类型 (myType&
/const myType&
)另一种选择是可能值得探索的是将每个函数调用放入其自己的单独循环中。这在某种程度上损害了数据局部性,但如果某些函数使用大量本地数据,它可能会表现得更好。它也可能与指令缓存配合得更好。
但实际上,性能极其复杂。缓存、乱序执行、
myType
的确切语义(尤其是它的复制构造函数和大小)以及编译器执行的优化量都是我们未知的。因此我们无法给您可靠的答案。猜猜谁可以:你的编译器。写测试。两者都尝试一下。计算结果的时间。选择速度更快的那个。
The first version may be needlessly expensive because it relies on creating a copy of the object in the vector. Unless
myType
is some very small and simple object, like anint
, storing a reference may be a better idea. It should also be declared when you need it, and no earlier, to limit aliasing issues that might otherwise cause the compiler to emit less efficient code:One advantage of creating a copy, rather than using a reference, is that it might cause the compiler to load the object into a register, rather than read it from memory on every access. So both versions are worth trying.
Any of course, the copy vs reference advice applies to each of your functions too. Do they take the argument by value or reference? Depending on what they do with it, and how
myType
is defined, one might be faster than the other.The second version is flawed because it (unless the compiler is able to optimize it away) requires the object to be looked up in memory every time. Depending on your STL implementation, there may also be a bit of overhead due to bounds checking on
operator[]
.Creating a temporary first, which is then passed to each of your functions is the right way to go. The question is whether that temporary should be of value type (
myType
), or reference type (myType&
/const myType&
)Another option that may be worth exploring is putting each function call in its own separate loop. That hurts data locality in some ways, but if some of the functions use a lot of local data, it might perform better. It might also play nicer with the instruction cache.
But really, performance is extremely complicated. Caching, out of order execution, the exact semantics of
myType
(especially its copy constructor and size) and the amount of optimizations performed by the compiler are all unknown to us. So we cannot give you a reliable answer.Guess who can: your compiler. Write the test. Try both. Time the results. Pick the faster one.
除了避免多次访问同一索引并使用引用来避免复制之外,您还可以在函数中使用 fastcall 调用约定。它指示编译器在可能的情况下在寄存器中传递参数,而不是将它们推入堆栈。
但是,快速调用并未标准化,因此它是特定于供应商的。
Besides avoiding multiple access to the same index and using a reference to avoid copying, you could use the fastcall calling convention in your functions. It instructs the compiler to pass parameters in registers, when possible, instead of pushing them to the stack.
However, the fastcall isn't standardized, so it's vendor specific.
取决于您的类型的赋值运算符的作用。我希望您是通过引用这些函数来传递的?但对于所有性能问题,如果对您来说很重要,请亲自测试您的具体案例。
Depends what the assignment operator for your type gets up to. And you are passing by reference to those functions, I hope? But as for all performance questions, if it is important to you, test your specific case yourself.
在格式方面,我建议如下(在初始化时声明,使用 const myType&):
但是,就您原来的问题而言,使用名为
current
的临时变量将在至少与每次使用myVec[i]
一样快。不过,优化编译器可能会删除对 myVec 的冗余查找(即使用分配给临时变量的解决方案)...一般来说,如果您重复调用声明为“const”的成员函数而不使用任何非常量成员函数时,编译器可以自由创建临时变量并仅执行一次调用,将结果保存到本地临时变量。In terms of formatting, I would suggest the following (declare at point of initialization, use const myType&):
However, in terms of your original question, using a temporary variable named
current
is going to be at least as fast as usingmyVec[i]
each time. It is possible, though, that an optimizing compiler will remove redundant lookups to myVec (i.e. using your solution of assignment to a temporary)... in general, if you call a a member function declared "const" repeatedly without using any non-const member function, the compiler is free to create a temporary and perform the call only once, saving the result to a local temporary.