std::向量优化
假设从 std::vector 读取大量值的循环是我的程序中的瓶颈,有人建议我更改
void f(std::vector<int> v)
{
...
while (...)
{
...
int x = v[i] + v[j]
...
}
}
为
void f(std::vector<int> v)
{
int* p_v = &v[0];
...
while (...)
{
...
int x = p_v[i] + p_v[j]
...
}
}
通过绕过 [] 运算符,这实际上会提高性能吗?
Assuming a loop that reads a lot of values from an std::vector is a bottleneck in my program, it has been suggested I change
void f(std::vector<int> v)
{
...
while (...)
{
...
int x = v[i] + v[j]
...
}
}
to
void f(std::vector<int> v)
{
int* p_v = &v[0];
...
while (...)
{
...
int x = p_v[i] + p_v[j]
...
}
}
Will this actually improve performance, by by-passing the [] operator?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
从表面上看,每次调用此函数时复制整个向量很可能是瓶颈。为什么不改为以下内容呢?
无论如何,永远不要假设瓶颈在哪里——首先测量,一旦确定就调整缓慢的代码。
It's more likely (on the face of it) that copying the entire vector every time you call this function is the bottleneck. Why not the following instead?
In any case, never assume where the bottleneck is - measure first, and tune the code that's slow once you know for sure.
不,这不应该影响性能。
请注意,您可能最好在向量上使用按引用传递到常量而不是按值传递。
编辑:有关其语法,请参阅@Steve Townsend 的答案。
No, that should not affect performance.
Note that you would probably be better off using pass-by-reference-to-const instead of pass-by-value on the vector.
EDIT: For the syntax of that, see @Steve Townsend's answer.
不,不是物质上的。您使代码更难以阅读,但代价是(可能)微小的性能提升。无论如何,如果编译器没有在优化构建中内联对
operator[]
的调用,我会感到惊讶。如果您不确定,请对其进行分析。我想它永远不会出现。
No, not materially. You're making the code harder to read at the expense of (maybe) miniscule performance gains. Regardless, I would be surprised if the compiler doesn't inline the call to
operator[]
in optimized builds.If you're unsure, profile it. I imagine it will never show up.
几乎所有有关性能问题的标准答案都是使用分析器来查看这是否是瓶颈以及更改是否有帮助。然而,在这种情况下,我认为这不是特别好的建议。我已经查看了足够多的编译器对此类代码的输出,我几乎甚至可以声明两者将生成相同的指令流。从理论上讲,可能是错误的(虽然我使用过相当多的编译器,但肯定还有其他编译器我没有使用过),但实际上如果是这样,我会感到非常惊讶。虽然可能有一个或两个预先(循环外部)不同的指令,但我希望循环中的内容是相同的。
The standard answer to almost any question regarding performance is to use a profiler to see if this is a bottleneck and to see whether the change helps. In this case, however, I don't think that's particularly good advice. I've looked at the output from enough compilers for code like this, that I'd almost go so far as to state as a fact that the two will generate identical instruction streams. In theory that could be wrong (while I've played with quite a few compilers, there are certainly other I haven't played with), but in reality I'd be pretty surprised if it is. While there's likely to be an instruction or two up-front (outside the loop) that's different, I'd expect what's in the loop to be identical.
如果您只需要顺序访问向量的内容(不幸的是,您的示例显示了看似随机的访问,因此这不起作用,但也许这只是一个示例),那么通过使用迭代器遍历向量,您可能会获得显着的速度改进向量。我发现这种优化即使在打开了完整编译器优化的普通数组上也会产生显着的差异。
If you only need sequential access to the contents of the vector (and sadly your example shows seemingly random access, so this wouldn't work, but maybe it's just an example), you may get a significant speed improvement by using iterators to traverse the vector. I've seen this optimization make a noticable difference even on plain arrays with full compiler optimizations turned on.