在容器上(在循环期间)重复调用 size() 是否不好?
出于效率原因,我总是避免编写这样的循环:
for(std::size_t i = 0; i < vec.size(); ++i) { ... }
其中 vec 是 STL 容器。相反,我要么
const std::size_t vec_size = vec.size();
for(std::size_t i = 0; i < vec_size; ++i) { ... }
使用容器迭代器,要么使用容器迭代器。
但第一个解决方案到底有多糟糕?我记得在迈耶斯(Meyers)中读到它将是二次的而不是线性的,因为向量不知道它的大小并且必须重复计数。但是现代编译器不会检测到这一点并将其优化掉吗?
For efficiency reasons, I always avoid writing loops like this:
for(std::size_t i = 0; i < vec.size(); ++i) { ... }
where vec
is an STL container. Instead, I either do
const std::size_t vec_size = vec.size();
for(std::size_t i = 0; i < vec_size; ++i) { ... }
or use the container iterators.
But how bad is the first solution really? I remember reading in Meyers that it will be quadratic instead of linear because the vector doesn't know its size and repeatedly has to count. But won't modern compilers detect this and optimize it away?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
vector::size()
是恒定时间的,通常实现为经过优化的简单内联函数。不要费心手动优化它。vector::size()
is constant-time and usually implemented as a trivial inline function that is optimised away. Don't bother hand-optimising it.您将
vector
和list
混淆了。向量
的大小值保存在向量中;列表
需要遍历实际列表。You're getting
vector
andlist
confused.vector
's size value is held in the vector;list
's requires transversal of the actual list.判断编译器是否优化某些内容的最简单方法是比较汇编语言编译器输出。
也就是说,这两段代码实际上并不等同。如果在迭代向量时向量的大小发生变化怎么办?编译器必须非常非常聪明才能最终证明向量的大小无法改变。
现在,在现实世界中,这种微小的优化真的值得付出额外的努力吗?
vec.size()
仅返回一个存储的值。它不会重新计算长度。The easiest way to tell if something is being optimized out by the compiler is to compare the assembly-language compiler output.
That said, the two chunks of code are not actually equivalent. What if the size of the vector changes while you're iterating over it? The compiler would have to be very, very smart to prove conclusively that the vector's size could not change.
Now, in the real world, is this tiny optimization really worth the extra effort? The
vec.size()
just returns a stored value. It doesn't re-count the length.考虑以下愚蠢的函数:
生成的实际程序集将取决于编译器和向量的实现,但我认为在大多数情况下,编译器必须重新读取向量每次循环时都会从内存中获取 的大小。这是因为
sumOut
指针可能会与向量的大小内部存储重叠(别名)(假设vector
将其大小存储在 int 中),因此该大小可能是由循环改变。如果你经常调用这样的函数,它可能会增加很多周期,因为你接触的内存超出了你的需要。三种可能的解决方案是:
将大小存储在局部变量中。
理想情况下,这将得到的大小
存储在寄存器中并避免触摸
完全记忆。即使必须
被编译器放入堆栈
应该能够订购
更有效地加载/存储。
在输出上使用
__restrict
指针。这告诉编译器
指针不可能
重叠任何其他内容,因此写入
它不需要重新加载任何东西
else.
反转循环。终止
条件现在检查 0
相反,所以
vec.size()
永远不会再次调用。
其中,我认为#1 是最干净的,但有些人可能更喜欢#3。 #2 可能是对读者最不友好的,但可能比其他更快(因为这意味着可以更有效地读取向量的数据)。
有关别名的更多信息,请参阅 Christer Ericson 关于内存优化的 GDC 演示;那里有一个与此几乎相同的示例。
Consider the following stupid function:
The actual assembly generated will depend on the compiler and implementation of
vector
, but I think in most cases, the compiler has to re-read thevector
's size from memory each time through the loop. This is because thesumOut
pointer could potentially overlap (alias) the vector's internal storage of the size (assuming thevector
stores its size in an int), so the size could be changed by the loop. If you call a function like this a lot, it could add up to a lot of cycles because you're touching memory more than you need.Three possible solutions are:
Store the size in a local variable.
Ideally, the size this will get
stored in a register and avoid touching
memory altogether. Even if it has to
get put on the stack, the compiler
should be able to order the
loads/stores more efficiently.
Use
__restrict
on the outputpointer. This tells the compiler
that the pointer can't possibly
overlap anything else, so writes to
it don't require reloading anything
else.
Reverse the loop. The termination
condition now checks against 0
instead, so
vec.size()
is nevercalled again.
Of those, I think #1 is the cleanest, but some people might prefer #3. #2 is the probably least reader-friendly, but might be faster than the others (because it means the vector's data could be read more efficiently).
For more info on aliasing, see Christer Ericson's GDC presentation on memory optimization; there's an example almost identical to this in there.