STL 迭代器会优化后缀 ++/-- 运算符的低效问题吗？

发布于 2024-11-16 21:00:45 字数 387 浏览 1 评论 0原文

我知道增量/减量运算符的后缀版本通常会由编译器针对内置类型进行优化（即不会进行复制），但迭代器是否属于这种情况？

它们本质上只是重载运算符，可以通过多种方式实现，但由于它们的行为是严格定义的，因此它们可以被优化吗？如果可以，它们是否由任何/许多编译器优化？

#include <vector> 

void foo(std::vector<int>& v){
  for (std::vector<int>::iterator i = v.begin();
       i!=v.end();
       i++){  //will this get optimised by the compiler?
    *i += 20;
  }
}

原文

I know that the postfix versions of the increment/decrement operators will generally be optimised by the compiler for built-in types (i.e. no copy will be made), but is this the case for iterators?

They are essentially just overloaded operators, and could be implemented in any number of ways, but since their behaviour is strictly defined, can they be optimised, and if so, are they by any/many compilers?

#include <vector> 

void foo(std::vector<int>& v){
  for (std::vector<int>::iterator i = v.begin();
       i!=v.end();
       i++){  //will this get optimised by the compiler?
    *i += 20;
  }
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

执着的年纪 2024-11-23 21:00:45

在 GNU GCC 的 STL 实现（版本 4.6.1）上的 std::vector 的特定情况下，我认为在足够高的优化级别上不会有性能差异。

vector 上的前向迭代器的实现由 __gnu_cxx::__normal_iterator 提供。让我们看看它的构造函数和后缀++运算符：

  explicit
  __normal_iterator(const _Iterator& __i) : _M_current(__i) { }

  __normal_iterator
  operator++(int)
  { return __normal_iterator(_M_current++); }

以及它在vector中的实例化：

  typedef __gnu_cxx::__normal_iterator<pointer, vector> iterator;

正如你所看到的，它内部对普通指针执行后缀增量，然后传递通过其自己的构造函数获取原始值，该构造函数将其保存到本地成员。通过死值分析可以轻松消除此代码。

但它真的优化了吗？让我们来看看吧。测试代码：

#include <vector>

void test_prefix(std::vector<int>::iterator &it)
{
    ++it;
}

void test_postfix(std::vector<int>::iterator &it)
{
    it++;
}

输出程序集（在 -Os 上）：

    .file   "test.cpp"
    .text
    .globl  _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB442:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE442:
    .size   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .globl  _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB443:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE443:
    .size   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .ident  "GCC: (Debian 4.6.0-10) 4.6.1 20110526 (prerelease)"
    .section    .note.GNU-stack,"",@progbits

如您所见，两种情况下都会输出完全相同的程序集。

当然，对于自定义迭代器或更复杂的数据类型来说，情况可能不一定如此。但看起来，对于向量来说，前缀和后缀（不捕获后缀返回值）具有相同的性能。

In the specific case of std::vector on GNU GCC's STL implementation (version 4.6.1), I don't think there would be a performance difference on sufficiently high optimization levels.

The implementation for forward iterators on vector is provided by __gnu_cxx::__normal_iterator<typename _Iterator, typename _Container>. Let's look at its constructor and postfix ++ operator:

  explicit
  __normal_iterator(const _Iterator& __i) : _M_current(__i) { }

  __normal_iterator
  operator++(int)
  { return __normal_iterator(_M_current++); }

And its instantiation in vector:

  typedef __gnu_cxx::__normal_iterator<pointer, vector> iterator;

As you can see, it internally performs a postfix increment on an ordinary pointer, then passes the original value through its own constructor, which saves it to a local member. This code should be trivial to eliminate through dead value analysis.

But is it optimized really? Let's find out. Test code:

#include <vector>

void test_prefix(std::vector<int>::iterator &it)
{
    ++it;
}

void test_postfix(std::vector<int>::iterator &it)
{
    it++;
}

Output assembly (on -Os):

    .file   "test.cpp"
    .text
    .globl  _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB442:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE442:
    .size   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .globl  _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB443:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE443:
    .size   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .ident  "GCC: (Debian 4.6.0-10) 4.6.1 20110526 (prerelease)"
    .section    .note.GNU-stack,"",@progbits

As you can see, exactly the same assembly is output in both cases.

Of course, this may not necessarily be the case for custom iterators, or more complex data types. But it appears that, for vector specifically, prefix and postfix (without capturing the postfix return value) have identical performance.

回复收藏 0 原文

~没有更多了~