编译器是否足够聪明,可以优化具有与静态方法参数相同的成员的函子?

发布于 2025-01-08 18:47:44 字数 1052 浏览 6 评论 0原文

我对跨多个编译器(GCC、MSVC、Clang)编写高性能代码感兴趣。我已经看到两种将函数作为编译时参数传递的模式,我很好奇编译器是否通常足够聪明以识别这两者是等效的,或者我是否要求太多。这是传递函子对象的 STL 样式:

  template<class InputIterator, class Predicate>
  InputIterator find_if ( InputIterator first, InputIterator last, Predicate pred )
  {
    for ( ; first!=last ; first++ ) if ( pred(*first) ) break;
    return first;
  }

下面是替代样式:

  template<class InputIterator, class Predicate, class PredData>
  InputIterator find_if ( InputIterator first, InputIterator last, PredData data )
  {
    for ( ; first!=last ; first++ ) if ( Predicate::eval(*first, data) ) break;
    return first;
  }

在 STL 样式中,您的 Predicate 类通常包含它需要的任何数据作为成员,并且您调用 operator() 来计算谓词。在替代样式中,您永远不会有 Predicate 对象,而是包含一个接受要检查的项目的静态方法,并且数据作为参数传递,而不是存储为 Predicate 上的成员。

使用 STL 风格时我有一些担心:

  1. 如果 Predicate 是一个单词或更小,编译器是否足够聪明,可以通过寄存器传递它?在替代风格中,该单词将是一个参数,因此编译器不必推断任何内容。
  2. 如果 Predicate 为空,它是否足够聪明,可以避免实例化和传递它?在替代风格中,谓词永远不会被实例化。

所以我的直觉是替代风格应该更快,但也许我低估了现代优化器。

I'm interested in writing performant code across several compilers (GCC, MSVC, Clang). I've seen two patterns for passing functions as compile time arguments, and I'm curious if compilers are usually smart enough to recognize the two are equivalent, or if I'm asking too much. Here's the STL style, passing a functor object:

  template<class InputIterator, class Predicate>
  InputIterator find_if ( InputIterator first, InputIterator last, Predicate pred )
  {
    for ( ; first!=last ; first++ ) if ( pred(*first) ) break;
    return first;
  }

And here's the alternative style:

  template<class InputIterator, class Predicate, class PredData>
  InputIterator find_if ( InputIterator first, InputIterator last, PredData data )
  {
    for ( ; first!=last ; first++ ) if ( Predicate::eval(*first, data) ) break;
    return first;
  }

In the STL style, your Predicate class typically contains as members any data it needs, and you call operator() to evaluate the predicate. In the alternative style, you never have a Predicate object, rather it contains a static method that takes the item to be checked, and the data is passed as an argument rather than stored as a member on Predicate.

I have a few fears using the STL style:

  1. If Predicate is a word or smaller, will the compiler be smart enough to pass it by register? In the alternative style the word would be an argument, so the compiler doesn't have to infer anything.
  2. If Predicate is empty, will it be smart enough to avoid instantiating and passing it? In the alternative style Predicate is never instantiated.

So my intuition is the alternative style should be faster, but perhaps I'm underestimating modern optimizers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

︶ ̄淡然 2025-01-15 18:47:44

根据我个人使用 gcc (g++) 的经验,现代编译器绝对有能力优化函子。这可能不成立的一种情况是函子位于不同的编译单元中。

也就是说,这不应该阻止您,C++ 库和方向通过使用现代风格来奖励您,它是一种使用抽象的更易于管理的语言。

我进行了一个实验,比较使用 for、std::for_each(带函数)、std::for_each(带函子)和 std::for_each(带 lambda)的 for 循环。编译器能够看到过去所有内联的指令,并且每个指令都具有相同的执行时间和指令数量(尽管指令的结构略有不同)。

最后,Herb Sutter 在他的一次演讲中(我认为是在构建时)表示,C++ 风格相对于 C 风格仅增加了 3% 的开销,与其更高的安全性相比,这根本不算什么。

From my personal experience with gcc (g++) modern compilers are absolutely capable of optimizing functors. The one case where this might not be true is when a functor is in a different compilation unit.

That said this should not deter you, the C++ library and direction rewards you by using the modern style, it is a much more manageable language using the abstractions.

I ran a experiment comparing a for loop using for, std::for_each (with function), std::for_each (with functor) and std::for_each (with lambda). The compiler was able to see past that inlined them all and each has the same execution time and number of instructions (although the structure of the instructions where slightly different).

Finally, Herb Sutter said in one of his presentations (at build i think) that C++ style over C style add only 3% overhead, which is nothing in comparison to its greater security.

总攻大人 2025-01-15 18:47:44

编译器可以执行多种优化,但需要记住的一件事是,不同的编译器将具有不同的优化,并且对一个编译器最有效的优化可能不会对另一个编译器产生影响。

在 C++11 中,有移动语义可以优化 Predicate 对象的副本。由于这是标准中的,所以所有编译器都应该实现相同的优化,并且第一种样式将具有与第二种样式接近的性能。

支持 STL 风格的另一点是,作为一种常见模式,您可能有更多编译器优化的机会,因为编译器供应商将针对这些使用模式。

此外,您应该使用分析器评估性能增益,因为程序员通常不擅长猜测代码中的瓶颈是什么以及在哪里。

The compiler can performa a number of optimizations, but one thing to keep in mind is that different compilers will have different optimizations and what works best on one might not have an affect on another.

In C++11 there are move semantics that can optimize the copy of the Predicate object. Since this is in the standard, all compilers should implement this same optimization, and the first style would have close performance as the second one.

Another point in favor of the STL style is that being a common pattern, you probably have more chances of compiler optimizations, as compiler vendors will be targeting those usage patterns.

Also, you should evaluate the performance gains with a profiler, since programmers are usually bad at guessing what and where bottlenecks in code are.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文