编译器是否足够聪明，可以优化具有与静态方法参数相同的成员的函子？

发布于 2025-01-08 18:47:44 字数 1052 浏览 6 评论 0原文

我对跨多个编译器（GCC、MSVC、Clang）编写高性能代码感兴趣。我已经看到两种将函数作为编译时参数传递的模式，我很好奇编译器是否通常足够聪明以识别这两者是等效的，或者我是否要求太多。这是传递函子对象的 STL 样式：

  template<class InputIterator, class Predicate>
  InputIterator find_if ( InputIterator first, InputIterator last, Predicate pred )
  {
    for ( ; first!=last ; first++ ) if ( pred(*first) ) break;
    return first;
  }

下面是替代样式：

  template<class InputIterator, class Predicate, class PredData>
  InputIterator find_if ( InputIterator first, InputIterator last, PredData data )
  {
    for ( ; first!=last ; first++ ) if ( Predicate::eval(*first, data) ) break;
    return first;
  }

在 STL 样式中，您的 Predicate 类通常包含它需要的任何数据作为成员，并且您调用 operator() 来计算谓词。在替代样式中，您永远不会有 Predicate 对象，而是包含一个接受要检查的项目的静态方法，并且数据作为参数传递，而不是存储为 Predicate 上的成员。

使用 STL 风格时我有一些担心：

如果 Predicate 是一个单词或更小，编译器是否足够聪明，可以通过寄存器传递它？在替代风格中，该单词将是一个参数，因此编译器不必推断任何内容。
如果 Predicate 为空，它是否足够聪明，可以避免实例化和传递它？在替代风格中，谓词永远不会被实例化。

所以我的直觉是替代风格应该更快，但也许我低估了现代优化器。

原文

I'm interested in writing performant code across several compilers (GCC, MSVC, Clang). I've seen two patterns for passing functions as compile time arguments, and I'm curious if compilers are usually smart enough to recognize the two are equivalent, or if I'm asking too much. Here's the STL style, passing a functor object:

  template<class InputIterator, class Predicate>
  InputIterator find_if ( InputIterator first, InputIterator last, Predicate pred )
  {
    for ( ; first!=last ; first++ ) if ( pred(*first) ) break;
    return first;
  }

And here's the alternative style:

  template<class InputIterator, class Predicate, class PredData>
  InputIterator find_if ( InputIterator first, InputIterator last, PredData data )
  {
    for ( ; first!=last ; first++ ) if ( Predicate::eval(*first, data) ) break;
    return first;
  }

In the STL style, your Predicate class typically contains as members any data it needs, and you call operator() to evaluate the predicate. In the alternative style, you never have a Predicate object, rather it contains a static method that takes the item to be checked, and the data is passed as an argument rather than stored as a member on Predicate.

I have a few fears using the STL style:

If Predicate is a word or smaller, will the compiler be smart enough to pass it by register? In the alternative style the word would be an argument, so the compiler doesn't have to infer anything.
If Predicate is empty, will it be smart enough to avoid instantiating and passing it? In the alternative style Predicate is never instantiated.

So my intuition is the alternative style should be faster, but perhaps I'm underestimating modern optimizers.

分享到QQ

分享到微博