当前位置：文江博客话题详情

C++ 中的排列和组合库函数

发布于 2024-08-20 15:10:42 字数 92 浏览 5 评论 0原文

C++ 中使用最广泛的现有库是什么，可以给出 n 个元素中 k 个元素的所有组合和排列？

我不是在问算法，而是在问现有的库或方法。

谢谢。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

泅人 2024-08-27 15:10:42

我决定在这里测试 dman 和 Charles Bailey 的解决方案。我将它们分别称为解决方案 A 和 B。我的测试是访问 vector size = 100 的每个组合，一次 5 个。以下是测试代码：

测试代码

struct F
{
    unsigned long long count_;

    F() : count_(0) {}

    bool operator()(std::vector<int>::iterator, std::vector<int>::iterator)
    {++count_; return false;}
};

int main()
{
    typedef std::chrono::high_resolution_clock Clock;
    typedef std::chrono::duration<double> sec;
    typedef std::chrono::duration<double, std::nano> ns;
    int n = 100;
    std::vector<int> v(n);
    std::iota(v.begin(), v.end(), 0);
    std::vector<int>::iterator r = v.begin() + 5;
    F f;
    Clock::time_point t0 = Clock::now();
    do
    {
        f(v.begin(), r);
    } while (next_combination(v.begin(), r, v.end()));
    Clock::time_point t1 = Clock::now();
    sec s0 = t1 - t0;
    ns pvt0 = s0 / f.count_;
    std::cout << "N = " << v.size() << ", r = " << r-v.begin()
              << ", visits = " << f.count_ << '\n'
              << "\tnext_combination total = " << s0.count() << " seconds\n"
              << "\tnext_combination per visit = " << pvt0.count() << " ns";
}

所有代码都是在 2.8 GHz Intel Core i5 上使用 clang++ -O3 编译的。

解决方案 A

解决方案 A 会导致无限循环。即使我将 n 设得很小，这个程序也永远不会完成。随后因此而被否决。

解决方案 B

这是一个编辑。解决方案 B 在撰写此答案的过程中发生了变化。起初它给出了错误的答案，由于非常及时的更新，它现在给出了正确的答案。它打印出：

N = 100, r = 5, visits = 75287520
    next_combination total = 4519.84 seconds
    next_combination per visit = 60034.3 ns

解决方案 C

接下来我尝试了 N2639 看起来与解决方案 A 非常相似，但工作正常。我将这个解决方案称为C，它打印出：

N = 100, r = 5, visits = 75287520
    next_combination total = 6.42602 seconds
    next_combination per visit = 85.3531 ns

解决方案C比解决方案B快703倍。

解决方案D

最后找到了一个解决方案D此处。该解决方案具有不同的签名/风格，称为 for_each_combination，其使用方式与 std::for_each 非常相似。上面的驱动程序代码在计时器调用之间发生变化，如下所示：

Clock::time_point t0 = Clock::now();
f = for_each_combination(v.begin(), r, v.end(), f);
Clock::time_point t1 = Clock::now();

解决方案 D 打印出：

N = 100, r = 5, visits = 75287520
    for_each_combination = 0.498979 seconds
    for_each_combination per visit = 6.62765 ns

解决方案 D 比解决方案 C 快 12.9 倍，比解决方案 B 快 9000 多倍。

我认为这是一个相对较小的问题：只有 7500 万次访问。随着访问量增加到数十亿，这些算法之间的性能差异不断扩大。解决方案 B 已经很笨拙了。解决方案 C 最终变得难以处理。解决方案 D 是访问我所知道的所有组合的性能最高的算法。

显示解决方案 D 的链接还包含其他几种算法，用于枚举和访问具有各种属性的排列（循环、可逆等）。这些算法的设计都将性能作为目标之一。请注意，这些算法都不要求初始序列按排序顺序排列。这些元素甚至不需要是LessThanComparable。

I decided to test the solutions by dman and Charles Bailey here. I'll call them solutions A and B respectively. My test is visiting each combination of of a vector<int> size = 100, 5 at a time. Here's the test code:

Test Code

struct F
{
    unsigned long long count_;

    F() : count_(0) {}

    bool operator()(std::vector<int>::iterator, std::vector<int>::iterator)
    {++count_; return false;}
};

int main()
{
    typedef std::chrono::high_resolution_clock Clock;
    typedef std::chrono::duration<double> sec;
    typedef std::chrono::duration<double, std::nano> ns;
    int n = 100;
    std::vector<int> v(n);
    std::iota(v.begin(), v.end(), 0);
    std::vector<int>::iterator r = v.begin() + 5;
    F f;
    Clock::time_point t0 = Clock::now();
    do
    {
        f(v.begin(), r);
    } while (next_combination(v.begin(), r, v.end()));
    Clock::time_point t1 = Clock::now();
    sec s0 = t1 - t0;
    ns pvt0 = s0 / f.count_;
    std::cout << "N = " << v.size() << ", r = " << r-v.begin()
              << ", visits = " << f.count_ << '\n'
              << "\tnext_combination total = " << s0.count() << " seconds\n"
              << "\tnext_combination per visit = " << pvt0.count() << " ns";
}

All code was compiled using clang++ -O3 on a 2.8 GHz Intel Core i5.

Solution A

Solution A results in an infinite loop. Even when I make n very small, this program never completes. Subsequently downvoted for this reason.

Solution B

This is an edit. Solution B changed in the course of writing this answer. At first it gave incorrect answers and due to very prompt updating it now gives correct answers. It prints out:

N = 100, r = 5, visits = 75287520
    next_combination total = 4519.84 seconds
    next_combination per visit = 60034.3 ns

Solution C

Next I tried the solution from N2639 which looks very similar to solution A, but works correctly. I'll call this solution C and it prints out:

N = 100, r = 5, visits = 75287520
    next_combination total = 6.42602 seconds
    next_combination per visit = 85.3531 ns

Solution C is 703 times faster than solution B.

Solution D

Finally there is a solution D found here. This solution has a different signature / style and is called for_each_combination, and is used much like std::for_each. The driver code above changes between the timer calls like so:

Clock::time_point t0 = Clock::now();
f = for_each_combination(v.begin(), r, v.end(), f);
Clock::time_point t1 = Clock::now();

Solution D prints out:

N = 100, r = 5, visits = 75287520
    for_each_combination = 0.498979 seconds
    for_each_combination per visit = 6.62765 ns

Solution D is 12.9 times faster than solution C, and over 9000 times faster than solution B.

I consider this a relatively small problem: only 75 million visits. As the number of visits increases into the billions, the discrepancy in the performance between these algorithms continues to grow. Solution B is already unwieldy. Solution C eventually becomes unwieldy. Solution D is the highest performing algorithm to visit all combinations I'm aware of.

The link showing solution D also contains several other algorithms for enumerating and visiting permutations with various properties (circular, reversible, etc.). Each of these algorithms was designed with performance as one of the goals. And note that none of these algorithms requires the initial sequence to be in sorted order. The elements need not even be LessThanComparable.

回复收藏 0 原文

朮生 2024-08-27 15:10:42

组合：来自 Mark Nelson 的文章，我们有相同的主题next_combination 排列：从 STL 中我们有 std::next_permutation

   template <typename Iterator>
   inline bool next_combination(const Iterator first, Iterator k, const Iterator last)
   {
      if ((first == last) || (first == k) || (last == k))
         return false;
      Iterator itr1 = first;
      Iterator itr2 = last;
      ++itr1;
      if (last == itr1)
         return false;
      itr1 = last;
      --itr1;
      itr1 = k;
      --itr2;
      while (first != itr1)
      {
         if (*--itr1 < *itr2)
         {
            Iterator j = k;
            while (!(*itr1 < *j)) ++j;
            std::iter_swap(itr1,j);
            ++itr1;
            ++j;
            itr2 = k;
            std::rotate(itr1,j,last);
            while (last != j)
            {
               ++j;
               ++itr2;
            }
            std::rotate(k,itr2,last);
            return true;
         }
      }
      std::rotate(first,k,last);
      return false;
   }

Combinations: from Mark Nelson's article on the same topic we have next_combination Permutations: From STL we have std::next_permutation

   template <typename Iterator>
   inline bool next_combination(const Iterator first, Iterator k, const Iterator last)
   {
      if ((first == last) || (first == k) || (last == k))
         return false;
      Iterator itr1 = first;
      Iterator itr2 = last;
      ++itr1;
      if (last == itr1)
         return false;
      itr1 = last;
      --itr1;
      itr1 = k;
      --itr2;
      while (first != itr1)
      {
         if (*--itr1 < *itr2)
         {
            Iterator j = k;
            while (!(*itr1 < *j)) ++j;
            std::iter_swap(itr1,j);
            ++itr1;
            ++j;
            itr2 = k;
            std::rotate(itr1,j,last);
            while (last != j)
            {
               ++j;
               ++itr2;
            }
            std::rotate(k,itr2,last);
            return true;
         }
      }
      std::rotate(first,k,last);
      return false;
   }

回复收藏 0 原文

寄人书 2024-08-27 15:10:42

这个答案提供了一个最小的实施工作的解决方案。如果您想检索大输入范围的组合，它可能不会具有可接受的性能。

标准库具有 std::next_permutation，您可以轻松地从中构建 next_k_permutation 和 next_combination 。

template<class RandIt, class Compare>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last, Compare comp)
{
    std::sort(mid, last, std::tr1::bind(comp, std::tr1::placeholders::_2
                                            , std::tr1::placeholders::_1));
    return std::next_permutation(first, last, comp);
}

如果您没有 tr1::bind 或 boost::bind ，您需要构建一个函数对象，将参数交换为给定的比较。当然，如果您只对 next_combination 的 std::less 变体感兴趣，那么您可以直接使用 std::greater

template<class RandIt>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits<RandIt>::value_type value_type;

    std::sort(mid, last, std::greater< value_type >());
    return std::next_permutation(first, last);
}

：是 next_combination 的相对安全版本。如果您可以保证范围 [mid, last) 与调用 next_combination 后的顺序一致，那么您可以使用更简单的方法：

template<class BiDiIt, class Compare>
bool next_k_permutation(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    std::reverse(mid, last);
    return std::next_permutation(first, last, comp);
}

这也适用于 bi - 定向迭代器以及随机访问迭代器。

为了输出组合而不是 k 排列，我们必须确保每个组合仅输出一次，因此只有当它是按顺序排列的 k 排列时，我们才会返回组合。

template<class BiDiIt, class Compare>
bool next_combination(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    bool result;
    do
    {
        result = next_k_permutation(first, mid, last, comp);
    } while (std::adjacent_find( first, mid,
                             std::tr1::bind(comp, std::tr1::placeholders::_2
                                                , std::tr1::placeholders::_1) )
                                                                        != mid );
    return result;
}

替代方案是使用反向迭代器而不是参数交换 bind 调用，或者如果 std::less 显式使用 std::greater正在使用的比较。

This answer provides a minimal implementation effort solution. It may not have acceptable performance if you want to retrieve combinations for large input ranges.

The standard library has std::next_permutation and you can trivially build a next_k_permutation from it and a next_combination from that.

template<class RandIt, class Compare>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last, Compare comp)
{
    std::sort(mid, last, std::tr1::bind(comp, std::tr1::placeholders::_2
                                            , std::tr1::placeholders::_1));
    return std::next_permutation(first, last, comp);
}

If you don't have tr1::bind or boost::bind you would need to build a function object that swaps the arguments to a given comparison. Of course, if you're only interested in a std::less variant of next_combination then you can use std::greater directly:

template<class RandIt>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits<RandIt>::value_type value_type;

    std::sort(mid, last, std::greater< value_type >());
    return std::next_permutation(first, last);
}

This is a relatively safe version of next_combination. If you can guarantee that the range [mid, last) is in order as they would be after a call to next_combination then you can use the simpler:

template<class BiDiIt, class Compare>
bool next_k_permutation(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    std::reverse(mid, last);
    return std::next_permutation(first, last, comp);
}

This also works with bi-directional iterators as well as random access iterators.

To output combinations instead of k-permutations, we have to ensure that we output each combination only once, so we'll return a combination it only if it is a k-permutation in order.

template<class BiDiIt, class Compare>
bool next_combination(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    bool result;
    do
    {
        result = next_k_permutation(first, mid, last, comp);
    } while (std::adjacent_find( first, mid,
                             std::tr1::bind(comp, std::tr1::placeholders::_2
                                                , std::tr1::placeholders::_1) )
                                                                        != mid );
    return result;
}

Alternatives would be to use a reverse iterator instead of the parameter swapping bind call or to use std::greater explicitly if std::less is the comparison being used.

回复收藏 0 原文

跨年 2024-08-27 15:10:42

@查尔斯·贝利上面：

我可能是错的，但我认为上面的前两个算法不会删除第一个和中间之间的重复项？也许我不确定如何使用它。

4选2示例：
12 34
12 43（排序后）
13 24（next_permutation 之后）
13 42（排序后）
14 23（next_permutation 之后）
14 32（排序后）
21 34（在next_permutation之后）

所以我在返回之前添加了一个检查以查看斜体值是否按顺序排列，但绝对不会想到你写的部分（非常优雅！谢谢！）。

没有完全测试，只是粗略测试。


template
bool next_combination(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits< RandIt >::value_type value_type;
    std::sort(mid, last, std::greater< value_type >() );
    while(std::next_permutation(first, last)){
        if(std::adjacent_find(first, mid, std::greater< value_type >() ) == mid){
            return true;
        }
        std::sort(mid, last, std::greater< value_type >() );
    return false;
}

@ Charles Bailey above:

I could be wrong, but I think the first two algorithms above does not remove duplicates between first and mid? Maybe I am not sure how to use it.

4 choose 2 example:
12 34
12 43 (after sort)
13 24 (after next_permutation)
13 42 (after sort)
14 23 (after next_permutation)
14 32 (after sort)
21 34 (after next_permutation)

So I added a check to see if the value in italics is in order before returning, but definitely wouldn't have thought of the part you wrote though (very elegant! thanks!).

Not fully tested, just cursory tests..


template
bool next_combination(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits< RandIt >::value_type value_type;
    std::sort(mid, last, std::greater< value_type >() );
    while(std::next_permutation(first, last)){
        if(std::adjacent_find(first, mid, std::greater< value_type >() ) == mid){
            return true;
        }
        std::sort(mid, last, std::greater< value_type >() );
    return false;
}

回复收藏 0 原文

合约呢 2024-08-27 15:10:42

也许它已经在前面的答案中说明过，但在这里我找不到关于参数类型的完整通用方法，而且除了 Boost 之外，我也没有在现有的库例程中找到它。这是我在针对各种参数变化广泛的场景构建测试用例期间所需的通用方法。也许它对你也有帮助，至少对于类似的场景。（可用于有疑问的微小变化的排列和组合）

#include <vector>
#include <memory>

class SingleParameterToVaryBase
{
public:

  virtual bool varyNext() = 0;
  virtual void reset() = 0;
};

template <typename _DataType, typename _ParamVariationContType>
class SingleParameterToVary : public SingleParameterToVaryBase
{
public:

  SingleParameterToVary(
    _DataType& param,
    const _ParamVariationContType& valuesToVary) :
      mParameter(param)
    , mVariations(valuesToVary)
  {
    if (mVariations.empty())
      throw std::logic_error("Empty variation container for parameter");
    reset();
  }

  // Step to next parameter value, return false if end of value vector is reached
  virtual bool varyNext() override
  {
    ++mCurrentIt;

    const bool finished = mCurrentIt == mVariations.cend();

    if (finished)
    {
      return false;
    }
    else
    {
      mParameter = *mCurrentIt;
      return true;
    }
  }

  virtual void reset() override
  {
    mCurrentIt = mVariations.cbegin();
    mParameter = *mCurrentIt;
  }

private:

  typedef typename _ParamVariationContType::const_iterator ConstIteratorType;

  // Iterator to the actual values this parameter can yield
  ConstIteratorType mCurrentIt;

  _ParamVariationContType mVariations;

  // Reference to the parameter itself
  _DataType& mParameter;
};

class GenericParameterVariator
{
public:

  GenericParameterVariator() : mFinished(false)
  {
    reset();
  }

  template <typename _ParameterType, typename _ParameterVariationsType>
  void registerParameterToVary(
    _ParameterType& param,
    const _ParameterVariationsType& paramVariations)
  {
    mParametersToVary.push_back(
      std::make_unique<SingleParameterToVary<_ParameterType, _ParameterVariationsType>>(
        param, paramVariations));
  }

  const bool isFinished() const { return mFinished; }

  void reset()
  {
    mFinished = false;
    mNumTotalCombinationsVisited = 0;

    for (const auto& upParameter : mParametersToVary)
      upParameter->reset();
  }

  // Step into next state if possible
  bool createNextParameterPermutation()
  {
    if (mFinished || mParametersToVary.empty())
      return false;

    auto itPToVary = mParametersToVary.begin();
    while (itPToVary != mParametersToVary.end())
    {
      const auto& upParameter = *itPToVary;

      // If we are the very first configuration at all, do not vary.
      const bool variedSomething = mNumTotalCombinationsVisited == 0 ? true : upParameter->varyNext();
      ++mNumTotalCombinationsVisited;

      if (!variedSomething)
      {
        // If we were not able to vary the last parameter in our list, we are finished.
        if (std::next(itPToVary) == mParametersToVary.end())
        {
          mFinished = true;
          return false;
        }

        ++itPToVary;
        continue;
      }
      else
      {
        if (itPToVary != mParametersToVary.begin())
        {
          // Reset all parameters before this one
          auto itBackwd = itPToVary;
          do
          {
            --itBackwd;
            (*itBackwd)->reset();
          } while (itBackwd != mParametersToVary.begin());
        }

        return true;
      }
    }

    return true;
  }

private:

  // Linearized parameter set
  std::vector<std::unique_ptr<SingleParameterToVaryBase>> mParametersToVary;

  bool mFinished;
  size_t mNumTotalCombinationsVisited;

};

可能的用法：

GenericParameterVariator paramVariator;

  size_t param1;
  int param2;
  char param3;

  paramVariator.registerParameterToVary(param1, std::vector<size_t>{ 1, 2 });
  paramVariator.registerParameterToVary(param2, std::vector<int>{ -1, -2 });
  paramVariator.registerParameterToVary(param3, std::vector<char>{ 'a', 'b' });

  std::vector<std::tuple<size_t, int, char>> visitedCombinations;
  while (paramVariator.createNextParameterPermutation())
    visitedCombinations.push_back(std::make_tuple(param1, param2, param3));

生成：

(1, -1, 'a')
(2, -1, 'a')
(1, -2, 'a')
(2, -2, 'a')
(1, -1, 'b')
(2, -1, 'b')
(1, -2, 'b')
(2, -2, 'b')

当然，这可以进一步优化/专门化。例如，如果您想避免有效重复，您可以简单地添加哈希方案和/或避免函子。此外，由于参数作为引用保存，因此可以考虑通过删除复制/赋值构造函数和运算符来保护生成器免受可能的错误使用。

时间复杂度在理论排列复杂度范围内。

Maybe it's already stated within the previous answers, but here I cannot find a full generic way for this with respect to the parameter types and I also didn't find it within existing library routines besides Boost. This is a generic way I needed during test case construction for scenarios with a wide spread of various parameter variations. Maybe it's helpful to you too, at least for similar scenarios. (Usable for permutation and combination with minor changes in doubt)

#include <vector>
#include <memory>

class SingleParameterToVaryBase
{
public:

  virtual bool varyNext() = 0;
  virtual void reset() = 0;
};

template <typename _DataType, typename _ParamVariationContType>
class SingleParameterToVary : public SingleParameterToVaryBase
{
public:

  SingleParameterToVary(
    _DataType& param,
    const _ParamVariationContType& valuesToVary) :
      mParameter(param)
    , mVariations(valuesToVary)
  {
    if (mVariations.empty())
      throw std::logic_error("Empty variation container for parameter");
    reset();
  }

  // Step to next parameter value, return false if end of value vector is reached
  virtual bool varyNext() override
  {
    ++mCurrentIt;

    const bool finished = mCurrentIt == mVariations.cend();

    if (finished)
    {
      return false;
    }
    else
    {
      mParameter = *mCurrentIt;
      return true;
    }
  }

  virtual void reset() override
  {
    mCurrentIt = mVariations.cbegin();
    mParameter = *mCurrentIt;
  }

private:

  typedef typename _ParamVariationContType::const_iterator ConstIteratorType;

  // Iterator to the actual values this parameter can yield
  ConstIteratorType mCurrentIt;

  _ParamVariationContType mVariations;

  // Reference to the parameter itself
  _DataType& mParameter;
};

class GenericParameterVariator
{
public:

  GenericParameterVariator() : mFinished(false)
  {
    reset();
  }

  template <typename _ParameterType, typename _ParameterVariationsType>
  void registerParameterToVary(
    _ParameterType& param,
    const _ParameterVariationsType& paramVariations)
  {
    mParametersToVary.push_back(
      std::make_unique<SingleParameterToVary<_ParameterType, _ParameterVariationsType>>(
        param, paramVariations));
  }

  const bool isFinished() const { return mFinished; }

  void reset()
  {
    mFinished = false;
    mNumTotalCombinationsVisited = 0;

    for (const auto& upParameter : mParametersToVary)
      upParameter->reset();
  }

  // Step into next state if possible
  bool createNextParameterPermutation()
  {
    if (mFinished || mParametersToVary.empty())
      return false;

    auto itPToVary = mParametersToVary.begin();
    while (itPToVary != mParametersToVary.end())
    {
      const auto& upParameter = *itPToVary;

      // If we are the very first configuration at all, do not vary.
      const bool variedSomething = mNumTotalCombinationsVisited == 0 ? true : upParameter->varyNext();
      ++mNumTotalCombinationsVisited;

      if (!variedSomething)
      {
        // If we were not able to vary the last parameter in our list, we are finished.
        if (std::next(itPToVary) == mParametersToVary.end())
        {
          mFinished = true;
          return false;
        }

        ++itPToVary;
        continue;
      }
      else
      {
        if (itPToVary != mParametersToVary.begin())
        {
          // Reset all parameters before this one
          auto itBackwd = itPToVary;
          do
          {
            --itBackwd;
            (*itBackwd)->reset();
          } while (itBackwd != mParametersToVary.begin());
        }

        return true;
      }
    }

    return true;
  }

private:

  // Linearized parameter set
  std::vector<std::unique_ptr<SingleParameterToVaryBase>> mParametersToVary;

  bool mFinished;
  size_t mNumTotalCombinationsVisited;

};

Possible usage:

GenericParameterVariator paramVariator;

  size_t param1;
  int param2;
  char param3;

  paramVariator.registerParameterToVary(param1, std::vector<size_t>{ 1, 2 });
  paramVariator.registerParameterToVary(param2, std::vector<int>{ -1, -2 });
  paramVariator.registerParameterToVary(param3, std::vector<char>{ 'a', 'b' });

  std::vector<std::tuple<size_t, int, char>> visitedCombinations;
  while (paramVariator.createNextParameterPermutation())
    visitedCombinations.push_back(std::make_tuple(param1, param2, param3));

Generates:

(1, -1, 'a')
(2, -1, 'a')
(1, -2, 'a')
(2, -2, 'a')
(1, -1, 'b')
(2, -1, 'b')
(1, -2, 'b')
(2, -2, 'b')

For sure, this can be further optimized/specialized. For instance you can simply add a hashing scheme and/or an avoid functor if you want to avoid effective repetitions. Also, since the parameters are held as references, one might consider to protect the generator from possible error-prone usage via deleting copy/assignement constructors and operators.

Time complexity is within the theoretical permutation complexity range.

回复收藏 0 原文

~没有更多了~