等效 C++ Python 生成器模式

发布于 2024-12-29 13:57:48 字数 753 浏览 3 评论 0 原文

我有一些示例 Python 代码,需要用 C++ 来模拟。我不需要任何特定的解决方案(例如基于协同例程的产量解决方案,尽管它们也是可接受的答案),我只需要以某种方式重现语义。

Python

这是一个基本的序列生成器,显然太大而无法存储具体化版本。

def pair_sequence():
    for i in range(2**32):
        for j in range(2**32):
            yield (i, j)

目标是维护上述序列的两个实例,并以半同步但分块的方式迭代它们。在下面的示例中,first_pass 使用对序列来初始化缓冲区,second_pass 重新生成相同的序列并再次处理缓冲区。

def run():
    seq1 = pair_sequence()
    seq2 = pair_sequence()

    buffer = [0] * 1000
    first_pass(seq1, buffer)
    second_pass(seq2, buffer)
    ... repeat ...

C++

我能在 C++ 中找到的唯一解决方案是用 C++ 协程模拟 yield,但我还没有找到任何关于如何做到这一点的好的参考。我也对这个问题的替代(非通用)解决方案感兴趣。我没有足够的内存预算来保存各遍之间的序列副本。

I've got some example Python code that I need to mimic in C++. I do not require any specific solution (such as co-routine based yield solutions, although they would be acceptable answers as well), I simply need to reproduce the semantics in some manner.

Python

This is a basic sequence generator, clearly too large to store a materialized version.

def pair_sequence():
    for i in range(2**32):
        for j in range(2**32):
            yield (i, j)

The goal is to maintain two instances of the sequence above, and iterate over them in semi-lockstep, but in chunks. In the example below the first_pass uses the sequence of pairs to initialize the buffer, and the second_pass regenerates the same exact sequence and processes the buffer again.

def run():
    seq1 = pair_sequence()
    seq2 = pair_sequence()

    buffer = [0] * 1000
    first_pass(seq1, buffer)
    second_pass(seq2, buffer)
    ... repeat ...

C++

The only thing I can find for a solution in C++ is to mimic yield with C++ coroutines, but I haven't found any good reference on how to do this. I'm also interested in alternative (non general) solutions for this problem. I do not have enough memory budget to keep a copy of the sequence between passes.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

只是偏爱你 2025-01-05 13:57:48

生成器存在于 C++ 中,只是有另一个名称:输入迭代器。例如,从 std::cin 读取类似于拥有 char 生成器。

您只需要了解生成器的作用:

  • 有一个数据块:局部变量定义一个状态
  • 一个 init 方法
  • 有一个“下一个”方法
  • 有一种发出终止信号的方法

有 你的例子很简单。从概念上讲:

struct State { unsigned i, j; };

State make();

void next(State&);

bool isDone(State const&);

当然,我们将其包装为一个适当的类:

class PairSequence:
    // (implicit aliases)
    public std::iterator<
        std::input_iterator_tag,
        std::pair<unsigned, unsigned>
    >
{
  // C++03
  typedef void (PairSequence::*BoolLike)();
  void non_comparable();
public:
  // C++11 (explicit aliases)
  using iterator_category = std::input_iterator_tag;
  using value_type = std::pair<unsigned, unsigned>;
  using reference = value_type const&;
  using pointer = value_type const*;
  using difference_type = ptrdiff_t;

  // C++03 (explicit aliases)
  typedef std::input_iterator_tag iterator_category;
  typedef std::pair<unsigned, unsigned> value_type;
  typedef value_type const& reference;
  typedef value_type const* pointer;
  typedef ptrdiff_t difference_type;

  PairSequence(): done(false) {}

  // C++11
  explicit operator bool() const { return !done; }

  // C++03
  // Safe Bool idiom
  operator BoolLike() const {
    return done ? 0 : &PairSequence::non_comparable;
  }

  reference operator*() const { return ij; }
  pointer operator->() const { return &ij; }

  PairSequence& operator++() {
    static unsigned const Max = std::numeric_limts<unsigned>::max();

    assert(!done);

    if (ij.second != Max) { ++ij.second; return *this; }
    if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }

    done = true;
    return *this;
  }

  PairSequence operator++(int) {
    PairSequence const tmp(*this);
    ++*this;
    return tmp;
  }

private:
  bool done;
  value_type ij;
};

所以嗯,是的...可能 C++ 有点冗长:)

Generators exist in C++, just under another name: Input Iterators. For example, reading from std::cin is similar to having a generator of char.

You simply need to understand what a generator does:

  • there is a blob of data: the local variables define a state
  • there is an init method
  • there is a "next" method
  • there is a way to signal termination

In your trivial example, it's easy enough. Conceptually:

struct State { unsigned i, j; };

State make();

void next(State&);

bool isDone(State const&);

Of course, we wrap this as a proper class:

class PairSequence:
    // (implicit aliases)
    public std::iterator<
        std::input_iterator_tag,
        std::pair<unsigned, unsigned>
    >
{
  // C++03
  typedef void (PairSequence::*BoolLike)();
  void non_comparable();
public:
  // C++11 (explicit aliases)
  using iterator_category = std::input_iterator_tag;
  using value_type = std::pair<unsigned, unsigned>;
  using reference = value_type const&;
  using pointer = value_type const*;
  using difference_type = ptrdiff_t;

  // C++03 (explicit aliases)
  typedef std::input_iterator_tag iterator_category;
  typedef std::pair<unsigned, unsigned> value_type;
  typedef value_type const& reference;
  typedef value_type const* pointer;
  typedef ptrdiff_t difference_type;

  PairSequence(): done(false) {}

  // C++11
  explicit operator bool() const { return !done; }

  // C++03
  // Safe Bool idiom
  operator BoolLike() const {
    return done ? 0 : &PairSequence::non_comparable;
  }

  reference operator*() const { return ij; }
  pointer operator->() const { return &ij; }

  PairSequence& operator++() {
    static unsigned const Max = std::numeric_limts<unsigned>::max();

    assert(!done);

    if (ij.second != Max) { ++ij.second; return *this; }
    if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }

    done = true;
    return *this;
  }

  PairSequence operator++(int) {
    PairSequence const tmp(*this);
    ++*this;
    return tmp;
  }

private:
  bool done;
  value_type ij;
};

So hum yeah... might be that C++ is a tad more verbose :)

久而酒知 2025-01-05 13:57:48

在 C++ 中,有迭代器,但实现迭代器并不简单:必须查阅 迭代器概念并仔细设计新的迭代器类来实现它们。值得庆幸的是,Boost 有一个 iterator_facade 模板应该有助于实现迭代器和迭代器兼容的生成器。

有时 无堆栈协程可用于实现迭代器

另请参阅这篇文章其中提到了 Christopher M. Kohlhoff 的 switch 黑客攻击和 Boost.Coroutine 作者:Oliver Kowalke。

PS我认为你也可以编写一种生成器使用lambdas

std::function<int()> generator = []{
  int i = 0;
  return [=]() mutable {
    return i < 10 ? i++ : -1;
  };
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

或者使用函子:

struct generator_t {
  int i = 0;
  int operator() () {
    return i < 10 ? i++ : -1;
  }
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

PS这是一个用以下实现的生成器Mordor 协程:

#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;

void testMordor() {
  Coroutine<int> coro ([](Coroutine<int>& self) {
    int i = 0; while (i < 9) self.yield (i++);
  });
  for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}

In C++ there are iterators, but implementing an iterator isn't straightforward: one has to consult the iterator concepts and carefully design the new iterator class to implement them. Thankfully, Boost has an iterator_facade template which should help implementing the iterators and iterator-compatible generators.

Sometimes a stackless coroutine can be used to implement an iterator.

P.S. See also this article which mentions both a switch hack by Christopher M. Kohlhoff and Boost.Coroutine by Oliver Kowalke.

P.S. I think you can also write a kind of generator with lambdas:

std::function<int()> generator = []{
  int i = 0;
  return [=]() mutable {
    return i < 10 ? i++ : -1;
  };
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

Or with a functor:

struct generator_t {
  int i = 0;
  int operator() () {
    return i < 10 ? i++ : -1;
  }
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

P.S. Here's a generator implemented with the Mordor coroutines:

#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;

void testMordor() {
  Coroutine<int> coro ([](Coroutine<int>& self) {
    int i = 0; while (i < 9) self.yield (i++);
  });
  for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}
智商已欠费 2025-01-05 13:57:48

由于 Boost.Coroutine2 现在很好地支持它(我找到它是因为我想准确地解决同样的 yield 问题),我发布了符合您初衷的 C++ 代码:

#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>

typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;

void pair_sequence(coro_t::push_type& yield)
{
    uint16_t i = 0;
    uint16_t j = 0;
    for (;;) {
        for (;;) {
            yield(std::make_pair(i, j));
            if (++j == 0)
                break;
        }
        if (++i == 0)
            break;
    }
}

int main()
{
    coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
                          pair_sequence);
    for (auto pair : seq) {
        print_pair(pair);
    }
    //while (seq) {
    //    print_pair(seq.get());
    //    seq();
    //}
}

在此示例中,pair_sequence 不接受额外的参数。如果需要,应使用 std::bind 或 lambda 来生成一个函数对象,该函数对象在传递给coro_t::pull_type 构造函数。

Since Boost.Coroutine2 now supports it very well (I found it because I wanted to solve exactly the same yield problem), I am posting the C++ code that matches your original intention:

#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>

typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;

void pair_sequence(coro_t::push_type& yield)
{
    uint16_t i = 0;
    uint16_t j = 0;
    for (;;) {
        for (;;) {
            yield(std::make_pair(i, j));
            if (++j == 0)
                break;
        }
        if (++i == 0)
            break;
    }
}

int main()
{
    coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
                          pair_sequence);
    for (auto pair : seq) {
        print_pair(pair);
    }
    //while (seq) {
    //    print_pair(seq.get());
    //    seq();
    //}
}

In this example, pair_sequence does not take additional arguments. If it needs to, std::bind or a lambda should be used to generate a function object that takes only one argument (of push_type), when it is passed to the coro_t::pull_type constructor.

梦亿 2025-01-05 13:57:48

所有涉及编写自己的迭代器的答案都是完全错误的。这样的答案完全忽略了 Python 生成器(该语言最伟大、最独特的功能之一)的意义。关于生成器最重要的一点是,执行会从中断处继续执行。这不会发生在迭代器上。相反,您必须手动存储状态信息,以便在重新调用operator++ 或operator* 时,正确的信息位于下一个函数调用的最开始处。这就是为什么编写自己的 C++ 迭代器是一件非常痛苦的事情;然而,生成器很优雅,并且易于读写。

我不认为原生 C++ 中的 Python 生成器有一个很好的模拟,至少目前还没有(有传言说 产量将登陆 C++17)。您可以通过求助于第三方(例如 Yongwei 的 Boost 建议)或自己推出类似的东西来获得类似的东西。

我想说原生 C++ 中最接近的东西是线程。线程可以维护一组挂起的局部变量,并且可以从中断处继续执行,非常类似于生成器,但是您需要滚动一些额外的基础结构来支持生成器对象与其调用者之间的通信。例如

// Infrastructure

template <typename Element>
class Channel { ... };

// Application

using IntPair = std::pair<int, int>;

void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
  for (int i = 0; i < end_i; ++i) {
    for (int j = 0; j < end_j; ++j) {
      out->send(IntPair{i, j});  // "yield"
    }
  }
  out->close();
}

void MyApp() {
  Channel<IntPair> pairs;
  std::thread generator(yield_pairs, 32, 32, &pairs);
  for (IntPair pair : pairs) {
    UsePair(pair);
  }
  generator.join();
}

,这个解决方案有几个缺点:

  1. 线程“昂贵”。大多数人会认为这是对线程的“奢侈”使用,尤其是当您的生成器如此简单时。
  2. 您需要记住一些清理操作。这些可以自动化,但您需要更多的基础设施,这又可能被视为“过于奢侈”。无论如何,您需要的清理工作是:
    1. out->close()
    2. generator.join()
  3. 这不允许您停止生成器。您可以进行一些修改来添加该功能,但这会使代码变得混乱。它永远不会像Python 的yield 语句那样干净。
  4. 除了 2 之外,每次想要“实例化”生成器对象时还需要其他一些样板:
    1. 通道*输出参数
    2. main 中的附加变量:pairs、generator

All answers that involve writing your own iterator are completely wrong. Such answers entirely miss the point of Python generators (one of the language's greatest and unique features). The most important thing about generators is that execution picks up where it left off. This does not happen to iterators. Instead, you must manually store state information such that when operator++ or operator* is called anew, the right information is in place at the very beginning of the next function call. This is why writing your own C++ iterator is a gigantic pain; whereas, generators are elegant, and easy to read+write.

I don't think there is a good analog for Python generators in native C++, at least not yet (there is a rummor that yield will land in C++17). You can get something similarish by resorting to third-party (e.g. Yongwei's Boost suggestion), or rolling your own.

I would say the closest thing in native C++ is threads. A thread can maintain a suspended set of local variables, and can continue execution where it left off, very much like generators, but you need to roll a little bit of additional infrastructure to support communication between the generator object and its caller. E.g.

// Infrastructure

template <typename Element>
class Channel { ... };

// Application

using IntPair = std::pair<int, int>;

void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
  for (int i = 0; i < end_i; ++i) {
    for (int j = 0; j < end_j; ++j) {
      out->send(IntPair{i, j});  // "yield"
    }
  }
  out->close();
}

void MyApp() {
  Channel<IntPair> pairs;
  std::thread generator(yield_pairs, 32, 32, &pairs);
  for (IntPair pair : pairs) {
    UsePair(pair);
  }
  generator.join();
}

This solution has several downsides though:

  1. Threads are "expensive". Most people would consider this to be an "extravagant" use of threads, especially when your generator is so simple.
  2. There are a couple of clean up actions that you need to remember. These could be automated, but you'd need even more infrastructure, which again, is likely to be seen as "too extravagant". Anyway, the clean ups that you need are:
    1. out->close()
    2. generator.join()
  3. This does not allow you to stop generator. You could make some modifications to add that ability, but it adds clutter to the code. It would never be as clean as Python's yield statement.
  4. In addition to 2, there are other bits of boilerplate that are needed each time you want to "instantiate" a generator object:
    1. Channel* out parameter
    2. Additional variables in main: pairs, generator
蔚蓝源自深海 2025-01-05 13:57:48

使用 range-v3

#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>

using namespace std;
using namespace ranges;

auto generator = [x = view::iota(0) | view::take(3)] {
    return view::cartesian_product(x, x);
};

int main () {
    for (auto x : generator()) {
        cout << get<0>(x) << ", " << get<1>(x) << endl;
    }

    return 0;
}

Using range-v3:

#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>

using namespace std;
using namespace ranges;

auto generator = [x = view::iota(0) | view::take(3)] {
    return view::cartesian_product(x, x);
};

int main () {
    for (auto x : generator()) {
        cout << get<0>(x) << ", " << get<1>(x) << endl;
    }

    return 0;
}
第七度阳光i 2025-01-05 13:57:48

您可能应该在 Visual Studio 2015 中检查 std::experimental 中的生成器,例如: https://blogs.msdn.microsoft.com/vcblog/2014/11/12/resumable-functions-in-c/

我认为这正是您正在寻找的。总体生成器应该在 C++17 中可用,因为这只是实验性的 Microsoft VC 功能。

You should probably check generators in std::experimental in Visual Studio 2015 e.g: https://blogs.msdn.microsoft.com/vcblog/2014/11/12/resumable-functions-in-c/

I think it's exactly what you are looking for. Overall generators should be available in C++17 as this is only experimental Microsoft VC feature.

瑶笙 2025-01-05 13:57:48

如果您只需要对相对少量的特定生成器执行此操作,则可以将每个生成器实现为一个类,其中成员数据相当于Python生成器函数的局部变量。然后你有一个 next 函数,它返回生成器将产生的下一个结果,并在执行过程中更新内部状态。

我相信这基本上与 Python 生成器的实现方式类似。主要区别在于它们可以记住生成器函数的字节码偏移量作为“内部状态”的一部分,这意味着生成器可以编写为包含收益的循环。您必须根据前一个值计算下一个值。对于您的pair_sequence来说,这是非常微不足道的。它可能不适用于复杂的发电机。

您还需要某种方式来指示终止。如果您返回的是“类似指针”,并且 NULL 不应该是有效的可生成值,您可以使用 NULL 指针作为终止指示符。否则您需要带外信号。

If you only need to do this for a relatively small number of specific generators, you can implement each as a class, where the member data is equivalent to the local variables of the Python generator function. Then you have a next function that returns the next thing the generator would yield, updating the internal state as it does so.

This is basically similar to how Python generators are implemented, I believe. The major difference being they can remember an offset into the bytecode for the generator function as part of the "internal state", which means the generators can be written as loops containing yields. You would have to instead calculate the next value from the previous. In the case of your pair_sequence, that's pretty trivial. It may not be for complex generators.

You also need some way of indicating termination. If what you're returning is "pointer-like", and NULL should not be a valid yieldable value you could use a NULL pointer as a termination indicator. Otherwise you need an out-of-band signal.

¢蛋碎的人ぎ生 2025-01-05 13:57:48

这样的事情非常相似:

struct pair_sequence
{
    typedef pair<unsigned int, unsigned int> result_type;
    static const unsigned int limit = numeric_limits<unsigned int>::max()

    pair_sequence() : i(0), j(0) {}

    result_type operator()()
    {
        result_type r(i, j);
        if(j < limit) j++;
        else if(i < limit)
        {
          j = 0;
          i++;
        }
        else throw out_of_range("end of iteration");
    }

    private:
        unsigned int i;
        unsigned int j;
}

使用operator()只是一个你想用这个生成器做什么的问题,你也可以将它构建为一个流并确保它适应istream_iterator,例如。

Something like this is very similar:

struct pair_sequence
{
    typedef pair<unsigned int, unsigned int> result_type;
    static const unsigned int limit = numeric_limits<unsigned int>::max()

    pair_sequence() : i(0), j(0) {}

    result_type operator()()
    {
        result_type r(i, j);
        if(j < limit) j++;
        else if(i < limit)
        {
          j = 0;
          i++;
        }
        else throw out_of_range("end of iteration");
    }

    private:
        unsigned int i;
        unsigned int j;
}

Using the operator() is only a question of what you want to do with this generator, you could also build it as a stream and make sure it adapts to an istream_iterator, for example.

画▽骨i 2025-01-05 13:57:48

嗯,今天我也在寻找 C++11 下的简单集合实现。事实上我很失望,因为我发现的一切都与 python 生成器或 C# 生成运算符之类的东西相去甚远……或者太复杂了。

目的是创建仅在需要时才会发出其项目的集合。

我希望它是这样的:

auto emitter = on_range<int>(a, b).yield(
    [](int i) {
         /* do something with i */
         return i * 2;
    });

我发现这篇文章,恕我直言,最佳答案是关于 boost.coroutine2,作者 Yongwei Wu。因为它最接近作者想要的。

值得学习 boost couroutines.. 我也许会在周末做。但到目前为止我正在使用我的非常小的实现。希望它对其他人有帮助。

下面是使用示例,然后是实现。

Example.cpp

#include <iostream>
#include "Generator.h"
int main() {
    typedef std::pair<int, int> res_t;

    auto emitter = Generator<res_t, int>::on_range(0, 3)
        .yield([](int i) {
            return std::make_pair(i, i * i);
        });

    for (auto kv : emitter) {
        std::cout << kv.first << "^2 = " << kv.second << std::endl;
    }

    return 0;
}

Generator.h

template<typename ResTy, typename IndexTy>
struct yield_function{
    typedef std::function<ResTy(IndexTy)> type;
};

template<typename ResTy, typename IndexTy>
class YieldConstIterator {
public:
    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;

    typedef YieldConstIterator<ResTy, IndexTy> mytype_t;
    typedef ResTy value_type;

    YieldConstIterator(index_t index, yield_function_t yieldFunction) :
            mIndex(index),
            mYieldFunction(yieldFunction) {}

    mytype_t &operator++() {
        ++mIndex;
        return *this;
    }

    const value_type operator*() const {
        return mYieldFunction(mIndex);
    }

    bool operator!=(const mytype_t &r) const {
        return mIndex != r.mIndex;
    }

protected:

    index_t mIndex;
    yield_function_t mYieldFunction;
};

template<typename ResTy, typename IndexTy>
class YieldIterator : public YieldConstIterator<ResTy, IndexTy> {
public:

    typedef YieldConstIterator<ResTy, IndexTy> parent_t;

    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;
    typedef ResTy value_type;

    YieldIterator(index_t index, yield_function_t yieldFunction) :
            parent_t(index, yieldFunction) {}

    value_type operator*() {
        return parent_t::mYieldFunction(parent_t::mIndex);
    }
};

template<typename IndexTy>
struct Range {
public:
    typedef IndexTy index_t;
    typedef Range<IndexTy> mytype_t;

    index_t begin;
    index_t end;
};

template<typename ResTy, typename IndexTy>
class GeneratorCollection {
public:

    typedef Range<IndexTy> range_t;

    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;
    typedef YieldIterator<ResTy, IndexTy> iterator;
    typedef YieldConstIterator<ResTy, IndexTy> const_iterator;

    GeneratorCollection(range_t range, const yield_function_t &yieldF) :
            mRange(range),
            mYieldFunction(yieldF) {}

    iterator begin() {
        return iterator(mRange.begin, mYieldFunction);
    }

    iterator end() {
        return iterator(mRange.end, mYieldFunction);
    }

    const_iterator begin() const {
        return const_iterator(mRange.begin, mYieldFunction);
    }

    const_iterator end() const {
        return const_iterator(mRange.end, mYieldFunction);
    }

private:
    range_t mRange;
    yield_function_t mYieldFunction;
};

template<typename ResTy, typename IndexTy>
class Generator {
public:
    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;

    typedef Generator<ResTy, IndexTy> mytype_t;
    typedef Range<IndexTy> parent_t;
    typedef GeneratorCollection<ResTy, IndexTy> finalized_emitter_t;
    typedef  Range<IndexTy> range_t;

protected:
    Generator(range_t range) : mRange(range) {}
public:
    static mytype_t on_range(index_t begin, index_t end) {
        return mytype_t({ begin, end });
    }

    finalized_emitter_t yield(yield_function_t f) {
        return finalized_emitter_t(mRange, f);
    }
protected:

    range_t mRange;
};      

Well, today I also was looking for easy collection implementation under C++11. Actually I was disappointed, because everything I found is too far from things like python generators, or C# yield operator... or too complicated.

The purpose is to make collection which will emit its items only when it is required.

I wanted it to be like this:

auto emitter = on_range<int>(a, b).yield(
    [](int i) {
         /* do something with i */
         return i * 2;
    });

I found this post, IMHO best answer was about boost.coroutine2, by Yongwei Wu. Since it is the nearest to what author wanted.

It is worth learning boost couroutines.. And I'll perhaps do on weekends. But so far I'm using my very small implementation. Hope it helps to someone else.

Below is example of use, and then implementation.

Example.cpp

#include <iostream>
#include "Generator.h"
int main() {
    typedef std::pair<int, int> res_t;

    auto emitter = Generator<res_t, int>::on_range(0, 3)
        .yield([](int i) {
            return std::make_pair(i, i * i);
        });

    for (auto kv : emitter) {
        std::cout << kv.first << "^2 = " << kv.second << std::endl;
    }

    return 0;
}

Generator.h

template<typename ResTy, typename IndexTy>
struct yield_function{
    typedef std::function<ResTy(IndexTy)> type;
};

template<typename ResTy, typename IndexTy>
class YieldConstIterator {
public:
    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;

    typedef YieldConstIterator<ResTy, IndexTy> mytype_t;
    typedef ResTy value_type;

    YieldConstIterator(index_t index, yield_function_t yieldFunction) :
            mIndex(index),
            mYieldFunction(yieldFunction) {}

    mytype_t &operator++() {
        ++mIndex;
        return *this;
    }

    const value_type operator*() const {
        return mYieldFunction(mIndex);
    }

    bool operator!=(const mytype_t &r) const {
        return mIndex != r.mIndex;
    }

protected:

    index_t mIndex;
    yield_function_t mYieldFunction;
};

template<typename ResTy, typename IndexTy>
class YieldIterator : public YieldConstIterator<ResTy, IndexTy> {
public:

    typedef YieldConstIterator<ResTy, IndexTy> parent_t;

    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;
    typedef ResTy value_type;

    YieldIterator(index_t index, yield_function_t yieldFunction) :
            parent_t(index, yieldFunction) {}

    value_type operator*() {
        return parent_t::mYieldFunction(parent_t::mIndex);
    }
};

template<typename IndexTy>
struct Range {
public:
    typedef IndexTy index_t;
    typedef Range<IndexTy> mytype_t;

    index_t begin;
    index_t end;
};

template<typename ResTy, typename IndexTy>
class GeneratorCollection {
public:

    typedef Range<IndexTy> range_t;

    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;
    typedef YieldIterator<ResTy, IndexTy> iterator;
    typedef YieldConstIterator<ResTy, IndexTy> const_iterator;

    GeneratorCollection(range_t range, const yield_function_t &yieldF) :
            mRange(range),
            mYieldFunction(yieldF) {}

    iterator begin() {
        return iterator(mRange.begin, mYieldFunction);
    }

    iterator end() {
        return iterator(mRange.end, mYieldFunction);
    }

    const_iterator begin() const {
        return const_iterator(mRange.begin, mYieldFunction);
    }

    const_iterator end() const {
        return const_iterator(mRange.end, mYieldFunction);
    }

private:
    range_t mRange;
    yield_function_t mYieldFunction;
};

template<typename ResTy, typename IndexTy>
class Generator {
public:
    typedef IndexTy index_t;
    typedef ResTy res_t;
    typedef typename yield_function<res_t, index_t>::type yield_function_t;

    typedef Generator<ResTy, IndexTy> mytype_t;
    typedef Range<IndexTy> parent_t;
    typedef GeneratorCollection<ResTy, IndexTy> finalized_emitter_t;
    typedef  Range<IndexTy> range_t;

protected:
    Generator(range_t range) : mRange(range) {}
public:
    static mytype_t on_range(index_t begin, index_t end) {
        return mytype_t({ begin, end });
    }

    finalized_emitter_t yield(yield_function_t f) {
        return finalized_emitter_t(mRange, f);
    }
protected:

    range_t mRange;
};      
旧伤还要旧人安 2025-01-05 13:57:48

可以通过简单的 goto 语句来实现yield 行为。因为很简单,我用 C 语言编写了它。

您在生成器函数中所需要做的就是:

  • 将所有变量声明为静态
  • 最后的产量退出用标签进行记忆
  • 变量在函数示例末尾重新初始化

#include <stdio.h>

typedef struct {
    int i, j;
} Pair;

// the function generate_pairs  can generate values in successive calls.
// - all variables are declared as static
// - last yield exit is memorized with a label
// - variables are reinitialized at the end of function
Pair* generate_pairs(int imax, int jmax)
{
    // all local variable are declared static. So they are declared at the beginning
    static int i = 0;
    static int j = 0;
    static Pair p;
    // the exit position is marked with a label
    static enum {EBEGIN, EYIELD1} tag_goto = EBEGIN;
    
    // I goto to the last exit position
    if (tag_goto == EYIELD1)
        goto TYIELD1;
    
    
    for (i=0; i<imax; i++)   {
        for (j=0; j<jmax; j++)   {
            p.i = i;   p.j = -j;
            
            // I manage the yield comportment
            tag_goto = EYIELD1;
            return &p;
            TYIELD1 : ;
        }
        j = 0;
    }
    
    // reinitialization of variables
    i = 0;   j = 0;   // in fact this reinitialization is not useful in this example
    tag_goto = EBEGIN;
    
    // NULL means ends of generator
    return NULL; 
}

int main()
{
    for (Pair *p = generate_pairs(2,4); p != NULL; p = generate_pairs(2,4))
    {
        printf("%d,%d\n",p->i,p->j);
    }
    printf("end\n");
    return 0;
}

It is possible to have yield comportment with simple goto statement. As it is simple, I wrote it in C.

All you have to do in your generator function is :

  • all variables are declared as static
  • last yield exit is memorized with a label
  • variables are reinitialized at the end of function

example :

#include <stdio.h>

typedef struct {
    int i, j;
} Pair;

// the function generate_pairs  can generate values in successive calls.
// - all variables are declared as static
// - last yield exit is memorized with a label
// - variables are reinitialized at the end of function
Pair* generate_pairs(int imax, int jmax)
{
    // all local variable are declared static. So they are declared at the beginning
    static int i = 0;
    static int j = 0;
    static Pair p;
    // the exit position is marked with a label
    static enum {EBEGIN, EYIELD1} tag_goto = EBEGIN;
    
    // I goto to the last exit position
    if (tag_goto == EYIELD1)
        goto TYIELD1;
    
    
    for (i=0; i<imax; i++)   {
        for (j=0; j<jmax; j++)   {
            p.i = i;   p.j = -j;
            
            // I manage the yield comportment
            tag_goto = EYIELD1;
            return &p;
            TYIELD1 : ;
        }
        j = 0;
    }
    
    // reinitialization of variables
    i = 0;   j = 0;   // in fact this reinitialization is not useful in this example
    tag_goto = EBEGIN;
    
    // NULL means ends of generator
    return NULL; 
}

int main()
{
    for (Pair *p = generate_pairs(2,4); p != NULL; p = generate_pairs(2,4))
    {
        printf("%d,%d\n",p->i,p->j);
    }
    printf("end\n");
    return 0;
}
从﹋此江山别 2025-01-05 13:57:48

类似 this 的内容:

示例使用:

using ull = unsigned long long;

auto main() -> int {
    for (ull val : range_t<ull>(100)) {
        std::cout << val << std::endl;
    }

    return 0;
}

将打印从 0 到 99 的数字

Something like this:

Example use:

using ull = unsigned long long;

auto main() -> int {
    for (ull val : range_t<ull>(100)) {
        std::cout << val << std::endl;
    }

    return 0;
}

Will print the numbers from 0 to 99

谷夏 2025-01-05 13:57:48

这个答案在 C 中有效(因此我认为在 C++ 中也有效)

#include<stdint.h>
//#include<stdio.h>

#define MAX (1ll << 32) //2^32

typedef struct {
    uint64_t i, j;
} Pair;

int generate_pairs(Pair* p)
{
    static uint64_t i = 0;
    static uint64_t j = 0;

    p->i = i;
    p->j = j;
    
    if(++j == MAX)
    {
        j = 0;
        if(++i == MAX)
        {
            return -1; // return -1 to indicate generator finished.
        }
    }
    
    return 1; // return non -1 to indicate generator not finished.
}

int main()
{
    while(1)
    {
        Pair p;
        int fin = generate_pairs(&p);
        
        //printf("%lld, %lld\n", p.i, p.j);
        
        if(fin == -1)
        {
            //printf("end");
            break;
        }
    }
    return 0;
}

这是模仿生成器的简单、非面向对象的方法。这对我来说正如预期的那样。

编辑:以前的代码是错误的,我已经更新了它。

注意:对于给定的问题,可以改进此代码,仅使用 uint32_t 而不是 uint64_t。

This answer works in C (and hence I think works in C++ too)

#include<stdint.h>
//#include<stdio.h>

#define MAX (1ll << 32) //2^32

typedef struct {
    uint64_t i, j;
} Pair;

int generate_pairs(Pair* p)
{
    static uint64_t i = 0;
    static uint64_t j = 0;

    p->i = i;
    p->j = j;
    
    if(++j == MAX)
    {
        j = 0;
        if(++i == MAX)
        {
            return -1; // return -1 to indicate generator finished.
        }
    }
    
    return 1; // return non -1 to indicate generator not finished.
}

int main()
{
    while(1)
    {
        Pair p;
        int fin = generate_pairs(&p);
        
        //printf("%lld, %lld\n", p.i, p.j);
        
        if(fin == -1)
        {
            //printf("end");
            break;
        }
    }
    return 0;
}

This is simple, non object-oriented way to mimic a generator. This worked as expected for me.

Edit: Previous code was erroneous and I have updated it.

Note: This code can be improved to use just uint32_t instead of uint64_t for the given question.

灰色世界里的红玫瑰 2025-01-05 13:57:48

我偶然发现了这篇文章,很旧的帖子,但它可能会对将来阅读它的其他人有所帮助。

我似乎不太明白这个问题,因为我没有看到这个问题。

基本上,我们需要保存 i & 的状态。 j。为什么不只使用一个简单的 C++ 类呢?

唯一的一点是知道在哪里停止,在我的示例中,我实际上返回 std::pair,第一个元素带有 bool 指示是否有更多元素,第二个元素带有实际值。

#include <iostream>
#include <functional>

using ReturnDataType = std::pair<long, long>;
using ReturnGeneratorType = std::pair<bool, ReturnDataType>;
size_t MaxIndex = 5;

class Generator
{
public:
  ReturnGeneratorType operator()()
  {
    while (m_i < MaxIndex)
    {
      while (m_j < MaxIndex)
      {
        return {true, {m_i, m_j++}};
      }
      ++m_i;
      m_j = 0;
    }

    return {false, {}};
  }

private:
  long m_i{0};
  long m_j{0};
};

所以我们可以像这样使用它:

int main(void)
{
  Generator sequence1;
  Generator sequence2;

  bool sequence1Finished = false;
  bool sequence2Finished = false;

  size_t blockIndex = 0;
  while (!sequence1Finished && !sequence2Finished)
  {
    std::cout << "Block number #" << blockIndex << "\n";
    for (int iteration = 0; iteration < 10; iteration++)
    {
      const auto [notFinished, result] = sequence1();
      if (notFinished)
      {
        std::cout << "Sequence1 returns (" << result.first << ", " << result.second << ")\n";
      }
      else
      {
        std::cout << "Sequence1 finished\n";
        sequence1Finished = true;
      }
    }
    for (int iteration = 0; iteration < 10; iteration++)
    {
      const auto [notFinished, result] = sequence2();
      if (notFinished)
      {
        std::cout << "Sequence2 returns (" << result.first << ", " << result.second << ")\n";
      }
      else
      {
        std::cout << "Sequence2 finished\n";
        sequence2Finished = true;
      }
    }
  }
}

输出:

Block number #0
Sequence1 returns (0, 0)
Sequence1 returns (0, 1)
Sequence1 returns (0, 2)
Sequence1 returns (0, 3)
Sequence1 returns (0, 4)
Sequence1 returns (1, 0)
Sequence1 returns (1, 1)
Sequence1 returns (1, 2)
Sequence1 returns (1, 3)
Sequence1 returns (1, 4)
Sequence2 returns (0, 0)
Sequence2 returns (0, 1)
Sequence2 returns (0, 2)
Sequence2 returns (0, 3)
Sequence2 returns (0, 4)
Sequence2 returns (1, 0)
Sequence2 returns (1, 1)
Sequence2 returns (1, 2)
Sequence2 returns (1, 3)
Sequence2 returns (1, 4)
Block number #1
Sequence1 returns (2, 0)
Sequence1 returns (2, 1)
Sequence1 returns (2, 2)
Sequence1 returns (2, 3)
Sequence1 returns (2, 4)
Sequence1 returns (3, 0)
Sequence1 returns (3, 1)
Sequence1 returns (3, 2)
Sequence1 returns (3, 3)
Sequence1 returns (3, 4)
Sequence2 returns (2, 0)
Sequence2 returns (2, 1)
Sequence2 returns (2, 2)
Sequence2 returns (2, 3)
Sequence2 returns (2, 4)
Sequence2 returns (3, 0)
Sequence2 returns (3, 1)
Sequence2 returns (3, 2)
Sequence2 returns (3, 3)
Sequence2 returns (3, 4)
Block number #2
Sequence1 returns (4, 0)
Sequence1 returns (4, 1)
Sequence1 returns (4, 2)
Sequence1 returns (4, 3)
Sequence1 returns (4, 4)
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence2 returns (4, 0)
Sequence2 returns (4, 1)
Sequence2 returns (4, 2)
Sequence2 returns (4, 3)
Sequence2 returns (4, 4)
Sequence2 finished
Sequence2 finished
Sequence2 finished
Sequence2 finished
Sequence2 finished

这只是普通的 C++。现在,它必须是一个函数而不是一个 C++ 对象吗?
很简单,我们将对象封装在 lambda 中:

  auto sequence1 = []() {static Generator sequence; return sequence();};  
  auto sequence2 = []() {static Generator sequence; return sequence();};

有时我需要 C++ 中的生成器,如果代码不长,我只需将它们封装在 lambda 中。内部声明的静态成员仅属于该 lambda 实例,因此我可以拥有多个实例。这不适用于仅一个函数,其中静态存储只是一个函数。

I came about this by chance, very old post, but it may help some others who read it in the future.

I don't quite understand the problem, it seems, because I don't see the problem.

Basically, we need to save the states of i & j. Why not use just a simple C++ class?

The only point is to know where to stop, and in my example I am actually returning std::pair, first element with a bool indicating if there are more elements, and the second with the real value.

#include <iostream>
#include <functional>

using ReturnDataType = std::pair<long, long>;
using ReturnGeneratorType = std::pair<bool, ReturnDataType>;
size_t MaxIndex = 5;

class Generator
{
public:
  ReturnGeneratorType operator()()
  {
    while (m_i < MaxIndex)
    {
      while (m_j < MaxIndex)
      {
        return {true, {m_i, m_j++}};
      }
      ++m_i;
      m_j = 0;
    }

    return {false, {}};
  }

private:
  long m_i{0};
  long m_j{0};
};

And so we can use it like this :

int main(void)
{
  Generator sequence1;
  Generator sequence2;

  bool sequence1Finished = false;
  bool sequence2Finished = false;

  size_t blockIndex = 0;
  while (!sequence1Finished && !sequence2Finished)
  {
    std::cout << "Block number #" << blockIndex << "\n";
    for (int iteration = 0; iteration < 10; iteration++)
    {
      const auto [notFinished, result] = sequence1();
      if (notFinished)
      {
        std::cout << "Sequence1 returns (" << result.first << ", " << result.second << ")\n";
      }
      else
      {
        std::cout << "Sequence1 finished\n";
        sequence1Finished = true;
      }
    }
    for (int iteration = 0; iteration < 10; iteration++)
    {
      const auto [notFinished, result] = sequence2();
      if (notFinished)
      {
        std::cout << "Sequence2 returns (" << result.first << ", " << result.second << ")\n";
      }
      else
      {
        std::cout << "Sequence2 finished\n";
        sequence2Finished = true;
      }
    }
  }
}

Output :

Block number #0
Sequence1 returns (0, 0)
Sequence1 returns (0, 1)
Sequence1 returns (0, 2)
Sequence1 returns (0, 3)
Sequence1 returns (0, 4)
Sequence1 returns (1, 0)
Sequence1 returns (1, 1)
Sequence1 returns (1, 2)
Sequence1 returns (1, 3)
Sequence1 returns (1, 4)
Sequence2 returns (0, 0)
Sequence2 returns (0, 1)
Sequence2 returns (0, 2)
Sequence2 returns (0, 3)
Sequence2 returns (0, 4)
Sequence2 returns (1, 0)
Sequence2 returns (1, 1)
Sequence2 returns (1, 2)
Sequence2 returns (1, 3)
Sequence2 returns (1, 4)
Block number #1
Sequence1 returns (2, 0)
Sequence1 returns (2, 1)
Sequence1 returns (2, 2)
Sequence1 returns (2, 3)
Sequence1 returns (2, 4)
Sequence1 returns (3, 0)
Sequence1 returns (3, 1)
Sequence1 returns (3, 2)
Sequence1 returns (3, 3)
Sequence1 returns (3, 4)
Sequence2 returns (2, 0)
Sequence2 returns (2, 1)
Sequence2 returns (2, 2)
Sequence2 returns (2, 3)
Sequence2 returns (2, 4)
Sequence2 returns (3, 0)
Sequence2 returns (3, 1)
Sequence2 returns (3, 2)
Sequence2 returns (3, 3)
Sequence2 returns (3, 4)
Block number #2
Sequence1 returns (4, 0)
Sequence1 returns (4, 1)
Sequence1 returns (4, 2)
Sequence1 returns (4, 3)
Sequence1 returns (4, 4)
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence1 finished
Sequence2 returns (4, 0)
Sequence2 returns (4, 1)
Sequence2 returns (4, 2)
Sequence2 returns (4, 3)
Sequence2 returns (4, 4)
Sequence2 finished
Sequence2 finished
Sequence2 finished
Sequence2 finished
Sequence2 finished

This is just normal C++. Now, does it have to be a function and not a C++ object?
Simple, we encapsulate the objects in a lambda:

  auto sequence1 = []() {static Generator sequence; return sequence();};  
  auto sequence2 = []() {static Generator sequence; return sequence();};

Sometimes I need generators in C++, and if the code is not long I just encapsulate them in a lambda. The static members declared inside belong only to that instance of the lambda, so I can have several instances. This would not work with just a function, where static storage is only one.

青衫儰鉨ミ守葔 2025-01-05 13:57:48

正如函数模拟堆栈的概念一样,生成器模拟队列的概念。剩下的就是语义了。

附带说明一下,您始终可以通过使用操作堆栈而不是数据来模拟带有堆栈的队列。这实际上意味着您可以通过返回一对来实现类似队列的行为,其中第二个值要么具有要调用的下一个函数,要么表明我们没有值。但这比收益率与回报的关系更为普遍。它允许模拟任何值的队列,而不是您期望从生成器获得的同类值,但不保留完整的内部队列。

更具体地说,由于 C++ 没有对队列的自然抽象,因此您需要使用在内部实现队列的构造。因此,给出迭代器示例的答案是该概念的一个不错的实现。

这实际上意味着,如果您只想快速执行某些操作,然后像使用生成器生成的值一样使用队列的值,则可以使用基本的队列功能来实现某些功能。

Just as a function simulates the concept of a stack, generators simulate the concept of a queue. The rest is semantics.

As a side note, you can always simulate a queue with a stack by using a stack of operations instead of data. What that practically means is that you can implement a queue-like behavior by returning a pair, the second value of which either has the next function to be called or indicates that we are out of values. But this is more general than what yield vs return does. It allows to simulate a queue of any values rather than homogeneous values that you expect from a generator, but without keeping a full internal queue.

More specifically, since C++ does not have a natural abstraction for a queue, you need to use constructs which implement a queue internally. So the answer which gave the example with iterators is a decent implementation of the concept.

What this practically means is that you can implement something with bare-bones queue functionality if you just want something quick and then consume queue's values just as you would consume values yielded from a generator.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文