使用生成器在 python 中创建惰性流

发布于 2025-01-17 05:47:03 字数 1091 浏览 2 评论 0原文

我一直在弄乱 python 中的流，并尝试生成汉明数（所有素因数为 2、3 或 5 的数字）。 Dijkstra 描述的这样做的标准方法是观察：

汉明数序列从 1 开始。
序列中其余值的形式为 2h、3h 和 5h，其中 h 是任意值汉明数。
h 是通过输出值 1，然后将 2h、3h 和 5h 合并在一起生成的。

我的实现是这样的：

def hamming():
    yield 1
    yield from merge(scale_stream(hamming(), 2), scale_stream(hamming(), 3))

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)

def scale_stream(stream, scalar):
    for e in stream:
        yield e * scalar

def stream_index(stream, n):
    for i, e in enumerate(stream):
        if i+1 == n:
            return e

print(stream_index(hamming(), 300))

这确实正确地生成了汉明数流，但是无论出于何种原因，它生成的时间越长，花费的时间就越多，尽管理论上时间复杂度应该是 O(N)。

我之前玩过其他流，但我对它们的直觉非常弱，所以我不知道这里发生了什么。我认为问题出在我定义 hamming(); 的递归方式上。我不知道每次对汉明的调用都可能产生一个必须并行运行的新版本的进程，从而减慢它的速度，这是否是一个问题。

老实说，就像我说的那样，我的进程非常糟糕当我运行它和调试时实际发生的情况的想法对我毫无帮助，所以如果有更多经验的人可以启发我，我将非常感激。

原文

I've been messing with streams in python and been trying to generate the Hamming numbers (all the numbers with prime factors of 2, 3, or 5 only). The standard way for doing so, described by Dijkstra, is to observe that:

The sequence of Hamming numbers begins with 1.
The remaining values in the sequence are of the form 2h, 3h, and 5h, where h is any
Hamming number.
h is be generated by outputting the value 1, and then merging together 2h, 3h, and 5h

My implementation is this:

def hamming():
    yield 1
    yield from merge(scale_stream(hamming(), 2), scale_stream(hamming(), 3))

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)

def scale_stream(stream, scalar):
    for e in stream:
        yield e * scalar

def stream_index(stream, n):
    for i, e in enumerate(stream):
        if i+1 == n:
            return e

print(stream_index(hamming(), 300))

This does correctly produce the stream of Hamming numbers, however for whatever reason it takes more and more time the longer it generates, even though in theory the time complexity should be O(N).

I have played around with other streams before but my intuition for them is pretty weak so I have no idea what is going on here. I think the issue is in the recursive way I defined hamming(); I don't know if it is an issue that every call to hamming might spawn a new version of the process that has to run in parallel thereby slowing it down.

Honestly though like I said I have a very poor idea of what actually happens when I run it and debugging has gotten me nowhere, so if someone with more experience can enlighten me I would really appreciate it.

分享到QQ

分享到微博