使用生成器在 python 中创建惰性流

发布于 2025-01-17 05:47:03 字数 1091 浏览 2 评论 0原文

我一直在弄乱 python 中的流,并尝试生成汉明数(所有素因数为 2、3 或 5 的数字)。 Dijkstra 描述的这样做的标准方法是观察:

  1. 汉明数序列从 1 开始。
  2. 序列中其余值的形式为 2h、3h 和 5h,其中 h 是任意值 汉明数。
  3. h 是通过输出值 1,然后将 2h、3h 和 5h 合并在一起生成的。

我的实现是这样的:

def hamming():
    yield 1
    yield from merge(scale_stream(hamming(), 2), scale_stream(hamming(), 3))

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)

def scale_stream(stream, scalar):
    for e in stream:
        yield e * scalar

def stream_index(stream, n):
    for i, e in enumerate(stream):
        if i+1 == n:
            return e

print(stream_index(hamming(), 300))

这确实正确地生成了汉明数流,但是无论出于何种原因,它生成的时间越长,花费的时间就越多,尽管理论上时间复杂度应该是 O(N)。

我之前玩过其他流,但我对它们的直觉非常弱,所以我不知道这里发生了什么。我认为问题出在我定义 hamming(); 的递归方式上。我不知道每次对汉明的调用都可能产生一个必须并行运行的新版本的进程,从而减慢它的速度,这是否是一个问题。

老实说,就像我说的那样,我的进程非常糟糕当我运行它和调试时实际发生的情况的想法对我毫无帮助,所以如果有更多经验的人可以启发我,我将非常感激。

I've been messing with streams in python and been trying to generate the Hamming numbers (all the numbers with prime factors of 2, 3, or 5 only). The standard way for doing so, described by Dijkstra, is to observe that:

  1. The sequence of Hamming numbers begins with 1.
  2. The remaining values in the sequence are of the form 2h, 3h, and 5h, where h is any
    Hamming number.
  3. h is be generated by outputting the value 1, and then merging together 2h, 3h, and 5h

My implementation is this:

def hamming():
    yield 1
    yield from merge(scale_stream(hamming(), 2), scale_stream(hamming(), 3))

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)

def scale_stream(stream, scalar):
    for e in stream:
        yield e * scalar

def stream_index(stream, n):
    for i, e in enumerate(stream):
        if i+1 == n:
            return e

print(stream_index(hamming(), 300))

This does correctly produce the stream of Hamming numbers, however for whatever reason it takes more and more time the longer it generates, even though in theory the time complexity should be O(N).

I have played around with other streams before but my intuition for them is pretty weak so I have no idea what is going on here. I think the issue is in the recursive way I defined hamming(); I don't know if it is an issue that every call to hamming might spawn a new version of the process that has to run in parallel thereby slowing it down.

Honestly though like I said I have a very poor idea of what actually happens when I run it and debugging has gotten me nowhere, so if someone with more experience can enlighten me I would really appreciate it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

给妤﹃绝世温柔 2025-01-24 05:47:03

您进入流的距离越远,您需要合并的重复项就越多。数字 2**4 * 3 **4 = 1296 将在您的多个流中出现 70 次(8 选择 4),并且您的程序将花费更多的时间来合并重复项,而不是输出新项目。

你走得越远,你要处理的重复就越多。没有理由期望您的程序是线性的。

The further you get out into your stream, the more duplicates you're going to have to be merging. The number 2**4 * 3 **4 = 1296 is going to appear 70 times in your multiple stream (8 choose 4), and your program is going to be spending more time merging duplicates than it is outputting new items.

The further you go out, the more duplication you'r going to be dealing with. There is no reason to expect your program to be linear.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文