Python:获取生成器中的项目数而不存储项目

发布于 2024-09-07 22:49:57 字数 191 浏览 10 评论 0原文

我有一个可以生成大量物品的生成器。我想遍历它们一次,将它们输出到文件中。但是,根据我当前拥有的文件格式,我首先必须输出我拥有的项目数。我不想在内存中建立一个项目列表,因为它们太多了,这会花费大量的时间和内存。有没有一种方法可以迭代生成器,获取其长度,但稍后能够以某种方式再次迭代它,获取相同的项目?

如果不是,我还能想出什么其他解决方案来解决这个问题?

I have a generator for a large set of items. I want to iterate through them once, outputting them to a file. However, with the file format I currently have, I first have to output the number of items I have. I don't want to build a list of the items in memory, as there are too many of them and that would take a lot of time and memory. Is there a way to iterate through the generator, getting its length, but somehow be able to iterate through it again later, getting the same items?

If not, what other solution could I come up with for this problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃酥萝莉 2024-09-14 22:49:57

如果您能弄清楚如何编写一个公式来根据控制生成器的参数来计算大小,那就这样做吧。否则,我认为你不会节省太多时间。

在此添加生成器,我们将尽力为您完成!

If you can figure out how to just write a formula to calculate the size based on the parameters that control the generator, do that. Otherwise, I don't think you would save much time.

Include the generator here, and we'll try to do it for you!

简单气质女生网名 2024-09-14 22:49:57

这是不可能的。一旦发电机耗尽,就需要对其进行重建才能再次使用。如果提前知道项目数量,则可以在迭代器对象上定义 __len__() 方法,然后可以针对迭代器调用 len()目的。

This cannot be done. Once a generator is exhausted it needs to be reconstructed in order to be used again. It is possible to define the __len__() method on an iterator object if the number of items is known ahead of time, and then len() can be called against the iterator object.

紫罗兰の梦幻 2024-09-14 22:49:57

我认为这对于任何通用迭代器都是不可能的。您需要弄清楚生成器最初是如何构建的,然后在最后一遍重新生成它。

或者,您可以将虚拟大小写入文件,写入项目,然后重新打开文件进行修改并更正标题中的大小。

如果您的文件是二进制格式,则这可以很好地工作,因为无论实际大小是多少,大小的字节数都是相同的。如果它是文本格式,如果您无法填充虚拟大小以覆盖所有情况,则可能需要向文件添加一些额外的长度。有关插入和重写的讨论,请参阅此问题使用 Python 在文本文件中。

I don't think that is possible for any generalized iterator. You will need to figure out how the generator was originally constructed and then regenerate it for the final pass.

Alternatively, you could write out a dummy size to your file, write the items, and then reopen the file for modification and correct the size in the header.

If your file is a binary format, this could work quite well, since the number of bytes for the size is the same regardless of what the actual size is. If it is a text format, it is possible that you would have to add some extra length to the file if you weren't able to pad the dummy size to cover all cases. See this question for a discussion on inserting and rewriting in a text file using Python.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文