Python Shelf 模块内存消耗
我被分配的任务是读取 .txt 文件,该文件是各种事件的日志,并将其中一些事件写入字典。
问题是文件的大小有时会超过 3GB。这意味着字典变得太大而无法装入主内存。看来 Shelve 是解决这个问题的好方法。但是,由于我会不断修改字典,因此我必须启用 writeback
选项。这是我担心的地方 - 教程说这会减慢读/写过程并使用更多内存,但我无法找到有关速度和内存如何受到影响的统计数据。
谁能澄清读/写速度和内存受到的影响有多大,以便我可以决定是否使用写回选项或牺牲一些可读性来提高代码效率?
谢谢
I have been assigned the task of reading a .txt file which is a log of various events and writing some of those events into a dictionary.
The problem is that the file can sometimes get bigger than 3GB in size. This means that the dictionary gets too big to fit into main memory. It seems that Shelve is a good way to solve this problem. However, since I will be constantly modifying the dictionary, I must have the writeback
option enabled. This is where I am concerned - the tutorial says that this would slow down the read/write process and use more memory, but I am unable to find statistics on how the speed and memory are affected.
Can anyone clarify by how much the read/write speed and memory are affected so that I can decide whether to use the writeback option or sacrifice some readability for code efficiency?
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于这种规模的数据库,搁置确实是错误的工具。如果您不需要高可用客户端/服务器架构,而您只想将 TXT 文件转换为本地内存可访问数据库,那么您确实应该使用 ZODB
如果你需要高可用的东西,你当然需要切换到正式的“NoSQL”数据库,其中有很多可供选择。
这是一个简单的示例,说明如何将搁置数据库转换为 ZODB 数据库,这将解决您的内存使用/性能问题。
For databases this size, shelve really is the wrong tool. If you do not need a highly available client/server architecture, and you just want to convert your TXT file to a local in-memory-accessible database, you really should be using ZODB
If you need something highly-available, you will of course need to switch to a formal "NoSQL" database, of which there are many to choose from.
Here's a simple example of how to convert your shelve database to a ZODB database which will solve your memory usage / performance problems.