Python Shelf 模块内存消耗

发布于 2024-11-09 15:12:48 字数 367 浏览 7 评论 0原文

我被分配的任务是读取 .txt 文件,该文件是各种事件的日志,并将其中一些事件写入字典。

问题是文件的大小有时会超过 3GB。这意味着字典变得太大而无法装入主内存。看来 Shelve 是解决这个问题的好方法。但是,由于我会不断修改字典,因此我必须启用 writeback 选项。这是我担心的地方 - 教程说这会减慢读/写过程并使用更多内存,但我无法找到有关速度和内存如何受到影响的统计数据。

谁能澄清读/写速度和内存受到的影响有多大,以便我可以决定是否使用写回选项或牺牲一些可读性来提高代码效率?

谢谢

I have been assigned the task of reading a .txt file which is a log of various events and writing some of those events into a dictionary.

The problem is that the file can sometimes get bigger than 3GB in size. This means that the dictionary gets too big to fit into main memory. It seems that Shelve is a good way to solve this problem. However, since I will be constantly modifying the dictionary, I must have the writeback option enabled. This is where I am concerned - the tutorial says that this would slow down the read/write process and use more memory, but I am unable to find statistics on how the speed and memory are affected.

Can anyone clarify by how much the read/write speed and memory are affected so that I can decide whether to use the writeback option or sacrifice some readability for code efficiency?

Thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

新一帅帅 2024-11-16 15:12:48

对于这种规模的数据库,搁置确实是错误的工具。如果您不需要高可用客户端/服务器架构,而您只想将 TXT 文件转换为本地内存可访问数据库,那么您确实应该使用 ZODB

如果你需要高可用的东西,你当然需要切换到正式的“NoSQL”数据库,其中有很多可供选择。

这是一个简单的示例,说明如何将搁置数据库转换为 ZODB 数据库,这将解决您的内存使用/性能问题。

#!/usr/bin/env python
import shelve
import ZODB, ZODB.FileStorage
import transaction
from optparse import OptionParser
import os
import sys
import re

reload(sys)
sys.setdefaultencoding("utf-8")

parser = OptionParser()

parser.add_option("-o", "--output", dest = "out_file", default = False, help ="original shelve database filename")
parser.add_option("-i", "--input", dest = "in_file", default = False, help ="new zodb database filename")

parser.set_defaults()
options, args = parser.parse_args()

if options.in_file == False or options.out_file == False :
    print "Need input and output database filenames"
    exit(1)

db = shelve.open(options.in_file, writeback=True)
zstorage = ZODB.FileStorage.FileStorage(options.out_file)
zdb = ZODB.DB(zstorage)
zconnection = zdb.open()
newdb = zconnection.root()

for key, value in db.iteritems() :
    print "Copying key: " + str(key)
    newdb[key] = value
                                                                                                                                                                                                
transaction.commit() 

For databases this size, shelve really is the wrong tool. If you do not need a highly available client/server architecture, and you just want to convert your TXT file to a local in-memory-accessible database, you really should be using ZODB

If you need something highly-available, you will of course need to switch to a formal "NoSQL" database, of which there are many to choose from.

Here's a simple example of how to convert your shelve database to a ZODB database which will solve your memory usage / performance problems.

#!/usr/bin/env python
import shelve
import ZODB, ZODB.FileStorage
import transaction
from optparse import OptionParser
import os
import sys
import re

reload(sys)
sys.setdefaultencoding("utf-8")

parser = OptionParser()

parser.add_option("-o", "--output", dest = "out_file", default = False, help ="original shelve database filename")
parser.add_option("-i", "--input", dest = "in_file", default = False, help ="new zodb database filename")

parser.set_defaults()
options, args = parser.parse_args()

if options.in_file == False or options.out_file == False :
    print "Need input and output database filenames"
    exit(1)

db = shelve.open(options.in_file, writeback=True)
zstorage = ZODB.FileStorage.FileStorage(options.out_file)
zdb = ZODB.DB(zstorage)
zconnection = zdb.open()
newdb = zconnection.root()

for key, value in db.iteritems() :
    print "Copying key: " + str(key)
    newdb[key] = value
                                                                                                                                                                                                
transaction.commit() 

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文