在 Python 中加载字典最有效的方法是什么?
我有一个包含约 1000 个条目的 Python 字典。将重复调用一个脚本来解析字符串并查看字符串中是否有任何键匹配。如果这样做,它将根据键和值采取一些操作。
这些哪个更快?
1)将字典存储在MySQL数据库中,然后每次调用脚本时读取数据库?
2)将字典存储在Python脚本中并每次导入? (例如,创建一个只包含字典初始化的文件)
3)将字典存储在文本文件中,然后每次都导入它? (使用 cpickle 的平面文本文件或 pickle 序列化数据文件)
只是在这里寻找最佳实践。
I have what is a Python dictionary of ~1000 entries. A script is going to be called repeatedly that will want to parse a string and see if any keys in the string match. If they do, it will take some action based on the key and the value.
Which of these is faster?
1) Store the dictionary in a MySQL database, and then read the database each time the script is called?
2) Store the dictionary in a Python script and import it every time? (e.g. make a file that contains nothing but the dictionary initialization)
3) Store the dictionary in a text file, and import it every time? (either a flat text file or a pickle serialized data file, using cpickle)
Just looking for a best practice here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以创建一个 .py Python 文件,仅将字典分配给一个名称。保存文件。将文件编译为 .pyc,然后在主 Python 脚本需要时将其作为模块加载。
您可以获得以下优势:保留字典的可读文本表示形式以进行维护/调试、加载 .pyc 文件的速度以及标准 Python 的简单性。
You might create a .py Python file that just assigns the dictionary to a name. Save the file. Compile the file to a .pyc then load it as a module when needed by your main Python script.
You get the advantage of keeping a readable textual representation of your dict for maintenance/debug, the speed of loading a .pyc file and the simplicity of it all being standard Python.
我认为将它作为字典存储在 python 文件中并导入到需要它的每个模块中将是可行的方法。你能以编程方式构建它吗?无论哪种方式,该文件实际上只会在每次程序执行时导入一次,因此这应该不是什么大问题,除非您知道由于某种原因在开始时加载一次是不可接受的。
shelve
可能是另一种方式。如果您想使用选项 (3),这可能是这样做的方法。它是基于anydbm
模块构建的。这可能会慢一些,但可以让您避免一次将整个事情都存入内存中。在我看来,1)和3)都是正确的。进行数据库查询的开销很可能会大大减慢访问速度。选项 2) 将使其成为一个简单的字典查找。
I would think that storing it as a dictionary in a python file and importing into every module that needs it would be the way to go. Can you construct it programmatically? Either way, the file will only be actually imported once per program execution so it shouldn't be a big deal unless you know that loading it once at the beginning is unacceptable for some reason.
shelve
could be another way to go here. This would probably be the way to do it if you wanted to go with option (3) It's built on theanydbm
module. This will probably be slower but will allow you to avoid having the whole thing in memory at once.In my opinion, both 1) and 3) are right out. The overhead of making the database queries would, in all probability, slow access down dramatically. Option 2) will make it all a simple dict lookup.
出于测试目的,您还可以加载包含任何内容(在本例中为整数)的字典,如下所示:
For testing purposes, you can also load a dictionary with whatever (integer numbers in this case) like this: