在Python中存储和加载大量链接对象

发布于 2024-11-09 16:58:17 字数 271 浏览 5 评论 0原文

我有很多对象通过保留对其他对象的引用形成一个网络。所有对象(节点)都有一个字典,它是它们的属性。

现在我正在寻找一种快速方法来存储这些对象(在文件中?)并稍后将它们全部重新加载到内存中(我不需要随机访问)。内存中的数据约为 300MB,从 SQL 格式加载需要 40 秒,但我现在想缓存它以便更快地访问。

您建议采用哪种方法?

(尽管我试图用 getstate 搞乱,但由于递归错误,我的 pickle 尝试失败了:(也许还有很快的东西?:))

I have a lot of objects which form a network by keeping references to other objects. All objects (nodes) have a dict which is their properties.

Now I'm looking for a fast way to store these objects (in a file?) and reload all of them into memory later (I don't need random access). The data is about 300MB in memory which takes 40s to load from my SQL format, but I now want to cache it to have faster access.

Which method would you suggest?

(my pickle attempt failed due to recursion errors despite trying to mess around with getstate :( maybe there is something fast anyway? :))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

手心的海 2024-11-16 16:58:17

泡菜将是我的第一选择。但既然你说它不起作用,你可能想尝试 shelve,甚至认为这不是搁置的主要目的。

实际上,您应该为此使用 Pickle。也许您可以发布一些代码,以便我们查看并找出它不起作用的原因

Pickle would be my first choice. But since you say that it didn't work, you might want to try shelve, even thought it's not shelve's primary purpose.

Really, you should be using Pickle for this. Perhaps you could post some code so that we can take a look and figure out why it doesn't work

浊酒尽余欢 2024-11-16 16:58:17

“pickle 模块会跟踪它已经序列化的对象,以便以后对同一对象的引用不会再次序列化。”所以这是可能的。也许可以使用 sys.setrecursionlimit 来增加递归限制。

使用 Python 的 Pickle / cPickle 达到最大递归深度

"The pickle module keeps track of the objects it has already serialized, so that later references to the same object won’t be serialized again." So it IS possible. Perhaps increase the recursion limit with sys.setrecursionlimit.

Hitting Maximum Recursion Depth Using Python's Pickle / cPickle

内心旳酸楚 2024-11-16 16:58:17

也许您可以设置一些间接层,其中对象实际上保存在另一个字典中,引用另一个对象的对象将存储被引用对象的键,然后通过字典访问该对象。如果存储的键的对象不在字典中,它将从 SQL 数据库加载到字典中,当不再需要它时,可以从字典/内存中删除该对象(可能使用在删除内存中的版本之前更新其在数据库中的状态)。

这样,您不必一次从数据库加载所有数据,并且可以将许多对象缓存在内存中,以便更快地访问这些对象。缺点是每次访问主字典都需要额外的开销。

Perhaps you could set up some layer of indirection where the objects are actually held within, say, another dictionary, and an object referencing another object will store the key of the object being referenced and then access the object through the dictionary. If the object for the stored key is not in the dictionary, it will be loaded into the dictionary from your SQL database, and when it doesn't seem to be needed anymore, the object can be removed from the dictionary/memory (possibly with an update to its state in the database before the version in memory is removed).

This way you don't have to load all the data from your database at once, and can keep a number of the objects cached in memory for quicker access to those. The downside would be the additional overhead required for each access to the main dict.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文