python 是否有已建立的 memoize 磁盘装饰器?

发布于 2024-12-20 04:42:45 字数 477 浏览 0 评论 0原文

我一直在寻找一个 python 模块,它提供具有以下功能的 memoize 装饰器:

  • 将缓存存储在磁盘上,以便在后续程序运行中重用。
  • 适用于任何可腌制的参数,最重要的是 numpy 数组。
  • (奖励)检查函数调用中参数是否发生变化。

我找到了一些用于此任务的小代码片段,并且可能可以自己实现一个,但我更喜欢为该任务建立一个包。我还找到了 incpy,但这似乎不适用于标准 python 解释器。

理想情况下,我想要类似 functools.lru_cache 加上磁盘上的缓存存储。有人可以给我指出一个合适的软件包吗?

I have been searching a bit for a python module that offers a memoize decorator with the following capabilities:

  • Stores cache on disk to be reused among subsequent program runs.
  • Works for any pickle-able arguments, most importantly numpy arrays.
  • (Bonus) checks whether arguments are mutated in function calls.

I found a few small code snippets for this task and could probably implement one myself, but I would prefer having an established package for this task. I also found incpy, but that does not seem to work with the standard python interpreter.

Ideally, I would like to have something like functools.lru_cache plus cache storage on disk. Can someone point me to a suitable package for this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

知足的幸福 2024-12-27 04:42:45

我不知道有哪个 memoize 装饰器可以处理这一切,但您可能想看看 ZODB。它是一个构建在 pickle 之上的持久性系统,提供了一些附加功能,包括能够在不使用对象时将对象从内存移动到磁盘以及仅保存已修改的对象的能力。

编辑:作为评论的后续行动。 ZODB 不支持开箱即用的记忆装饰器。但是,我认为您可以:

  • 实现您自己的 persistent class
  • 在您需要的方法中使用记忆装饰器(任何标准实现都应该有效,但可能需要修改它以确保 脏位被设置)

之后,如果您创建该类的对象并将其添加到 ZODB 数据库,则当您执行其中一个 memoized 方法时,该对象将被标记为脏,并且更改将在下一个事务提交操作中保存到数据库中。

I don't know of any memoize decorator that takes care of all that, but you might want to have a look at ZODB. It's a persistence system built on top of pickle that provides some additional features including being able move objects from memory to disk when they aren't being used and the ability to save only objects that have been modified.

Edit: As a follow-up for the comment. A memoization decorator isn't supported out of the box by ZODB. However, I think you can:

  • Implement your own persistent class
  • Use a memoization decorator in the methods you need (any standard implementation should work, but it probably needs to be modified to make sure that the dirty bit is set)

After that, if you create an object of that class and add it to a ZODB database, when you execute one of the memoized methods, the object will be marked as dirty and changes will be saved to the database in the next transaction commit operation.

多情癖 2024-12-27 04:42:45

我意识到这是一个 2 年前的问题,并且这不算是“已建立的”装饰器,但是......

这很简单,您真的不需要担心只使用已建立的代码。该模块的 docs 链接到 因为,除了本身有用之外,它还可以作为示例 代码。

那么,您需要添加什么?添加文件名参数。在运行时,pickle.load 将文件名加载到缓存中,如果失败,则使用{}。添加一个 cache_save 函数,仅将 pickle.save 将缓存保存到锁定下的文件中。将该函数附加到wrapper,与现有函数(cache_info 等)相同。

如果您想自动保存缓存,而不是将其留给调用者,那很简单;这只是何时这样做的问题。您提出的任何选项 - atexit.register,添加 save_every 参数以便保存每个 save_every 未命中,...... - 实现起来都很简单。在这个答案中,我展示了它需要很少的工作。或者,您可以在 GitHub 上获取完整的工作版本(自定义或按原样使用)< /a>.

您还可以通过其他方式扩展它 - 将一些与保存相关的统计信息(上次保存时间、自上次保存以来的命中和未命中次数等)放入 cache_info 中,复制缓存并将其保存在后台线程中但我想不出有什么值得做的事情,这并不容易。

I realize this is a 2-year-old question, and that this wouldn't count as an "established" decorator, but…

This is simple enough that you really don't need to worry about only using established code. The module's docs link to the source because, in addition to being useful in its own right, it works as sample code.

So, what do you need to add? Add a filename parameter. At run time, pickle.load the filename into the cache, using {} if it fails. Add a cache_save function that just pickle.saves the cache to the file under the lock. Attach that function to wrapper the same as the existing ones (cache_info, etc.).

If you want to save the cache automatically, instead of leaving it up to the caller, that's easy; it's just a matter of when to do so. Any option you come up with—atexit.register, adding a save_every argument so it saves every save_every misses, …—is trivial to implement. In this answer I showed how little work it takes. Or you can get a complete working version (to customize, or to use as-is) on GitHub.

There are other ways you could extend it—put some save-related statistics (last save time, hits and misses since last save, …) in the cache_info, copy the cache and save it in a background thread instead of saving it inline, etc. But I can't think of anything that would be worth doing that wouldn't be easy.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文