python 是否有已建立的 memoize 磁盘装饰器?
我一直在寻找一个 python 模块,它提供具有以下功能的 memoize 装饰器:
- 将缓存存储在磁盘上,以便在后续程序运行中重用。
- 适用于任何可腌制的参数,最重要的是 numpy 数组。
- (奖励)检查函数调用中参数是否发生变化。
我找到了一些用于此任务的小代码片段,并且可能可以自己实现一个,但我更喜欢为该任务建立一个包。我还找到了 incpy,但这似乎不适用于标准 python 解释器。
理想情况下,我想要类似 functools.lru_cache
加上磁盘上的缓存存储。有人可以给我指出一个合适的软件包吗?
I have been searching a bit for a python module that offers a memoize decorator with the following capabilities:
- Stores cache on disk to be reused among subsequent program runs.
- Works for any pickle-able arguments, most importantly numpy arrays.
- (Bonus) checks whether arguments are mutated in function calls.
I found a few small code snippets for this task and could probably implement one myself, but I would prefer having an established package for this task. I also found incpy, but that does not seem to work with the standard python interpreter.
Ideally, I would like to have something like functools.lru_cache
plus cache storage on disk. Can someone point me to a suitable package for this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我不知道有哪个 memoize 装饰器可以处理这一切,但您可能想看看 ZODB。它是一个构建在
pickle
之上的持久性系统,提供了一些附加功能,包括能够在不使用对象时将对象从内存移动到磁盘以及仅保存已修改的对象的能力。编辑:作为评论的后续行动。 ZODB 不支持开箱即用的记忆装饰器。但是,我认为您可以:
之后,如果您创建该类的对象并将其添加到 ZODB 数据库,则当您执行其中一个 memoized 方法时,该对象将被标记为脏,并且更改将在下一个事务提交操作中保存到数据库中。
I don't know of any memoize decorator that takes care of all that, but you might want to have a look at ZODB. It's a persistence system built on top of
pickle
that provides some additional features including being able move objects from memory to disk when they aren't being used and the ability to save only objects that have been modified.Edit: As a follow-up for the comment. A memoization decorator isn't supported out of the box by ZODB. However, I think you can:
After that, if you create an object of that class and add it to a ZODB database, when you execute one of the memoized methods, the object will be marked as dirty and changes will be saved to the database in the next transaction commit operation.
我意识到这是一个 2 年前的问题,并且这不算是“已建立的”装饰器,但是......
这很简单,您真的不需要担心只使用已建立的代码。该模块的 docs 链接到 源 因为,除了本身有用之外,它还可以作为示例 代码。
那么,您需要添加什么?添加
文件名
参数。在运行时,pickle.load
将文件名加载到缓存
中,如果失败,则使用{}
。添加一个cache_save
函数,仅将pickle.save
将缓存保存到锁定下的文件中。将该函数附加到wrapper
,与现有函数(cache_info
等)相同。如果您想自动保存缓存,而不是将其留给调用者,那很简单;这只是何时这样做的问题。您提出的任何选项 -
atexit.register
,添加save_every
参数以便保存每个save_every
未命中,...... - 实现起来都很简单。在这个答案中,我展示了它需要很少的工作。或者,您可以在 GitHub 上获取完整的工作版本(自定义或按原样使用)< /a>.您还可以通过其他方式扩展它 - 将一些与保存相关的统计信息(上次保存时间、自上次保存以来的命中和未命中次数等)放入
cache_info
中,复制缓存并将其保存在后台线程中但我想不出有什么值得做的事情,这并不容易。I realize this is a 2-year-old question, and that this wouldn't count as an "established" decorator, but…
This is simple enough that you really don't need to worry about only using established code. The module's docs link to the source because, in addition to being useful in its own right, it works as sample code.
So, what do you need to add? Add a
filename
parameter. At run time,pickle.load
the filename into thecache
, using{}
if it fails. Add acache_save
function that justpickle.save
s the cache to the file under the lock. Attach that function towrapper
the same as the existing ones (cache_info
, etc.).If you want to save the cache automatically, instead of leaving it up to the caller, that's easy; it's just a matter of when to do so. Any option you come up with—
atexit.register
, adding asave_every
argument so it saves everysave_every
misses, …—is trivial to implement. In this answer I showed how little work it takes. Or you can get a complete working version (to customize, or to use as-is) on GitHub.There are other ways you could extend it—put some save-related statistics (last save time, hits and misses since last save, …) in the
cache_info
, copy the cache and save it in a background thread instead of saving it inline, etc. But I can't think of anything that would be worth doing that wouldn't be easy.