递归地使 Django 缓存中的路径失效

发布于 2024-08-16 15:04:14 字数 789 浏览 3 评论 0原文

我正在从 Django 缓存中删除单个路径,如下所示:

from models                   import Graph
from django.http              import HttpRequest
from django.utils.cache       import get_cache_key
from django.db.models.signals import post_save
from django.core.cache        import cache

def expire_page(path):
    request      = HttpRequest()
    request.path = path
    key          = get_cache_key(request)
    if cache.has_key(key):   
        cache.delete(key)

def invalidate_cache(sender, instance, **kwargs):
    expire_page(instance.get_absolute_url())

post_save.connect(invalidate_cache, sender = Graph)

这可行 - 但有没有办法递归删除?我的路径如下所示:

<前><代码>/graph/123 /图/123/2009-08-01/2009-10-21

每当保存id为“123”的图时,两条路径的缓存都需要失效。这可以做到吗?

I am deleting a single path from the Django cache like this:

from models                   import Graph
from django.http              import HttpRequest
from django.utils.cache       import get_cache_key
from django.db.models.signals import post_save
from django.core.cache        import cache

def expire_page(path):
    request      = HttpRequest()
    request.path = path
    key          = get_cache_key(request)
    if cache.has_key(key):   
        cache.delete(key)

def invalidate_cache(sender, instance, **kwargs):
    expire_page(instance.get_absolute_url())

post_save.connect(invalidate_cache, sender = Graph)

This works - but is there a way to delete recursively? My paths look like this:

/graph/123
/graph/123/2009-08-01/2009-10-21

Whenever the graph with id "123" is saved, the cache for both paths needs to be invalidated. Can this be done?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

愚人国度 2024-08-23 15:04:14

您可能需要考虑采用分代缓存策略,它似乎适合您想要实现的目标。在您提供的代码中,您将为每个绝对网址存储一个“生成”编号。因此,例如,您将初始化“/graph/123”以使其具有一代,那么其缓存键将变为类似“/GENERATION/1/graph/123”。当您想要使该绝对 URL 的缓存过期时,您可以增加其生成值(在本例中为 2)。这样,下次有人去查找“/graph/123”时,缓存键就会变成“/GENERATION/2/graph/123”。这也解决了所有子页面过期的问题,因为它们应该引用与“/graph/123”相同的缓存键。

一开始理解起来有点棘手,但它是一种非常优雅的缓存策略,如果正确完成,意味着您永远不必从缓存中实际删除任何内容。有关详细信息,请参阅关于分代缓存的演示,适用于 Rails,但无论语言如何,概念都是相同的。

You might want to consider employing a generational caching strategy, it seems like it would fit what you are trying to accomplish. In the code that you have provided, you would store a "generation" number for each absolute url. So for example you would initialize the "/graph/123" to have a generation of one, then its cache key would become something like "/GENERATION/1/graph/123". When you want to expire the cache for that absolute url you increment its generation value (to two in this case). That way, the next time someone goes to look up "/graph/123" the cache key becomes "/GENERATION/2/graph/123". This also solves the issue of expiring all the sub pages since they should be referring to the same cache key as "/graph/123".

Its a bit tricky to understand at first but it is a really elegant caching strategy which if done correctly means you never have to actually delete anything from cache. For more information here is a presentation on generational caching, its for Rails but the concept is the same, regardless of language.

作死小能手 2024-08-23 15:04:14

另一种选择是使用支持标记键和按标记逐出键的缓存。 Django 的内置缓存 API 不支持这种方法。但至少有一个缓存后端(不是 Django 本身的一部分)确实有支持。

DiskCache* 是 Apache2 许可的磁盘和文件支持的缓存库,用纯 Python 编写,与 Django 兼容。要在项目中使用 DiskCache,只需安装它并配置您的 CACHES 设置即可。

使用 pip 可以轻松安装:

$ pip install diskcache

然后配置您的 CACHES 设置:

CACHES = {
    'default': {
        'BACKEND': 'diskcache.DjangoCache',
        'LOCATION': '/tmp/path/to/directory/',
    }
}

缓存 set 方法通过可选的 tag 进行扩展关键字参数如下:

from django.core.cache import cache

cache.set('/graph/123', value, tag='/graph/123')
cache.set('/graph/123/2009-08-01/2009-10-21', other_value, tag='/graph/123')

diskcache.DjangoCache在内部使用 diskcache.FanoutCache 。相应的 FanoutCache 可通过 _cache 属性访问,并公开一个 evict 方法。简单地驱逐所有标记有 /graph/123 的键:

cache._cache.evict('/graph/123')

虽然访问下划线前缀的属性可能会感觉很尴尬,但 DiskCache 项目是稳定的,不太可能对 DjangoCache< /代码>实施。

Django 缓存基准 页面有讨论替代缓存后端。

  • 免责声明:我是 DiskCache 项目的原作者。

Another option is to use a cache that supports tagging keys and evicting keys by tag. Django's built-in cache API does not have support for this approach. But at least one cache backend (not part of Django proper) does have support.

DiskCache* is an Apache2 licensed disk and file backed cache library, written in pure-Python, and compatible with Django. To use DiskCache in your project simply install it and configure your CACHES setting.

Installation is easy with pip:

$ pip install diskcache

Then configure your CACHES setting:

CACHES = {
    'default': {
        'BACKEND': 'diskcache.DjangoCache',
        'LOCATION': '/tmp/path/to/directory/',
    }
}

The cache set method is extended by an optional tag keyword argument like so:

from django.core.cache import cache

cache.set('/graph/123', value, tag='/graph/123')
cache.set('/graph/123/2009-08-01/2009-10-21', other_value, tag='/graph/123')

diskcache.DjangoCache uses a diskcache.FanoutCache internally. The corresponding FanoutCache is accessible through the _cache attribute and exposes an evict method. To evict all keys tagged with /graph/123 simply:

cache._cache.evict('/graph/123')

Though it may feel awkward to access an underscore-prefixed attribute, the DiskCache project is stable and unlikely to make significant changes to the DjangoCache implementation.

The Django cache benchmarks page has a discussion of alternative cache backends.

  • Disclaimer: I am the original author of the DiskCache project.
沉默的熊 2024-08-23 15:04:14

查看 shutils.rmtree()os.removedirs()。我想第一个可能就是你想要的。

根据几条评论进行更新:实际上,Django 缓存机制比仅使用 path 作为键更通用、更细粒度(尽管您可以在该级别使用它) )。我们有一些页面有 7 或 8 个单独缓存的子组件,这些子组件根据一系列条件过期。我们的组件缓存名称反映了关键对象(或对象类),并用于识别某些更新时需要无效的内容。

我们所有的页面都有一个基于会员/非会员状态的整体缓存键,但这仅占页面的 95% 左右。另外 5% 可以根据每个成员进行更改,因此根本不会缓存。

如何遍历缓存以查找无效项目取决于缓存的实际存储方式。如果是文件,您可以简单地使用 glob 和/或递归目录删除,如果是其他机制,那么您将不得不使用其他机制。

我的回答以及其他人的一些评论试图说明的是,如何实现缓存失效与您如何使用/存储缓存密切相关。

第二次更新:@andybak:所以我猜你的评论意味着我所有的商业 Django 网站都会在火焰中爆炸?感谢您对此的提醒。我注意到您没有尝试回答该问题。

Knipknap 的问题是,他有一组缓存项,由于它们的名称,它们似乎是相关的并且处于层次结构中,但是缓存机制的密钥生成逻辑通过创建 MD5 哈希来删除该名称路径+varie_on。由于没有原始路径/参数的踪迹,您将不得不详尽地猜测所有可能的路径/参数组合,希望您能找到正确的组。我还有其他更有趣的爱好。

如果您希望能够根据路径和/或参数值的某种组合查找缓存项组,您必须使用可以直接进行模式匹配的缓存键某些系统会保留此信息以供搜索时使用。

因为我们的需求与 OP 的问题不无关系,所以我们在 2 年前就控制了模板片段缓存,特别是密钥生成。它允许我们以多种方式使用正则表达式来有效地使相关缓存项组无效。我们还添加了可在 settings.py 中配置的默认超时和变化变量名称(在运行时解析),更改了名称和变量的顺序。因为总是必须覆盖默认超时来命名片段是没有意义的,使fragment_name可解析(即它可以是一个变量)以更好地与多级模板继承方案一起工作,以及其他一些事物。

我最初的答案(对于当前的 Django 来说确实是错误的)的唯一原因是因为我已经使用更健全的缓存键太久了,我真的忘记了我们放弃的简单机制。

Checkout shutils.rmtree() or os.removedirs(). I think the first is probably what you want.

Update based on several comments: Actually, the Django caching mechanism is more general and finer-grained than just using the path for the key (although you can use it at that level). We have some pages that have 7 or 8 separately cached subcomponents that expire based on a range of criteria. Our component cache names reflect the key objects (or object classes) and are used to identify what needs to be invalidated on certain updates.

All of our pages have an overall cache-key based on member/non-member status, but that is only about 95% of the page. The other 5% can change on a per-member basis and so is not cached at all.

How you iterate through your cache to find invalid items is a function of how it's actually stored. If it's files you can use simply globs and/or recursive directory deletes, if it's some other mechanism then you'll have to use something else.

What my answer, and some of the comments by others, are trying to say is that how you accomplish cache invalidation is intimately tied to how you are using/storing the cache.

Second Update: @andybak: So I guess your comment means that all of my commercial Django sites are going to explode in flames? Thanks for the heads up on that. I notice you did not attempt an answer to the problem.

Knipknap's problem is that he has a group of cache items that appear to be related and in a hierarchy because of their names, but the key-generation logic of the cache mechanism obliterates that name by creating an MD5 hash of the path + vary_on. Since there is no trace of the original path/params you will have to exhaustively guess all possible path/params combinations, hoping you can find the right group. I have other hobbies that are more interesting.

If you wish to be able to find groups of cached items based on some combination of path and/or parameter values you must either use cache keys that can be pattern matched directly or some system that retains this information for use at search time.

Because we had needs not-unrelated to the OP's problem, we took control of template fragment caching -- and specifically key generation -- over 2 years ago. It allows us to use regexps in a number of ways to efficiently invalidate groups of related cached items. We also added a default timeout and vary_on variable names (resolved at run time) configurable in settings.py, changed the ordering of name & timeout because it made no sense to always have to override the default timeout in order to name the fragment, made the fragment_name resolvable (ie. it can be a variable) to work better with a multi-level template inheritance scheme, and a few other things.

The only reason for my initial answer, which was indeed wrong for current Django, was because I have been using saner cache keys for so long I literally forgot the simple mechanism we walked away from.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文