如何处理高负载网站上 memcached 中的过期项(由于 TTL)?
当峰值达到 600 个请求/秒时,由于 TTL 过期而刷新项目的内存缓存会产生一些相当负面的影响。几乎在同一时间,200 个线程/进程发现缓存为空,并触发数据库请求以再次填充缓存。
处理这些情况的最佳实践是什么?
ps 这种情况的术语是什么? (让我有机会获得有关该主题的更好的谷歌结果)
When you have peaks of 600 requests/second, then the memcache flushing an item due to the TTL expiring has some pretty negative effects. At almost the same time, 200 threads/processes find the cache empty and fire of a DB request to fill it up again
What is the best practice to deal with these situations?
p.s. what is the term for this situation? (gives me chance to get better google results on the topic)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您有大量请求需要的 memcached 对象(您暗示就是这种情况),那么我会考虑建立一个单独的进程或 cron 作业来定期计算和刷新这些对象。这样它就永远不会达到 TTL。这是一种常见的权衡:您在低流量期间添加了一些不必要的负载,以帮助减少高峰期间(您可能最关心的时间)的负载。
我发现这被 memcached 人员称为“stampeding herd”,他们在这里讨论:http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Avoiding_stampeding_herd
我的下一个建议实际上是使用软缓存限制,如上面的链接中所述。
If you have memcached objects which will be needed on a large number of requests (which you imply is the case), then I would look into having a separate process or cron job that regularly calculated and refreshed these objects. That way it should never go TTL. It's a common trade-off: you add a little unnecessary load during low traffic time to help reduce the load during peaking (the time you probably care the most about).
I found out this is referred to as "stampeding herd" by the memcached folks, and they discuss it here: http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Avoiding_stampeding_herd
My next suggestion was actually going to be using soft cache limits as discussed in the link above.
如果您的对象由于设置了过期日期而过期,并且已超过日期,则除了增加过期时间之外,您无能为力。
如果您担心过时的数据,可以考虑以下几种技术:
考虑使缓存成为您正在查看的任何数据的权威来源,并创建一个线程,其工作是保持数据新鲜。 这才有意义
而不是在数据上设置 TTL,而是更改更新数据的任何进程来更新缓存时, 。我用于频繁更改数据的一种技术是按概率执行此操作 - 写入数据时,有 10% 的时间会对其进行更新。您可以根据数据库查询的成本以及陈旧数据的影响有多严重来进行合理的调整。
If your object is expiring because you've set an expiry and it's gone past date, there is nothing you can do but increase the expiry time.
If you are worried about stale data, a few techniques exist you could consider:
Consider making the cache the authoritative source for whatever data you are looking at, and make a thread whose job is to keep it fresh. This will make the other threads block on refilling the cache, so it may only make sense if you can
Rather than setting a TTL on the data, change whatever process updates the data to update the cache. One technique I use for frequently changing data is to do this probabilistically -- 10% of the time data is written, it is updated. You can tune this for whatever is sensible, depending on how expensive the DB query is and how severe the impact of stale data.