在 PHP 中仅缓存常用数据
我有一个新闻网站,每天收到 36,000 篇文章的大约 58,000 次点击。在这 36000 个独特的故事中,有 30000 个仅获得 1 次点击(其中大多数是搜索引擎爬虫),只有 250 个故事获得超过 20 次展示。除了这 250 篇文章之外,缓存任何内容都会浪费内存。
目前我使用 MySQL Query Cache 和 xcache 进行数据缓存。该表每 5-10 分钟更新一次,因此单独使用查询缓存并没有多大用处。如何单独检测经常访问的页面并缓存数据?
I have a news site which receives around 58,000 hits a day for 36,000 articles. Of this 36000 unique stories, 30000 get only 1 hit (majority of which are search engine crawlers) and only 250 stories get over 20 impressions. It is a wastage of memory to cache anything, but these 250 articles.
Currently I am using MySQL Query Cache and xcache for data caching. The table is updated every 5-10 mins, hence Query Cache alone is not much useful. How can I detect frequently visited pages alone and cache the data?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为您可以有两个选择:
默认情况下不缓存任何内容。
您可以使用观察者/可观察模式实现一种在文章的视图达到阈值时触发事件的方法,并开始缓存页面。
在创建时缓存每篇文章
在这两种情况下,您都可以使用 cron 来清除未到达的文章您定义的阈值。
无论如何,您可能需要使用任何启发式方法尽早确定您的文章需要缓存,并且与任何启发式方法一样,您将出现误报,反之亦然。
这取决于您的内容如何被阅读,如果文章是实时新闻,它可能会很有效,因为它会很快产生高流量。
这些方法的主要问题是您需要存储额外的信息,例如上次访问日期时间及其当前页面视图,这可能会导致额外的查询。
I think you can have two options to start with:
You don't cache anything by default.
You can implement with an Observer/Observable pattern a way to trigger an event when the article's view reaches a threshold, and start caching the page.
You cache every article at creation
In both case, you can use a cron to purge articles which don't reaches your defined threshold.
In any case, you'll probably need to use any heuristic method to determine enough early that your article will need to be cached, and as in any heuristic method, you'll have false-positive and vice-versa.
It'll depend on how your content is read, if articles are realtime news, it'll probably be efficient as it'll quickly generate high traffic.
The main problem with those method is you'll need to store extra information like the last access datetime and its current page views which could result in extra queries.
您只能缓存新文章(比如说最近添加的文章)。我建议查看 memcached 和 Redis - 它们都是非常有用、简单且同时功能强大的缓存引擎。
You can cache only new articles (let's say the ones which have been added recently). I'd suggest having a look at memcached and Redis - they are both very useful, simple and at the same time powerful caching engines.