如何优化收藏夹中新帖子的数量
我想说,我很高兴收到任何回复。我将尝试构建我的文本以便更好地理解。
情况 我经营主题互联网论坛,您可以在菜单中将主题添加为您的最爱,菜单会显示这些主题中的新帖子数量。因此,每次您重新加载页面(转到整个网站的任何位置)时,都会检查您收藏夹中所有主题的新帖子。
问题 这在数据库上当然是相当昂贵的,因为通常有 20-50 个收藏夹,我必须检查数据库是否在这些主题中添加了任何帖子。平均主题有 1000-2000 个帖子。每个用户的每次综合浏览量都会发生这种情况,即每月大约 900,000 次综合浏览量。
可能的解决方案1 我存储每个主题中的帖子总数,并为每个用户存储每个主题的最后查看的帖子数。这可能是最快的,但它有很多缺点,这些缺点是功能性的(删除、过滤帖子等)。
可能的解决方案2 我为每个用户存储每个主题的上次查看帖子的 ID。这是非常好的解决方案,但比前一个方案慢大约十倍。
数据库 我将所有主题的所有帖子存储在一张巨大的表中=数十万个帖子。
问题 我想消除解决方案 1 带来的问题,但我需要保持速度。我想过为每个主题创建一个表并使用解决方案 2,但我不知道它是否有帮助。因此,如果您有任何经验,请告诉我最快的解决方案是什么。
非常感谢。
I would like to say, that I will be glad for any reply. I will try to structure my text for better understanding.
Situation
I run thematic internet forum, where you can add topics as your favorites in the menu and the menu shows the number of new posts in these topics. So everytime you reload the page (go anywhere on the whole site), new posts for all topics in your favorites are checked.
Problem
This is of course quite expensive on DB, because it is common to have 20-50 favorites and I have to check the DB if any post was added in any of these topics. The average topic has 1000-2000 posts. And this happens for every pageview for every user which is approximately 900 000 pageviews per month.
Possible solution 1
I store number of total posts in every topic and I store number of last viewed posts for every topic, for every user. This may be fastest, but it has a lot of disadvantages, which are functional (deleting, filtering of posts, etc.).
Possible solution 2
I store id of last viewed post for every topic, for every user. This is very good solution, but about ten times slower then previous one.
Database
I store all posts for all topics in one huge table = hunderds of thousands of posts.
Question
I would like to remove problems that brings solution 1, but I need to keep the speed. I thought of creating a table for each topic and use Solution 2, but I dont know if it will help. So if you have any experiences please just tell me what would be the fastest solution.
Thank you very much.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
首先:不知道您的架构或数据库系统,但这应该相对简单,假设您保留了用户上次看到的时间的记录(下面示例中的 $DATE_USER_WAS_LAST_SEEN ),并且您的每个帖子可能都与其主题相关联类型的 id,并且您有所有 $FAVOURITE id 的列表。
将为您提供如下输出:
对于这种规模来说,这应该是可接受的速度,您可以通过不使用
IN
并制作长(topic_id = 1 OR topic_id = 2或者 topic_id = etc)
字符串(如果您的数据库没有自动优化这些内容)。其次:不要太担心如何保持这些值最新。人们会将它们用作有新消息的指示器,而不是根据它们做出生活决策,因此缓存每个用户的这些请求(可以在用户自己的记录上,也可以使用某种内存缓存,例如 memcache,如果您熟悉这些) )并每隔 5 分钟左右使缓存过期,这将从根本上减少对数据库的命中
Firsty: No idea about your schema or database system, but this should be relativly simple assuming you keep a record of when your user was last seen ($DATE_USER_WAS_LAST_SEEN in the example below) and each of your posts is presumably associaed with it's topic by some kind of id and you have a list of all the $FAVOURITE ids.
will give you an output like:
This should be an acceptable speed for this kind of scale, you could improve the query by not using
IN
and making a long(topic_id = 1 OR topic_id = 2 OR topic_id = etc)
string if your database doesn't automatically optimise these things.Secondly: Don't worry so much about keeping these values bang up to date. People will use them as an indicator that there are new messages, not base life decisions on them, so cache these requests per user (either on the user's own record or using some kind of in-memory cache like memcache if you are familiar with those) and expire the cache every 5mins or so, this will radically reduce your hits to the database
我想你的帖子 ID 是连续的并且总是递增的。
为您最喜欢的表创建一个至少包含以下字段的表:user_id、topic_id、last_post_id
然后您可以使用这个简单的查询检查新帖子:
这应该运行得相当顺利。
您还必须在用户每次访问主题时更新last_post_id,但这应该非常简单。
I suppose your post ids are sequential and always incrementing.
Create a table for your favorite with at least these fields : user_id, topic_id, last_post_id
You can then check for new posts with this simple query :
This should run pretty smoothly.
You must also update the last_post_id each time a user visit a topic, but this should be pretty straightforward.
如果您在巨大的 all_posts 表上有一个索引(topic_id,post_id),那么执行此查询的成本应该不会太高:
I you have an index (topic_id, post_id) on the huge all_posts table it shouldn't be too costly to do this query: