帖子元的数据库设计
如果您有一个名为 posts 的表,其中包含 id、标题、内容等,您将如何设计一个数据库表来保存每个帖子的元数据,例如今天、本周等被访问的次数,因此您可以找出本周哪个帖子最受欢迎。
另外,当我在这里时,索引和内部联接之间有什么区别。我知道这类似于最有效的数据库设计一个博客(帖子和评论) 问题:
我有三个表,post、author_posts 和author,我使用内部联接来查找 x 帖子的作者,posts 包含一个名为author_id 的字段,但是我看到一些网站这样做不包含外键,而是执行如下操作:
> post author_posts author
> id id, title, id, post_id,
> name, content author_id username
哪一个最快/您应该使用哪一个?
If you have a table called posts, which contains id, title, content .etc, how would you design a database table to hold meta data about each posts, e.g. how many times it was visited today, this week, etc, and so you can work out which post was most popular this week.
Also whilst I'm here, what is the difference between indexes and inner joins. I know this is similar to this Most efficient database design for a blog (posts and comments) question:
I have three tables, post, author_posts and author and I use inner join to find authors of x post, posts contains a field called author_id, however I have seen some sites which do not contain foreign keys, instead they do something like this:
> post author_posts author
> id id, title, id, post_id,
> name, content author_id username
Which one would be the quickest/ which one should you use?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
访问次数是关于帖子的一种非常特殊的元数据,其解决方案可能与存储其他类型的元数据不同。
计算网站上特定资源的“访问量”涉及对“访问量”的定义做出一些决定。您是否有登录信息来决定访问者是谁,还是仅使用 IP 地址?经过多少时间会触发同一 IP 的新访问?
也就是说,你有两个选择。
您可以为每次访问存储一条记录(带有日期和时间),并在您想知道有多少人访问时对这些记录进行计数。优点是您可以分析任何时间段,并且您有一个地方可以存储有关访问的额外信息。但是,如果您有很多访问者并频繁进行计算,这可能会给您的应用程序带来负载。
或者,您可以记录每个离散时间段内的访问次数。您应该使用您可能想要分析的最小时间段(也许每个日期一个计数器,或者每天的每个小时一个计数器,具体取决于您的最终用途)。每个计数器都是表中的一行,其中包含帖子 ID、日期或时间描述符以及整数计数。当您检测到新的访问时,您只需更新当前时间段的计数器即可。您可能需要保留最近访问详细信息的缓存,以帮助您确定是否有“新”访问,还是现有访问的延续,但如果访问只是对页面的请求,那么您不需要为此,只需更新相应的计数器即可。
Number of visits is a very particular kind of meta data about a post and the solution for that will probably be different from storing other kinds of meta data.
Counting "visits" for a particular resource on your website involves making some decisions about just what a "visit" is. Will you have login information to decide who's doing the visiting, or will you just be using an IP address? What amount of elapsed time triggers a new visit from the same IP?
That said, you have two choices.
You can store one record per visit (with the date and time) and count those records when you want to know how many people visited. The advantage is that you can analyze any time period and you have a place to store extra information about the visit. However, if you get a lot of visitors and do the calculations frequently, this can cause a load on your application.
Alternatively, you can keep a counter of how many visits you had in each discrete time period. You should use the smallest time period you might want to analyze (perhaps, one counter for each date or perhaps for each hour of each day, depending on your eventual use). Each counter is a row in a table with the post ID, date or time descriptor, and an integer count. When you detect a new visit, you simply update the counter for the current time period. You may need to keep a cache of recent visit details to help you decide whether you have a "new" visit, or a continuation of an existing one, but if a visit is simply a request for the page then you don't need to do this, just update the appropriate counter.
对于这种事情,如果你实时进行计算,你就会被杀。像这样的用例是非规范化的一个很好的候选者。
For this sort of thing you're going to get killed if you're doing calculations in real time. Use cases like this are a very good candidate for denormalization.