哪个数据库设计更好?

发布于 2024-10-22 15:37:30 字数 126 浏览 1 评论 0原文

给定像 StackOverflow 这样的网站,是否最好创建 num_comments 列来存储提交的评论数量,然后在发表评论时更新它,或者只是使用 COUNT 函数查询行数?看起来后者会更具可读性和优雅,但前者会更有效。那么觉得怎么样?

Given a site like StackOverflow, would it be better to create num_comments column to store how many comments a submission has and then update it when a comment is made or just query the number of rows with the COUNT function? It seems like the latter would be more readable and elegant but the former would be more efficient. What does SO think?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

与风相奔跑 2024-10-29 15:37:30

肯定要使用 COUNT。存储评论数量是一种经典的非规范化,会带来麻烦。它的检索效率稍高,但插入的成本更高:每个新评论不仅需要插入评论表,还需要对包含评论计数的行进行写入锁定。

Definitely to use COUNT. Storing the number of comments is a classic de-normalization that produces headaches. It's slightly more efficient for retrieval but makes inserts much more expensive: each new comment requires not only an insert into the comments table, but a write lock on the row containing the comment count.

奈何桥上唱咆哮 2024-10-29 15:37:30

前者没有标准化,但会产生更好的性能(假设读取次数多于写入次数)。

后者更加规范化,但需要更多资源,因此性能较差。

哪个更好归结为应用程序要求。

The former is not normalized but will produce better performance (assuming many more reads than writes).

The latter is more normalized, but will require more resources and hence be less performant.

Which is better boils down to application requirements.

╰ゝ天使的微笑 2024-10-29 15:37:30

我建议统计评论记录。尽管另一种方法速度更快,但它可以使数据库更干净。添加计数列将是一种数据重复,更不用说需要额外的代码步骤和插入。

如果您预计有数百万条评论,那么您可能需要选择计数列方法。

I would suggest counting comment records. Although the other method would be faster it lends to a cleaner database. Adding a count column would be a sort of data duplication not to mention require on additional code step and insert.

If you were to expect millions of comments, then you may want to pick the count column approach.

不乱于心 2024-10-29 15:37:30

我同意@Oded。这取决于应用程序的要求以及网站的活跃程度,但是这也是我的两分钱,

  • 我会尽量避免必须由触发器完成的写入,添加新评论时更新到发布表。
  • 如果您担心报告数据,那么不要在事务系统上这样做。创建报告数据库并定期更新。

I agree with @Oded. It depends on the app requirements and also how active is the site, however here is also my two cents

  • I would try to avoid the writes which will have to be done by triggers, UPDATES to post table when new comments are added.
  • If you are concerned about reporting the data then don't do that on a transactional system. Create a reporting DB and update that periodically.
终止放荡 2024-10-29 15:37:30

“正确”的设计方法是使用另一个表,将其与 COUNT 连接起来。这与数据库规范化所教导的内容一致。

标准化的问题在于它无法扩展。剥猫皮的方法只有这么多,所以如果每天有数百万个查询,其中很多涉及表 X,那么数据库性能就会下降,因为服务器还必须处理并发写入、事务等 。

为了解决这个问题,常见的做法是分片 分片有一个副作用,即表的行不存储在同一物理位置,其主要后果是您无法再JOIN;如何JOIN针对半个表并获得有意义的结果?显然,尝试对表的所有分区进行 JOIN 并合并结果将比疾病更糟糕。

因此,您会发现,您所研究的替代方案不仅在实践中用于实现高性能,而且工程师可以并且确实采取了更激进的步骤。

当然,除非您确实遇到性能问题,否则分片甚至反规范化只会让您的生活变得更加困难,而不会带来任何实际好处。

The "correct" way to design is to use another table, join it and COUNT. This is consistent with what database normalization teaches.

The problem with normalization is that it cannot scale. There are only so many ways to skin a cat, so if you have millions of queries per day and a lot of them involve table X, the database performance is going below ground as the server also has to deal with concurrent writes, transactions, etc.

To deal with this problem, a common practice is sharding. Sharding has the side effect that the rows of a table are not stored in the same physical location, and a primary consequence of this is that you cannot JOIN anymore; how can you JOIN against half a table and receive meaningful results? And obviously, trying to JOIN against all partitions of a table and merge the results is going to be worse than the disease.

So you see that not only the alternative you examine is used in practice to achieve high performance, but also that there are even more radical steps that engineers can and do take.

Of course, unless you do have performance issues, sharding or even de-normalizing is just making your life harder for no tangible benefit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文