可扩展的 MySQL 数据库,用于类似邮件的消息传递
假设我们有一个受欢迎的网站。我们需要在用户之间实现类似邮件的消息传递。 典型的解决方案是使用 2 个表:
Users (user_id)
Messages (message_id, sender_id (references user_id), receive_id (references user_id), subject, body )。
这种方法有两个明显的限制:
- 所有用户的所有消息都存储在一个表中,导致其负载很高并降低数据库的整体性能。
- 当某人需要同时向多个用户发送消息时,消息会被复制 (recipients_count) 次。
另一个解决方案使用 3 个表:
Users(user_id)
Sent_messages(sent_id, sender_id (references user_id), subject, body)
Received_messages(sent_id, receive_messages的receiver_id(引用user_id)、主题、正文)
主题和正文是从sent_messages的相应字段复制的。
这种方法通过将信息从一个表复制到另一个表来导致
- 数据库非规范化。
- 用户实际上可以删除发送/接收的消息,而无需将它们从接收者/发送者中删除。
- 消息占用大约 2 倍的空间
- 每个表的加载量大约减少 2 倍。
那么问题来了:
- 哪种设计更适合高负载和可扩展性? (我认为是第二个)
- 是否有另一种数据库设计可以处理高负载?它是什么?有什么限制?
谢谢!
PS 我知道在解决这些可扩展性问题之前,网站必须非常成功,但我想知道如果需要的话该怎么做。
更新
目前,对于第一个版本,我将使用 Daniel Vassallo 提出的设计。但如果以后一切顺利的话,设计就会改成第二种。感谢埃弗特减轻了我对此的担忧。
Assume we have a popular site. We need to implement mail-like messaging between users.
Typical solution is to use 2 tables:
Users (user_id)
Messages (message_id, sender_id (references user_id), receiver_id (references user_id), subject, body ).
This method has 2 significant limitations
- All messages of all users are stored in one table leading to it's high load and decreasing overall database performance.
- When someone needs to send message to several users simultaneously, the message gets copied (recipients_count) times.
The other solution uses 3 tables:
Users(user_id)
Sent_messages(sent_id, sender_id (references user_id), subject, body)
Received_messages(sent_id, receiver_id (references user_id), subject, body)
subject and body of received_messages are copied from corresponding fields of sent_messages.
This method leads to
- Denormalizing the database by copying information from one table to another
- Users can actually delete sent/received messages without removing them from the receivers/senders.
- Messages take approximately 2 times more space
- Each table is loaded approximately 2 times less.
So here go the questions:
- Which one of considered design is better for high load and scalability? (I think it's the second one)
- Is there another database design that can handle high load? What is it? What are the limitations?
Thanks!
P.S. I understand that before getting to these scalability issues the site has to be very successful, but I want to know what to do if I need to.
UPDATE
Currently for the first versions I'll be using design proposed by Daniel Vassallo. But if everything is OK in the future, the design will be changed to the second one. Thanks to Evert for allaying my apprehension about it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在将邮件发送给多个收件人的情况下,您可能希望避免多次复制邮件正文。这是您可能需要考虑的另一个选择:
这个模型可能更twitter-与电子邮件类似,但它可能具有一些优点。
规则是:
以下是一些优点:
对于大多数应用程序,如果您对上述模型使用乐观隔离级别,即使您期望以每秒几个的速率交换消息,也不应该出现性能问题。另一方面,如果您预计每秒有数百或数千条消息,那么可能确实需要考虑其他选项。
You may want to avoid copying the message body multiple times in the case where a message is sent to multiple recipients. Here is another option which you may want to consider:
This model may be more twitter-like than email-like, but it may come with some advantages.
The rules are that:
These are some of the advantages:
For most applications, if you use an optimistic isolation level with the above model, you should not have performance problems even if you are expecting messages to be exchanged at a rate of a few per second. If on the other hand you're expecting hundreds or thousands of messages per second, then it may really be the case to consider other options.
一般来说,数据库大小不会引起太多关注。速度更为重要。
因此,我很想选择第二个方案。就像您提到的那样,它使删除消息之类的事情变得更加容易,而且我很确定这是一种非常常见的方法。
In general database size will not be much of a concern. Speed is much more important.
Therefore, I would be tempted to go for option two. Just like you mentioned, it makes things like deleting messages a lot easier, and I'm pretty sure this is a very common way to do this.