MongoDB 架构设计 - 实时聊天
我正在启动一个项目,我认为该项目特别适合 MongoDB,因为它提供的速度和可扩展性。
我目前感兴趣的模块是与实时聊天有关的。如果我要在传统的 RDBMS 中执行此操作,我会将其分为:
- 通道(一个通道有许多用户)
- 用户(一个用户有一个通道,但有许多消息)
- 消息(一条消息有一个用户)
这样做的目的在用例中,我想假设通常有 5 个通道同时处于活动状态,每个通道每秒最多处理 5 条消息。
需要快速的特定查询:
- 获取新消息(可能基于书签、时间戳或递增计数器?)
- 将消息发布到频道
- 验证用户可以在频道中发布
请记住文档限制MongoDB 有 4mb,你会如何设计模式?你的会是什么样子?有什么我应该注意的问题吗?
I'm starting a project which I think will be particularly suited to MongoDB due to the speed and scalability it affords.
The module I'm currently interested in is to do with real-time chat. If I was to do this in a traditional RDBMS I'd split it out into:
- Channel (A channel has many users)
- User (A user has one channel but many messages)
- Message (A message has a user)
The the purpose of this use case, I'd like to assume that there will be typically 5 channels active at one time, each handling at most 5 messages per second.
Specific queries that need to be fast:
- Fetch new messages (based on an bookmark, time stamp maybe, or an incrementing counter?)
- Post a message to a channel
- Verify that a user can post in a channel
Bearing in mind that the document limit with MongoDB is 4mb, how would you go about designing the schema? What would yours look like? Are there any gotchas I should watch out for?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我使用了 Redis、NGINX 和用于我的聊天项目的 PHP-FPM。不是超级优雅,但它确实有效。这个谜题有几个部分。
有一个非常简单的 PHP 脚本,它接收客户端命令并将它们放入一个庞大的列表中。它还检查所有房间列表和用户私有列表,以查看是否有必须传递的消息。这是由用 jQuery 和 jQuery 编写的客户端进行轮询的。每隔几秒就完成一次。
有一个命令行 PHP 脚本,以每秒 20 次的无限循环操作服务器端,检查此列表,然后处理这些命令。脚本在脚本内存中处理谁在哪个房间以及权限,此信息不存储在 Redis 中。
Redis 为每个房间和房间都有一个列表。每个用户的列表,作为专用队列运行。对于用户所在的每个房间,它还有多个计数器。如果用户计数器小于房间中的消息总数,则它会获取差值并将其发送给用户。
我无法对这个解决方案进行压力测试,但至少从我的基本基准测试来看,它可能每秒可以处理数千条消息。还有机会将其移植到 Node.js 之类的东西以提高性能。 Redis 也正在成熟,并且具有一些有趣的功能,例如发布/订阅命令,这些功能可能会令人感兴趣,这可能会消除服务器端的轮询。
我研究了基于 Comet 的解决方案,但其中许多都很复杂,文档很少,或者需要我学习一种全新的语言(例如 Jetty->Java、APE->C)等......还有交付和通过代理有时可能是 Comet 的问题。这就是我坚持民意调查的原因。
我想你可以用 MongoDB 做类似的事情。每个房间一个集合,每个用户一个集合以及然后是维护计数器的集合。您仍然需要编写后端守护程序或脚本来处理这些消息的去向。您还可以使用 MongoDB 的“有限集合”,它可以使文档保持排序和存储。也会自动清除旧消息,但这在维护正确的计数器方面可能会很复杂。
I used Redis, NGINX & PHP-FPM for my chat project. Not super elegant, but it does the trick. There are a few pieces to the puzzle.
There is a very simple PHP script that receives client commands and puts them in one massive LIST. It also checks all room LISTs and the users private LIST to see if there are messages it must deliver. This is polled by a client written in jQuery & it's done every few seconds.
There is a command line PHP script that operates server side in an infinite loop, 20 times per second, which checks this list and then processes these commands. The script handles who is in what room and permissions in the scripts memory, this info is not stored in Redis.
Redis has a LIST for each room & a LIST for each user which operates as a private queue. It also has multiple counters for each room the user is in. If the users counter is less than the total messages in the room, then it gets the difference and sends it to the user.
I haven't been able to stress test this solution, but at least from my basic benchmarking it could probably handle many thousands of messages per second. There is also the opportunity to port this over to something like Node.js to increase performance. Redis is also maturing and has some interesting features like Pub/Subscribe commands, which might be of interest, that would possibly remove the polling on the server side possibly.
I looked into Comet based solutions, but many of them were complicated, poorly documented or would require me learning an entirely new language(e.g. Jetty->Java, APE->C),etc... Also delivery and going through proxies can sometimes be an issue with Comet. So that is why I've stuck with polling.
I imagine you could do something similar with MongoDB. A collection per room, a collection per user & then a collection which maintains counters. You'll still need to write a back-end daemon or script to handle manging where these messages go. You could also use MongoDB's "limited collections", which keeps the documents sorted & also automatically clears old messages out, but that could be complicated in maintaining proper counters.
为什么使用 mongo 作为消息系统?无论静态存储有多快(mongo 非常快),无论是 mongo 还是 db,为了模仿消息队列,您都必须使用某种轮询,这不是非常可扩展或高效的。假设您没有做任何非常激烈的事情,但为什么不使用正确的工具来完成正确的工作呢?使用消息系统,例如 Rabbit 或 ActiveMQ。
如果你必须使用 mongo (也许你只是想玩玩它,这个项目是一个很好的机会做到这一点?)我想你会有一个用户集合(其中每个用户对象都有一个用户使用的队列列表)听)。对于消息,您可以为每个队列都有一个集合,但是您必须轮询您感兴趣的每个队列以获取消息。更好的办法是将单个集合作为队列,因为在 mongo 中很容易对单个集合执行“in”查询,因此可以轻松执行“在任何队列中获取比 X 更新的所有消息”之类的操作.列表中的名称[a,b,c]”。
您还可以考虑将您的集合设置为 mongo capped 集合,这意味着您在设置集合时告诉 mongo 您的集合应该仅保存 X 个字节或 X 个项目。添加附加项目具有先进先出的行为,这对于消息队列来说非常理想。但同样,它并不是真正的消息系统。
Why use mongo for a messaging system? No matter how fast the static store is (and mongo is very fast), whether mongo or db, to mimic a message queue your going to have to use some kind of polling, which is not very scalable or efficient. Granted you're not doing anything terribly intense, but why not just use the right tool for the right job? Use a messaging system like Rabbit or ActiveMQ.
If you must use mongo (maybe you just want to play around with it and this project is a good chance to do that?) I imagine you'll have a collection for users (where each user object has a list of the queues that user listens to). For messages, you could have a collection for each queue, but then you'd have to poll each queue you're interested in for messages. Better would be to have a single collection as a queue, as it's easy in mongo to do "in" queries on a single collection, so it'd be easy to do things like "get all messages newer than X in any queues where queue.name in list [a,b,c]".
You might also consider setting up your collection as a mongo capped collection, which just means that you tell mongo when you set up the collection that your collection should only hold X number of bytes, or X number of items. Adding additional items has First-In, First-Out behavior which is pretty much ideal for a message queue. But again, it's not really a messaging system.
1) ape-project.org
2) http://code.google.com/p/redis/ 3
)完成所有这些之后 - 您可以将数据哑化到 mongodb 中进行日志记录并存储一致的数据(用户、频道)
1) ape-project.org
2) http://code.google.com/p/redis/
3) after you're through all this - you can dumb data into mongodb for logging and store consistent data (users, channels) as well