设计后端(云)服务器以避免“热点”场景
我正在尝试设计一个实时群聊应用程序,专门针对每个聊天室中的大型群组(> 50 个用户)。并非所有用户都会立即积极聊天,但可以预期许多用户只是闲置/收听并在聊天进入聊天室时接收更新。
我已经制定了一个不面向云的原型,并且正在为基于云的系统重新设计。
我希望有一个“重定向/负载平衡”服务器(LBServer),它重定向到一系列后端“聊天”服务器(CServer)。当用户从客户端请求加入特定聊天室时,客户端将连接到 LBServer,LBServer 将回复特定 CServer 的连接信息,该 CServer 在内存中维护聊天室实例。然后客户端会断开与LBServer的连接并连接到CServer。只要用户留在聊天室中,与 CServer 的连接就会持续存在。 CServer 负责更新记录聊天室状态的后端数据库,并通知连接到自身的其他客户端聊天室中的更新。
您已经可以想象,如果一个聊天室中存在太多用户(因此一台 CServer 必须与所有这些用户保持持久连接),如果房间中的活动增加超过 CServer 处理速度的阈值,就会出现“热点”场景。更新所有更新。
此时,我想出了一个简单的解决方案,以便我的系统仍然具有可扩展性。我可以加载一个更大的 CServer 实例,复制聊天室的状态,并请求“热”CServer 中的所有用户重新连接到新的更大实例。我不认为这是处理此类系统的可扩展性的正确方法。
我有几个问题:
鉴于我希望聊天的实时性,是否有一种更合适的方法来设计我的后端系统以避免必须保持与一个服务器实例的连接?
当我已经在跟踪数据库中的状态时,我是否还需要费心隔离每个聊天室的处理,使其全部发生在一台 CServer 上?我想为用户留出空间,以便能够同时参与多个聊天室。如果我们使用我当前的模型,客户端将必须维护与我的云的多个连接(用户所在的每个聊天室一个连接)。这对客户端来说很糟糕。作为修订,我设想客户端维护与“通用”CServer 的连接,这些 CServer 将侦听用户当前所在的聊天室中的更改并相应地更新它们。
我们将非常感谢所有反馈和意见,我很乐意详细说明任何不清楚的地方。谢谢。
I'm trying to design a real-time group chat application specifically targeted towards large groups (>50 users) in each chatroom. Not all users will be actively chatting at once, but one can expect many users to simply idle/listen and receive updates as chats come into the chatrooms.
I've worked out a prototype that is not cloud-oriented and am in the process of redesigning for a cloud-based system.
I expect to have one 'redirecting/load balancing' server (LBServer) that redirects to a series of backend 'chat' servers (CServers). When the user requests to join a particular chatroom from the client, the client will connect to the LBServer and the LBServer will reply with the connection information for a particular CServer that maintains an instance of the chatroom in memory. Then the client will disconnect from the LBServer and connect to the CServer. This connection to the CServer is persisted for as long as the user remains in the chatroom. The CServer is responsible for updating a backend database that logs chatroom state as well as notify the other clients connected to itself of updates in the chatroom.
You can already envision if too many users exist in one chatroom (so one CServer must maintain persistent connections to all these users), that a 'hotspot' scenario will unfold if activity in the room increases past the threshold of the CServer's processing speed to keep up with all updates.
At this point, I've come up with one naive solution so that my system is still scalable. I could load up a larger CServer instance, copy over the state of the chatroom, and request all users in the 'hot' CServer to reconnect to the new larger instance. I don't believe this is the correct way to handle the scalability of such a system.
I have a few questions:
Given that I wish to the real-time nature of the chat, is there a more appropriate way to design my backend system to avoid having to persist connections to one server instance?
Do I even need to bother isolating each chatroom's processing to occur all on one CServer when I'm keeping track of state in a databaes already? I want to leave room open for users to be able to participate in multiple chatrooms simultaneously. If we use my current model, the client will have to maintain multiple connections to my cloud (one for each chatroom the user is in). This sucks for the client end. As a revision, I'm envisioning clients maintaining connections to 'universal' CServers that will listen for changes in chatrooms the users is currently in and update them accordingly.
All feedback and input would be extremely appreciated, and I would be glad to elaborate on anything that is unclear. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可能想了解一下 IRC http://en.wikipedia.org/wiki/Internet_Relay_Chat进行简单的多播。 IRC 显然存在一些可扩展性问题和设计问题,但通常仍然运行得非常好。 IRC 协议的几个问题是:1) 网络对服务器树有相当大的信任;2) 网络状态的更改需要客户端的拆分/合并。有关 RFC 和其他技术细节,请参阅:http://www.irc.org/techie.html
此比较 http://en.wikipedia.org/wiki/Comparison_of_instant_messaging_protocols 还包括 PSYC(同步会议协议 - 我从未听说过),据说它已经解决了一些问题IRC 协议:http://about.psyc.eu/Introduction
还有 XMPP http://fi.wikipedia.org/wiki/Extensible_Messaging_and_Presence_Protocol,但这不进行多播,并且可能更多适合MSN/Google Talk类型的一对一聊天,尽管FB聊天(用Erlang编写)除了Google Talk之外还使用它。
说到 Erlang http://en.wikipedia.org/wiki/Erlang_(programming_language) ) – 它遵循 Actor 模型 http://en.wikipedia.org/wiki/Actor_model 用于并发,这有助于分布式和可扩展性其他语言如 Scala、Common Lisp、Python 和 Haskell 也支持 Actor 模型, 。
PS,我并不声称自己是设计聊天协议的专家,只是碰巧了解一两点关于网络协议的知识,并且最近对并发编程技术进行了一些业余爱好研究
You may want to look at how IRC http://en.wikipedia.org/wiki/Internet_Relay_Chat does simple multicasting. IRC apparently has some scalability problems and design issues, but still usually works amazingly well. Couple of IRC protocol’s problems are that 1) the network has quite some trust on the tree of servers and 2) that changes to the network state require splits/joins of clients. For RFCs and other technicalities, see: http://www.irc.org/techie.html
This comparison http://en.wikipedia.org/wiki/Comparison_of_instant_messaging_protocols also includes PSYC (Protocol for SYnchronous Conferencing – I’ve never heard about it), that supposedly has fixed some of the problems with IRC protocol: http://about.psyc.eu/Introduction
There is also XMPP http://fi.wikipedia.org/wiki/Extensible_Messaging_and_Presence_Protocol, but that does not do multicasting, and may be more suitable to MSN/Google Talk type one-to-one chatting, although the FB chat (written in Erlang) uses it in addition to Google Talk.
And speaking of Erlang http://en.wikipedia.org/wiki/Erlang_(programming_language) – it follows the Actor model http://en.wikipedia.org/wiki/Actor_model for concurrency, which aids with distributability and scalability. Other languages like Scala, Common Lisp, Python and Haskell also support the Actor model, either natively or through libraries.
PS. I don’t claim to be an expert on designing a chat protocol, just happen to know thing or two about network protocols and have recently done some hobby research on concurrent programming techniques...
我认为这里有几个设计注意事项:
考虑让每个聊天室显示为 DNS 中的子条目。例如,chatroom1.chatservice.com。这样,您就可以在服务器之间实现负载平衡并保持粘性。
聊天服务器可以通过多播相互通信,并且可以解耦消息发送者与接收者以提供规模。
您不需要维护持久连接,但通信流需要包含可以处理路由职责的令牌。
詹姆斯·麦戈文
惠普
I think there are several design considerations here:
Consider having each chatroom appear as a subentry in DNS. For example, chatroom1.chatservice.com. That way, you can load balance across servers and keep things sticky.
Chat servers can communicate with each other over multicast and can decouple the sender of messages vs receivers in order to provide scale.
You don't need to maintain a persistent connection, but rather the stream of communication needs to contain a token that can handle routing duties.
James McGovern
HP
我认为您可以利用一些 MOM 主题和订阅者模型。
您可以创建主题库队列,其中只有聊天室。
用户只不过是订阅者。 RabitMQ/ActiveMQ 会很有帮助。
您还可以在 DWR 的帮助下使用反向 AJAX 或服务器推送。
您可以缓存或内存数据库(如 CSQL/SQLite)来提高性能。
您可以将用户及其聊天室映射放入数据库中。
I think you can utilise some MOM topic and subscriber model.
You can create topic base queue which nothing but chat rooms.
Users are nothing but subcribers. RabitMQ/ActiveMQ will be helpfull.
You also use reverse AJAX or Server push with help of DWR.
You can cache or In memory database like CSQL/SQLite to to increase performance.
You can put the users and their chat room mapping in databse.