用于主动/被动故障转移集群的.NET 库
我想开发一个连接到某些输入源并处理它读取的消息的应用程序(原则上考虑 BizTalk,但不那么繁重)。为了性能和可靠性,我希望启用服务的水平扩展,显然是通过利用共享存储(例如数据库)作为消息排队机制。
但是,访问电子邮件或磁盘文件夹等资源的线程无法水平扩展。一次只能运行一个实例从该输入源读取数据。 (进一步的消息处理业务逻辑当然可以驻留在多个节点上)。
这是主动/被动集群的完美候选者。一个节点被视为“主动”并主动连接到“单实例”资源(例如电子邮件收件箱),而其他节点则为“被动”。如果“主动”节点死亡,则其他“被动”节点将在它们之间选举一个新的“主动”节点。
现在的问题是:是否有一个 .NET 库可以帮助人们实现通常的故障转移集群逻辑? (即实现必要的心跳发送/检测,以及“主动”节点选举过程)。因为我不想重新发明轮子。
我从已经完成的研究中可以看到:
- BizTalk Server 本身支持此功能,但我没有使用 BizTalk,因为它太重且昂贵(但我想模拟它的此功能)
- Windows Server 支持故障转移群集(在某些高-最终版本(如 Windows Server 2008 Enterprise 或 Datacenter),但这又是一个昂贵的解决方案(因为每个节点都需要昂贵的许可证)
- 有很多关于故障转移算法应如何工作的信息,但我在任何地方都看不到开源实现...(仅在以溢价出售的商业产品中)
我知道它可能被认为是先进和理想的功能,因此为什么它的商业解决方案很昂贵。这很好 - 如果没有开源实现或库,我将自己开发一个。我只是不想花费它已经存在的精力。
更新 12/02/2011: 找到 SAForum (http: //www.saforum.org/link/linkshow.asp?link_id=214720),这是一个发布用于开发服务可用性概念的开放规范的网站。还有 OpenSAF (http://www. opensaf.org/Welcome-to-OpenSAF%E2%84%A2~151213~14944.htm),以及 SAForum 上规范的开源 C++ 实现。看上去很全面,但是却很沉重。我需要花费很多时间来仔细阅读规范和文档。它还涵盖的不仅仅是故障转移,还提供了完全可扩展的分布式系统的规范(通知、分布式事件、锁、集群管理等)……仍然没有 .NET 实现的迹象。
I want to develop an application that connects to some input sources and processes the messages it reads (think BizTalk in principle, but not as heavy). For performance and reliability I would like to enable horizontal scaling of the service, obviously by utilising a shared storage (such as DB) to act as a message queuing mechanism.
However, threads that access resources such as email or disk folder cannot be scaled horizontally. Only one instance must be running at one time reading from that input source. (Further message processing business logic can of course reside on multiple nodes).
This is a perfect candidate for Active/Passive clustering. One node is considered "Active" and actively connects to the "single-instance" resources (such as email inbox), while others are "Passive". If the "Active" node dies, then the other "Passive" nodes elect a new "Active" node among themselves.
Now the question: is there a .NET library out there somewhere which helps one implement the usual failover clustering logic? (i.e. implementing the necessary heartbeat sending/detection, and "active" node election process). As I don't want to reinvent the wheel.
What I can see from the research done already:
- BizTalk Server supports this functionality natively, but I am not using BizTalk as it's too heavy and expensive (but I want to emulate this functionality of it)
- Windows Server supports Failover Clustering (in certain high-end versions like Windows Server 2008 Enterprise or Datacenter), but again this is an expensive solution (as each node would need the expensive license)
- There is a lot of information on how failover algorithm should work, but I cannot see an open source implementation anywhere ... (only in commercial products sold at a premium)
I understand that it might be considered advanced and desirable functionality, and hence why commercial solutions for it are expensive. This is fine - if there is no open-source implementation or library out there, I will develop one on my own. I just don't want to spend the effort it it already exists.
UPDATE 12/02/2011: Found SAForum (http://www.saforum.org/link/linkshow.asp?link_id=214720), which is a website that publishes open specification for developing service availability concepts. There is also OpenSAF (http://www.opensaf.org/Welcome-to-OpenSAF%E2%84%A2~151213~14944.htm), and open-source C++ implementation of specifications on SAForum. Looks comprehensive, but is very heavy. It will take me a lot of time to wade through the specifications and documentation. It also covers a lot more than just fail-over, offering specification for full scalable distributed system (notifications, distributed events, locks, cluster management, etc.) ... Still no sign of a .NET implementation anywhere.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当然,自行开发这种高级功能比商业购买更昂贵。除非你的时间被贡献给这个项目,并且你没有截止日期,否则我不会自己写这个。
为了获得高可用性和水平扩展,您需要编写大量代码。测试它是否达到高可用性生产环境所需的水平也需要相当大的努力。即使你做了所有这些,你是否会相信你自己的代码而不是微软的代码,微软的代码已经积累了无数的运行时间,并且已经经历了所有软件都需要经历的多个版本才能变得成熟和稳定。
我知道您确实在问有关开源库的问题,但同样的论点也适用 - 您是否相信它,它是否经过充分测试,是否经过现场验证,以及当它倒下时您能踢谁的屁股?
更新:嗯,这是几年前的事了,我想我已经软化了我对在这种关键任务基础设施中使用开源的可行性的立场,尽管我仍然相信拥有商业支持是必不可少的,我仍然会避免自己写它。
我会在此处插入 Rabbit MQ 作为高可用性、高度可扩展的消息总线,其优点是其他人正在读这篇文章。提供商业支持,并且基于开放标准 (AMQP)。客户端库几乎可用于任何主要平台。
Surely developing this sort of advanced functionality on your own would be more expensive than buying it commercially. Unless your time is being donated to the project, and you have no deadline, I'd rule out writing this yourself.
To get high availability and horizontal scaling you need to write a lot of code. Testing that it works to the level that would be required in a high availability production environment will also take considerable effort. And even if you did all that, would you trust your own code over Microsoft's, which has accumulated run hours in the gazilions, and has been through the multiple versions that all software needs to go through to become mature and stable.
I know you were really asking about open source libraries, but the same argument applies - would you trust it, is it well tested, is it field proven, and who's butt can you kick when it falls dead?
Update: Well this was a few years ago and I guess I've softened my stance towards the viability of using open source for this sort of mission critical infrastructure, although I still believe having commercial support is essential, and I'd still avoid writing it yourself.
I would put in a plug here for Rabbit MQ as a high availability, highly scalable message bus, for the benefit of others reading this. Commercial support is available, and its based on open standards (AMQP). Client libraries are available for just about any major platform.