Windows服务——高可用性场景和设计方法

发布于 2024-08-27 19:28:48 字数 253 浏览 14 评论 0原文

假设我有一个独立的 Windows 服务在 Windows 服务器计算机上运行。如何保证高可用？

1).您可以提出哪些设计级别指南？

2）。如何使其像主/从一样高可用，例如目前市场上可用的集群解决方案

3）。在发生任何故障转移情况时如何处理横切问题

如果您能想到其他任何问题，请在此处添加..

注意： 该问题仅与Windows和Windows服务有关，请尽量遵守此规则:)

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

沫雨熙 2024-09-03 19:28:48

要保持服务至少运行，您可以安排 Windows 服务管理器在服务崩溃时自动重新启动服务（请参阅服务属性上的“恢复”选项卡。）此处提供了更多详细信息，包括用于设置这些属性的批处理脚本 - < a href="https://serverfault.com/questions/48600/how-can-i-automatically-restart-a-windows-service-if-it-crashes">如果 Windows 服务崩溃，请重新启动

高可用性不仅仅是保持服务不受外部影响 - 服务本身需要在构建时考虑到高可用性（即始终使用良好的编程实践、适当的数据结构、对资源获取和释放），并且整个压力 -进行测试以确保其在预期负载下保持正常运行。

对于幂等命令，可以通过重新调用命令一定次数来容忍间歇性故障（例如锁定资源）。这允许服务保护客户端免受故障（在一定程度上）。客户端还应该进行编码以预测故障。客户端可以通过多种方式处理服务故障 - 记录、提示用户、重试 X 次、记录致命错误和退出都是可能的处理程序 - 哪一种适合您取决于您的要求。如果服务具有“会话状态”，当服务发生硬故障（即进程重新启动）时，客户端应该意识到并处理这种情况，因为这通常意味着当前会话状态已经丢失。

单台机器很容易出现硬件故障，因此如果您要使用单台机器，请确保它具有冗余组件。 HDD 特别容易出现故障，因此至少要有镜像驱动器或 RAID 阵列。 PSU 是下一个弱点，因此冗余 PSU 和 UPS 也是值得的。

至于集群，Windows 支持服务集群，并使用网络名称而不是单个计算机名称来管理服务。这允许您的客户端连接到运行该服务的任何计算机，而不是硬编码的名称。但除非您采取其他措施，否则这就是资源故障转移 - 将请求从一个服务实例定向到另一个服务实例。转换状态通常会丢失。如果您的服务正在写入数据库，那么也应该对其进行集群，以确保可靠性并确保更改可用于整个集群，而不仅仅是本地节点。

这实际上只是冰山一角，但我希望它能为您提供开始进一步研究的想法。

Microsoft 群集服务 (MSCS)

To keep the service at least running you can arrange for the Windows Service Manager to automatically restart the service if it crashes (see the Recovery tab on the service properties.) More details are available here, including a batch script to set these properties - Restart a windows service if it crashes

High availability is more than just keeping the service up from the outside - the service itself needs to be built with high-availabiity in mind (i.e. use good programming practices throughout, appropriate datastructures, pairs resource aquire and release), and the whole stress-tested to ensure that it will stay up under expected loads.

For idempotent commands, tolerating intermittent failures (such as locked resources) can be achieved by re-invoking the command a certain number of times. This allows the service to shield the client from the failure (up to a point.) The client should also be coded to anticipate failure. The client can handle service failure in several ways - logging, prompting the user, retrying X times, logging a fatal error and exiting are all possible handlers - which one is right for you depends upon your requirements. If the service has "conversation state", when service fails hard (i.e. process is restarted), the client should be aware of and handle ths situation, as it usually means current conversation state has been lost.

A single machine is going to be vulnerable to hardware failure, so if you are going to use a single machine, then ensure it has redundant components. HDDs are particularly prone to failure, so have at least mirrored drives, or a RAID array. PSUs are the next weak point, so redundant PSU is also worthwhile, as is a UPS.

As to clustering, Windows supports service clustering, and manages services using a Network Name, rather than individual Computer names. This allows your client to connect to any machine running the service and not a hard-coded name. But unless you take additional measures, this is Resource failover - directing requests from one instance of the service to another. Converstaion state is usually lost. If your services are writing to a database, then that should also be clustered to also ensure reliabiity and ensure changes are available to the entire cluster, and not just the local node.

This is really just the tip of the iceberg, but I hope it gives you ideas to get started on further research.

Microsoft Clustering Service (MSCS)

回复收藏 0 原文