分布式网络监控——如何判断被监控的资源是否倒塌,或者监控器本身出现故障

发布于 2024-09-15 18:31:46 字数 305 浏览 1 评论 0原文

我正在构建一个用于监视多个大型网站(资源)的系统,使用由中央控制器控制的分布式 Web 服务。

我将讨论设计的一个特定部分 - 被认为已倒塌的资源的实际报告。

我的问题是,实际的监视器本身总是有可能出现故障,或者丢失了与资源的网络连接,而资源实际上是好的。如果问题并不真正存在,我不想报告这些问题。

我目前的计划是,让监视器请求,所有其他监视器检查该资源是否遇到问题,然后根据集体结果来决定该资源是否真的崩溃了。

我确信有人比我更有此类编程的经验。

此类问题有通用的解决方案吗?我的解决方案是看待这个问题的好方法吗?

I'm building a system for monitoring several large web sites (resources), using distributed web services controlled by a central controller.

I'm coming to a specific part of the design - the actual reporting of resources that are thought to have fallen over.

My problem is that there is always the chance that the actual monitor it self is at fault, or has lost its network connection to a resource, and the resource is actually fine. I don't want to report issues if they are not really there.

My plan at the moment is to have the monitor request, that all other monitors check the resource if it encounters a problem, and then make a decision as to whether the resource has really fallen over based on collective results.

I'm sure there's someone out there with more experience of this type of programming than myself.

Is there a common solution to this type of problem? Is my solution a decent way of looking at this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

榕城若虚 2024-09-22 18:31:46

您的解决方案是唯一实用的解决方案之一。

太阳底下并无新鲜事。 IETF 路由信息协议并不是解决此问题的第一次尝试,但它有据可查并且有效。

请注意,对于您所面临的此类问题,没有最佳(或完美)的解决方案,使用带内监控所能做的最好的事情就是很好地猜测故障所在。在需要非常高的故障信息准确度的系统中(例如公共交换电话网络 )建立了一个并行的带外监控网络,其本身必须由人类监控。

Your solution is about one of the only pragmatic ones.

There is nothing new under the sun. The IETF Routing Information Protocol wasn't the first attempt at addressing this problem, but it is well documented and works.

Note well, that there is no optimal (or perfect) solution to the class of problems which you are facing, the best you can do with in-band monitoring is make good guesses about where the fault is. In systems that need a very high degree of accuracy of fault information (e.g. the public switched telephone network) a parallel out-of-band monitoring network is established which itself must necessarily be monitored by humans.

无远思近则忧 2024-09-22 18:31:46

Quis custodiet ipsos custodes? (谁来观看观察者?)——尤维纳尔《讽刺》

Quis custodiet ipsos custodes? (Who will watch the watchers?) -- Juvenal, "Satires"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文