网络服务器上的错误监控/处理
我们有一个 Web 服务器,我们将在其上启动许多应用程序。它们都将共享数据库和 memcached 服务器,但每个应用程序都有自己的 mySQL 数据库,并且每个应用程序的所有 memcached 密钥都有前缀。
可能的场景:
如果我们集群中的 Memcached 服务器出现故障,我们希望通过电子邮件/iPhone 推送通知或任何其他适当的方式自动联系某人(操作系统管理员)。
如果我们要在我们的服务器上为客户安装 150 个相同的应用程序,而一台 memcached 服务器死机了 - 所有 150 个应用程序都会单独发现这一点并联系我们的系统管理员,系统管理员肯定会考虑找一份新工作他或她不会因为凌晨 4:15 收到 150 条消息而被吵醒。
可能的解决方案:
一种想法是设置一个用于错误处理的外部服务器,该服务器获取发送的 $_POST 或 cURL 请求,并根据实际错误消息的严重性处理错误消息的存储。当然,它会在收到错误呼叫时进行检查,如果同一个 memcached 服务器已被报告为离线,则无需向系统管理员发送垃圾邮件以提供额外的提醒...
问题:
- 关于如何处理错误?
- 业内大佬们又是如何应对的呢?
谢谢!
We have a web server that we're about to launch a number of applications onto. They will all share database and memcached servers, but each application has it's own mySQL database and all memcached keys per application, is prefixed.
Possible scenario:
If a memcached server in our cluster goes boom, we want someone (operative system admin) to be automatically contacted by email/iphone push notification or in any other appropriate way.
If we we're about to install 150 identical applications for our customers on our servers, and a memcached server dies - all 150 applications will individually find this out and contact our system admin, which most certainly is going to think about getting a new job where he or she isn't about to be woken up by getting 150 messages sent 4:15 in the morning.
Possible solution:
One idea is to set up an external server for error handling that gets a $_POST or cURL request sent, and handles storage of the error message depending on the seriousness of the actual error message. It would of course check upon receiving the error call, that if the same memcached server have already been reported as offline, there would be no need to spam the system admin with additional reminders...
The questions:
- What's a good approach on how to handle errors?
- How does the big guys in the industry handle this?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以考虑使用开源监控框架,例如 Hyperic,这样您就不需要重新发明轮子。
Hyperic 可以开箱即用地监控系统的许多方面,并且插入新的监控点非常容易。它提供基于规则的警报,您可以配置哪些类型的警报在重置之前仅一次,而不是每次发生时一次。
我还没有使用它来监视 PHP 应用程序(尽管假设它可以),但已经非常成功地使用它来监视 java 应用程序和关联的 MySQL DB。
You might consider using an open source monitoring framework such as Hyperic so you don't need to reinvent the wheel.
Hyperic can monitor many aspects of your system out of the box and it's pretty easy to plug in new monitoring points. It provides rule based alerting and you can configure which types of alerts are once-only until reset vs. once each time it happens.
I have not used it to monitor a PHP app (though presume that it can), but have used it very successfully to monitor a java app and associated MySQL DB.
好吧,我认为你的问题最好在应用程序之外解决。
您想要监控物理和软件服务器/服务。我推荐类似 http://www.nagios.org/ 或 http://www.opennms.org/。将其设置为监视每个 memcached 服务器、mysql 服务器、apache 等,并发送有关状态更改的通知(关闭、资源不足、恢复等)
Well, I think your problem is best solved outside of the application.
You want to monitor physical and software servers/services. I'd recommend something like http://www.nagios.org/ or http://www.opennms.org/. Set it up to watch each memcached server, mysql server, apache, etc, and send notifications on state change (down, low resources, recovery, etc)