有时,在持续数天或数周的生产中会出现严重的错误(新的或重新引入的),而客户并不总是通知我们。我现在唯一的工具是 grep、awk & perl,但一旦有人抱怨,我就会做出反应。
我希望积极主动,并在给定时间段内特定错误发生一定次数时收到通知。但我不想收到关于每个错误的垃圾邮件通知。
是否有针对服务器集群的轻量级开源解决方案?电子邮件、短信或 RSS 都可以。此外,如果能以图表形式查看报告和趋势也很好,但这不是必需的。
目前我使用 Apache Log4J,并且我知道我可以使用它发送电子邮件警报。但正如我所说,我不想因为每个错误而收到电子邮件。我想了解系统何时应通知我的一些情报。我希望在我的应用程序代码之外获得这种智能。
Sometimes there are severe bugs (new or reintroduced) in productions that go on for days and weeks, and customers do not always notify us. The only tool I have now is grep, awk & perl but I am just being reactive once someone complains.
I want to be proactive and be notified when a certain error has occured for certain number of times in a given time period. But I don't want to be spammed with notifications on every single error.
Are there any lightweight, opensource solutions for a cluster of servers ? Email, SMS or RSS is fine. Also it would be nice to view the reports and trends in a graph too, but not necessary.
Currently I use Apache Log4J, and I know I can send email alerts using it. But as I said, I dont want to be email for every single error. I want to have some intelligence on the system on when it should notify me. And I want that intelligence outside of my application code.
发布评论
评论(1)
您能否添加每天运行一次的程序来执行您执行的所有 grep 操作并将结果发送或通过电子邮件发送给您?或者,您可以将结果发送给客户管理员,以便他们将其提升给您。
Can you add something that runs once per day that does all the greps you do and either sends or emails you the results? Alternately you can send the results to the customer's admin so they can elevate it to you.