AWS CloudWatch警报,用于波动数据点
我正在监视ECS的Fargate服务,并想知道什么时候弹跳了很多(越来越失败,Healthcheck,被EC杀死,而新的则进行了安排并进行此操作)
,我复制了我感兴趣的场景和使用“示例计数”聚合进行EC的CPUutilitiz,我可以看到此图:
理想世界中的价值将是1,但是正如我们在这里看到的,ECS计划了一个新容器来替换不健康的容器,最终被杀死,我们看到
我想设置的 这种弹跳行为为此,CloudWatch警报。当价值从短时间内的理想价值中波动时,但我无法完全弄清楚这是否可以。也许有一些公制数学,但我无法完全理解。我还研究了异常检测,我认为这会起作用,但是如果价值围绕多个y轴点弹起,请付出额外的费用,我认为没有必要
引发警报。
I am monitoring a Fargate service on ECS and want to know when containers are bouncing a lot (come up, fail healthcheck, get killed by ECS and a new one gets scheduled and does the same)
I replicated the scenario I'm interested in and using the "Sample count" aggregation for CPUUtilization from ECS I can see this graph:
The value in an ideal world would be 1 but as we can see here ECS schedules a new container to replace the unhealthy one and that gets killed eventually and we see this bouncing behavior
I would like to set up a Cloudwatch alarm for this. When the value fluctuates a lot from the ideal value in a short period of time but I can't quite figure out if this is possible. Maybe with some metric math but I can't quite get it. I also looked into Anomaly Detection and I think that would work but it incurs extra cost that I don't think is warranted
Just looking to set off an alarm if value bounces around multiple y axis points in let's say a 5 minute time frame
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用公制数学进行这样的操作:
running_sum(abs(diff(mymetric)))
这将返回您期间指标中所有绝对更改的滚动总和。然后,您可以基于此创建警报,调整所需的期间和阈值。
You can do something like this with metric math:
RUNNING_SUM(ABS(DIFF(myMetric)))
This will return a rolling sum of all the absolute changes in the metric for your period. You can then create an alarm based on this, adjusting the desired period and threshold.