AWS CloudWatch警报,用于波动数据点

发布于 2025-01-27 00:10:44 字数 522 浏览 3 评论 0原文

我正在监视ECS的Fargate服务,并想知道什么时候弹跳了很多(越来越失败,Healthcheck,被EC杀死,而新的则进行了安排并进行此操作)

,我复制了我感兴趣的场景和使用“示例计数”聚合进行EC的CPUutilitiz,我可以看到此图:

理想世界中的价值将是1,但是正如我们在这里看到的,ECS计划了一个新容器来替换不健康的容器,最终被杀死,我们看到

我想设置的 这种弹跳行为为此,CloudWatch警报。当价值从短时间内的理想价值中波动时,但我无法完全弄清楚这是否可以。也许有一些公制数学,但我无法完全理解。我还研究了异常检测,我认为这会起作用,但是如果价值围绕多个y轴点弹起,请付出额外的费用,我认为没有必要

引发警报。

I am monitoring a Fargate service on ECS and want to know when containers are bouncing a lot (come up, fail healthcheck, get killed by ECS and a new one gets scheduled and does the same)

I replicated the scenario I'm interested in and using the "Sample count" aggregation for CPUUtilization from ECS I can see this graph:
enter image description here

The value in an ideal world would be 1 but as we can see here ECS schedules a new container to replace the unhealthy one and that gets killed eventually and we see this bouncing behavior

I would like to set up a Cloudwatch alarm for this. When the value fluctuates a lot from the ideal value in a short period of time but I can't quite figure out if this is possible. Maybe with some metric math but I can't quite get it. I also looked into Anomaly Detection and I think that would work but it incurs extra cost that I don't think is warranted

Just looking to set off an alarm if value bounces around multiple y axis points in let's say a 5 minute time frame

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

请远离我 2025-02-03 00:10:44

您可以使用公制数学进行这样的操作:
running_sum(abs(diff(mymetric)))

这将返回您期间指标中所有绝对更改的滚动总和。然后,您可以基于此创建警报,调整所需的期间和阈值。

You can do something like this with metric math:
RUNNING_SUM(ABS(DIFF(myMetric)))

This will return a rolling sum of all the absolute changes in the metric for your period. You can then create an alarm based on this, adjusting the desired period and threshold.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文