DataDog 中用于计算错误请求百分比和请求数量的指标
我们有一个 datadog 指标,可以测量成功的网络请求的百分比,并在超过阈值时向我们发出警报。该指标的一个问题是,当我们没有太多请求时,它会在周末晚上变得嘈杂,甚至一个错误都会使指标超过阈值。
现在的查询是这样的:
"query": "sum(last_30m):sum:q.inquiry{success:true}.as_count() / sum:q.inquiry.as_count() * 100 < 80"
它计算成功的请求数与请求总数的关系,并报告是否超过 80%。
有没有办法执行布尔运算来执行类似的操作,
above_query && sum(last_30m):sum:q.inquiry{success:false}.as_count() > 3
仅当错误请求数大于 3 时才会发出警报
We have a datadog metric that measures a % of successful web requests and alerts us if it's more than threshold. A problem with this metric is that it becomes noisy on weekend nights when we don't have too many requests and even a single error can push the metric over the threshold.
Right now the query is like this:
"query": "sum(last_30m):sum:q.inquiry{success:true}.as_count() / sum:q.inquiry.as_count() * 100 < 80"
which counts the number of requests that succeed vs total number of requests and reports if it's more than 80%.
Is there a way to do a boolean operation to do something like
above_query && sum(last_30m):sum:q.inquiry{success:false}.as_count() > 3
which will only alert if the number of bad requests is greater than 3
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
创建一个在两个查询上触发的复合监视器,其中一个包含计数阈值和百分比阈值。
Create a composite monitor that triggers on your two queries, one with the count threshold and one with the percentage threshold.