Prometheus`Kube_node_status_condition合并 * kube_node_status_condition`返回许多节点
使用EKS 1.22中的以下Prometheus图像:
alertmanager:v0.23.0
kube-state-metrics:v2.4.1
prometheus/node-exporter:v1.3.0
prometheus:v2.34.0
我正在尝试获取处于未就绪状态的节点的标签。但是,当此警报发射时,我会收到群集中所有节点的松弛通知。但是,只有一个节点被标记为notready
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-110-0-102.ec2.internal NotReady <none> 17h v1.22.9-eks-810597c
ip-10-110-1-79.ec2.internal Ready <none> 2d16h v1.22.9-eks-810597c
ip-10-110-3-243.ec2.internal Ready <none> 2d16h v1.22.9-eks-810597c
ip-10-110-4-137.ec2.internal Ready <none> 2d14h v1.22.9-eks-810597c
...
警报规则:
(kube_node_status_condition{condition="Ready",status=~"unknown|false"} == 1)
* on(node) group_right() kube_node_labels
松弛通知显示了许多条目:
Using the following prometheus images inside EKS 1.22:
alertmanager:v0.23.0
kube-state-metrics:v2.4.1
prometheus/node-exporter:v1.3.0
prometheus:v2.34.0
I'm trying to get labels of the node that is in NotReady condition. However when this alarm fires I get a slack notification with ALL nodes in the cluster. However only a single node is marked as NotReady
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-110-0-102.ec2.internal NotReady <none> 17h v1.22.9-eks-810597c
ip-10-110-1-79.ec2.internal Ready <none> 2d16h v1.22.9-eks-810597c
ip-10-110-3-243.ec2.internal Ready <none> 2d16h v1.22.9-eks-810597c
ip-10-110-4-137.ec2.internal Ready <none> 2d14h v1.22.9-eks-810597c
...
the alert rule:
(kube_node_status_condition{condition="Ready",status=~"unknown|false"} == 1)
* on(node) group_right() kube_node_labels
Slack notification shows many entries:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论