如何使用Prometheus测量CPU使用百分比?
我正在尝试使用Prometheus测量值来获得在Kubernetes中运行的每种微服务的CPU使用百分比,以优化CPU资源和限制。
我有一个设置,每个客户在服务器上都有4个Micro Services。每个微服务都有一个单独的内存资源和限制和单独的CPU资源和限制。要获取Prometheus的平均值,我正在使用以下查询:
AVG_OVER_TIME(sum(rateer_cpu_usage_seconds_total) pod =〜“^$ exployment。< /em> $”} [5M]))[24H:5M]) / avg_over_time(sum(contuner_spec_cpu_quota {name =〜” pod =〜“^$ deployment。 $”}/container_spec_cpu_period {name =〜“^k8s_。。”,namespace =〜“ $ namesspace”,container_name!=“ pod”,pod =〜”^ $部署。 $“})[24h:5m]) * 100
要检查上述值是否正确,我进入每个值kubernetes pod并使用命令检查CPU用法: kubectl -n {namespace} top pod {exployment}
检查CPU限制我使用命令: kubectl -n {命名空间}描述pod {exployment}
我获得了CPU限制。
然后我进行计算: CPU使用率除以CPU限制次数100等于CPU使用的当前百分比。
我从CPU使用率和Kubernetes中限制中获得的值不同于使用Prometheus查询所获得的值(我获得的某些值很近,有些值很关闭)。 Here is an example of CPU usage in Percent from Prometheus and from Kubernetes:
Customer | Service | Prometheus | Kubernetes |
---|---|---|---|
Customer A | Service 1 | 0.216 | 0.2 |
Service 2 | 0.137 | 0.2 | |
Service 3 | 0.445 | 0.45 | |
Service 4 | 0.165 | 0.2 | |
Customer B | Service 1 | 0.139 | 0.2 |
Service 2 | 0.0917 | 0.2 | |
Service 3 | 0.5739 | 0.5 | |
服务4 | 0.0972 | 0.2 |
任何人是否有任何评论正确测量?我的Prometheus查询中是否有错误或如何从Kubernetes获得价值?我想确保使用Prometheus正确测量CPU的百分比使用百分比
I’m trying to use the prometheus measurements to get percent CPU usage for each micro service running in Kubernetes to optimize CPU resources and limits.
I have a setup where for each customer there are 4 micro services running on the server. Each micro service has a separate memory resource and limit and separate CPU resource and limit. To get the average from prometheus I am using the following query:
avg_over_time(sum(rate(container_cpu_usage_seconds_total{name=~"^k8s_.", namespace=~"$namespace", container_name!="POD", pod=~"^$Deployment.$"}[5m]))[24h:5m]) /
avg_over_time(sum(container_spec_cpu_quota{name=~"^k8s_.", namespace=~"$namespace",container_name!="POD", pod=~"^$Deployment.$"}/container_spec_cpu_period{name=~"^k8s_.",namespace=~"$namespace", container_name!="POD", pod=~"^$Deployment.$"})[24h:5m]) * 100
To check that the value above is correct, I go into each Kubernetes pod and check the CPU usage using the command:
kubectl -n {namespace} top pod {Deployment}
To check the CPU limit I use the command:
kubectl -n {namespace} describe pod {Deployment}
Where I get the CPU limit.
Then I do the calculation:
CPU usage divided by CPU limit times 100 equals current percent of CPU usage.
The values I get from the CPU usage and limit in Kubernetes are different from the values I get using the prometheus query (Some of the values I get are close and some are quite off).
Here is an example of CPU usage in Percent from Prometheus and from Kubernetes:
Customer | Service | Prometheus | Kubernetes |
---|---|---|---|
Customer A | Service 1 | 0.216 | 0.2 |
Service 2 | 0.137 | 0.2 | |
Service 3 | 0.445 | 0.45 | |
Service 4 | 0.165 | 0.2 | |
Customer B | Service 1 | 0.139 | 0.2 |
Service 2 | 0.0917 | 0.2 | |
Service 3 | 0.5739 | 0.5 | |
Service 4 | 0.0972 | 0.2 |
Anyone have any comments whether I am doing the measurements correctly? Is there a mistake in my prometheus query or how I get the values from Kubernetes? I want to make sure that I am measuring the percent CPU usage correctly using prometheus
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以尝试以下查询以获取一项服务并根据您的要求修改查询:
sum(caller_cpu_usage_seconds_total {id =“ /”} [1M])) / sum(machine_cpu_cors) * 100
i还跟踪每个POD的CPU使用情况。
sum(rate(contuner_cpu_usage_seconds_total {image!=“”} [1M]))(pod_name)我在github上有一个完整的kubernetes-prometheus解决方案,
也许可以帮助您使用更多的指标: https://github.com/camilb/camilb/prometheus-kubernetes。
希望这会有所帮助!结果与Windows Performance Manager几乎相同。因此,对于运行服务(任务,进程)的CPU%:
总和(process,hostName)(iRate(wmi_process_cpu_time_total {scaleset =“ name”,process =〜
Can you try the following query for one service and modify the query according your requirement:
sum (rate (container_cpu_usage_seconds_total{id="/"}[1m])) / sum (machine_cpu_cores) * 100
I also track the CPU usage for each pod.
sum (rate (container_cpu_usage_seconds_total{image!=""}[1m])) by (pod_name) I have a complete kubernetes-prometheus solution on GitHub,
maybe can help you with more metrics: https://github.com/camilb/prometheus-kubernetes.
I hope this will help! The result is pretty much the same as the Windows performance manager. So, for CPU % for running services (tasks, processes):
sum by (process,hostname)(irate(wmi_process_cpu_time_total{scaleset="name", process=~