Pod内存使用率的计算
通过docker-cadvisor的方式启动的metrics监控使用container_memory_rss
通过prometheus在k8s中启动的方式获取的metrics监控使用container_memory_usage_bytes或container_memory_working_set_bytes
Pod 内存使用率的计算就简单多了,直接用内存实际使用量除以内存限制使用量即可:
cadvisor容器 内存使用率大于90%
sum by(name, id, job, node) (container_memory_rss{image!="",job="ali-prod-executor-cadvisor"}) / sum by(name, id, job, node) (container_spec_memory_limit_bytes{image!="",job="ali-prod-executor-cadvisor"}) * 100 != +Inf > 80
k8s方式 方法1
avg by(pod_name) (container_memory_usage_bytes{pod_name!="",image!~".*pause-amd64:1031|.*pause-amd64:3.0"} / container_spec_memory_limit_bytes{pod_name!="",image!~".*pause-amd64:1031|.*pause-amd64:3.0"}) * 100 > 90
方法2
sum by(pod_name, namespace, job) (container_memory_working_set_bytes{image!="",image!~"xxxxx.com/xxs/pause.+",job!="xxxd-executor-cadvisor"}) / sum by(pod_name, namespace, job) (container_spec_memory_limit_bytes{image!="",image!~"reg.linkdoc-inc.com/ops/pause.+",job!="ali-prod-executor-cadvisor"}) * 100 != +Inf > 90
容器的CPU使用率:
sum by(pod_name, namespace, job) (rate(container_cpu_usage_seconds_total{image!=""}[1m])) / (sum by(pod_name, namespace, job) (container_spec_cpu_quota{image!=""} / 100000)) * 100 > 90
容器入带宽大于50M
sum by (namespace,job,pod_name) (irate(container_network_receive_bytes_total{image!=""}[3m])) / 1024 /1024 > 50
容器出带宽大于50M
sum by (namespace,job,pod_name) (irate(container_network_transmit_bytes_total{image!=""}[1m])) / 1024 /1024 > 50