Troubleshoot a Cluster Agent Not Reporting Metrics
If the Cluster Agent does not report metrics for certain containers, pods, or nodes, it may be due to a problem with the Kubernetes Metrics Server. If metrics are not reported by the Metrics Server, then the Cluster Agent is unable to report them.
To verify that the Metrics Server is sending metrics, enter this command from your cluster's primary node:
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
If the output of the command does not show metrics for the container, there may be a problem with the Metrics Server. This example shows output from the Metrics Server:
{
"kind":"PodMetricsList",
"apiVersion":"metrics.k8s.io/v1beta1",
"metadata":{
"selfLink":"/apis/metrics.k8s.io/v1beta1/pods"
},
"items":[
{
"metadata":{
"name":"replicaset-test-cjnsc",
"namespace":"test-qe",
"selfLink":"/apis/metrics.k8s.io/v1beta1/namespaces/test-qe/pods/replicaset-test-cjnsc",
"creationTimestamp":"2019-09-23T10:24:46Z"
},
"timestamp":"2019-09-23T10:23:38Z",
"window":"30s",
"containers":[
{
"name":"appagent",
"usage":{
"cpu":"1667384n",
"memory":"258672Ki"
}
}
]
}
]
}
As the Metric Server collects metrics from nodes, pods, and containers, it logs all issues. To retrieve and view logs for the Metric Server, enter:
$ kubectl logs <metric-server pod name> -n <namespace for metric-server(default value is: "kube-system")> --tail <number of required lines of logs>
For example:
$ kubectl logs metrics-server-6764b987d-mtn7g -n kube-system --tail 20
The Metric Server logs may reveal why it could not collect metrics. For example:
E0920 11:44:54.204075 1 reststorage.go:147] unable to fetch pod metrics for pod test-qe/replicaset-test-9k7rl: no metrics known for pod
E0920 11:44:54.204080 1 reststorage.go:147] unable to fetch pod metrics for pod test/replicaset1-458-g9n2d: no metrics known for pod
E0920 11:44:54.204089 1 reststorage.go:147] unable to fetch pod metrics for pod kube-system/kube-proxy-t54rc: no metrics known for pod
E0920 11:45:19.188033 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:ip-111.111.111.111: unable to fetch metrics from Kubelet ip-111.111.111.111 (111.111.111.111): Get https://111.111.111.111:2222/stats/summary/: dial tcp 111.111.111.111:2222: i/o timeout