Cluster Metrics

The Cluster Agent Dashboard metrics derive from the Kubernetes API, and they report information for the clusters and pods. For any defined set of namespaces, the Cluster Agent reports events on these Kubernetes and hardware resources.

Splunk AppDynamics monitors cluster health and Kubernetes objects for:

Cluster Agent

Metric NameDescriptionUI LocationMetric Path
Availability

Availability of the Cluster Agent.

This metric helps in identifying if the Cluster Agent is down. Value of 100 represents that the Cluster Agent status is active, thus available.

Server > Metric Browser

Cluster Agent|Availability

Cluster Summary Metrics

Metric NameDescriptionUI LocationMetric Path
Error events countNumber of error eventsDashboard > Errors Hardware Resources|Cluster|Error events count
Evicted pods countNumber of evicted pods Pods > Evicted Hardware Resources|Cluster|Evicted pods count
Eviction threats countNumber of events that represent pod evictions Dashboard > Errors Hardware Resources|Cluster|Eviction threats count
Image pull errorsNumber of image pull errors Dashboard > Issues > Image Issues Hardware Resources|Cluster|Image pull errors
Image pullsNumber of image pulls Dashboard > Issues > Image Issues Hardware Resources|Cluster|Image pulls
Info events countNumber of informational events Dashboard > Errors Hardware Resources|Cluster|Info events count
Pod errorsNumber of errors related to pods Dashboard > Issues > Pod Issues Hardware Resources|Cluster|Pod errors
Pod KillsNumber of pods that were killed Inventory > Pods > Pod Kills Hardware Resources|Cluster|Pod Kills
Pod restartsNumber of times the pods restarted Dashboard > Issues > Pod Issues Hardware Resources|Cluster|Pod restarts
Pods ScaledownsCount of scaledowns; you can scale down your deployments and replica sets. Inventory > Pods > Scaledowns Hardware Resources|Cluster|Pods Scaledowns
Pods countTotal count of pods Inventory > Pods > Phases > Normal Hardware Resources|Cluster|Pods count
Pods failedNumber of failed pods Pods > Failed Hardware Resources|Cluster|Pods failed
Pods pendingNumber of pods in a pending state. Pending status normally indicates an issue. See theKubernetes documentation. Pods > Pending Hardware Resources|Cluster|Pods pending
Pods runningNumber of pods in a running state Pods > running Hardware Resources|Cluster|Pods running
Pods succeededNumber of pods in Succeeded phase Dashboard > Pods By Phase Hardware Resources|Cluster|Pods succeeded
Pods unknownNumber of pods in Unknown state Dashboard > Pods By Phase Hardware Resources|Cluster|Pods unknown
Pods with Missing Dependencies - Config Maps and SecretsIf a pod is dependent on any Config Maps & Secrets, then those dependencies are missing. Inventory > Pods > Missing Dependencies - Config Maps and Secrets Hardware Resources|Cluster|Pods With Missing Dependencies - Config Maps And Secrets (Pod Metrics for Inventory tab)
Pods with Missing Dependencies - ServicesIf a pod is dependent on any Services, then those dependencies are missing.

Inventory > Pods > Missing Dependencies - Services

Hardware Resources|Cluster|Pods With Missing Dependencies (Pod Metrics for Inventory Tab)
Pods with No Limits

Number of pods with no limits (on CPU/memory) set. If you specified limits on any pod that you are starting, this metric indicates how many pods do not have a limit defined (Displays in the Inventory tab, under Pod Metrics).

Inventory > Pods > No Limits Hardware Resources|Cluster|Pods With No Limits
Pods With No Liveness ProbeNumber of pods with no liveness probe. If you configured a probe in Kubernetes to monitor liveness, the values display in the Inventory tab, under Pod Metrics. Inventory > Pods > No Probes -Liveness Hardware Resources|Cluster|Pods With No Liveness Probe
Pods With No Readiness ProbeNumber of pods with no readiness probe. If you configured a probe in Kubernetes to monitor readiness, the values display in the Inventory tab, under Pod Metrics. Inventory > Pods > No Probes -Readiness Hardware Resources|Cluster|Pods With No Readiness Probe
Privileged PodsNumber of privileged pods that run with root access (Displays in the Inventory tab, under Pod Metrics). Inventory > Pods > Privileged Hardware Resources|Cluster|Privileged Pods

Storage errorsOverall number of errors related to storage for the cluster. Inventory > Pod Metrics Hardware Resources|Cluster|Storage errors
Storage quota violationsNumber of storage quota violations; if someone exceeds that quota.Inventory > Pod Metrics Hardware Resources|Cluster|Storage quota violations

CPU

CPU Capacity

Metric NameDescriptionUI LocationMetric Path
Total (MilliCores)Total CPU capacity for the cluster in MilliCoresCluster Capacity > CPUHardware Resources|Cluster|CPU|Capacity|Total (MilliCores)
Used (MilliCores)CPU capacity already used by the cluster in MilliCoresCluster Capacity > CPUHardware Resources|Cluster|CPU|Capacity|Used (MilliCores)

CPU Quota

Metric NameDescriptionUI LocationMetric Path
Limit Used (%)Percentage of CPU limit quota used Dashboard > Quotas > CPU Limit

Hardware Resources|Cluster|CPU|Quota|Limit Used (%)

Limit Used (MilliCores)MilliCores value for CPU limit quota used Dashboard > Quotas > CPU Limit Hardware Resources|Cluster|CPU|Quota|Limit Used (MilliCores)
Request Used (%)Percentage of CPU request quota used Dashboard > Quotas > CPU Request Hardware Resources|Cluster|CPU|Quota|Request Used (%)
Request Used (MilliCores)MilliCores value for CPU request quota used Dashboard > Quotas > CPU Request Hardware Resources|Cluster|CPU|Quota|Request Used (Millicores)

CPU Utilization

Metric NameDescriptionUI LocationMetric Path
Limit (MilliCores)

Limit of CPU which can be used by the pods. Only the pods belonging to monitored namespaces are used to calculate this metric.

If this value is not specified for any pod, then the value is calculated as the CPU limit of the node.

For example:

  • If node limit is 24m with five pods without any CPU limit, then this value is displayed as 24m.
  • If node limit is 24m with five pods and each pod has a limit of 5m, this value displays the limit as 25m.
Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Limit (MilliCores)
Request (MilliCores)MilliCore value of CPU for which all the pods in monitored namespaces have requested. Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Request (MilliCores)
Used (MilliCores)Actual CPU which the pods from monitored namespaces are currently using. Dashboard > Utilization > CPU Hardware Resources|Cluster|CPU|Utilization|Used (MilliCores)

DaemonSets

Metric NameDescriptionUI LocationMetric Path
CountNumber of daemon sets that exist Inventory > Objects > DaemonSets > (Count) HardwareResources|Cluster|DaemonSets|Count
Nodes AvailableNumber of nodes that are running and available on the cluster Inventory > Objects > DaemonSets > Available HardwareResources|Cluster|DaemonSets|Nodes Available
Nodes MissScheduledNumber of nodes that are running, but should not be running Inventory > Objects > DaemonSets > MissScheduled HardwareResources|Cluster|DaemonSets|Nodes MissScheduled
Nodes UnavailableNumber of nodes that should be running, but are not running Inventory > Objects > DaemonSets > Unavailable HardwareResources|Cluster|DaemonSets|Nodes Unavailable

Deployments

Metric NameDescriptionUI LocationMetric Path
CountNumber of deployments that exist in the cluster Inventory > Objects > Deployments > (Count) HardwareResources|Cluster|Deployments|Count
ReplicasNumber of pod replicas in the cluster that are not in a terminated state Inventory > Objects > Deployments > Available HardwareResources|Cluster|Deployments|Replicas
Replicas UnavailableTotal number of unavailable pod replicas across all deployments in the cluster Inventory > Objects > Deployments > Unavailable HardwareResources|Cluster|Deployments|ReplicasUnavailable

Endpoints

Metric NameDescriptionUI LocationMetric Path
CountNumber of endpoints in the cluster Inventory > Services > Endpoints > Count HardwareResources|Cluster|Endpoints|Count
Not Ready AddressTotal number of not ready addresses for all the endpoints in the cluster Inventory > Services > Endpoints without ready IP HardwareResources|Cluster|Endpoints|Not Ready Address
OrphansTotal number of endpoints in the cluster which do not have any ready, nor any not ready addresses Inventory > Services > Orphan Endpoints with no IP HardwareResources|Cluster|Endpoints|Orphans
Ready AddressTotal number of ready addresses for all the endpoints in the cluster Inventory > Services > Endpoints HardwareResources|Cluster|Endpoints|Ready Address

Jobs

Metric NameDescriptionUI LocationMetric Path

Count

Total number of jobs in the cluster. Inventory > Objects > Jobs > (Count) Hardware Resources|Cluster|Jobs|Count

Pods Active

Total number of active pods for all the jobs in the cluster. Inventory > Objects > Jobs > Active Hardware Resources|Cluster|Jobs|Pods Active

Pods Failed

Total number of pods which reached phase Failed for all the jobs in the cluster. Inventory > Objects > Jobs > Failed Hardware Resources|Cluster|Jobs|Pods Failed

Pods Succeeded

Total number of pods which reached phase Succeeded for all the jobs in the cluster. Inventory > Objects > Jobs > Succeeded Hardware Resources|Cluster|Jobs|Pods Succeeded

Memory

Memory Capacity

Metric NameDescriptionUI LocationMetric Path
Total (MB)Total Memory capacity for the cluster in MBs. Dashboard > Cluster > Capacity > Memory Hardware Resources|Cluster|Memory|Capacity|Total (MB)
Used (MB)Memory capacity already used by the cluster in MBs Dashboard > Cluster > Capacity > Memory Hardware Resources|Cluster|Memory|Capacity|Used (MB)

Memory Quota

Metric NameDescriptionUI LocationMetric Path
Limit Used (%)Percentage of Memory limit quota used Dashboard > Quotas > Memory Limit

Hardware Resources|Cluster|Memory|Quota|Limit Used (%)

Limit Used (MB)MB value for Memory limit quota used Dashboard > Quotas > Memory Limit

Hardware Resources|Cluster|Memory|Quota|Limit Used (MB)

Request Used (%)Percentage of Memory request quota used Dashboard > Quotas > Memory Request Hardware Resources|Cluster|Memory|Quota|Request Used (%)
Request Used (MB)MB value for Memory request quota used Dashboard > Quotas > Memory Request Hardware Resources|Cluster|Memory|Quota|Request Used (MB)

Memory Utilization

Metric NameDescriptionUI LocationMetric Path
Limit (MB)

Limit of Memory which can be used by the pods. Only the pods belonging to monitored namespaces are used to calculate this metric.

If this value is not specified for any pod, then the value is calculated as the memory limit of the node.

For example:

  • If node limit is 24MB with five pods without any memory limit, then this value is displayed as 24MB.
  • If node limit is 24MB with five pods and each pod has a limit of 5MB, this value displays the limit as 25MB.
Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Limit (MB)
Request (MB)MB value of Memory for which all the pods in monitored namespaces have requested. Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Request (MB)
Used (MB)Actual Memory which the pods from monitored namespaces are currently using. Dashboard > Utilization > Memory Hardware Resources|Cluster|Memory|Utilization|Used (MB)

Nodes

Metric NameDescriptionUI LocationMetric Path
Master CountNumber of master nodes in the cluster Inventory > Masters Hardware Resources|Cluster|Nodes|Master Count
Worker CountNumber of worker nodes in the cluster Inventory > Workers Hardware Resources|Cluster|Nodes|Worker Count
Memory Pressure CountNumber of nodes that are under memory pressure in the cluster Inventory > Memory Pressure Hardware Resources|Cluster|Nodes|Memory Pressure Count
Disk Pressure CountNumber of nodes that are under disk pressure in the cluster Inventory > Disk Pressure Hardware Resources|Cluster|Nodes|Disk Pressure Count

Pods

Pods Capacity

Metric NameDescriptionUI LocationMetric Path
Total CountTotal number of pods that a cluster can support

Pods > Total Count

Hardware Resources|Cluster|Pods|Capacity|Total Count
Used CountNumber of pods already created in the cluster Pods > Count Hardware Resources|Cluster|Pods|Capacity|Used Count

Pods CPU Usage

Metric NameDescriptionUI LocationMetric Path
%Busy ScaledThis normalises the CPU usage percentage relative to the CPU limit, scaling it to a more detailed unit. This metric displays how much of the allocated CPU resources (measured in milli-cores) are being used, providing a precise view of CPU utilisation with the CPU limit of the resource.Server > Metric BrowserRoot|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|CPU|%Busy Scaled
%Busy

The percentage of the CPU used by a pod. If the CPU limit is provided for the pod, the busy % is calculated as the percentage of CPU used relative to the CPU limit of the pod.

If CPU limit of the pod is not specified, this is calculated as the percentage of CPU used relative to the CPU limit of the node or cluster, whichever is available.

Server > Metric BrowserRoot|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|CPU|%Busy

Pods Memory Usage

Metric NameDescriptionUI LocationMetric Path
Used (MB)The amount of memory used by a pod.Server > Metric BrowserRoot|Individual Nodes|<namespace>/<pod-name>|Hardware Resources|Memory|Used (MB)

PVC

PVC Quota

Metric NameDescriptionUI LocationMetric Path
UsedPVC quota already being used in the cluster (count) Dashboard > Quotas > PVC Hardware Resources|Cluster|PVC|Quota|Used
Used %Percentage of PVC quota already being used in the cluster Dashboard > Quotas > PVC Hardware Resources|Cluster|PVC|Quota|Used (%)

PVC Utilization

Metric NameDescriptionUI LocationMetric Path
Capacity (MB)Total PVC available for the pods in the monitored namespaces Dashboard > Utilization > PVCs Hardware Resources|Cluster|PVC|Utilization|Capacity (MB)
Request (MB)Value for PVC requested by pods in monitored namespaces Dashboard > Utilization > PVCs Hardware Resources|Cluster|PVC|Utilization|Request (MB)

ReplicaSets

Metric NameDescriptionUI LocationMetric Path
CountNumber of replica set resources in the cluster Inventory > Objects > ReplicaSets > Count Hardware Resources|Cluster|Count
ReplicasTotal number of replicas for all the replica sets in the clusterInventory > Objects > ReplicaSets > Count Hardware Resources|Cluster|ReplicaSets|Replicas
Replicas AvailableTotal number of available replicas for all the replica sets in the clusterInventory > Objects > ReplicaSets > Available Hardware Resources|Cluster|ReplicaSets|Replicas Available
Replicas UnavailableTotal number of unavailable replicas for all the replica sets in the cluster Inventory > Objects > ReplicaSets > Unavailable Hardware Resources|Cluster|ReplicaSets|Replicas Unavailable

Services

Metric NameDescriptionUI LocationMetric Path
CountTotal number of Kubernetes Services running in the cluster Inventory > Services > Services Hardware Resources|Cluster|Services|Count

StatefulSets

Metric NameDescriptionUI LocationMetric Path
CountNumber of statefulsets in monitored namespaces Inventory > Objects > StatefulSets > (Count) Hardware Resources|Cluster|StatefulSets|Count
Replicas ReadyNumber of replicas in a ready state across all statefulsets in monitored namespaces Inventory > Objects > StatefulSets > Replicas Not Ready Hardware Resources|Cluster|StatefulSets|Replicas Ready
Replicas DesiredNumber of replicas across all statefulsets in monitored namespaces which are specified as desired in statefulset specN/AHardware Resources|Cluster|StatefulSets|Replicas Desired
Replicas Not ReadyNumber of replicas across all statefulsets in monitored namespaces which are not ready and are yet to be created or started Inventory > Objects > StatefulSets > Replicas Not Ready Hardware Resources|Cluster|StatefulSets|Replicas Not Ready
CollisionsNumber of hash collisions for statefulsets across all namespaces monitoredN/AHardware Resources|Cluster|StatefulSets|Collisions

Storage Quota

Metric NameDescriptionUI LocationMetric Path
Used (MB)Storage quota used by the cluster in MB Dashboard > Quotas > Storage Hardware Resources|Cluster|Storage|Quota|Used (MB)
Used (%)Percentage of storage quota used by the cluster Dashboard > Quotas > Storage Hardware Resources|Cluster|Storage|Quota|Used (%)