Collection Methods
NVIDIA-SMI Collector: A lightweight, zero-dependency approach that uses the native NVIDIA-SMI CLI. See Configure the NVIDIA-SMI Collector.
- DCGM Exporter:
NVIDIA's Data Center GPU Manager (DCGM) Prometheus exporter for high-frequency sampling.
Available as both a standalone node agent and as a Kubernetes DaemonSet. Using Cluster Agent, it provides end-to-end GPU monitoring in Kubernetes environments, capturing granular metrics at the pod and container levels. See Configure the DCGM Exporter.