NVIDIA NIM Metrics
NVIDIA NIM metrics in AppDynamics are documented for the llm service. They cover running requests, token metrics, cache utilization, finished-request counts, and latency approximations.
Prerequisites
- NIM
llmservice is deployed in thenimnamespace - Prometheus-compatible metrics are exposed at
/v1/metrics
Enable Prometheus Scraping for NVIDIA NIM
- service:
llm - namespace:
nim - port:
8000 - path:
/v1/metrics
Replace these values with the NIM LLM service name and namespace used in the target environment.
Configure Machine Agent Ingestion
Infrastructure Visibility Prometheus monitoring loads the NIM exporter definition through prometheus-config-template.yaml.
Before enabling the scrape, update the exporter YAML service discovery fields to the service name and namespace used by your NIM deployment.
Exporter YAML Contract
- exporter-yamls/nim-for-llms-exporter.yaml
- key direct metrics:
num_requests_runningprompt_tokens_totalgeneration_tokens_totalrequest_finish_totalgpu_cache_usage_perc
-
computed-metric source series (drive the latency and per-request metrics):
e2e_request_latency_seconds_sum/e2e_request_latency_seconds_counttime_to_first_token_seconds_sum/time_to_first_token_seconds_counttime_per_output_token_seconds_sum/time_per_output_token_seconds_countrequest_prompt_tokens_sum/request_prompt_tokens_countrequest_generation_tokens_sum/request_generation_tokens_count
- key computed metrics:
Avg E2E Latency (ms)Avg TTFT (ms)Avg TPOT (ms)-
Total Tokens Prompt Tokens per RequestGeneration Tokens per Request
Expected AppDynamics Custom Metric Paths
Custom Metrics|NIM|LLMs|{model_name}|Requests RunningCustom Metrics|NIM|LLMs|{model_name}|KV Cache Utilization (%)Custom Metrics|NIM|LLMs|{model_name}|Prompt TokensCustom Metrics|NIM|LLMs|{model_name}|Generation TokensCustom Metrics|NIM|LLMs|{model_name}|{finished_reason}|Finished RequestsCustom Metrics|NIM|LLMs|{model_name}|Avg E2E Latency (ms)Custom Metrics|NIM|LLMs|{model_name}|Avg TTFT (ms)Custom Metrics|NIM|LLMs|{model_name}|Avg TPOT (ms)Custom Metrics|NIM|LLMs|All Models|Prompt TokensCustom Metrics|NIM|LLMs|All Models|Generation TokensCustom Metrics|NIM|LLMs|{model_name}|Prompt Tokens per RequestCustom Metrics|NIM|LLMs|{model_name}|Generation Tokens per Request
Prompt Tokens and Generation Tokens are interval delta metrics. The All Models|... leaves are aggregate mirrors across models. The per-request leaves are derived metrics built from histogram _sum and _count sources.
Create Custom Dashboard
The custom dashboard script generates ready-to-import AppDynamics dashboard JSON files from a set of templates. You supply your environment's node names and, optionally, the custom metric path prefixes. The script substitutes them into the templates and writes the JSON files. See Create Custom Dashboards for AI Pods.
Troubleshooting
AppDynamics approximates the Splunk histogram percentile widgets as interval mean values from _sum and _count.