Monitor your overall AI application and agent environment with Splunk APM
Monitor the overall performance, quality, estimated cost, and security risk of your AI applications and agents with the AI overview page.
Alpha features described in this document are provided by Splunk to you "as is" without any warranties, maintenance and support, or service-level commitments. Splunk makes this alpha feature available in its sole discretion and may discontinue it at any time. These documents are not yet publicly available and we ask that you keep such information confidential. Use of alpha features is subject to the Splunk Pre-Release Agreement for Hosted Services.
-
AI Agent Monitoring
-
AI Agent Security Monitoring
The AI overview page can help you answer questions such as:
-
How is my overall application environment performing, in terms of total errors, latency, and quality issues?
-
What's driving estimated costs and token usage among my applications?
-
Which models and providers are driving errors, latency, and quality issues?
Prerequisites
To monitor AI applications and agents, you must meet the following requirements.
-
You have Set up AI Agent Monitoring.
-
(Optional) To enable security risk metrics, set up the Cisco AI Defense integration. For instructions, see Set up an integration with Cisco AI Defense.
Monitor all AI applications and agents
To monitor all AI applications and agents, use the Splunk Observability Cloud main menu to select . The following screenshot displays an example of the page.
On the AI overview page, the Requests, Errors, Tokens, and Estimated cost sections of the header display the aggregate metrics across all of your AI applications and agents.
To monitor your AI agents in greater detail with the AI agents page, select View all AI agents. For more information on using this page, see Monitor AI agents with Splunk APM.
Analyze AI applications and agents using overview charts
On the AI overview page, the charts display all metric values in the selected time period for the AI applications and agents in your environment. Use the filters above each chart to update the chart view based on model, provider, or other filters based on the associated metric.
Select any chart in this view to show example traces that match the parameters of the chart.
| Chart name | Metric name | Use this chart to |
|---|---|---|
| Requests |
CODE
|
Determine the total number of requests/calls based on spans with chat operations. This metric indicates the total traffic faced by your AI applications and agents. |
| Errors |
CODE
|
Determine the total number of errors based on spans with chat operations. This metric is a leading indicator for technical issues faced by your system. |
| Error rates |
CODE
|
Determine how many errors occurred among your total calls/requests. A high error rate indicates that a high number of your users are facing issues. |
| Latency per LLM generation |
CODE
|
Determine latency of GenAI spans. A high latency indicates that your users are facing long wait times for responses. |
| Latency per provider |
CODE
|
Determine latency of GenAI spans by model provider. Use this metric to determine if any model provider is currently producing slow responses. |
| Latency per operation |
CODE
|
Determine latency of GenAI spans by operation type. This metric indicates which operations are currently performing slowly and helps guide troubleshooting. |
| Token usage |
CODE
|
Track token usage by model or request. A model or request using a high number of tokens could be experiencing increased traffic or could be wasting resources. |
| Estimated cost |
CODE
|
Track estimated costs by model or request. A model or request with high estimated costs may indicate high traffic or that re-optimization across models could reduce costs. |
| Quality issues |
CODE
|
Track negative-scoring evaluations by model and correlate semantic issues with models. A high number of issues correlated with a specific model may require action to mitigate issues, or shifting traffic to different models to prevent the issues. For more information about quality scores, see About AI agent quality scores. |
| Risks |
CODE
|
Track security risks by model or provider and correlate security issues with models. A high number of issues correlated with a specific model may require action to mitigate issues, or shifting traffic to different models to prevent the issues. For more information about monitoring security risks, see Monitor security risks on the AI overview page. |
Create a detector to generate alerts from a chart
To create a detector to generate alerts for a chart, select the actions (…) menu in the chart and select New detector from chart. For more information on detectors and alerts, see Create detectors to trigger alerts.