Monitor AI agents with Splunk APM

Attention:

Alpha features described in this document are provided by Splunk to you "as is" without any warranties, maintenance and support, or service-level commitments. Splunk makes this alpha feature available in its sole discretion and may discontinue it at any time. These documents are not yet publicly available and we ask that you keep such information confidential. Use of alpha features is subject to the Splunk Pre-Release Agreement for Hosted Services.

The following documentation links apply to users who want to store conversation data in Splunk Observability Cloud. Use the following links to navigate between the pages in this release:

Monitor and troubleshoot the performance, quality, estimated cost, and security risk of your AI agents with the AI agents page.

The AI agents page can help you answer questions such as:

Which of my AI agents are currently degraded in performance?
What AI agents are using the most tokens?
What quality issues are currently affecting my AI agents?
What types of quality issues are most prevalent?

Prerequisites

To monitor AI agents, you must meet the following requirements:

You have Set up AI Agent Monitoring.
(Optional) To enable security risk metrics, set up the Cisco AI Defense integration. For instructions, see Set up an integration with Cisco AI Defense.

Monitor all AI agents

To monitor all AI agents, select APM > AI agents from the Splunk Observability Cloud main menu. The following screenshot displays an example of the AI agents page.

The AI agents list view in Splunk APM.

On the AI agents page, the panels above the table display the aggregate metrics across all your agents. The table displays a list of the instrumented agents in your environment and their individual metrics. For more information on monitoring security risks, see Monitor security risks on the AI agents page.

Drill down into the detail view of an AI agent

In the table of agents on the AI agents page, select an agent name to navigate to the detail view. The detail view for an agent displays charts for the metrics shown in the table of agents.

Select any chart in this view to show example traces that match the parameters of the chart.

The following screenshot displays an example of the detail view for an agent.

The detail view for an AI agent.

Use the agent detail view to answer questions such as:

When did my agent start experiencing errors or issues?
Is my agent consuming a high number of tokens?
What quality issues is my agent facing?

On the AI agents page, you can use the following methods to view related traces:

Select View related traces to navigate to the AI trace data page. This page displays a table of the traces associated with your AI agents and the performance, estimated cost, quality, and risk metrics for each trace.
In the table of agents, select the actions (…) menu in the row for an agent. Select View related AI trace data to navigate to the AI trace data page with a filter set for the agent.
In the detail view for an agent, select View related traces to navigate to the AI trace data page with a filter set for the agent.

For more information on using the AI trace data page, see Monitor AI traces and spans with Splunk APM.

Note:

To view AI agent conversation details, you must have a role with the read_apm_ai_conversation capability. This capability is included in the admin and ai_monitoring roles. If you're an admin and want to grant the ai_monitoring rule to a user, see Assign roles to users in Splunk Observability Cloud.

Create a detector to generate alerts for an AI agent

To create a detector to trigger alerts for an AI agent from the AI agents page, select the actions (…) menu in the row for an agent and select Create Detector. For more information about detectors and alerts, see Create detectors to trigger alerts.

About AI agent quality scores

The percentage of evaluations that passed for a metric. Splunk Observability Cloud reports quality scores for the following metrics:

Bias: If responses are fair toward certain groups, ideas, or outcomes.
Hallucination: If responses are factually correct or incorrect.
Relevance: If responses are on topic, helpful, and match the user's question or task.
Sentiment: If the tone of responses are positive, negative, or neutral.
Toxicity: How harmful or offensive responses are.

For example, a quality score of 60% means that 60% of a given evaluation passed for the metric. When fewer than 80% of a given evaluation pass for a metric, the agent is flagged as having a quality issue.

Splunk Observability Cloud samples spans to calculate quality scores. The sampling rate is defined in the LLM Providers data integration, which is set up when you enable platform-side evaluations. To update the sampling rate, use the Splunk Observability Cloud main menu to select Data Management > Deployed integrations and edit the LLM Providers integration.

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

Enterprise Security

SOAR

IT Service Intelligence

Content Packs

Splunk Observability Cloud

AppDynamics SaaS

AppDynamics On-Premises

SAP Agent

Developer Documentation

Splunkbase

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

DATA MANAGEMENT

SEARCH AND ANALYTICS

ADMINISTRATION

Enterprise Security

SOAR

ENTERPRISE SECURITY

SOAR

RELATED APPS

IT Service Intelligence

Content Packs

ITSI

IT Ops

ADMINISTRATION

EXTENSIONS

Splunk Observability Cloud

MONITORING

DATA MANAGEMENT

ADMINISTRATION

AppDynamics SaaS

AppDynamics On-Premises

SAP Agent

ESSENTIALS

MONITORING

ADMINISTRATION

Developer Documentation

Splunkbase

PLATFORM

OBSERVABILITY

REFERENCE

Resources

REFERENCE

Learn More

Support

Prerequisites

Monitor all AI agents

Drill down into the detail view of an AI agent

View related traces for AI agents

Create a detector to generate alerts for an AI agent

About AI agent quality scores