Monitor AI traces and spans with Splunk APM

Monitor and troubleshoot the performance, quality, estimated cost, and security risk of the traces and spans associated with your AI agents using the AI trace data page.

Attention:

Alpha features described in this document are provided by Splunk to you "as is" without any warranties, maintenance and support, or service-level commitments. Splunk makes this alpha feature available in its sole discretion and may discontinue it at any time. These documents are not yet publicly available and we ask that you keep such information confidential. Use of alpha features is subject to the Splunk Pre-Release Agreement for Hosted Services.

Monitor and troubleshoot the performance, quality, estimated cost, and security risk of the traces and spans associated with your AI agents using the AI trace data page.

The AI trace data page can help you answer questions such as:

  • What agents were involved in this AI workflow, and what was the hierarchy of calls?

  • Which agents and corresponding interactions resulted in quality issues?

  • How are my AI interactions correlated with my underlying APM attributes and errors?

  • What was the timing of my AI calls, and which ones were responsible for overall latency?

Prerequisites

To monitor AI traces, you must meet the following requirements:

Access the AI trace data page

Use any of the following methods to access the AI trace data page:

  • In the Splunk Observability Cloud main menu, select APM > AI trace data.

  • On the AI agents page, select View related traces.

  • On the AI agents page, select the actions () menu in the row for an agent. Select View related AI trace data to navigate to the AI trace data page with a filter set for the agent.

  • In the detail view for an AI agent, select View related traces to navigate to the AI trace data page with a filter set for the agent.

Monitor all AI spans

The AI trace data page includes the following views that you can use to monitor all AI spans:

  • A chart that displays the number of interactions with quality or risk issues over the time period selected in the time filter. Use the Quality or Risk tab to toggle between interactions with quality or risk issues.

  • A filterable table that lists all of the spans associated with your AI agents and the performance, estimated cost, quality, and security risk metrics for each span.

The filters at the top of the page above the chart affect both the chart and the table, but the filters below the chart only affect the table.

For more information about monitoring security risks, see Monitor security risks on the AI trace data page.

Monitor a specific AI trace

On the AI trace data page, select a Trace ID to navigate to the AI trace view. The following screenshot displays an example of the AI trace view.

The trace view for an AI agent in Splunk APM.

By default, the trace Waterfall tab is selected. You can use this view to:

  • See the calls made between agents and agent quality issues in the Agent flow graph.

  • View all of your APM trace details and correlate AI-related data with service and other APM-level data.

  • Monitor the token usage, estimated cost, number of tool and LLM calls, model names, and relationships associated with the trace.

  • Monitor the spans within the trace. Select a span from the waterfall to display the Span properties panel, where you can:

    • Use the Span details tab to correlate the trace and span details with APM tag and attribute data.

    • Select the AI details tab to view the metadata, quality scores, and agent input and output messages for the span.

      To view AI agent conversation details, you must have a role with the read_apm_ai_conversation capability. This capability is included in the admin and ai_monitoring roles. If you're an admin and want to grant the ai_monitoring rule to a user, see Assign roles to users in Splunk Observability Cloud.

Select the Span Performance tab to monitor the workload usage percentage, duration, and number of child spans for each span.