Set up AI Agent Monitoring
Get data in, enable platform-side evaluations for AI agent responses, and start monitoring your AI agents and applications.
Complete the following high-level steps to set up AI Agent Monitoring.
Collect traces and metrics from AI agents and applications
Complete the following steps to collect traces and metrics from your AI agents and applications.
-
Deploy the Splunk Distribution of OpenTelemetry Collector on the hosts that your applications are running on. You can either use the guided setup wizards in Splunk Observability Cloud or install the Collector programmatically.
- Guided setup wizards
-
Complete the following steps to use the guided setup wizards to deploy the Collector on a host.
-
In the Splunk Observability Cloud main menu, select .
-
Select .
-
Follow the on-screen instructions to deploy the Collector on your host.
-
- Advanced installation
-
To programmatically install the Collector, see:
-
Histogram metrics are required to display data on AI Agent Monitoring pages.
To send histogram data to Splunk Observability Cloud with the SignalFx exporter, set
send_otlp_histograms: truein your Collectorvalues.yamlfile. For example:YAMLexporters: signalfx: access_token: "${SPLUNK_ACCESS_TOKEN}" api_url: "${SPLUNK_API_URL}" ingest_url: "${SPLUNK_INGEST_URL}" sync_host_metadata: true correlation: send_otlp_histograms: true -
If the Python agent is installed in your Kubernetes cluster, configure the Kubernetes Downward API to expose environment variables to Kubernetes resources.
The following example shows how to update a deployment to expose environment variables by adding the agent configuration under the.spec.template.spec.containers.envsection:YAMLapiVersion: apps/v1 kind: Deployment spec: selector: matchLabels: app: your-application template: spec: containers: - name: myapp env: - name: SPLUNK_OTEL_AGENT valueFrom: fieldRef: fieldPath: status.hostIP - name: OTEL_EXPORTER_OTLP_ENDPOINT value: "http://$(SPLUNK_OTEL_AGENT):4317" - name: OTEL_SERVICE_NAME value: "<serviceName>" - name: OTEL_RESOURCE_ATTRIBUTES value: "deployment.environment=<environmentName>"To optionally configure the Python agent to send telemetry to Splunk Observability Cloud using other methods, see Instrument your Python application for Splunk Observability Cloud.
-
Set the following environment variables in your
.envfile. For more information on these environment variables and their supported values, see Configure the Python agent for AI applications (0.1.14 and higher).CODE# Emitters (span_metric for full telemetry) OTEL_INSTRUMENTATION_GENAI_EMITTERS=span_metric # Content Capture OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY # Metrics OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=deltaNote:Enabling GenAI content capture with one of the following settings may cause performance issues:
- OpenTelemetry GenAI utility version 0.1.14 and higher:
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENTasSPAN_AND_EVENT,SPAN_ONLY, orEVENT_ONLY -
OpenTelemetry GenAI utility version 0.1.13 and lower:
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true
This issue may arise when content captured in input and output
gen_aispan attributes is larger than Splunk Observability Cloud and AI Agent Monitoring backend limits. For more information, see Recording content on attributes in the OpenTelemetry documentation. - OpenTelemetry GenAI utility version 0.1.14 and higher:
-
Instrument or translate data from AI applications using one or more of the following options:
To troubleshoot the Splunk Distribution of the OpenTelemetry Collector, see Troubleshoot the Collector.Option Description Zero-code instrumentation Exports telemetry data without changes to your application's source code. Code-based instrumentation Exports telemetry data and requires modifying your application's source code. Translate data from third-party instrumentation libraries Converts data from applications already instrumented with supported third-party libraries and sends the data to Splunk Observability Cloud.
(Optional) Enable platform-side evaluations for AI agent responses
Evaluation is the process of testing a large-language model (LLM) to assess the bias, hallucination, relevance, sentiment, or toxicity of its outputs. Splunk Observability Cloud displays evaluation results so you can monitor the quality of your AI agent responses.
If you want to store your AI conversation data in Splunk Enterprise or Splunk Cloud Platform and monitor evaluation results, you can enable instrumentation-side evaluations.
This alternate data storage method requires a license for Splunk Enterprise or Splunk Cloud Platform, may incur additional resource costs, and limits feature availability. For instructions, see Store AI agent conversation data in Splunk Enterprise/Splunk Cloud Platform and enable instrumentation-side evaluations.
- Set up the LLM data integration:
-
In the Splunk Observability Cloud main menu, select .
-
Search for and select LLM Providers.
-
Follow the on-screen instructions to set up the data integration. After you set up the data integration, evaluations may take a few minutes to begin running.
If you receive an error when you try to save your configuration, see the troubleshooting topic LLM Providers data integration can't be saved.
-
-
Verify that the LLM Providers data integration was correctly configured and trace data is being ingested:
-
In the Splunk Observability Cloud main menu, select .
-
Add a filter to display data beginning from after you configured the LLM Providers data integration.
-
If you don't see data on this page, see the troubleshooting topic Troubleshoot data ingestion for AI Agent Monitoring.
-
(Optional) Set up the Cisco AI Defense integration
Set up the Cisco AI Defense integration to enable security risk metrics for AI agents.
For more information about the integration, see Introduction to Splunk AI Security Monitoring. For setup instructions, see Set up an integration with Cisco AI Defense.
Verify that your data is being ingested
Verify that your data is being ingested by using the Splunk Observability Cloud main menu to navigate to .
If you don't see data on this page:
-
Ensure that you have a role with the
read_apm_ai_conversationcapability, which is included in the admin and ai_monitoring roles. This capability is required to view AI agent conversation details, such as AI outputs, inputs, and system prompts. If you're an admin and want to grant the ai_monitoring rule to a user, see Assign roles to users in Splunk Observability Cloud.
Next steps
After you set up AI Agent Monitoring, you can: