Set up AI Observability
Learn about the high-level steps to set up Splunk AI Observability.
Monitor and troubleshoot your AI components by sending data from AI components to Splunk Observability Cloud.
Complete the following high-level steps to set up and use AI Observability.
Collect metrics and metadata from AI components
Learn how to collect metrics and metadata for Splunk AI Observability.
Splunk Observability Cloud supports multiple data ingestion and connection methods to collect your Amazon Web Services (AWS), Azure, and Google Cloud Platform metrics and metadata. To collect metrics and metadata from all other AI components, you must install the Splunk Distribution of the OpenTelemetry Collector and configure an OpenTelemetry receiver.
To collect metrics and metadata, refer to the following documentation for your AI component:
-
Configure your Splunk Observability Cloud account to collect AWS Bedrock metrics
-
Configure your Splunk Observability Cloud account to collect Azure OpenAI metrics
-
Configure your Splunk Observability Cloud account to collect GCP VertexAI metrics
-
Configure the Prometheus receiver to collect Nvidia GPU metrics
-
Configure the Prometheus receiver to collect Nvidia NIM metrics
-
Configure the Prometheus receiver to collect Ray cluster metrics
-
Configure the Prometheus receiver to collect Weaviate metrics
Collect traces and logs from AI components
Learn how to collect traces and logs from AI components for Splunk AI Observability.
Splunk Observability Cloud uses the Splunk HTTP Event Collector (HEC) exporter to enable the Splunk Distribution of the OpenTelemetry Collector to collect traces and logs. Splunk Log Observer Connect correlates the logs with metrics and traces for advanced troubleshooting.
To collect traces and logs for your AI components, complete the following high-level steps:
Monitor and troubleshoot your AI components
Learn about the tools you can use to monitor and troubleshoot your AI components in Splunk AI Observability.
After you set up data collection from supported AI components to Splunk Observability Cloud, the data populates built-in experiences that you can use to monitor and troubleshoot your AI components.
Monitoring tool | Use this tool to | Link to documentation |
---|---|---|
Built-in navigators | Orient and explore different layers of your AI tech stack. | |
Built-in dashboards | Assess service, endpoint, and system health at a glance. | |
Splunk Application Performance Monitoring (APM) service map and trace view | View all of your LLM service dependency graphs and user interactions in the service map or trace view. |