Configure the Prometheus receiver to collect agentgateway LLM proxy metrics
Configure the Splunk Distribution of the OpenTelemetry Collector to send agentgateway LLM proxy telemetry data to Splunk Observability Cloud.
You can monitor the performance of agentgateway LLM proxy by configuring the Splunk Distribution of the OpenTelemetry Collector to send agentgateway LLM proxy metrics to Splunk Observability Cloud.
This solution uses the Prometheus receiver to collect metrics from agentgateway LLM proxy, which exposes the Prometheus-compatible /metrics endpoint.
To configure the Prometheus receiver to collect agentgateway LLM proxy metrics, you must have access to the agentgateway Prometheus-compatible /metrics endpoint. This endpoint is typically available on port 15020 unless overridden by configuration.
- Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
- To manually activate the Prometheus receiver for agentgateway LLM proxy, make the following changes to your Collector
values.yamlconfiguration file.
Configuration settings
To view the configuration options for the Prometheus receiver, see Settings.
Metrics
The following metrics are available for agentgateway LLM proxy. For more information, see View LLM metrics in the agentgateway documentation.
These metrics are considered custom metrics in Splunk Observability Cloud.
| Metric name | Metric type | Description |
|---|---|---|
agentgateway_downstream_connections |
counter | The total number of established downstream connections. |
agentgateway_downstream_received_bytes |
counter | Total TCP bytes received per connection labels. |
agentgateway_downstream_sent_bytes |
counter | Total TCP bytes transmitted per connection labels. |
agentgateway_gen_ai_client_token_usage |
histogram | Number of tokens used per request. |
agentgateway_gen_ai_server_request_duration |
histogram | Duration of generative AI request. |
agentgateway_gen_ai_server_time_per_output_token |
histogram | Time to generate each output token for a given request. |
agentgateway_gen_ai_server_time_to_first_token |
histogram | Time to generate the first token for a given request. |
agentgateway_mcp_requests |
counter | Total number of MCP tool calls. |
agentgateway_request_duration_seconds |
histogram | Duration of HTTP requests (seconds). |
agentgateway_requests |
counter | The total number of HTTP requests sent. |
agentgateway_response_bytes |
counter | Total HTTP response bytes received. |
agentgateway_tokio_global_queue_depth |
gauge | Number of tasks currently scheduled in the runtime's global queue. |
agentgateway_tokio_num_alive_tasks |
gauge | Number of currently alive tasks in the runtime. |
agentgateway_upstream_connect_duration_seconds |
histogram | Duration to establish upstream connection (seconds). |
Attributes
The following resource attributes are available for agentgateway LLM proxy.
| Attribute name | Description |
|---|---|
backend |
Upstream backend identifier. |
bind |
Bind address and port the gateway is listening on. |
gateway |
Gateway namespace/name. Value is unknown before the connection is fully routed. |
gen_ai_operation_name |
Generative AI operation type. |
gen_ai_request_model |
Model name as specified in the client request. |
gen_ai_response_model |
Model name as returned in the upstream response. |
gen_ai_system |
Backend AI provider/system. |
gen_ai_token_type |
Token category consumed: input, output, or input_cache_read. |
listener |
Listener name on the gateway. Value is unknown before connection is routed. |
method |
HTTP method for proxy metrics (e.g. POST, GET, OPTIONS); MCP method for MCP metrics (e.g. tools/call, tools/list, initialize, notifications/initialized). |
protocol |
Protocol used for the request or connection. |
reason |
Source or classification of the response. |
resource |
Specific MCP resource or tool name invoked. |
resource_type |
MCP resource category. |
route |
Route namespace/name matched for the request. |
route_rule |
Specific route rule matched within the route (unknown if no rule matched). |
server |
MCP server name handling the request. |
status |
HTTP response status code. |
transport |
Transport security mode for upstream connections. |
Next steps
After you set up data collection, the data populates built-in dashboards that you can use to monitor and troubleshoot your instances.
For more information on using built-in dashboards in Splunk Observability Cloud, see: