MCP Telemetry Dashboard

Version 1.2 introduces a Dashboard tab to the Splunk MCP Server app. It gives Splunk administrators a single pane to govern, audit, and troubleshoot how MCP tools are using Splunk resources.

You can use the MCP telemetry dashboard to track how MCP tools consume Splunk Virtual Compute (SVC). To learn more about SVC see https://www.splunk.com/en_us/blog/platform/what-is-splunk-virtual-compute-svc.html

The following image shows the Dashboard tab in the Splunk MCP Server app:

This image shows the MCP Server app. The new tab called Dashboard is selected and highlighted. The Overview page is displayed. Views for other metrics are listed on the left side of the page including Users and Auth, and Health and Risk.

The Dashboard lets you address the following types of questions about the Splunk MCP Server app:

  • Who is using MCP and how are they authenticating?

  • What are users doing in the app and how much SVC are they consuming?

  • Is anything broken or suspicious?

  • Is MCP operating correctly?

  • Is MCP secure?

Who can benefit from the Dashboard

Persona Beneficial tabs Use case
Security-minded administrator Users & Auth, Security & Operations, Access & Governance Confirm nothing is urgent, weekly access review.
Capacity or cost administrator Overview, Usage & Consumption, Health & Risk Find the heaviest user and set thresholds to protect SVCs.
Platform owner enabling AI Overview, Usage & Consumption, Access & Governance Track adoption, report results to management chain.

Common workflows

See the following table for common workflows that leverage the MCP Dashboard:

Workflow Path through the Dashboard
Investigate a runaway agent Overview > high Total Tool Calls or Rate limit hits > Usage & Consumption > User Activity > expand the row > contact the team or apply a rate limit.
Quarterly access review Governance > Users with MCP Capabilities > Active MCP Tokens > Users & Auth > Inactive Users > revoke or remove capability.
Is MCP responsible for the SVC spike? Security & Operations > MCP versus s non-MCP Search Traffic > Usage & Consumption > Most-Run SPL Queries.

Before you begin

See the following guidelines before you begin using the Dashboard:

Function Details
Dashboard access

Those with admin and sc_admin roles can see the Dashboard and all the metrics. Users in any other role cannot see the Dashboard tab or any metrics.

Lookback Lookback is limited to 60 days. The time picker is capped.
Freshness Most of the Dashboard panels lag in real time by approximately 5 to 20 minutes.
Time zone The time picker honors the viewing user's Splunk time-zone preference.

Key Dashboard terms

See the following table for terms you might see when using the Dashboard and their meaning:

Term Meaning
MCP session A unique client connection initialized against the MCP server. Counted through initialize_complete events; inflated by reconnects.
Tool call One invocation of an MCP tool. Emits tool_call_complete or tool_call_error.
MCP capability A Splunk RBAC capability such as mcp_tool_execute or mcp_tool_admin that authorizes a user to call MCP tools. Distinct from MCP server 'capabilities' advertised to clients.
Capability policy denial A user with a valid token (audience = mcp) tries to call a tool but lacks the required Splunk capability.
Token validation failure The token itself is invalid such as wrong token, expired, or audience claim is not 'mcp.'
Guardrail rejection An MCP-submitted SPL query was blocked by the SPL safety check before it ran.
Tool family A grouping of tools such as Search, AI Assistant, Splunk platform, and Observability.

Dashboard filters

For each of the Dashboard panels including the Overview, there are 5 global filters available at the top of the page as shown in the following image:

This image shows the Dashboard tab of the MCP Server. There are 5 global filters near the top of of the page highlighted. Filters include time range and tool family.

See the following table for global filter information:

Note: All filters chain together.
Global filter name Default Notes
Time Range Last 30 days Maximum 60 days
Auth method All Token / OAuth / All
User All List populated from telemetry. Only shows users seen within the selected Time Range.
Tool Family All For example Search, AI Assistant, Splunk platform, and Observability.
Tool All List populated from telemetry.

Dashboard panels

See the following tables for details on what panels are available on the Dashboard tab.

Overview panels

Purpose: The state of MCP at a glance including adoption, performance, and guardrail health on 1 screen.

When to use: At the start of every session, for a daily glance, or first stop during incident triage.

You can use the panels on the Overview view to answer the following:

  • Is MCP adoption growing or stalling?

  • Are users hitting errors or being rate-limited right now?

  • Are auth failures spiking?

  • Is the guardrail catching unsafe SPL more often than usual?

Panel What the panel shows Notes
New Sessions

Number of MCP client sessions started in the selected window.

Inflated by client reconnects. Treat as upper bound, not unique users.

Tool and Tool Family filters are ignored.

Unique Users Distinct users who invoked at least one tool call.
Distinct Tools Used Number of unique MCP tools invoked.
Total Tool Calls All tool invocations in the window.
Success Rate Percentage of tool calls returning 2xx.
Error Rate Percentage of non-2xx tool calls.
P95 latency 95th-percentile tool execution time. Server-side execution time only. Does not include client network time.

Rate limit hits

Tool calls rejected by rate limit. Tool and Tool Family filters are ignored.
Blocked or Denied Tool calls denied by access control or policy. Also includes SPL safety rejections. Includes capability policy denial. A valid token whose user lacks the required MCP capability. Not 'bad token.'

Tool and Tool Family filters are ignored.

Search Success Rate Percentage of Splunk search tool calls that completed successfully.
Saved Search Share Share of search tool calls that came through run_saved_search.
Guardrail Rejection Rate SPL safety rejections divided by executed search calls plus rejections. Should be near zero. A rising rate means agents are repeatedly trying unsafe SPL.
Auth Failure Rate Percentage of authentication attempts that failed. Token validation failure (wrong token / wrong audience). Different from capability policy denial above.

Tool and Tool Family filters are ignored.

Users and Authentication panels

Purpose: Who is on MCP and how they're authenticating.

When to use: OAuth migration milestones, quarterly access review, and after enabling MCP for a new team.

You can use the panels on the Users & Auth view to answer the following:

  • How is the OAuth migration progressing?

  • Who is still on token auth?

  • How many OAuth client apps are active?

  • Who has stopped using MCP and could be revoked?

Panel What the panel shows Notes
Traffic Split by Auth Method

Tool-call volume over time, split by token versus OAuth.

Users using Token Auth

Distinct users authenticated through a token.

Should drift down as OAuth migration progresses.
Users using OAuth

Users authenticated through OAuth.

OAuth Client IDs Distinct OAuth client_ids seen. Sudden growth can mean a new app onboarded or unexpected registration.
Inactive Users Users whose last activity is before the dashboard's earliest filter. Strong revoke candidates.

Scans the entire history and is the slowest panel on the dashboard. Use narrow time windows during peak hours.

Global filters do not apply.

Usage and Consumption panels

Purpose: What users are doing with MCP and who is driving load.

When to use: Capacity planning, weekly hygiene review, investigating SVC pressure attributed to MCP.

You can use the panels on the Usage & Consumption view to answer the following:

  • Which tools are valuable vs ignored?

  • Who is the heaviest user, and what tools are they hitting?

  • What indexes is MCP touching, and by whom?

  • What's a reasonable per-user quota?

  • What SPL queries do MCP clients run most often?

Panel What the panel shows Notes
Tool Activity Per-tool call volume, failure rate, p95 latency, and the heaviest caller.
User Activity Per-user call volume, recency, and outcome. Client IDs column shows the user's 5 most recently used OAuth client IDs. Expand a row for the user's top 5 tools.
Most-Run SPL Queries Actual SPL text MCP submits, with run count and unique callers.

This panel has Source column with information about the tool through which this SPL query was triggered.

Runs as a separate _audit search. Is slightly out of sync with _internal based panels.

Health and Risk panels

Purpose: Is MCP fast, successful, and on its expected error budget.

When to use: Incident triage. After a tool deployment; when users report slowness, or errors.

You can use the panels on the Health & Risk view to answer the following:

  • Is the slowdown global, on a specific tool, or on a specific user?

  • Is the token auth path or the OAuth path degraded?

  • Which user or tool contributes most to the error budget?

  • What are the worst recent failures and slow calls?

Panel What the panel shows Notes
Auth Events Over Time (by Method & Failure type) Tool call outcomes by auth method, alongside auth failure events, broken down by failure reason. Lets you isolate 'token path is fine, OAuth path is broken' or vice versa.
Failure Rate by User Per-user error rate.
Failure Rate by Tool Per-tool error rate.
Recent Errors Last 10 non-2xx tool calls Also shows spl_safety_rejections, auth_access_denied, and others.
Slowest Tool Calls Top 10 slow requests with timestamp, tool, user, execution time, and status.

Security and Operations panels

Purpose: Is MCP being used securely, and how does its load compare to the rest of Splunk's search traffic.

When to use: Suspicious-activity investigations, SVC post-mortems, and periodic security reviews.

You can use the panels on the Security & Ops view to answer the following:

  • Where is MCP traffic coming from?

  • Who keeps getting denied, and is it mis-configured or probing?

  • Is MCP responsible for the SVC spike?

Panel What the panel shows Notes
Tool Calls by Source IP Top 20 source IPs by combined activity (tool calls + failed auths). Behind a load balancer this might show the LB IP, not the original client.
Access Denied Events by User

403 and auth_access_denied events per user.

This panel is for MCP capability policy based denial.

Repeated denies typically mean the user is missing the required MCP capability or probing.
MCP versus non-MCP Search Traffic

Search counts over time, split by MCP-originated versus other.

Only the User global filter applies to this panel. Auth Method filter is ignored.

Access and Governance panels

Purpose: Who has MCP access, what tokens are live, and what config changes admins have made.

When to use: Quarterly access review; right after onboarding or off boarding. Investigate 'who turned this tool off?'

You can use the panels on the Access & Governance view to answer the following:

  • Who currently has MCP execute or admin capability?

  • Which long-lived MCP-audience tokens exist, and who owns them?

  • What's the recent history of tool enables and disables, and who did them?

Panel What the panel shows Notes
Users with MCP Capabilities Users with mcp_tool_execute or mcp_tool_admin joined from Splunk REST. Hits Splunk REST directly. Global filters do not apply. Always shows current state.
Active MCP Tokens Tokens whose audience claim is 'mcp.' Hits Splunk REST directly. Global filters do not apply. Always shows current state.
Tool Catalog Registered MCP tools and their current enabled state. This panel reads from KV store and _internal index. Built-in tools that have never been toggled might not yet appear.

Global filters do not apply. Is not REST based.

Tool Enable or Disable Actions The tool_enabled and tool_disabled events with timestamp, action, tool, and the admin who did it. Flows through _internal and does respect global filters. This is different from the two panels above.

Dashboard troubleshooting

See the following table for some issues you might encounter with the MCP Dashboard and how to resolve them.

Issue How to address
A panel is empty Confirm telemetry is flowing:

You can run:
CODE
index=_internal sourcetype=admin_console_dashboard 
| stats count by event_type

If empty, telemetry is not being emitted, check the MCP server version and configuration.
Panel shows stale data Check Settings > Searches > the MCP summary saved searches > History for failed runs.
Inactive Users times out Narrow the time window. Note that this panel is expensive to run.
Does the Dashboard count against my Splunk license? The events the Dashboard come from internal indexes and do not count against ingest license.
Can I set alerts when auth failures spike? Built-in thresholds and alerts are not included in version 1.2. As a workaround you can clone the underlying panel SPL into a saved search and add an alert action.
Can I edit panels? Clone the dashboard before editing. App upgrades will overwrite the shipped XML.
Why are my numbers different between the Dashboard and an ad-hoc Splunk search? The Dashboard runs SPL against raw indexes making it more current, but a bit slower for long look backs.

Known limitations

See the following table for the known limitations with the MCP Dashboard in version 1.2:

Limitation Details
Built-in alerts Thresholds and alerts are not supported.
Inactive Users panel Scans the full event history and can be slow.
Filter scope: MCP vs non-MCP Search Traffic Honors all filters except Auth Method as _audit index's events doesn't have auth_method field.
Filter scope: REST-driven Governance panels Users with MCP Capabilities and Active MCP Tokens ignore all global filters and always show current state.
Lookback Maximum 60 days.