Use the service view in Splunk APM

Learn how to use service views in Splunk APM for a complete view of your service health.

As a service owner, you can use the service view in Splunk APM to get a complete view of your service health in a single pane of glass. The service view includes a service-level indicator (SLI) for availability, dependencies, request, error, and duration (RED) metrics, runtime metrics, infrastructure metrics, Tag Spotlight, endpoints, and logs for a selected service. You can also quickly navigate to code profiling and memory profiling for your service from the service view.

The service view is available for instrumented services, pub/sub queues, databases, and inferred services. See Service view support for various service types for details on the information available for various service types.

Access the service view for your service

You can access the service view for a specific service in several places.

You can search for the service using the search in the top toolbar.

Animation showing a user searching for the checkoutservice and selecting the service result.

You can also access the service view for a specific service within the service map. Start by selecting APM > Service map. Select a service in the service map, then select the link to the service in the panel.

Screenshot of the service view button within the service map when a service is selected.

Finally, you can also access the service view for a specific service by selecting the service from the APM landing page.

Use the service overview to monitor the health of your service

When you open the service view an environment is selected based on your recently viewed environments. Adjust the environment and time range filters if necessary. Use the following sections to monitor the health of your service.

Service metrics

Use the following metrics in the Service metrics section to monitor the health of your service. Collapse sub-sections that are not relevant to you to customize your service view.

Success rate SLI - The success service-level indicator (SLI) shows the percentage of time requests for your service were successful in the last 30 days. The chart shows successful and unsuccessful requests. If you configured a success rate service-level objective (SLO), an additional chart displays success rate over the compliance window you specified in your objective. See Measure and track your service health metrics with service level objectives (SLOs).
Service map - displays the immediate upstream and downstream dependencies for the service you are viewing. The service map in service view is limited to 20 services, sorted by the most number of requests. Hover over the chart and select View full service map to go to the service map.
Service requests - displays streaming request data for the service. If you have detectors for the service requests configured, triggered alerts display below the chart. Select the chart to view example traces. Select the alert icon to view alert details.
Service latency - displays p50, p90, and p99 latency data for the service. If you have detectors for the service latency configured, triggered alerts display below the chart. Select the chart to view example traces. Select the alert icon to view alert details.
Service error - displays streaming error data for the service. If you have detectors for the service error rate configured, triggered alerts display below the chart. Select the chart to view example traces. Select the alert icon to view alert details.
Dependency latency by type - displays the latency for each of the downstream systems. Select the chart to see details about each system category. Systems are categorized as follows:
- Services - instrumented services
- Databases
- Inferred services - un-instrumented third-party services
- Pub/sub queues - Publisher and subscriber queues

Runtime metrics

Instrument your back-end applications to send spans to Splunk APM to view runtime metrics. See Instrument back-end applications to send spans to Splunk APM.

The available runtime metrics vary based on language. See Metric reference for more information.

Infrastructure metrics

If you are using the Splunk Distribution of the OpenTelemetry Collector and the SignalFx Exporter, infrastructure metrics for the environment and service you are viewing display. See Get started with the Splunk Distribution of the OpenTelemetry Collector and SignalFx exporter.

The following infrastructure metrics are available:

Host CPU usage
Host memory usage
Host disk usage
Host network usage
Pod CPU usage
Pod memory usage
Pod disk usage
Pod network utilization

View Tag Spotlight view for your service

Select Tag Spotlight to view Tag Spotlight view filtered for your service. See Analyze service performance with Tag Spotlight to learn more about Tag Spotlight.

View errors for your service

Error tab on APM page showing error traces for the service.

Navigate to the Errors tab to visualize and troubleshoot errors for your service. On this tab, you can view:

Errors by exception type: Displays a breakdown of errors based on the span attribute exception.type.
Errors by HTTP status code: Displays a breakdown of errors based on the HTTP error status code. For more information about error status codes, see Semantic Conventions for HTTP Spans.
Error type: Displays error trend over time for each value of the span attribute exception.type, along with the total count of the error type in the selected time range.

Select a data point on any of the charts or any row in the error to view related traces for that time period and error. This populates a side panel with example traces. Each example trace contains the following information:

The service, operation and exception.message span attribute associated with the error
Stack trace: The exception.stacktrace for the error span
Trace context: Service, operation and exception.message for each of the error spans in the span tree within a trace

Additionally, you can also view the Error breakdown section on APM > Overview to see errors categorized by type, HTTP and gRPC status code.

To display errors, Splunk APM indexes the span attributes exception.type, http.response.status_code, rpc.grpc.status_code as TroubleshootingMetricSets (TMS). You can deactivate these metrics by navigating to the APM & RUM MetricSets page under the Settings menu. Go to the Sources of Errors MetricSets section and select Pause Indexing to deactivate these metrics. You must be an admin to deactivate these metrics.

Note:

This feature requires configuring the service.instance.id attribute in your agent. This attribute is set by default by the Java, .NET, and Node.js agent instrumentation.

For more information on the service.instance.id attribute, see Service in the OpenTelemetry documentation. To manually set the attribute, refer to the language-specific page for your agent under Language APIs & SDKs in the OpenTelemetry documentation.

Select the Instances tab to display a searchable table with a list of the instances related to your service and their request, error, and duration (RED) metrics.

Select the Requests, Errors, or Latency(P90) headers on the table to display the charts for each metric and sort the instances in ascending or descending order based on the metric values. Alternatively, you can select a row on the table to view the Requests & Errors and Latency charts for that service instance.

View endpoints for your service

Select the Endpoints tab to view endpoints for the service. Use the search field to search for specific endpoints. Use the sort drop-down list to change how endpoints are sorted. Select an endpoint to view endpoint details or go to Tag Spotlight, traces, code profiling, or the dashboard for the endpoint.

View logs for your service

Select Logs to view logs for the environment and service you are viewing. By default, logs are displayed for all indices that correspond to first listed Log Observer Connect connection. Logs are filtered by the service you are viewing using the service.name value. If your logs do not have a service.name value, you can create an alias in Splunk Web. See Create field aliases in Splunk Web.

To select a different connection or refine which indices logs are pulled from, select Configure service view.

In the Log Observer Connect Index drop-down list, select the Log Observer Connect connection, then select the corresponding indices you want to pull logs from.
Select Apply
Select Save changes.

The connection and indices you select are saved for all users in your organization for each unique service and environment combination.

Search embedded logs

You can search for specific keywords within logs embedded in the service centric view in APM, Kubernetes Navigator, and in charts and dashboards. Your search does not affect the Log Chart Summary, ensuring data integrity.

To search embedded logs in the APM service centric view, follow these steps:

In Splunk Observability Cloud, go to APM > Overview.
From the Services tab, select a service.
On the Logs tab at the top of the logs table, enter the keyword in the search bar that you want to search for in embedded logs in APM.
Note: Searches are case-insensitive and treat the keywords you enter as a single string, aligning with Log Observer Connect behavior. When you view the logs in Log Observer Connect, the search persists to maintain context.
Press Enter on your keyboard. (There is no Search button.)

View traces for your service

Select Traces to view traces for the environment and service you are viewing. The Traces tab includes charts for Service requests and errors and Service latency. Select within the charts to see example traces.

Under the charts are lists of Traces with errors and Long traces. Select the trace ID link to open the trace in trace waterfall view. Select View more in Trace Analyzer to search additional traces. See Explore your traces using Trace Analyzer in Splunk APM for more information about using Trace Analyzer to search traces.

View top commands or queries for your databases

If you select a Redis or SQL database from the service dropdown menu, you can select Database Query Performance to view top commands or queries for your database. See Monitor Database Query Performance to learn more.

Go to the code profiling view for your service

Select Code profiling to go to the code profiling view of AlwaysOn Profiling filtered for your service. See Introduction to AlwaysOn Profiling for Splunk APM to learn more about AlwaysOn Profiling.

Go to the memory profiling view for your service

Select Memory profiling to go to the memory profiling view of AlwaysOn Profiling filtered for your service. See Introduction to AlwaysOn Profiling for Splunk APM to learn more about AlwaysOn Profiling.

Configure the service view

Select Configure service view to modify the Log Observer Connect connection and indices for the logs you want to display for your service.

In the Log Observer Connect Index drop-down list, select the Log Observer Connect connection, then select the corresponding indices you want to pull logs from.
Select Apply
Select Save changes.

The connection and indices you select are saved for all users in your organization for each unique service and environment combination.

Service view support for various service types

The information available in your service view varies based on the type of service you select. The following table shows which sections are available for each service type.


Service view section	Instrumented services	Databases	Pub/sub queues	Inferred services
Overview	Yes, includes service metrics, runtime metrics, and infrastructure metrics	Yes, includes only service metrics	Yes, includes only service metrics	Yes, includes only service metrics
Tag Spotlight	Yes	Yes	Yes	Yes
Endpoints	Yes	No	No	Yes
Logs	Yes	Yes	Yes	Yes
Traces	Yes	Yes	Yes	Yes
Database Query Performance	No	Yes, only displays for Redis and SQL databases.	No	No
Code profiling	Yes	No	No	No
Memory profiling	Yes	No	No	No

Metric reference

The following metrics are used in the service view.

Service metrics


Chart	Metrics
Service requests	`service.request` with a `count` function
Service latency	`service.request` with a `median` function `service.request` with a `percentile` function and a percentile value `90` `service.request` with a `percentile` function and a percentile value `99`
Service errors	`service.requests` with a `count` function and a `sf_error:True` filter
SLI/SLO	`service.request` with a `count` function

.NET runtime metrics


Chart	Metrics
Heap usage	`process.runtime.dotnet.gc.committed_memory.size`
GC collections	`process.runtime.dotnet.gc.collections.count`
Application activity	`process.runtime.dotnet.gc.allocations.size`
GC heap size	`process.runtime.dotnet.gc.heap.size`
GC pause time	`process.runtime.dotnet.gc.pause.time`
Monitor lock contention	`process.runtime.dotnet.monitor.lock_contention.count`
Threadpool thread	`process.runtime.dotnet.monitor.lock_contention.count`
Exceptions	`process.runtime.dotnet.exceptions.count`

Java runtime metrics


Charts	Metrics
Memory usage	`jvm.memory.used_after_last_gc` `jvm.memory.limit` `jvm.memory.used`
Class loading	`jvm.class.count` `jvm.class.unloaded`
GC activity	`jvm.gc.duration`
Thread count	`jvm.thread.count`

For more information on these metrics, see Migration guide for OpenTelemetry Java 2.x metrics.

Node.js runtime metrics


Charts	Metrics
Heap usage	`process.runtime.nodejs.memory.heap.total` `process.runtime.nodejs.memory.heap.used`
Resident set size	`process.runtime.nodejs.memory.rss`
GC activity	`process.runtime.nodejs.memory.gc.size` `process.runtime.nodejs.memory.gc.pause` `process.runtime.nodejs.memory.gc.count`
Event loop lag	`Process.runtime.nodejs.event_loop.lag.max` `process.runtime.nodejs.event_loop.lag.min`

Infrastructure metrics


Chart	Metrics
Host CPU usage	`cpu.utilization`
Host memory usage	`memory.utilization`
Host disk usage	`disk.summary_utilization`
Host network usage	`network.total`
Pod CPU usage	`container_cpu_utilization` `cpu.num_processors` `machine_cpu_cores` `k8s.container.ready`
Pod memory usage	`k8s.container.ready` `container_memory_usage_bytes` `container_spec_memory_limit_bytes`
Pod disk usage	`k8s.container.ready` `container_fs_usage_bytes`
Pod network utilization	`k8s.container.ready` `pod_network_receive_bytes_total` `pod_network_transmit_bytes_total`

Use the service view in Splunk APM

Access the service view for your service

Use the service overview to monitor the health of your service

Service metrics

Runtime metrics

Infrastructure metrics

View Tag Spotlight view for your service

View errors for your service

View instances related to your service

View endpoints for your service

View logs for your service

Search embedded logs

View traces for your service

View top commands or queries for your databases

Go to the code profiling view for your service

Go to the memory profiling view for your service

Configure the service view

Service view support for various service types

Metric reference

Service metrics

.NET runtime metrics

Java runtime metrics

Node.js runtime metrics

Infrastructure metrics