Key concepts in Splunk APM

Learn about key concepts in Splunk APM.

Splunk APM is an Application Performance Monitoring solution that collects every span from every instrumented application to give you a complete picture of the interactions among the microservices that make up your distributed system.

Concept

Description

business transaction

A set of correlated traces that track a transaction or user flow of particular interest.

cardinality

The number of distinct values in a data set.

endpoint

An access point for a resource or action. Endpoints provide information about how a service is called in a trace.

environment

A distinct deployment of your application that doesn’t interact directly with other deployments of the same application.

indexed span tag

A span tag for which Splunk generates Troubleshooting MetricSets.

inferred service

A remote service that is not instrumented in Splunk APM, but which Splunk APM can identify based on information in spans that make calls to the remote service.

MetricSet

A set of metric time series capturing the values of key indicators over time, such as request rate, error rate and durations, calculated based on your traces and spans in Splunk APM.

Monitoring MetricSets (MMS)

Metric time series that power the real-time monitoring capabilities in Splunk APM, including charts, dashboards, and detectors.

operation

The actions that a service performs to respond to a request.

service

A unit of software that connects to other services to make up a complete application.

service map

A user interface page that visualizes your instrumented and inferred services and their relationships.

span

A single operation within a system of applications and services.

span tag

A piece of metadata attached to a span that provides more information about the operation the span represents.

Tag Spotlight

A top-down view of your services based on indexed span tags.

trace

A collection of operations, known as spans, that represents a unique transaction an application handles.

Trace Analyzer

A user interface page that displays the traces generated by your applications. You can use Trace Analyzer to search traces and explore trace data.

trace view

A user interface page that displays a span waterfall chart for a specific trace. You can use this view to search for spans within a trace.

Troubleshooting MetricSets (TMS)

Metric time series used for high-cardinality troubleshooting of identities in APM and for historical comparison among spans and business transactions.

Services

Services are the key components of the systems you can monitor with Splunk APM. The following are service-related terms and concepts.

endpoint

In a service API, an endpoint is an access point for a resource or action. For example, an e-commerce service could use the endpoint /ecommerce/users to access user profiles and the endpoint /ecommerce/checkout to perform a checkout action.

Endpoint names are often URLs, but can also be other types of network addresses or communication interfaces. The endpoint name is derived from the name of the first span for each service invoked as part of a trace. In other words, an endpoint is generated when the span.kind of the first span on each service = SERVER or CONSUMER.

Endpoints provide information about how a service is called in a trace. A service typically has one or more endpoints associated with it.

You can monitor endpoints to analyze the performance of a particular part of a service, such as a specific API route. To monitor and troubleshoot the performance of an endpoint in a service, you can use the trace view, Tag Spotlight, breakdown feature in the service map, and the Endpoints tab in the service view.

environment

The term "environment" refers to the deployment environment, which is a distinct deployment in Splunk APM that doesn’t interact directly with other deployments of the same application. Separate deployment environments are often used for different stages of the development process, such as development, staging, and production. For more information, see Set up deployment environments in Splunk APM.

inferred service

A remote service that is not instrumented in Splunk APM, but which Splunk APM can identify based on information in spans that make calls to the remote service. Inferred services often include external service providers, pub/subs, Remote Procedure Calls (RPCs), and databases. To learn more, see Inferred services in Splunk APM

instrumented service

Use the OpenTelemetry Collector to instrument a service so that it sends its spans to Splunk APM. The SignalFx Smart Agent is now deprecated and will reach end of support on June 30th, 2023. To migrate from the Smart Agent to the Collector, see the migration guide.

See Instrument back-end applications to send spans to Splunk APM to learn more about instrumenting services.

operation

The actions that a service performs to respond to a request. Each operation in an instrumented service is represented by a span. Operations are derived from span names and describe what the service is doing at any point during a request.

In the context of an e-commerce application, examples of operations that are not endpoints include:

  • Database query: SQL Select

  • Cache operation: cache.get

  • Internal function: convertPrice

  • Batch or background task: process_order

Examples of operations that are also endpoints include:

  • GET /checkout

  • POST /orders

To directly monitor and troubleshoot the performance of an operation in a service, you can investigate the span that represents the operation. To monitor operations over time and their performance as part of a trace, you can use the trace view, Tag Spotlight, and the Endpoints tab in the service view.

service

A service is a small, flexible, and autonomous unit of software that connects to other services to make up a complete application. A service typically represents a collection of API endpoints and operations that work together with other services’ endpoints in a distributed and dynamic architecture to deliver the full functionality of an application.

"Service" is an umbrella term that encompasses container services (e.g. Docker, Kubernetes), microservices, and even calls to serverless functions. By instrumenting each of the services that make up your application, you can collect spans that represent operations within services and traces that represent collections of operations across services, to analyze and monitor this activity in Splunk APM.

service map

A user interface page that visualizes your instrumented and inferred services and their relationships. The service map is dynamically generated based on your selections in the time range, environment, business transaction, service, and tag filters. See View dependencies in the service map in Splunk APM to learn more about using the service map in APM, or see Scenario: Kai investigates the root cause of an error with the Splunk APM service map for a dedicated scenario.

Traces and spans

Spans and traces form the backbone of application monitoring in Splunk APM. The following image illustrates the relationship between traces and spans:

This image shows a trace represented by a series of multicolored bars labeled with the letters A, B, C, D, and E. Each lettered bar represents a single span. The spans are organized to visually represent a hierarchical relationship in which span A is the parent span and the subsequent spans are its children.

The following are terms and concepts related to spans and traces.

business transaction

A set of related traces that track a transaction or user flow of particular interest.

To learn more, see:

indexed span tag

When you index a span tag, you indicate to Splunk APM that you are particularly interested in this tag and would like to generate additional analytics for it. Indexing a span tag generates Troubleshooting MetricSets for that tag. When you index a service-level span tag, you also have the option to generate custom dimensionalized Monitoring MetricSets using that span tag as a dimension.

To learn how to index a span tag, see Index span tags to create Troubleshooting MetricSets.

span

A single operation within a system of applications and services. Spans include span tags, which provide metadata such as the location and duration of the operations they represent. A group of related spans makes up a trace. For more information, see Manage services, spans, and traces in Splunk APM.

span tag

A piece of metadata attached to a span that provides more information about the operation the span represents. Examples of span tags include service.name and http.operation. You can add span tags to spans during instrumentation or in the Splunk Distribution of OpenTelemetry Collector. Span tags are also known as "attributes" in the OpenTelemetry context.

For more information, see Analyze services with span tags and MetricSets in Splunk APM.

Tag Spotlight

The Tag Spotlight view in Splunk APM offers a top-down view of your services based on indexed span tags.

To learn more, see:

trace

A collection of related operations, known as spans, that represents a unique transaction an application handles. For more information, see Manage services, spans, and traces in Splunk APM.

Trace Analyzer

A user interface page that displays the traces generated by your applications. You can use Trace Analyzer to explore trace data and search traces to find the precise source of a particular issue.

To learn more, see:

trace view

A user interface page that displays a span waterfall chart for a specific trace. You can use this view to search for spans within a trace.

To learn more, see:

MetricSets

MetricSets are the central type of metric data that power Splunk APM.

A MetricSet is a set of metric time series capturing the values of key indicators over time, such as request rate, error rate and durations, calculated based on your traces and spans in Splunk APM. Generate MetricSets by indexing span tags of interest. The following are terms and concepts related to MetricSets.

cardinality

The number of distinct values in a data set. Low cardinality data has a small number of distinct values. High cardinality data has a large number of distinct values, and requires more computation and storage to analyze and store.

See Troubleshoot cardinality in Monitoring MetricSets to learn more about working with high cardinality data.

Monitoring MetricSets (MMS)

Metric time series used to monitor and alert on the performance of your services in real time. MMS power the real-time APM landing page and the dashboard view, and are the metrics that detectors monitor and use to generate alerts. MMS use the same functionality as metric time series in Infrastructure Monitoring to monitor and alert on the performance of applications and services.

For more information about MMS, see Learn about Monitoring MetricSets in Splunk APM.

Troubleshooting MetricSets (TMS)

Metric time series used for high-cardinality troubleshooting of identities in APM and for historical comparison among spans and business transactions. Splunk APM generates TMS based on indexed span tags.

To learn more, see Learn about Troubleshooting MetricSets in Splunk APM.