Learn about Monitoring MetricSets in Splunk APM

Learn about MetricSets in Splunk Observability Cloud. MetricSets are metrics for traces and spans in Splunk APM.

MetricSets are key performance indicators, like request rate, error rate, and request duration, that are calculated from traces and spans in Splunk APM. MetricSets are metric time series (MTS) that are specific to Splunk APM. See Metric time series to learn more.

There are 2 categories of MetricSets: Troubleshooting MetricSets (TMS), used for high-cardinality troubleshooting, and Monitoring MetricSets (MMS), used for real-time monitoring.

Monitoring MetricSets overview

Monitoring MetricSets (MMS) are MTS that power monitoring capabilities in Splunk APM, including:

  • Charts and dashboards.

  • The APM Overview page and the dashboard view.

  • Alerts. Detectors monitor MMS to generate alerts.

Splunk APM provides 5 default types of MMS, each with a set of dimensions such as deployment.environment and service.name. You can configure additional custom dimensions by indexing other span attributes.

MMS are created for spans where the span.kind has a value of SERVER or CONSUMER. Spans might lack a kind value, or have a different kind value, in the following situations:

  • The span originates in self-initiating operations or inferred services.

  • An error in instrumentation occurs.

Splunk Observability Cloud stores MMS for 13 months by default.

Available default MMS and dimensions

The following table describes the available default MMS. Each MMS is a metric time series that corresponds to the measurements of an APM component and has a set of dimensions you can use to monitor and alert on service performance.

MMS name Description Dimensions Custom dimensions available?
service.request The requests to endpoints in a service.
  • sf_environment

  • deployment.environment (only available for histogram MMS)

  • sf_service

  • service.name (only available for histogram MMS)

  • sf_error

Yes
inferred.services The requests to a service that has not yet been instrumented.
  • sf_service

  • service.name (only available for histogram MMS)

  • sf_environment

  • deployment.environment (only available for histogram MMS)

  • sf_error

  • sf.kind

  • sf_operation

  • sf_httpMethod

No
spans The count of spans (a single operation).
  • sf_environment

  • deployment.environment (only available for histogram MMS)

  • sf_service

  • service.name (only available for histogram MMS)

  • sf_operation

  • sf_kind

  • sf_error

  • sf_httpMethod, where relevant

Yes
traces The count of traces (collection of spans that represents a transaction).
  • sf_environment

  • deployment.environment (only available for histogram MMS)

  • sf_service

  • service.name (only available for histogram MMS)

  • sf_operation

  • sf_httpMethod

  • sf_error

No
workflows Created by default when you create a business transaction.
  • sf_environment

  • deployment.environment (only available for histogram MMS)

  • sf_workflow

  • sf_error

No

Monitoring MetricSets are stored as histograms, a type of metric that provides accurate representations of large amounts of data. For more information on histograms, see Histogram metrics in Splunk Observability Cloud.

To generate metrics from these histograms, you must apply a function (such as count, max, min, or percentile) to summarize the data. You can do this using the Chart Builder or SignalFlow.

The following table describes:

  • Examples of metrics that can be generated by applying a function to an MMS. The example metrics are based on an MMS that measures requests.

  • The previous non-histogram MMS, which were classified as either counter or gauge metric types.

To calculate this example metric

Use the MMS with this function

Previous non-histogram MMS

Request count

<MMS_name> with a count function

<MMS_name>.count

Minimum request duration

<MMS_name> with a min function

<MMS_name>.duration.ns.min

Maximum request duration

<MMS_name> with a max function

<MMS_name>.duration.ns.max

Median request duration

<MMS_name> with a median function

<MMS_name>.duration.ns.median

Percentile request duration

<MMS_name> with a percentile function and a percentile value

<MMS_name>.duration.ns.p90

Percentile request duration

<MMS_name> with a percentile function and a percentile value

<MMS_name>.duration.ns.p99

Example histogram metrics in APM

A histogram MTS uses the following syntax using SignalFlow: histogram(metric=[,filter=][,resolution=]) .

The following table displays example SignalFlow functions.

Example SignalFlow functions

Description Previous MMS function Histogram MMS function (recommended)
Aggregate count of all MTS A = data('spans.count').sum().publish(label='A') A = histogram('spans').count().publish(label='A')
P90 percentile for single MTS filter_ = filter('sf_environment', 'environment1') and filter('sf_service', 'service 1') and filter('sf_operation', 'operation1') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = data('spans.duration.ns.p90', filter=filter_, rollup='sum').publish(label='A') filter_ = filter('sf_environment', 'environment1') and filter('sf_service', 'service 1') and filter('sf_operation', 'operation1') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = histogram('spans', filter=filter_).percentile(pct=90).publish(label='A')
Combined p90 for multiple services A = data('service.request.duration.ns.p90', filter=filter('sf_service', 'service 2', 'service 1'), rollup='average').mean().publish(label='A') A = histogram('service.request', filter=filter('sf_service', 'service 2', 'service1')).percentile(pct=90).publish(label='A')
Note: An aggregation is applied on histogram(). To display all of the metric sets separately, each dimension must be applied as a group-by.

About custom Monitoring MetricSets

You can add up to 5 custom dimensions to an MMS by indexing span attributes. Use custom MMS to filter and aggregate the MMS by specific indexed span attributes or processes such as cloud.provider and service.version.

You can create custom MMS at the service level and the endpoint or span level. When you create a custom dimension for a service-level MMS, APM creates an MMS that includes the service-level metrics with your chosen indexed span attribute or process as a custom dimension. If you add endpoint-level metrics, APM creates MMS that include span-level metrics, with your chosen indexed span attribute or process as a custom dimension.

Each custom MMS with multiple dimensions has a unique multi-dimension MMS ID associated with it. To ensure that you receive accurate metrics for your custom MMS, you must filter on the MMS ID when you create charts and detectors with custom MMS. The APM & RUM MetricSets page displays MMS IDs in the MMS (MMS ID) column, as displayed in the following image.

After you create an MMS with custom dimensions, you can use the custom dimensions to create charts, dashboards, and alerts. You can only use multiple custom dimensions at the same time if they were created as part of the same custom MMS.

To learn more about a specific scenario for custom MMS, see Monitor detector service latency for a group of customers.

To learn how to create a custom MMS, see Create a Monitoring MetricSet with custom dimensions.

Scope of custom Monitoring MetricSets

You can create custom MMS for endpoints (span), services (service.request) and inferred services (inferred.services), but not for business transactions (workflow) or traces (trace) at this time. Custom MMS aren't supported for global tags. See Available default MMS metrics and dimensions.

Metrics and dimensions of custom Monitoring MetricSets

Each MMS has a set of metrics and dimensions for spans and traces you can use to monitor and alert on service performance. To prevent over counting metrics in aggregations, the built-in dashboards and charts in Splunk APM automatically exclude custom MMS. Custom MMS have a marker dimension, sf_dimensionalized: true, to include custom MMS.

When you create your dashboards and charts, you can exclude custom MMS by adding a filter on !sf_dimensionalized: true. If you want to look at the time series of a custom MMS in your charts, filter on sf_dimensionalized: true and then aggregate by the custom dimension or multi-dimension MMS ID you want to look at.

Use cases for Monitoring MetricSets with multiple custom dimensions

Custom dimensions enable you to filter or group by your MMS for more detailed monitoring and analysis. The following examples describe use cases for a custom MMS with multiple dimensions.

Use case: Create a chart with a custom multi-dimension MMS

Consider a scenario where you want to monitor the number of requests that result in client errors for version 350.9 of a service named paymentservice.

You know that Splunk APM provides the default MMS metric service.request to track requests to endpoints in a service, and the default dimension service.name to filter data by service.

To create this chart, you:

  1. Create a custom MMS with dimensions for the following indexed span attributes: version, featureFlag, and grpc.status_code. You assign the MMS ID version_featureFlag_gprc to the custom MMS.

  2. Create a chart with the default MMS metric service.request, filtered by:

    1. The default dimension key and value service.name:paymentservice.

    2. The multi-dimension MMS ID key and value sf_mms_id:version_featureFlag_gprc.

    3. The indexed span attribute key and value version:v350.9.

The following screenshot displays the example chart created for this use case.

A screenshot of an example chart that uses a custom multi-dimension MMS to display the client errors for a specific version of a service.

Cardinality contribution of indexed span attributes and processes

When you index a new span attribute or process to create custom MetricSets, Splunk APM runs a cardinality contribution analysis to calculate the potential total cardinality contribution after indexing the span attribute or process. This gives you control of what you index and helps you to account for organization subscription limits.

If you try to index a span attribute or process that might increase the total cardinality contribution beyond your limit, you can change the existing cardinality contribution of indexed tags or processes for instrumented services by modifying or removing indexed span attributes or processes.

To see your TMS or MMS subscription limit, navigate to Settings, then Subscription usage. Depending on your organization subscription, you might need to go to Settings then Billing and usage. Select the APM tab and then select the Troubleshooting MetricSets or Monitoring MetricSets panel to view your subscription limit for MMS. You must have an admin or usage role to view subscription limits. To learn more about APM usage and billing, see Monitor Splunk APM billing and subscription usage.

Use MMS within Splunk APM

Use MMS for alerting and real-time monitoring in Splunk APM. You can create charts, dashboards, and alerts based on default and custom Monitoring MetricSets.

To use a default MMS, use one of the following workflows and filter by the default MMS name.

To use a custom MMS, see the Next steps section of Create a Monitoring MetricSet with custom dimensions.