Learn about MetricSets

Learn about MetricsSets and their default contents and dimensions.

A MetricSet is a time series of related metrics. MetricSets typically contain only your most important key performance indicators (KPIs). In SplunkRUM, there are two types of MetricSets:

Troubleshooting MetricSets

Troubleshooting MetricSets (TMS) contain high-cardinality metrics that use a span tag as a dimension. Dimensions let you group and analyze metrics using the distinct values of the span tag. For example, if a Troubleshooting MetricSet uses a span tag named user.id as a dimension, you can group its metrics by each distinct value of user.id. This allows you to analyze performance or issues specific to those user.id values, and to make historical comparisons across spans and workflows.

Splunk RUM indexes several span tags by default, and automatically creates Troubleshooting MetricSets for them. You can't modify or stop this behavior. In addition, you can create custom Troubleshooting MetricSets by indexing additional span tags. All indexed span tags are dimensions of the MetricSet that contains them.

Contents

In Splunk RUM, every Troubleshooting MetricSet contains the following metrics:

  • RED metrics (request count, error count, duration). RED metrics appear when you select a service in the service map.

  • Request rate

  • Error rate

  • Root cause error rate

  • p50, p90, and p99 latency

Precision

The measurement precision of Troubleshooting MetricSets is 10 seconds. Splunk RUM reports quantities from a distribution of metrics for each 10-second reporting window.

Visibility

Troubleshooting MetricSets appear on the service map and in Tag Spotlight. Use Troubleshooting MetricSets to filter the service map and to create breakdowns across the values of a given indexed span tag or process. See View dependencies among your services in the service map.

Retention

Splunk Observability Cloud retains Troubleshooting MetricSets for the same amount of time as raw traces. By default, the retention period is 8 days. Learn more about best practices for span tags and Troubleshooting MetricSets.

Monitoring MetricSets

Monitoring MetricSets (MMS) contain endpoint-level and service-level metrics. You can use Monitoring MetricSets to create dashboards, charts, detectors, and alerts, and to monitor your environment in real time.

Contents

In Splunk RUM, every Monitoring MetricSet contains a specific endpoint or an aggregate of all endpoints in a service.

Endpoint-level Monitoring MetricSets reflect the activity of a single endpoint in a service, while service-level Monitoring MetricSets aggregate the activity of all of the endpoints in the service. Monitoring MetricSets are created for spans where the span.kind has a value of SERVER or CONSUMER.

Spans might lack a kind value, or have a different kind value, in the following situations:

  • The span originates in self-initiating operations or inferred services

  • An error in instrumentation occurs.

Available default Monitoring MetricSet metrics and dimensions

Monitoring MetricSets are available for the Splunk RUM components listed in the following table. Each Monitoring MetricSet also has a set of dimensions you can use to monitor and alert on service performance. In addition to the following default Monitoring MetricSets, you can create custom Monitoring MetricSets to deep dive on your Monitoring MetricSets. See Create a Monitoring MetricSet with a custom dimension.

Metric nameDimensionsCustom dimension available (Yes/No)?
service.request - the requests to endpoints in a service
  • sf_environment

  • deployment.environment - This dimension is only available for histogram MMS.

  • sf_service

  • service.name - This dimension is only available for histogram MMS.

  • sf_error

Yes
inferred.services - the requests to a service that has not yet been instrumented
  • sf_service

  • service.name - This dimension is only available for histogram MMS.

  • sf_environment

  • deployment.environment - This dimension is only available for histogram MMS.

  • sf_error

  • sf.kind

  • sf_operation

  • sf_httpMethod

No
spans - the count of spans (a single operation)
  • sf_environment

  • deployment.environment - This dimension is only available for histogram MMS.

  • sf_service

  • service.name - This dimension is only available for histogram MMS.

  • sf_operation

  • sf_kind

  • sf_error

  • sf_httpMethod, where relevant

Yes
traces - the count of traces (collection of spans that represents a transaction)
  • sf_environment

  • deployment.environment - This dimension is only available for histogram MMS.

  • sf_service

  • service.name - This dimension is only available for histogram MMS.

  • sf_operation

  • sf_httpMethod

  • sf_error

No
workflows - created by default when you create a business workflow
  • sf_environment

  • deployment.environment - This dimension is only available for histogram MMS.

  • sf_workflow

  • sf_error

No
Metric types

Monitoring MetricSets in Splunk RUM are generated as histogram metrics. Histogram metrics represent a distribution of measurements or metrics, with complete percentile data available. Data is distributed into equally sized intervals, allowing you to compute percentiles across multiple services, and aggregate datapoints from multiple metric time series. Histogram metrics provide an advantage over other metric types when calculating percentiles, such as the p90 percentile for a single MTS. See more in Metric types. For histogram MMS, there is a single metric for each component.

Previously, Monitoring MetricSets were classified as either a counter or gauge metric type.The previous Monitoring MetricSets included 6 metrics for each component.

Metric

Monitoring MetricSets

Histogram MMS

Request count

<component>.count

<component> with a count function

Minimum request duration

<component>.duration.ns.min

<component> with a min function

Maximum request duration

<component>.duration.ns.max

<component> with a max function

Median request duration

<component>.duration.ns.median

<component> with a median function

Percentile request duration

<component>.duration.ns.p90

<component> with a percentile function and a percentile value

Percentile request duration

<component>.duration.ns.p99

<component> with a percentile function and a percentile value

Each Monitoring MetricSet has a set of dimensions you can use to monitor and alert on service performance.

Example histogram metrics in Splunk RUM

A histogram MTS uses the following syntax using SignalFlow:

histogram(metric=<metric_name>[,filter=<filter_dict>][,resolution=<resolution>) 

The following table displays example SignalFlow functions:

DescriptionPrevious MMS functionHistogram MMS function
Aggregate count of all MTSA = data('spans.count').sum().publish(label='A')A = histogram('spans').count().publish(label='A')
P90 percentile for single MTSfilter_ = filter('sf_environment', 'us1') and filter('sf_service', 'apm-api-peanuts') and filter('sf_operation', 'POST /api/autosuggest/tagvalues') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = data('spans.duration.ns.p90', filter=filter_, rollup='sum').publish(label='A' filter_ = filter('sf_environment', 'us1') and filter('sf_service', 'apm-api-peanuts') and filter('sf_operation', 'POST /api/autosuggest/tagvalues') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = histogram('spans', filter=filter_).percentile(pct=90).publish(label='A')
Combined p90 for multiple servicesA = data('service.request.duration.ns.p90', filter=filter('sf_service', 'apm-graphql', 'apm-api-peanuts'), rollup='average').mean().publish(label='A')A = histogram('service.request', filter=filter('sf_service', 'apm-graphql', 'apm-api-peanuts')).percentile(pct=90).publish(label='A')
Note: Because an aggregation is applied on histogram(), to display all of the metric sets separately, each dimension needs to be applied as a groupby.
Visibility

Monitoring MetricSets appear in Dashboards, Alerts, and charts.

Task

Documentation

Create charts

Create charts in Splunk Observability Cloud

Create dashboards

Create and customize dashboards

Create an alert

Alert on Splunk RUM data

Monitor services in Splunk RUM dashboards

Splunk RUM dashboards

Retention

Splunk Observability Cloud stores Monitoring MetricSets for 13 months by default.

Comparing Monitoring MetricSets and Troubleshooting MetricSets

Because endpoint-level and service-level Monitoring MetricSets include a subset of the Troubleshooting MetricSet metrics, you might notice that metric values for a service are different depending on the context in Splunk RUM. This is because Monitoring MetricSets are the basis of the dashboard view and Monitoring MetricSets can only have a kind of SERVER or CONSUMER. In contrast, Troubleshooting MetricSets are the basis of the troubleshooting and Tag Spotlight views, and Troubleshooting MetricSets aren’t restricted to specific metrics.

For example, values for checkout service metrics displayed in the host dashboard might be different from the metrics displayed in the service map because there are multiple span kind values associated with this service that the Monitoring MetricSets that power the dashboard don’t monitor.

To compare Monitoring MetricSets and Troubleshooting MetricSets directly, restrict your Troubleshooting MetricSets to endpoint-only data by filtering to a specific endpoint. You can also break down the service map by endpoint.