Learn about MetricSets
Learn about MetricsSets and their default contents and dimensions.
A MetricSet is a time series of related metrics. MetricSets typically contain only your most important key performance indicators (KPIs). In SplunkRUM, there are two types of MetricSets:
Troubleshooting MetricSets
Troubleshooting MetricSets (TMS) contain high-cardinality metrics that use a span tag as a dimension. Dimensions let you group and analyze metrics using the distinct values of the span tag. For example, if a Troubleshooting MetricSet uses a span tag named user.id
as a dimension, you can group its metrics by each distinct value of user.id
. This allows you to analyze performance or issues specific to those user.id
values, and to make historical comparisons across spans and workflows.
Splunk RUM indexes several span tags by default, and automatically creates Troubleshooting MetricSets for them. You can't modify or stop this behavior. In addition, you can create custom Troubleshooting MetricSets by indexing additional span tags. All indexed span tags are dimensions of the MetricSet that contains them.
- Contents
In Splunk RUM, every Troubleshooting MetricSet contains the following metrics:
RED metrics (request count, error count, duration). RED metrics appear when you select a service in the service map.
Request rate
Error rate
Root cause error rate
p50, p90, and p99 latency
- Precision
The measurement precision of Troubleshooting MetricSets is 10 seconds. Splunk RUM reports quantities from a distribution of metrics for each 10-second reporting window.
- Visibility
Troubleshooting MetricSets appear on the service map and in Tag Spotlight. Use Troubleshooting MetricSets to filter the service map and to create breakdowns across the values of a given indexed span tag or process. See View dependencies among your services in the service map.
- Retention
Splunk Observability Cloud retains Troubleshooting MetricSets for the same amount of time as raw traces. By default, the retention period is 8 days. Learn more about best practices for span tags and Troubleshooting MetricSets.
Monitoring MetricSets
Monitoring MetricSets (MMS) contain endpoint-level and service-level metrics. You can use Monitoring MetricSets to create dashboards, charts, detectors, and alerts, and to monitor your environment in real time.
- Contents
In Splunk RUM, every Monitoring MetricSet contains a specific endpoint or an aggregate of all endpoints in a service.
Endpoint-level Monitoring MetricSets reflect the activity of a single endpoint in a service, while service-level Monitoring MetricSets aggregate the activity of all of the endpoints in the service. Monitoring MetricSets are created for spans where the
span.kind
has a value ofSERVER
orCONSUMER
.Spans might lack a
kind
value, or have a differentkind
value, in the following situations:The span originates in self-initiating operations or inferred services
An error in instrumentation occurs.
- Available default Monitoring MetricSet metrics and dimensions
Monitoring MetricSets are available for the Splunk RUM components listed in the following table. Each Monitoring MetricSet also has a set of dimensions you can use to monitor and alert on service performance. In addition to the following default Monitoring MetricSets, you can create custom Monitoring MetricSets to deep dive on your Monitoring MetricSets. See Create a Monitoring MetricSet with a custom dimension.
Metric name Dimensions Custom dimension available (Yes/No)? service.request
- the requests to endpoints in a servicesf_environment
deployment.environment
- This dimension is only available for histogram MMS.sf_service
service.name
- This dimension is only available for histogram MMS.sf_error
Yes inferred.services
- the requests to a service that has not yet been instrumentedsf_service
service.name
- This dimension is only available for histogram MMS.sf_environment
deployment.environment
- This dimension is only available for histogram MMS.sf_error
sf.kind
sf_operation
sf_httpMethod
No spans
- the count of spans (a single operation)sf_environment
deployment.environment
- This dimension is only available for histogram MMS.sf_service
service.name
- This dimension is only available for histogram MMS.sf_operation
sf_kind
sf_error
sf_httpMethod
, where relevant
Yes traces
- the count of traces (collection of spans that represents a transaction)sf_environment
deployment.environment
- This dimension is only available for histogram MMS.sf_service
service.name
- This dimension is only available for histogram MMS.sf_operation
sf_httpMethod
sf_error
No workflows
- created by default when you create a business workflowsf_environment
deployment.environment
- This dimension is only available for histogram MMS.sf_workflow
sf_error
No - Metric types
Monitoring MetricSets in Splunk RUM are generated as histogram metrics. Histogram metrics represent a distribution of measurements or metrics, with complete percentile data available. Data is distributed into equally sized intervals, allowing you to compute percentiles across multiple services, and aggregate datapoints from multiple metric time series. Histogram metrics provide an advantage over other metric types when calculating percentiles, such as the p90 percentile for a single MTS. See more in Metric types. For histogram MMS, there is a single metric for each component.
Previously, Monitoring MetricSets were classified as either a counter or gauge metric type.The previous Monitoring MetricSets included 6 metrics for each component.
Metric
Monitoring MetricSets
Histogram MMS
Request count
<component>.count
<component>
with acount
functionMinimum request duration
<component>.duration.ns.min
<component>
with amin
functionMaximum request duration
<component>.duration.ns.max
<component>
with amax
functionMedian request duration
<component>.duration.ns.median
<component>
with amedian
functionPercentile request duration
<component>.duration.ns.p90
<component>
with apercentile
function and a percentilevalue
Percentile request duration
<component>.duration.ns.p99
<component>
with apercentile
function and a percentilevalue
Each Monitoring MetricSet has a set of dimensions you can use to monitor and alert on service performance.
- Example histogram metrics in Splunk RUM
A histogram MTS uses the following syntax using SignalFlow:
histogram(metric=<metric_name>[,filter=<filter_dict>][,resolution=<resolution>)
The following table displays example SignalFlow functions:
Description Previous MMS function Histogram MMS function Aggregate count of all MTS A = data('spans.count').sum().publish(label='A')
A = histogram('spans').count().publish(label='A')
P90 percentile for single MTS filter_ = filter('sf_environment', 'us1') and filter('sf_service', 'apm-api-peanuts') and filter('sf_operation', 'POST /api/autosuggest/tagvalues') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = data('spans.duration.ns.p90', filter=filter_, rollup='sum').publish(label='A'
filter_ = filter('sf_environment', 'us1') and filter('sf_service', 'apm-api-peanuts') and filter('sf_operation', 'POST /api/autosuggest/tagvalues') and filter('sf_httpMethod', 'POST') and filter('sf_error', 'false') A = histogram('spans', filter=filter_).percentile(pct=90).publish(label='A')
Combined p90 for multiple services A = data('service.request.duration.ns.p90', filter=filter('sf_service', 'apm-graphql', 'apm-api-peanuts'), rollup='average').mean().publish(label='A')
A = histogram('service.request', filter=filter('sf_service', 'apm-graphql', 'apm-api-peanuts')).percentile(pct=90).publish(label='A')
Note: Because an aggregation is applied onhistogram()
, to display all of the metric sets separately, each dimension needs to be applied as agroupby
.- Visibility
Monitoring MetricSets appear in Dashboards, Alerts, and charts.
Task
Documentation
Create charts
Create dashboards
Create an alert
Monitor services in Splunk RUM dashboards
- Retention
Splunk Observability Cloud stores Monitoring MetricSets for 13 months by default.
Comparing Monitoring MetricSets and Troubleshooting MetricSets
Because endpoint-level and service-level Monitoring MetricSets include a subset of the Troubleshooting MetricSet metrics, you might notice that metric values for a service are different depending on the context in Splunk RUM. This is because Monitoring MetricSets are the basis of the dashboard view and Monitoring MetricSets can only have a kind
of SERVER
or CONSUMER
. In contrast, Troubleshooting MetricSets are the basis of the troubleshooting and Tag Spotlight views, and Troubleshooting MetricSets aren’t restricted to specific metrics.
For example, values for checkout
service metrics displayed in the host dashboard might be different from the metrics displayed in the service map because there are multiple span kind
values associated with this service that the Monitoring MetricSets that power the dashboard don’t monitor.
To compare Monitoring MetricSets and Troubleshooting MetricSets directly, restrict your Troubleshooting MetricSets to endpoint-only data by filtering to a specific endpoint. You can also break down the service map by endpoint.