Metrics in Splunk Observability Cloud

Introduction to metrics, data points, and metric time series in Splunk Observability Cloud.

In Splunk Observability Cloud, metric data consists of a numerical measurement called a metric, the metric type, and one or more dimensions. Each piece of data in this form is a data point. For example, a data point can be the CPU utilization of host server1 with metric type gauge, metric value 0.7, dimensions "hostname":"server1" and "host_location":"Tokyo", and the timestamp 1557225030000.

A metric time series (MTS) contains all the data points that have the same metric name, metric type, and set of dimensions. Splunk Observability Cloud automatically creates MTS from incoming data points. For example, the following data points for the cpu.utilization metric with the same "hostname":"server1" and "location":"Tokyo" dimensions, but with different values and timestamps, make up a single MTS.

MTS are used in Splunk Infrastructure Monitoring to populate charts and generate alerts.

Metrics

A metric is a measurable number that varies over time. Multiple sources of the same general type, such as host machines, usually report the metric values for a single set of metric names. For example, a server cluster that has 100 host machines might report a single set of metrics named cpu.utilization, api.calls, and dropped.packets, although metric values might be different for each machine.

Note: All metrics and MTS generated by Splunk Observability Cloud start with the prefix sf. or sf_metric.

Metric type

There are three types of metrics: gauge, cumulative counter, and counter. See more in Metric types.


Metric type	Description	Example
Gauge	Value of a measurement at a specific point in time	CPU utilization percentage of a server
Cumulative counter	Total number of occurrences or items since the measurement began	Total number of Splunk Infrastructure Monitoring API calls served since starting the web server
Counter	Number of new occurrences or items since the last measurement	The number of packets that fail to reach their destinations over each 24-hour period
Histograms	Distribution of measurements across time. Splunk Observability Cloud supports explicit bucket histograms.	Response time (performance) or successful screen loads (availability)

Metric category

There are about 20 metric categories in Splunk Observability Cloud. Metric category, especially metrics categorized as custom, can impact billing.

Learn all metric categories and how to identify them in Metric categories.

Metric resolution

By default, Splunk Observability Cloud processes metrics at a 10-second resolution. If metrics have a native resolution courser than 10 seconds, then Splunk Observability Cloud processes the metrics at their native resolution.

Optionally, metrics can be ingested at a higher resolution of 1 second. High-resolution metrics enable exceptionally fine-grained and low-latency visibility and alerting for your infrastructure, applications, and business performance.

Note: To process a metric at high resolution, set the dimension sf_hires to 1 in any MTS.

Metric metadata

Metrics can have associated metadata such as dimensions, custom properties, or tags. Learn more in Metadata: Dimensions, custom properties, tags, and attributes.

To add or edit dimensions:

Search the Metric Finder and Metadata Catalog.
Use the API. See how in our developer portal .

Data points

A data point contains a metric name and value, the type of the metric, and the dimensions of the metric. Dimensions are the key-value pairs that identify the source of the reported value. Infrastructure Monitoring assumes that incoming data points contain a metric as well as a dimension, or a unique key-value pair that describes some aspect of the metric source.

A data point consists of the following components:


Component	Description	Examples
Metric type	The specified metric type determines the way that Splunk Observability Cloud works with the metric. To learn more about metric types, see Metric types.	One of three metric types: `counter`, `cumulative counter`, or `gauge`.
Metric name	A metric name identifies the values that you send into Infrastructure Monitoring. For example, the AWS metric `4xxErrorRate` represents the percentage of all HTTP requests for which the HTTP status code is 4xx. The data source often determines the metric name, but application receivers and other integrations might map the data source metric name to another name used by Splunk Observability Cloud. To learn more about metrics naming constraints, see Naming conventions for metrics and dimensions.	`memory.free`, `CPUUtilization`, `page_visits`
Metric value	The measurement from your system, represented as a number. Metric values must be a signed integer, float, or numeric string in decimal or fixed-point notation. The system stores them as 64-bit integers. See more in the Send Traces, Metrics and Events API documentation.	`99.98751`, `0.7`, `"1.13"`
Dimensions	Key-value pairs that describe some aspect of the source of the metric. A data point can have one or more dimensions. The most common dimension is a source. For example, a dimension can be a host or instance for infrastructure metrics, or it can be an application component or service tier for application metrics. Dimensions are considered metric metadata. To learn more about dimensions, see Metadata: Dimensions, custom properties, tags, and attributes.	`"hostname":"server1"`, `"host_location":"Tokyo"`
Timestamp (Optional)	Either the time that data is sent by the software, or the time at which the data arrives in Splunk Observability Cloud. The timestamp is in *nix time in milliseconds.	1557225030000

Metric time series

A metric time series (MTS) is a collection of data points that have the same metric and the same set of dimensions.

For example, the following sets of data points are in three separate MTS:

MTS1: Gauge metric cpu.utilization, dimension "hostname": "host1"
MTS2: Gauge metric cpu.utilization, dimension "source_host": "host1"
MTS3: Gauge metric cpu.utilization, dimension "hostname": "host2"

MTS 2 has the same host value as MTS 1, but not the same dimension key. MTS 3 has the same host name as MTS 1, but not the same host name value.

Splunk Observability Cloud retains inactive MTS for 13 months.

Use unique dimensions to create independent MTS

It’s important to configure the Collector or ingest to provide at least one dimension that identifies a unique entity.

For example, when you report on the CPU utilization of 10 hosts in a cluster, the metric is the CPU utilization.

If each host in the cluster shares the exact same dimensions with all the other hosts, the cluster generates only one MTS. As a result, you might have difficultly in differentiating and monitoring the CPU utilization of each individual host in the cluster.

However, if each host in the cluster has at least one unique dimension (typically a unique hostname), the cluster generates 10 MTS, or one for each host. Each MTS represents the CPU utilization over time for a single host.

Metric types

Learn about the metric types in Splunk Observability Cloud: gauges, cumulative counters, histograms, and counters.

In Splunk Observability Cloud, there are four types of metrics: gauge, counters, cumulative counters, and histograms.

The following table lists the types of supported metrics and their default rollups in Splunk Observability Cloud:


Metric	Description	Rollup
Gauge metrics	Represent data that has a specific value at each point in time. Gauge metrics can increase or decrease.	Average
Counter metrics	Represent a count of occurrences in a time interval. Counter metrics can only increase during the time interval.	Sum
Cumulative counter metrics	Represent a running count of occurrences, and measure the change in the value of the metric from the previous data point.	Delta
Histograms	Represent a distribution of measurements or metrics, with complete percentile data available. Data is distributed into equally sized intervals or "buckets".	Histogram

The type of the metric determines which default rollup function Splunk Observability Cloud applies to summarize individual incoming data points to match a specified data resolution. A rollup is a statistical function that takes all the data points in a metric time series (MTS) over a time period and outputs a single data point. Splunk Observability Cloud applies rollups after it retrieves the data points from storage but before it applies analytics functions. To learn more about rollups and data resolution, see Rollups in Data resolution and rollups in charts.

Note: Splunk Observability Cloud applies the SignalFlow average() function to data points for gauge metrics. When you specify a 10-second resolution for a line graph plot, and Splunk Observability Cloud is receiving data for the metric every second, each point in the line represents the average of 10 data points.

Gauges

Fan speed, CPU utilization, memory usage, and time spent processing a request are examples of gauge metric data.

Splunk Observability Cloud applies the SignalFlow average() function to data points for gauge metrics. When you specify a ten second resolution for a line graph plot, and Splunk Observability Cloud is receiving data for the metric every second, each point on the line represents the average of 10 data points.

Counters

Number of requests handled, emails sent, and errors encountered are examples of counter metric data. The machine or app that generates the counter increments its value every time something happens and resets the value at the end of each reporting interval.

Splunk Observability Cloud applies the SignalFlow sum() function to data points for counter metrics. When you specify a ten second resolution for a line graph plot, and Splunk Observability Cloud is receiving data for the metric every second, each point on the line represents the sum of 10 data points.

Cumulative counters

Number of successful jobs, number of logged-in users, and number of warnings are examples of cumulative counter metric data. Cumulative counter metrics differ from counter metrics in the following ways:

Cumulative counters only reset to 0 when the monitored machine or application restarts or when the counter value reaches the maximum value representable (2 ³² or 2 ⁶⁴ ).
In most cases, you’re interested in how much the metric value changed between measurements.

Splunk Observability Cloud applies the SignalFlow delta() function to data points for cumulative counter metrics. When you specify a ten second resolution for a line graph plot, and Splunk Observability Cloud is receiving data for the metric every second, each point on the line represents the change between the first data point received and the 10th data point received. As a result, you don’t have to create custom SignalFlow to apply the delta() function, and the plot line represents variations.

Histograms

Histograms can summarize data in ways that are difficult to reproduce with other metrics. Thanks to the buckets, the distribution of your continuous data over time is easier to explore, as you don’t have to analyze the entire dataset to see where all the data points are. At the same time, histogram helps reduce usage of your subscription.

Splunk Observability Cloud applies the SignalFlow histogram() function to data points for histogram metrics, with a default percentile value of 90. You can apply several other functions to histograms, like min, max, count, sum, percentile, and cumulative_distribution_function.

For more information, see Histogram metrics in Splunk Observability Cloud.

Metric categories

Learn about metric categories in Splunk Observability Cloud.

Available categories and billing details

The following metric categories are used in Splunk Observability Cloud:


Billing class	Metrics included
Custom metrics	Metrics reported to Splunk Observability Cloud outside of those reported by default, such as host, container, or bundled metrics. Custom metrics might result in increased data ingest costs.
APM Monitoring MetricSets	Includes metrics from APM Monitoring MetricSets. See Learn about Monitoring MetricSets in Splunk APM for more information.
APM Troubleshooting MetricSets	See Learn about Troubleshooting MetricSets in Splunk APM
RUM Monitoring MetricSets	Includes metrics from RUM Monitoring MetricSets. See Filter and troubleshoot with custom tags for more information.
RUM Troubleshooting MetricSets	See Learn about MetricSets in Splunk RUM
AI Agent Monitoring metrics	AI span metrics (including request, error, and duration metrics) Evaluations Tokens Costs See Introduction to Splunk AI Agent Monitoring for more information.
Default/bundled metrics (Infrastructure)	Host Container Bundled Additional metrics sent through infrastructure monitoring public cloud integrations that aren’t attributed to specific hosts or containers.
Default/bundled metrics (Application)	APM host APM container APM identity APM bundled Tracing Runtime Synthetics
Other metrics	Internal metrics

Note: In subscription plans based on metric time series (MTS), all metrics are categorized as custom metrics and billed accordingly.

Note: Usage analytics doesn't support APM and RUM Troubleshooting and Monitoring MetricSets so you can't view these metrics in usage analytics. To learn more about usage analytics, see Analyze your metric usage in Splunk Observability Cloud.

Identify and track the category of a metric

In host-based plans, the category of a metric might impact billing.

To keep track of the type of metrics you're ingesting, use Usage Analytics.

Track specific org metrics with custom metric information. See more in View organization metrics for Splunk Observability Cloud.

Histogram metrics in Splunk Observability Cloud

Splunk Observability Cloud natively supports histograms. All histogram metric data you send to Splunk Observability Cloud through OpenTelemetry feeds charts, alerts, and other features.

Splunk Observability Cloud supports histogram data. You can use the histogram metric data you send from instrumented applications and services to Splunk Observability Cloud to create charts, detectors, and more.

Understanding histograms

A histogram represents the distribution of observations. Histograms require numerical, continuous values. Examples of continuous values include time, size, or temperature. The following chart is a visual representation of a histogram for response times in milliseconds:

A sample histogram for response times with five intervals.

Histograms store data in buckets, which are adjacent intervals with numeric boundaries. The buckets or bars in the previous histogram span 100 milliseconds. The size of each bar is determined by the number of observations inside each interval. The higher the bar, the more data points fall within the interval.

You can calculate the total number of observations, the minimum and maximum value, the sum of all values, the average value, and discrete percentile values in every histogram. Splunk Observability Cloud provides a SignalFlow function for histograms, which you can use to customize histograms or perform calculations on the data.

Histograms are useful to compare different datasets at a glance, and to identify trends in your data that might be otherwise hard to detect. For example, histograms can answer questions like "What was the 90th percentile of response time for the database yesterday?"

When to use histogram metrics

Histograms can summarize data in ways that are difficult to reproduce using other metrics. With histogram buckets, you can explore the distribution of your continuous data over time without needing to analyze the entire dataset to see all of the data points. Histograms can combine multiple statistics into a single datapoint, such as sum, min, max, and count, along with the buckets.

Service level objectives (SLO)

Histograms are particularly suited for representing performance and availability service level objectives (SLO). Examples of availability SLOs are checking whether a percentile n of all requests is processed in less than a certain duration or that a percentile n of screens in your app loads successfully.

Unlike metrics covering a single percentile or quantile, histograms contain the percentiles or quantiles you need to track in a single metric. This facilitates exploring data in depth after initial detections. For example, if you get an alert for the 99th percentile for response time, using histograms you can explore other percentiles.

See Introduction to service level objective (SLO) management in Splunk Observability Cloud for more information.

Histogram instead of calculated metrics

Histograms contain data that you can use to calculate percentiles and other statistics in Splunk Observability Cloud instead of calculating them using your infrastructure. Sending histograms also results in fewer MTS sent, which reduces your subscription usage.

For example, if you’re sending the service.response_time.upper_90 and service.response_time.upper_95 metrics to track the response time of a key service in your infrastructure at the 90th and 95th percentiles, you can send histogram data for the entire distribution of response times, eliminating the need of sending 2 separate MTS.

Explicit bucket histograms

Explicit bucket histograms are histograms with predefined bucket boundaries. The advantage of defining bucket boundaries yourself is that you can use limits that make sense in your situation.

For example, the following Java code creates an OpenTelemetry histogram with explicit bucket boundaries:

JAVA

void exampleWithCustomBuckets(Meter meter) {
   DoubleHistogramBuilder originalBuilder = meter.histogramBuilder("people.ages");
   ExtendedLongHistogramBuilder builder = (ExtendedLongHistogramBuilder) originalBuilder.ofLongs();
   List<Long> bucketBoundaries = Arrays.asList(0L, 5L, 12L, 18L, 24L, 40L, 50L, 80L, 115L);
   LongHistogram histogram =
      builder
            .setAdvice(advice -> advice.setExplicitBucketBoundaries(bucketBoundaries))
            .setDescription("A distribution of people's ages")
            .setUnit("years")
            .build();
   addDataToHistogram(histogram);
}

void exampleWithCustomBuckets(Meter meter) {
   DoubleHistogramBuilder originalBuilder = meter.histogramBuilder("people.ages");
   ExtendedLongHistogramBuilder builder = (ExtendedLongHistogramBuilder) originalBuilder.ofLongs();
   List<Long> bucketBoundaries = Arrays.asList(0L, 5L, 12L, 18L, 24L, 40L, 50L, 80L, 115L);
   LongHistogram histogram =
      builder
            .setAdvice(advice -> advice.setExplicitBucketBoundaries(bucketBoundaries))
            .setDescription("A distribution of people's ages")
            .setUnit("years")
            .build();
   addDataToHistogram(histogram);
}

Note: Currently, Splunk Observability Cloud supports only explicit bucket histograms.

Get histogram data into Splunk Observability Cloud

For instructions on how to get histogram data into Splunk Observability Cloud and how to migrate existing reporting elements, see Get histogram data into Splunk Observability Cloud.

Get histogram data into Splunk Observability Cloud

You can collect histogram data using a variety of receivers, including the Prometheus receiver, and send them to Splunk Observability Cloud using the OpenTelemetry Collector.

You can collect histogram data using a variety of receivers, including the Prometheus receiver, and send them to Splunk Observability Cloud using the OpenTelemetry Collector. See Prometheus receiver.

The Splunk Distribution of OpenTelemetry Collector supports explicit-bucket histogram metrics. This allows you to send histogram metric using the OTLP/HTTP exporter or, starting from version 0.98, through the SignalFx exporter.

CAUTION: Only the SignalFx exporter included in the Splunk Distribution of the OpenTelemetry Collector has been confirmed to successfully send histogram data. The version of the SignalFx exporter in the Amazon Distribution of OpenTelemetry (ADOT), however, does not support the parameter send_otlp_histogram and, therefore, cannot be used to send histogram data.

Export histogram data with the SignalFx exporter

The version of the SignalFx exporter in the Splunk Distribution of the OpenTelemetry Collector supports the parameter send_otlp_histograms and is the recommended method to send histogram data.

The SignalFx exporter can preserve histogram bucket data. This can be used to extract various statistics from the metric at charting time, e.g., 90th percentile or mean.

To send histogram data to Splunk Observability Cloud with the SignalFx exporter, set send_otlp_histograms: true in your Collector values.yaml file. For example:

YAML

exporters:
  signalfx:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    api_url: "${SPLUNK_API_URL}"
    ingest_url: "${SPLUNK_INGEST_URL}"
    sync_host_metadata: true
    correlation:
    send_otlp_histograms: true

exporters:
  signalfx:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    api_url: "${SPLUNK_API_URL}"
    ingest_url: "${SPLUNK_INGEST_URL}"
    sync_host_metadata: true
    correlation:
    send_otlp_histograms: true

Export histogram data with the OTLP/HTTP exporter

Because of its broader compatibility, the OTLP/HTTP exporter is able to send histogram data to any OTLP-compliant backend. Splunk Observability Cloud does not support sending metrics via the gRPC-based OTLP exporter, so you must use OTLP/HTTP exporter.

Note: The OTLP/HTTP exporter can send logs but not to Splunk Observability Cloud. You should use the the Splunk HTTP Event Collector (HEC) if you need to export logs to Splunk Observability Cloud.

To send histogram data to Splunk Observability Cloud with the OLTP/HTTP exporter, you configure the endpoints for the metrics_endpoint and the traces_endpoint fields:

CODE

exporters:
  otlp_http:
    metrics_endpoint: https://ingest.<realm>.observability.splunkcloud.com/v2/datapoint/otlp
    traces_endpoint: https://ingest.<realm>.observability.splunkcloud.com/v2/trace/otlp
    headers:
      "X-SF-Token": "mytoken"
    tls:
      insecure: true
    timeout: 10s

exporters:
  otlp_http:
    metrics_endpoint: https://ingest.<realm>.observability.splunkcloud.com/v2/datapoint/otlp
    traces_endpoint: https://ingest.<realm>.observability.splunkcloud.com/v2/trace/otlp
    headers:
      "X-SF-Token": "mytoken"
    tls:
      insecure: true
    timeout: 10s

Best practices when sending bucket histogram data

When sending bucket histogram data to Splunk Observability Cloud, follow these best practices:

Send minimum and maximum values, unless you’re sending cumulative data. The minimum value must be lower than the maximum value, otherwise the datapoint is dropped.
Use no more than 31 bucket boundaries when sending histograms. Histograms with more than 31 bucket boundaries (32 buckets) are dropped.
Make sure that bucket boundaries don’t overlap or repeat. Order the bucket boundaries when sending them.
Send values as signed integer, float, or numeric string in decimal or fixed-point notation. Splunk Observability Cloud stores them as 64-bit integers.
Check that the sum of all histogram buckets is equal to the count field, and that the size of bucket boundaries is equal to the bucket count minus 1. Histograms that don’t comply with these criteria are dropped.
When sending cumulative data, for example from Prometheus, use delta aggregation temporality. See Considerations on delta aggregation temporality for instructions on how to configure delta temporality in your system.

Considerations on delta aggregation temporality

When handling cumulative histograms, you must set the delta aggregation temporality flag. If you do not, the cumulative histograms will lack minimum and maximum values. This might cause a percentile calculation to give an incorrect value.

To activate delta aggregation temporality in your instrumentation, set the OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE environment variable to delta. See the compliance matrix in the OpenTelemetry Specification repository to check SDK support for your language.

Send histogram data using the API

If you need to bypass the OpenTelemetry Collector, send histogram data directly to Splunk Observability Cloud using the /v2/datapoint/otlp endpoint of the ingest API. The endpoint accepts data in OTLP, serialized as Protobuf, over HTTP. The gRPC scheme is not supported.

To learn how to send histogram metric data using the API, see /datapoint/otlp in the Splunk Developer Portal.

Migrate your dashboards, functions, charts, and detectors

To migrate your existing dashboards, functions, charts, and detectors to histograms, follow these steps:

Make sure that you’re sending histogram data using the Splunk Distribution of OpenTelemetry Collector version 0.98 or higher. Lower versions can’t send histogram data in OTLP format using the SignalFx exporter.
Edit your charts to use the new histogram() function. See histogram() in the SignalFlow reference documentation.

Troubleshooting

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Submit a case in the Splunk Support Portal.
Contact Splunk Support.

Available to prospective customers and free trial users

Ask a question and get answers through community support at Splunk Answers.
Join the Splunk community #observability Slack channel to communicate with customers, partners, and Splunk employees worldwide.

Naming conventions for metrics and dimensions

Naming conventions for metric and dimensions in Splunk Observability Cloud.

Read this document to learn about naming conventions and recommendations for custom metrics and dimensions in Splunk Observability Cloud.

Note: All metrics and MTS generated by Splunk Observability Cloud start with the prefix sf. or sf_metric.

Types of data in Splunk Observability Cloud

Splunk Observability Cloud works with imported or existing data as well as custom data.

Imported data

When you use an existing data collection integration such as the collectd agent or the AWS CloudWatch integration, the integration defines metric, dimension, and event names for you. To learn more, see Metric name standards.

To make it easier for you to find and work with metrics coming in from different sources, Splunk Infrastructure Monitoring pulls, transforms, and returns the data in a unified format called virtual metrics. See Virtual metrics in Splunk Infrastructure Monitoring for more information.

Custom data

When you send custom metrics, dimensions, or events (key-value pairs you send to mark specific events such as a release) to Splunk Infrastructure Monitoring, you choose your own names.

Send custom data to Splunk Observability Cloud

To learn how to send custom metrics in Splunk Observability Cloud using our API, see the developer portal .

If you’re using the OpenTelemetry Collector, you can create a receiver to Send custom metrics to Splunk Observability Cloud.

Modify naming schemes you sent to other metric systems

If you’re working with metrics that you had previously sent to other metric systems, such as Graphite, modify the naming scheme to leverage the full feature set of Splunk Observability Cloud.

Metric name standards

Metrics are distinct numeric measurements generated by system infrastructure, application instrumentation, or other hardware or software, which change over time. For example:

Count of GET requests received
Percent of total memory in use
Network response time in milliseconds

Use descriptive names

Metric names can have up to 256 characters. If the value is longer, the metric might be dropped.

Use names that help you identify what the metric is related to.


Metric information	Example
Measurement description	`cpu.temperature`
Measurement units	`degreesC`
Metric category	`temperature`

If you apply a calculation to the metric before you send it, use the calculation as part of the description. For example, if you calculate the ninety-fifth percentile of measurements and send the result in a metric, use p95 as part of the metric name.

On the other hand, some information is better suited for dimension instead of a metric names, such as the description of the hardware or software being measured. For example, don’t use production1 to indicate that the measurement is for a particular host. To learn more, see Type of information suitable for dimensions.

Use metric names to indicate metric types

Follow these best practices to use names to indicate different metric types:

Give each metric its own name.
When you define your own metric, give each metric a name that includes a reference of the metric type.
Avoid assigning custom metric names that include dimensions. For example, if you have 100 server instances and you want to create a custom metric that tracks the number of disk writes for each one, differentiate between the instances with a dimension.

Create metric names using a hierarchical structure

Start at the highest level, then add more specific values as you proceed.

In this example, all of these metrics have a dimension key called hostname with values such as analytics-1, analytics-2, and so forth. These metrics also have a customer dimension key with values org-x, org-y, and so on. The dimensions provide an infrastructure-focused or a customer-focused view of the analytics service usage. For more information on gauge metrics, see Identify metric types.

Start with a domain or namespace that the metric belongs to, such as analytics or web.
Next, add the entity that the metric measures, such as jobs or http.
At your discretion, add intermediate names, such as errors.
Finish with a unit of measurement. For example, the SignalFlow analytics service reports the following metrics:
- analytics.jobs.total: Gauge metric that periodically measures the current number of executing jobs
- analytics.thrift.execute.count: Counter metric that’s incremented each time new job starts
- analytics.thrift.execute.time: Gauge metric that measures the time needed to process a job execution request
- analytics.jobs_by_state: Counter metric with a dimension key called state, incremented each time a job reaches a particular state.

Dimension names and value standards

Dimensions are arbitrary key-value pairs you associate with metrics. While metrics identify a measurement, dimensions identify a specific aspect of the system that’s generating the measurement or characterizes the measurement. Use dimensions to:

Classify different streams of data points for a metric.
Simplify filtering and aggregation. For example, SignalFlow lets you filter and aggregate data streams by one or more dimensions.

Dimensions can be numeric or nonnumeric. Some dimensions, such as host name and value, come from a system you’re monitoring. You can also create your own dimensions.

Dimension key and value requirements

Dimension key names are UTF-8 strings with a maximum length of 128 characters (512 bytes).

For example, if a dimension’s key:value pair is ("mydim", "myvalue"), ‘’mydim’’ is limited to 128 characters.
Must start with an uppercase or lowercase letter. The rest of the name can contain letters, numbers, underscores (_) and hyphens (-), and periods (.), but cannot contain blank spaces.
Must not start with the underscore character (_).
Must not start with the prefix sf_, except for dimensions defined by Splunk Observability Cloud such as sf_hires.

Dimension values are UTF-8 strings with a maximum length of 256 UTF-8 characters (1024 bytes).

For example, if a dimension’s key:value pair is ("mydim", "myvalue"), ‘’myvalue’’ is limited to 256 characters.
If the value is longer, then the datapoint might be dropped.
Numbers are represented as numeric strings.

You can have up to 36 dimensions per MTS. If this limit is exceeded, the data point is dropped, and a message is logged.

To ensure readability, keep names and values to 40 characters or less.

For example:

"hostname": "production1"
"region": "emea"

Considerations for metric and dimension names in your organization

Create consistent names for your organization:

Use a single consistent delimiter in metric names. Using a single consistent delimiter in metric names helps you search with wildcards. Use periods or underscores as delimiters. Don’t use colons or slashes.
Avoid changing metric and dimension names. If you change a name, you have to update the charts and detectors that use the old name. Infrastructure Monitoring doesn’t do this automatically.
Since you’re not the only person using the metric or dimension, use names easy to identify and understand. Follow established conventions. To find out the conventions in your organization, browse your metrics using the Metric Finder.

Guidelines for working with low and high cardinality data

Send low-cardinality data only in metric names or dimension key names. Low-cardinality data has a small number of distinct values. For example, the metric name web.http.error.count for a gauge metric that reports the number of HTTP request errors has a single value. This name is also readable and self-explanatory. For more information on gauge metrics, see Identify metric types.

High-cardinality data has a large number of distinct values. For example, timestamps are high-cardinality data. Only send this kind of high-cardinality data in dimension values. If you send high-cardinality data in metric names, Infrastructure Monitoring might not ingest the data. Infrastructure Monitoring rejects metrics with names that contain timestamps. High-cardinality data does have legitimate uses. For example, in containerized environments, container_id is usually a high-cardinality field. If you include container_id in a metric name such as system.cpu.utilization.<container_id>, instead of having one MTS, you have as many MTS as you have containers.

When to use metrics or dimensions

Use metrics when tracking different metric types

In Infrastructure Monitoring, all metrics belong to a specific metric type, with a specific default rollup. To learn more about metric types, see Metric types.

To track a measurable value using two different metric types, use two metrics instead of one metric with two dimensions.

For example, suppose you have a network_latency measurement that you want to send as two different metric types: a gauge metric (the average network latency in milliseconds) and a counter metric (the total number of latency values sent in an interval). In this case, send the measurement using two different metric names, such as network_latency.average and network_latency.count, instead of one metric name with two dimensions type:average and type:count.

Type of information suitable for dimensions

See some examples of types of information you can add to dimensions:

Categories rather than measurements: If doing an arithmetic operation on dimension values results in something meaningful, you don’t have a dimension.
Metadata for filtering, grouping, or aggregating.
Name of entity being measured: For example hostname, production1.
Metadata with large number of possible values: Use one dimension key for many different dimension values.
Nonnumeric values: Numeric dimension values are usually labels rather than measurements.

Example: Custom metrics and dimensions to measure HTTP errors

Let’s imagine you want to track the following data to oversee HTTP errors:

Number of errors
HTTP response code for each error
Host that reported the error
Service (app) that returned the error

Suppose you identify your data with a long metric name instead of a metric name and a dimension. For example, web.http.myhost.checkout.error.500.count might be a long metric name that represents the number of HTTP response code 500 errors reported by the host named myhost for the service checkout.

If you use web.http.myhost.checkout.error.500.count, you might encounter the following issues:

To visualize this data in a Splunk Infrastructure Monitoring chart, you have to run a wildcard query with the syntax web.http.*.*.error.*.count.
To sum up the errors by host, service, or error type, you have to change the query.
You can’t use filters or dashboard variables in your chart.
You have to define a separate metric name to track HTTP 400 errors, or errors reported by other hosts, or errors reported by other services.

Instead, use dimensions to track the same data:

Define a metric name that describes the measurement you want, which is the number of HTTP errors: web.http.error.count. The metric name includes the following:
- web: Your name for a family of metrics for web measurements
- http.error: Your name for the protocol you’re measuring (http) and an aspect of the protocol (error)
- count: The unit of measure
Define dimensions that categorize the errors. The dimensions include the following:
- host: The host that reported the error
- service: The service that returned the error
- error_type: The HTTP response code for the error

This way, to visualize the error data using a chart, you can search for "error count" to locate the metric by name. When you create the chart, you can filter and aggregate incoming metric time series by host, service, error_type, or all three. You can add a dashboard filter so that when you view the chart in a specific dashboard, you don’t have the chart itself.

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

Enterprise Security

SOAR

IT Service Intelligence

Content Packs

Splunk Observability Cloud

AppDynamics SaaS

AppDynamics On-Premises

Virtual Appliance (Self-Hosted)

Developer Documentation

Splunkbase

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

DATA MANAGEMENT

SEARCH AND ANALYTICS

ADMINISTRATION

Enterprise Security

SOAR

ENTERPRISE SECURITY

SOAR

RELATED APPS

IT Service Intelligence

Content Packs

ITSI

IT Ops

ADMINISTRATION

EXTENSIONS

Splunk Observability Cloud

MONITORING

DATA MANAGEMENT

ADMINISTRATION

AppDynamics SaaS

AppDynamics On-Premises

Virtual Appliance (Self-Hosted)

ESSENTIALS

MONITORING

ADMINISTRATION

Developer Documentation

Splunkbase

PLATFORM

OBSERVABILITY

REFERENCE

Resources

REFERENCE

Learn More

Support

Metrics in Splunk Observability Cloud

Metrics

Metric type

Metric category

Metric resolution

Metric metadata

Data points

Metric time series

Use unique dimensions to create independent MTS

Metric types

Gauges

Counters

Cumulative counters

Histograms

Metric categories

Available categories and billing details

Identify and track the category of a metric

Histogram metrics in Splunk Observability Cloud

Understanding histograms

When to use histogram metrics

Service level objectives (SLO)

Histogram instead of calculated metrics

Explicit bucket histograms

Get histogram data into Splunk Observability Cloud

Get histogram data into Splunk Observability Cloud

Export histogram data with the SignalFx exporter

Export histogram data with the OTLP/HTTP exporter

Best practices when sending bucket histogram data

Considerations on delta aggregation temporality

Send histogram data using the API

Migrate your dashboards, functions, charts, and detectors