Group by attributes processor
Use the Group by Attributes processor to reassociate spans, log records, and metric data points to a resource that matches with the specified attributes. As a result, all spans, log records, or metric data points with the same values for the specified attributes are grouped under the same resource.
The Group by Attributes processor is an OpenTelemetry Collector component that reassociates spans, log records, and metric data points to a resource that matches with the specified attributes. As a result, all spans, log records, or metric data points with the same values for the specified attributes are grouped under the same resource.
The supported pipeline types are traces, metrics, and logs. See Process your data with pipelines for more information.
Get started
Follow these steps to configure and activate the component:
-
Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:
-
Configure the
groupbyattrsprocessor as described in the next section. -
Restart the Collector.
Sample configurations
To activate the resource processor, add groupbyattrs to the processors section of your configuration file. Specify an array of attribute keys to use to “group” spans, log records or metric data points together, as in the following example:
processors:
groupbyattrs:
keys:
- foo
- bar
The keys property describes which attribute keys will be considered for grouping:
-
If the processed span, log record and metric data point has at least one of the specified attributes key, it will be moved to a resource with the same value for these attributes. The resource will be created if none exists with the same attributes.
-
If none of the specified attributes key is present in the processed span, log record or metric data point, it remains associated to the same resource, without any change.
To complete the configuration, include the processor in any pipeline of the service section of your configuration file. For example:
service:
pipelines:
metrics:
processors: [groupbyattrs]
logs:
processors: [groupbyattrs]
traces:
processors: [groupbyattrs]
See config.go for the config spec.
Typical use cases
Use the processor to perform the following actions:
-
Extract resources from “flat” data formats, such as Fluentbit logs or Prometheus metrics.
-
Associate Prometheus metrics to a resource that describes the relevant host, based on a label present on all metrics.
-
Optimize data packaging by extracting common attributes.
-
Compact multiple records that share the same
resourceandInstrumentationLibraryattributes but are under multipleResourceSpansorResourceMetricsorResourceLogsinto a singleResourceSpansorResourceMetricsorResourceLogs, when an empty list of keys is provided.-
This happens, for example, when you use the
groupbytraceprocessor, or when data comes in multiple requests. -
If you compact data it takes less memory, it’s more efficiently processed and serialized, and the number of export requests is reduced.
-
Tip
Use the groupbyattrs processor together with batch processor, as a consecutive step. Grouping records together under matching resource and/or InstrumentationLibrary reduces the fragmentation of data.
Advanced configuration examples
Group metrics by host
Consider the below metrics, all originally associated to the same resource:
Resource {host.name="localhost",source="prom"}
Metric "gauge-1" (GAUGE)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "gauge-1" (GAUGE) // Identical to previous Metric
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-B",id="eth0"}
Metric "mixed-type" (SUM)
DataPoint {host.name="host-A",id="eth0"}
DataPoint {host.name="host-A",id="eth0"}
Metric "dont-move" (Gauge)
DataPoint {id="eth0"}
Use the following configuration to re-associate the metrics with either host-A or host-B, based on the value of the host.name attribute.
processors:
groupbyattrs:
keys:
- host.name
The output of the processor is:
Resource {host.name="localhost",source="prom"}
Metric "dont-move" (Gauge)
DataPoint {id="eth0"}
Resource {host.name="host-A",source="prom"}
Metric "gauge-1"
DataPoint {id="eth0"}
DataPoint {id="eth0"}
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (SUM)
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Resource {host.name="host-B",source="prom"}
Metric "gauge-1"
DataPoint {id="eth0"}
DataPoint {id="eth0"}
Metric "mixed-type" (GAUGE)
DataPoint {id="eth0"}
The groupbytrace processor has accomplished the following:
-
The
DataPointsfor thegauge-1metric were originally split under 2 metric instances, and have been merged in the output. -
The
DataPointsof themixed-typegaugeand mixed-typesummetrics have not been merged under the same metric, because theirDataTypeis different. -
The
dont-movemetricDataPointsdon’t have ahost.nameattribute, and therefore have remained under the original resource. -
The new resources inherited the attributes from the original resource (source=”prom”), and the specified attributes from the processed metrics (
host.name="host-A"orhost.name="host-B"). -
The specified grouping attributes that are set on the new resources are also removed from the metric
DataPoints. -
While not shown in this example, the processor also merges collections of records under matching
InstrumentationLibrary.
Compact data
In some cases, data might come in single requests to the Collector, or become fragmented due to use of the groupbytrace processor. Even after batching there might be multiple duplicated ResourceSpans or ResourceMetrics or ResourceLogs objects, which leads to additional memory consumption, increased processing costs, inefficient serialization, or increase of the export requests.
To remedy this, use the groupbyattrs processor to compact the data by matching Resource and InstrumentationLibrary properties.
For example, consider the following input:
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=1, ...}
InstumentationLibrary {name="OtherLibrary"}
Spans
Span {span_id=2, ...}
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=3, ...}
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=4, ...}
Resource {host.name="otherhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=5, ...}
Use the following configuration to re-associate the spans with matching Resource and InstrumentationLibrary.
processors:
batch:
groupbyattrs:
pipelines:
traces:
processors: [batch, groupbyattrs/grouping]
...
The output of the processor is:
Resource {host.name="localhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=1, ...}
Span {span_id=3, ...}
Span {span_id=4, ...}
InstumentationLibrary {name="OtherLibrary"}
Spans
Span {span_id=2, ...}
Resource {host.name="otherhost"}
InstumentationLibrary {name="MyLibrary"}
Spans
Span {span_id=5, ...}
Settings
The following table shows the configuration options for the groupbyattrs processor:
Internal metrics
The groupbyattrs processor records the following internal metrics:
|
Metric |
Description |
|---|---|
|
|
The number of spans that had attributes grouped |
|
|
The number of spans that did not have attributes grouped |
|
|
Distribution of groups extracted for spans |
|
|
Number of logs that had attributes grouped |
|
|
Number of logs that did not have attributes grouped |
|
|
Distribution of groups extracted for logs |
|
|
Number of metrics that had attributes grouped |
|
|
Number of metrics that did not have attributes grouped |
|
|
Distribution of groups extracted for metrics |
Troubleshooting
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
Submit a case in the Splunk Support Portal.
Contact Splunk Support.
Available to prospective customers and free trial users
Ask a question and get answers through community support at Splunk Answers.
Join the Splunk community #observability Slack channel to communicate with customers, partners, and Splunk employees worldwide.