Resource detection processor

Use the resource detection processor to detect resources and manipulate information about them in OpenTelemetry format. Read on to learn how to configure the component.

The resource detection processor is an OpenTelemetry Collector component that can detect resources in the incoming telemetry and collect additional metadata about them. The supported pipeline types are traces, metrics, and logs. See Process your data with pipelines for more information.

The resource detection processor uses detectors to collect system metadata from a variety of sources. The detection targets supported by the resource detection processor are the following:

On-host environment variables
On-host system information
Amazon Web Services EC2, ECS, EKS, Elastic Beanstalk, and Lambda
Azure instances and AKS
Google Cloud Platform GCE, GKE, Cloud Run, Cloud Functions, and App Engine
Consul agents
Openshift and Kubernetes
Docker containers
Heroku

You can use metadata collected by the resource detection processor to expand or overwrite resource values in the collected telemetry. By default, the processor overrides existing resource metadata. You can also choose to append attributes to existing resources.

Note: For information about the Resource processor, see Resource processor.

Get started

Note: This component is included in the default configuration of the Splunk Distribution of the OpenTelemetry Collector. For details about the default configuration, see Configure the Collector for Kubernetes with Helm, Collector for Linux default configuration, or Collector for Windows default configuration. You can customize your configuration any time as explained in this document.

By default, the Splunk Distribution of OpenTelemetry Collector includes the resource detection processor in all the predefined pipelines when deploying in host monitoring (agent) mode. When deploying the Collector in data forwarding (gateway) mode, the resource detection processor collects internal metrics. See Collector deployment modes for more information.

To detect more types of resources, you can configure additional processors and add them to existing or new pipelines, as shown in the following sample configurations.

CAUTION: Don’t remove the resourcedetection or the resourcedetection/internal processors from the configuration. Removing the processor might prevent Splunk Observability Cloud from collecting infrastructure metadata.

Follow these steps to configure and activate the component:

Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:
Configure the processor as described in this doc.
Restart the Collector.

Main configuration

The resource attributes processor accepts a list of detectors in detectors. You can specify which resource attributes are collected or ignored for each detector, as well as whether existing attributes must be overridden. See Available detectors and metadata for a list of detectors.

Note: Starting from version 0.81 of the Collector, the attributes setting is deprecated. To migrate from attributes to resource_attributes, see Migrate from attributes to resource_attributes.

The following example shows the main configuration settings of the resource attributes processor:

resourcedetection:
  # List of detectors
  detectors: [ec2, system]
  # Whether to override existing attributes. Default is true
  override: true
  system:
    resource_attributes:
      host.name:
        enabled: true
      host.id:
        enabled: false
  ec2:
    resource_attributes:
      host.name:
        enabled: false
      host.id:
        enabled: true

Next, include the processor in the required pipelines of the service section of your configuration file:

service:
  pipelines:
    metrics:
      processors: [resourcedetection]
    logs:
      processors: [resourcedetection]
    traces:
      processors: [resourcedetection]

Ordering considerations

If multiple detectors insert the same attribute name, only the first detector is considered. For example, if you use the eks and ec2 detectors, the value of the cloud.platform attribute is aws_eks instead of ec2.

When using multiple AWS detectors, follow this order: lambda, elastic_beanstalk, eks, ecs, ec2.

Detect resources and collect data

The following sample configurations show how to detect resources from different targets.

Collect EC2 resources and tags

The following example shows how detect resources, environment variables, and selected tags from EC2 instances without overwriting existing metadata:

processors:
  resourcedetection/ec2:
    detectors: [env, ec2]
    timeout: 2s
    override: false
    ec2:
    # List of attributes to collect or ignore
     resource_attributes:
       host.name:
         enabled: false
       host.id:
         enabled: true
    # Regex patterns for tag keys you want to add as resource attributes
      tags:
        - ^tag1$
        - ^tag2$
        - ^label.*$

Collect OpenShift resources over a TLS connection

The following example shows how to collect resource attributes from OpenShift and the Kubernetes API by specifying an IP address and port, as well as a TLS certificate and service token:

processors:
  resourcedetection/openshift:
    detectors: [openshift]
    timeout: 2s
    override: false
    openshift:
      address: "127.0.0.1:4444"
      token: "<token>"
      tls:
        insecure: false
        ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"

Collect system metadata using all available sources

The following example shows how to use all sources available to the system detector to determine the host name. The resource_attributes field tells the processor to only include the selected attributes.

processors:
  resourcedetection/system:
    detectors: ["system"]
    system:
      # Default is "dns" and "os"
      hostname_sources: ["lookup", "cname", "dns", "os"]
      # Attributes to collect or ignore. Invalid names are ignored
      resource_attributes:
        host.name:
          enabled: true
        host.id:
          enabled: true

Available detectors and metadata

The resource detection processor uses detectors to collect resource metadata. By default, the following detectors are active in the Splunk Distribution of OpenTelemetry Collector: gcp, ecs, ec2, azure, and system.

Amazon Elastic Beanstalk metadata

The elastic_beanstalk detector collects the following resource attributes by reading the AWS X-Ray configuration on all Beanstalk instances that have X-Ray activated:

cloud.provider (Value: aws)
cloud.platform (Value: aws_elastic_beanstalk)
deployment.environment
service.instance.id
service.version

Amazon EKS metadata

The eks detector collects the following resource attributes:

cloud.provider (Value: aws)
cloud.platform (Value: aws_eks)

AWS EC2 metadata

The ec2 detector collects the following resource attributes:

cloud.provider (Value: aws)
cloud.platform (Value: aws_ec2)
cloud.account.id
cloud.region
cloud.availability_zone
host.id
host.image.id
host.name
host.type

The ec2 detector can also collect tags. To collect tags, add the ec2:DescribeTags permission to the EC2 instance’s policy. If you’re using a proxy on the EC2 instance, allow requests for metadata.

AWS ECS metadata

The ecs detector collects the following resource attributes. Only Task Metadata Endpoint (TMDE) version 3 and 4 are supported.

cloud.provider (Value: aws)
cloud.platform (Value: aws_ecs)
cloud.account.id
cloud.region
cloud.availability_zone
aws.ecs.cluster.arn
aws.ecs.task.arn
aws.ecs.task.family
aws.ecs.task.revision
aws.ecs.launchtype (TMDE version 4 only)
aws.log.group.names (TMDE version 4 only)
aws.log.group.arns (TMDE version 4 only)
aws.log.stream.names (TMDE version 4 only)
aws.log.stream.arns (TMDE version 4 only)

AWS Lambda metadata

The lambda detector collects the following resource attributes using runtime environment variables:

aws.log.group.names (Value: $AWS_LAMBDA_LOG_GROUP_NAME)
aws.log.stream.names (Value: $AWS_LAMBDA_LOG_STREAM_NAME)
cloud.provider (Value: aws)
cloud.platform (Value: aws_lambda)
cloud.region (Value: $AWS_REGION)
faas.name (Value: $AWS_LAMBDA_FUNCTION_NAME)
faas.version (Value: $AWS_LAMBDA_FUNCTION_VERSION)
faas.instance (Value: $AWS_LAMBDA_LOG_STREAM_NAME)
faas.max_memory (Value: $AWS_LAMBDA_FUNCTION_MEMORY_SIZE)

Azure metadata

The azure detector collects the following resource attributes through the Azure Instance Metadata Service:

cloud.provider (Value: azure)
cloud.platform (Value: azure_vm)
cloud.region
cloud.account.id (Value: Subscription ID)
host.id (Value: Virtual machine ID)
host.name
azure.vm.name (Same as host.name)
azure.vm.size (Value: Virtual machine size)
azure.vm.scaleset.name (Value: Name of the scale set, if any)
azure.resourcegroup.name (Value: Resource group name)

Azure AKS metadata

The aks detector collects the following resource attributes:

cloud.provider (Value: azure)
cloud.platform (Value: azure_aks)

Consul metadata

The consul detector collects the following resource attributes by querying a Consul agent and reading its configuration endpoint:

cloud.region (Value: Consul data center)
host.id (Value: Consul node id)
host.name (Value: Consul node name)

The detector also collects all key-value pairs in Consul metadata and converts them into label-value pairs.

Docker metadata

The docker detector collects the following resource attributes from the host machine by querying the Docker daemon:

host.name
os.type

For Heroku applications, the dyno ID identifies the virtualized environment.

Note: To contact the Docker daemon, mount the Docker socket. On Linux, the socket is /var/run/docker.sock.

Environment variables

The env detector collects resource information from the OTEL_RESOURCE_ATTRIBUTES environment variable as a list of key-value pairs separated by the = character.

GCP metadata

The gcp detector uses the Google Cloud client libraries to read resource information from the metadata server, as well as environment variables. The detector uses the metadata to determine which GCP application is running and extracts relevant attributes.

GCE metadata

The gcp detector collects the following resource attributes from GCE:

cloud.provider (Value: gcp)
cloud.platform (Value: gcp_compute_engine)
cloud.account.id (Value: Project ID)
cloud.region `` (For example, ``us-central1)
cloud.availability_zone (For example, us-central1-c)
host.id (Value: Instance ID)
host.name (Value: Instance name)
host.type (Value: Machine type)

GKE metadata

The gcp detector collects the following resource attributes from GKE:

cloud.provider (Value: gcp)
cloud.platform (Value: gcp_kubernetes_engine)
cloud.account.id (Value: Project ID)
cloud.region (Only for regional GKE clusters. For example, us-central1)
cloud.availability_zone (Only for zonal GKE clusters. For example, us-central1-c)
k8s.cluster.name
host.id (Value: Instance ID)
host.name (Value: Instance name, only when workload identity is deactivated)

Google App Engine metadata

The gcp detector collects the following resource attributes from Google App Engine:

cloud.provider (Value: gcp)
cloud.platform (Value: gcp_app_engine)
cloud.account.id (Value: Project ID)
cloud.region (For example, us-central1)
cloud.availability_zone (For example: us-central1-c)
faas.id (Value: Instance ID)
faas.name (Value: Service name)
faas.version (service version)

Google Cloud Run metadata

The gcp detector collects the following resource attributes from Google Cloud Run:

cloud.provider (Value: gcp)
cloud.platform (Value: gcp_cloud_run)
cloud.account.id (Value: Project ID)
cloud.region (For example, us-central1)
faas.id (Value: Instance ID)
faas.name (Value: Service name)
faas.version (Value: Service version)

Google Cloud Functions metadata

The gcp detector collects the following resource attributes from Google Cloud Functions:

cloud.provider (Value: gcp)
cloud.platform (Value: gcp_cloud_functions)
cloud.account.id (Value: Project ID)
cloud.region (For example, us-central1)
faas.id (Value: Instance ID)
faas.name (function name)
faas.version (function version)

Heroku metadata

The heroku detector collects the following resource attributes through the Heroku metadata feature:

heroku.release.version (Value: Identifier for the current release)
heroku.release.creation_timestamp (Value: Creation time and date of the release)
heroku.release.commit (Value: Commit hash for the current release)
heroku.app.name (Value: Application name)
heroku.app.id (Value: Unique identifier for the application)
heroku.dyno.id (Value: Dyno identifier. Used as host name)

Note: Activate the Heroku metadata feature for your application before using the heroku detector.

Openshift metadata

The openshift detector collects the following resource attributes by querying the OpenShift and Kubernetes API:

cloud.provider
cloud.platform
cloud.region
k8s.cluster.name

By default, the detector determines the API address using the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT environment variables; the default service token is /var/run/secrets/kubernetes.io/serviceaccount/token. If TLS is active and you don’t define a CA file, the detector uses the certificate in /var/run/secrets/kubernetes.io/serviceaccount/ca.crt. All settings can be overridden in the configuration.

System metadata

The system detector collects the following resource attributes:

host.name
host.id
os.type

By default, the host.name attribute is the fully qualified domain name (FQDN) when available. The detector uses the host name as fallback.

The default configuration of the detector is hostname_sources: ["dns", "os"], which can be overridden using the following supported values:

cname: Canonical name.
dns: Either the host name from the hosts file, the CNAME, or the result of a reverse DNS query, in that order.
lookup: Reverse DNS lookup of the current host’s IP address.
os: Host name provided by the local machine’s kernel.

To avoid using the FQDN, set the value of the hostname_sources field to os.

Note: Use the docker detector if you’re running the Collector as a Docker container.

Migrate from attributes to resource_attributes

Starting from version 0.81 of the Collector, the resource detection processor deprecates the attributes option and replaces it with resource_attributes, which is specific to each detector.

To migrate, move the attributes inside of attributes to the relevant resource_attributes lists of each detector. For example, consider the following configuration:

resourcedetection:
  detectors: [system]
  # Deprecated in version 0.81
  attributes: ['host.name', 'host.id']

You can replace the previous configuration with the following:

resourcedetection:
  detectors: [system]
  system:
    resource_attributes:
      host.name:
        enabled: true
      host.id:
        enabled: true
      os.type:
        enabled: false

Settings

The following table shows the configuration options for the resource detection processor:

included

https://raw.githubusercontent.com/splunk/collector-config-tools/main/cfg-metadata/processor/resourcedetection.yaml

Troubleshooting

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Submit a case in the Splunk Support Portal.
Contact Splunk Support.

Available to prospective customers and free trial users

Ask a question and get answers through community support at Splunk Answers.
Join the Splunk community #observability Slack channel to communicate with customers, partners, and Splunk employees worldwide.