Kubernetes API server

The Splunk Distribution of OpenTelemetry Collector uses the Smart Agent receiver with the Kubernetes API server monitor type to retrieve metrics from the API server’s Prometheus metric endpoint.

This integration is available on Kubernetes, Linux, and Windows.

This integration requires access to kube-apiserver pods to be able to access certain pods in the control plane. Since several Kubernetes-as-a-service distributions don’t expose the control plane pods to the end user, metric collection might not be possible in these cases.

Benefits

After you configure the integration, you can access these features:

View metrics. You can create your own custom dashboards, and most monitors provide built-in dashboards as well. For information about dashboards, see View dashboards in Splunk Observability Cloud.
View a data-driven visualization of the physical servers, virtual machines, AWS instances, and other resources in your environment that are visible to Infrastructure Monitoring. For information about navigators, see Use navigators in Splunk Infrastructure Monitoring.
Access the Metric Finder and search for metrics sent by the monitor. For information, see Search the Metric Finder and Metadata Catalog.

Installation

Follow these steps to deploy this integration:

Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
Configure the integration, as described in the Configuration section.
Restart the Splunk Distribution of the OpenTelemetry Collector.

Configuration

To use this integration of a Smart Agent monitor with the Collector:

Include the Smart Agent receiver in your configuration file.
Add the monitor type to the Collector configuration, both in the receiver and pipelines sections.
- See how to Use Smart Agent monitors with the Collector.
- See how to set up the Smart Agent receiver.
- For a list of common configuration options, refer to Common configuration settings for monitors.
- Learn more about the Collector at Get started: Understand and use the Collector.

Example

To activate this integration, add the following to your Collector configuration:

YAML

receivers:
  smartagent/kubernetes-apiserver:
    type: kubernetes-apiserver
    ... # Additional config

receivers:
  smartagent/kubernetes-apiserver:
    type: kubernetes-apiserver
    ... # Additional config

Next, add the monitor to the service.pipelines.metrics.receivers section of your configuration file:

YAML

service:
  pipelines:
    metrics:
      receivers: [smartagent/kubernetes-apiserver]

service:
  pipelines:
    metrics:
      receivers: [smartagent/kubernetes-apiserver]

See the kubernetes-yaml examples in GitHub for the Agent and Gateway YAML files.

Example: Kubernetes observer

The following is an example YAML configuration:

YAML

receivers:
  smartagent/kubernetes-apiserver:
    type: kubernetes-apiserver
    host: localhost
    port: 443
    extraDimensions:
      metric_source: kubernetes-apiserver

receivers:
  smartagent/kubernetes-apiserver:
    type: kubernetes-apiserver
    host: localhost
    port: 443
    extraDimensions:
      metric_source: kubernetes-apiserver

The OpenTelemetry Collector has a Kubernetes observer (k8sobserver) that can be implemented as an extension to discover networked endpoints, such as a Kubernetes pod. Using this observer assumes that the OpenTelemetry Collector is deployed in host monitoring (agent) mode, where it is running on each individual node or host instance.

To use the observer, create a receiver creator instance with an associated rule. For example:

YAML

extensions:
  # Configures the Kubernetes observer to watch for pod start and stop events.
  k8s_observer:

receivers:
  receiver_creator/1:
    # Name of the extensions to watch for endpoints to start and stop.
    watch_observers: [k8s_observer]
    receivers:
      smartagent/kubernetes-apiserver:
        rule: type == "pod" && labels["k8s-app"] == "kube-apiserver"
        type: kubernetes-apiserver
        port: 443
        extraDimensions:
          metric_source: kubernetes-apiserver

processors:
  exampleprocessor:

exporters:
  exampleexporter:

service:
  pipelines:
    metrics:
      receivers: [receiver_creator/1]
      processors: [exampleprocessor]
      exporters: [exampleexporter]
  extensions: [k8s_observer]

extensions:
  # Configures the Kubernetes observer to watch for pod start and stop events.
  k8s_observer:

receivers:
  receiver_creator/1:
    # Name of the extensions to watch for endpoints to start and stop.
    watch_observers: [k8s_observer]
    receivers:
      smartagent/kubernetes-apiserver:
        rule: type == "pod" && labels["k8s-app"] == "kube-apiserver"
        type: kubernetes-apiserver
        port: 443
        extraDimensions:
          metric_source: kubernetes-apiserver

processors:
  exampleprocessor:

exporters:
  exampleexporter:

service:
  pipelines:
    metrics:
      receivers: [receiver_creator/1]
      processors: [exampleprocessor]
      exporters: [exampleexporter]
  extensions: [k8s_observer]

See Receiver creator for more information.

Configuration settings

The following table shows the configuration options for this monitor:


Option	Required	Type	Description
`httpTimeout`	no	`int64`	HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. (default: `10s`)
`username`	no	`string`	Basic Auth username to use on each request, if any.
`password`	no	`string`	Basic Auth password to use on each request, if any.
`useHTTPS`	no	`bool`	If `true`, the agent will connect to the server using HTTPS instead of plain HTTP. (default: `false`)
`httpHeaders`	no	`map of strings`	A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported.
`skipVerify`	no	`bool`	If `useHTTPS` is `true` and this option is also `true`, the exporter TLS cert will not be verified. (default: `false`)
`caCertPath`	no	`string`	Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to `false`.
`clientCertPath`	no	`string`	Path to the client TLS cert to use for TLS required connections.
`clientKeyPath`	no	`string`	Path to the client TLS key to use for TLS required connections.
`host`	yes	`string`	Host of the exporter.
`port`	yes	`integer`	Port of the exporter.
`useServiceAccount`	no	`bool`	Use pod service account to authenticate. (default: `false`)
`metricPath`	no	`string`	Path to the metrics endpoint on the exporter server, usually `/metrics` (the default). (default: `/metrics`)
`sendAllMetrics`	no	`bool`	Send all the metrics that come out of the Prometheus exporter without any filtering. This option has no effect when using the prometheus exporter monitor directly since there is no built-in filtering, only when embedding it in other monitors. (default: `false`)

Metrics

The following metrics are available for this integration:

https://raw.githubusercontent.com/signalfx/splunk-otel-collector/main/internal/signalfx-agent/pkg/monitors/kubernetes/apiserver/metadata.yaml

Notes

To learn more about the available in Splunk Observability Cloud see Metric types.
In host-based subscription plans, default metrics are those metrics included in host-based subscriptions in Splunk Observability Cloud, such as host, container, or bundled metrics. Custom metrics are not provided by default and might be subject to charges. See Metric categories for more information.
In MTS-based subscription plans, all metrics are custom.
To add additional metrics, see how to configure extraMetrics in Add additional metrics.

Troubleshooting

You’re getting a "bind: address already in use" error message

If you see an error message such as "bind: address already in use", another resource is already using the port that the current configuration requires. This resource could be another application, or a tracing tool such as Jaeger or Zipkin.

You can modify the configuration to use another port. You can modify any of these endpoints or ports:

Receiver endpoint
Extensions endpoint
Metrics address (if port 8888)

If you see this error message on Kubernetes and you’re using Helm charts, modify the configuration by updating the chart values for both configuration and exposed ports.

You’re getting a "2021-10-19T20:18:40.556Z info builder/receivers_builder.go:112 Ignoring receiver as it is not used by any pipeline {"kind": "receiver", "name": "error message"

This error happens when a component (receiver, processor, or exporter) has been configured, but is not used in a receiver pipeline. For example, the following error message tells you that the smartagent/http receiver is configured, but that it is not used by any pipeline:

TEXT

"2021-10-19T20:18:40.556Z info builder/receivers_builder.go:112 Ignoring receiver as it is not used by any pipeline {"kind": "receiver", "name": "smartagent/http"

"2021-10-19T20:18:40.556Z info builder/receivers_builder.go:112 Ignoring receiver as it is not used by any pipeline {"kind": "receiver", "name": "smartagent/http"

Once configured, all components must be turned on by using pipelines in the service section. The service section is used to configure what components are activated based on the configuration found in the components sections of your configuration file. If a component is configured, but not defined within the service section, then it is not activated.

Here is a sample configuration:

YAML

service:
  pipelines:
  # Pipelines can contain multiple subsections, one per pipeline.
    traces:
    # Traces is the pipeline type.
      receivers: [otlp, jaeger, zipkin]
      processors: [memory_limiter, batch]
      exporters: [otlp, jaeger, zipkin]

service:
  pipelines:
  # Pipelines can contain multiple subsections, one per pipeline.
    traces:
    # Traces is the pipeline type.
      receivers: [otlp, jaeger, zipkin]
      processors: [memory_limiter, batch]
      exporters: [otlp, jaeger, zipkin]

See Process your data with pipelines for more information.

The Splunk Distribution of OpenTelemetry Collector is out of memory

If you receive high memory usage or out of memory warnings, do the following before opening a support case:

Verify that you have installed the latest version of the Splunk Distribution of OpenTelemetry Collector for Kubernetes.
Add or change the memory_limiter processor in your configuration file. For example:
YAML
processors: memory_limiter: ballast_size_mib: 2000 check_interval: 5s # Check_interval is the time between measurements of memory usage for the purposes of avoiding goingover the limits. # The default is 0. Values below 1s are not recommended, as this can result in unnecessary CPU consumption. limit_mib: 4000 # Maximum amount of memory, in MiB, targeted to be allocated by the process heap. # The total memory usage of the process is typically about 50 MiB higher than this value. spike_limit_mib: 500 # The maximum, in MiB, spike expected between the measurements of memory usage. ballast_size_mib: 2000 # BallastSizeMiB is the size, in MiB, of the ballast size being used by the process. # This must match the value of the mem-ballast-size-mib command line option (if used). # Otherwise, the memory limiter does not work correctly.
```
processors:
  memory_limiter:
     ballast_size_mib: 2000
     check_interval: 5s
        # Check_interval is the time between measurements of memory usage for the  purposes of avoiding goingover the limits.
        # The default is 0. Values below 1s are not recommended, as this can result in unnecessary CPU consumption.
     limit_mib: 4000
        # Maximum amount of memory, in MiB, targeted to be allocated by the process heap.
        # The total memory usage of the process is typically about 50 MiB higher than this value.
     spike_limit_mib: 500
        # The maximum, in MiB, spike expected between the measurements of memory usage.
     ballast_size_mib: 2000
        # BallastSizeMiB is the size, in MiB, of the ballast size being used by the process.
        # This must match the value of the mem-ballast-size-mib command line option (if used).
        # Otherwise, the memory limiter does not work correctly.
```
Try to reproduce the error and collect a heap dump close to the point where the memory kill occurs:
1. Add the pprof extension to the component configuration that is failing. Make sure you turned on this extension in a pipeline in the services section.
2. Capture the output of the following commands against the problematic pod:
  YAML
  curl http://127.0.0.1:1777/debug/pprof/goroutine?debug=2 (http://127.0.0.1:1777/debug/pprof/goroutine?debug=2) curl http://127.0.0.1:1777/debug/pprof/heap > heap.out
```
curl http://127.0.0.1:1777/debug/pprof/goroutine?debug=2 (http://127.0.0.1:1777/debug/pprof/goroutine?debug=2)
curl http://127.0.0.1:1777/debug/pprof/heap > heap.out
```

For example, if you discover that the pod lasts 5 minutes before it gets killed:

Bounce the pod and collect the first set of data after the startup.
Wait 3 minutes and collect another set of data. Make sure to label the data accordingly.
Collect another set of data before the crash, if possible.

How long does it take for the pod to be killed due to memory limit? Check the logs at the time of the issue to see if there are any obvious repeating errors.

Gather additional support information, including your end-to-end architecture information. See Gather information to open a support request

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Submit a case in the Splunk Support Portal.
Contact Splunk Support.

Available to prospective customers and free trial users

Ask a question and get answers through community support at Splunk Answers.
Join the Splunk community #observability Slack channel to communicate with customers, partners, and Splunk employees worldwide.

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

Enterprise Security

SOAR

IT Service Intelligence

Content Packs

Splunk Observability Cloud

AppDynamics SaaS

AppDynamics On-Premises

SAP Agent

Developer Documentation

Splunkbase

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

DATA MANAGEMENT

SEARCH AND ANALYTICS

ADMINISTRATION

Enterprise Security

SOAR

ENTERPRISE SECURITY

SOAR

RELATED APPS

IT Service Intelligence

Content Packs

ITSI

IT Ops

ADMINISTRATION

EXTENSIONS

Splunk Observability Cloud

MONITORING

DATA MANAGEMENT

ADMINISTRATION

AppDynamics SaaS

AppDynamics On-Premises

SAP Agent

ESSENTIALS

MONITORING

ADMINISTRATION

Developer Documentation

Splunkbase

PLATFORM

OBSERVABILITY

REFERENCE

Resources

REFERENCE

Learn More

Support

Benefits

Installation

Configuration

Example

Example: Kubernetes observer

Configuration settings

Metrics

Notes

Troubleshooting