Collect logs and events with the Collector for Kubernetes
Configure logs and events for the Splunk Distribution of OpenTelemetry Collector for Kubernetes.
Starting on version 0.86.0, the Splunk Distribution of the Collector for Kubernetes collects native OpenTelemetry logs by default.
The following applies:
Use version 0.80.0 (or higher) of the Splunk OpenTelemetry Collector to correlate logs and traces in Istio environments.
If you’re unable to upgrade the Collector to the required version, use Fluentd for log collection and deploy the Helm chart with
autodetect.istio=true. See Splunk OpenTelemetry collector version 0.80.0 for more information.
The Collector cannot collect Journald logs natively.
Log collection is not supported in GKE Autopilot.
See also other rules and limitations for metrics and dimensions. For instance, you can have up to 36 dimensions per MTS, otherwise the data point is dropped.
Add log files from Kubernetes host machines or volumes
To add additional log files to be ingested from Kubernetes host machines and Kubernetes volumes, use agent.extraVolumes, agent.extraVolumeMounts, and logsCollection.extraFileLogs in the values.yaml file used to deploy the Collector for Kubernetes.
The following example shows how to add logs from Kubernetes host machines:
logsCollection:
extraFileLogs:
filelog/audit-log:
include: [/var/log/kubernetes/apiserver/audit.log]
start_at: beginning
include_file_path: true
include_file_name: false
resource:
com.splunk.source: /var/log/kubernetes/apiserver/audit.log
host.name: 'EXPR(env("K8S_NODE_NAME"))'
com.splunk.sourcetype: kube:apiserver-audit
agent:
extraVolumeMounts:
- name: audit-log
mountPath: /var/log/kubernetes/apiserver
extraVolumes:
- name: audit-log
hostPath:
path: /var/log/kubernetes/apiserverProcess multi-line logs
The Splunk Distribution of the OpenTelemetry Collector for Kubernetes supports parsing of multi-line logs to help read, understand, and troubleshoot the multi-line logs in a better way.
To process multi-line logs, add the following section to your values.yaml configuration:
logsCollection:
containers:
multilineConfigs:
- namespaceName:
value: default
podName:
value: buttercup-app-.*
useRegexp: true
containerName:
value: server
firstEntryRegex: ^[^\s].*Use regex101 to find a Golang regex that works for your format and specify it in the config file for the config option firstEntryRegex.
Manage log ingestion using annotations
Send logs to different indexes
The Collector for Kubernetes uses main as the default Splunk platform index. Use the splunk.com/index annotation on pods or namespaces to indicate which Splunk platform indexes you want to send logs to.
For example, to send logs from the kube-system namespace to the k8s_events index, use the command:
kubectl annotate namespace kube-system splunk.com/index=k8s_eventsFilter logs using pod or namespace annotations
If logsCollection.containers.useSplunkIncludeAnnotation is false (default value), set the splunk.com/exclude annotation to true on pods or namespaces to exclude their logs from being ingested. For example:
# annotates a namespace
kubectl annotate namespace <my-namespace> splunk.com/exclude=true
# annotates a pod
kubectl annotate pod -n <my-namespace> <my-pod> splunk.com/exclude=trueIf logsCollection.containers.useSplunkIncludeAnnotation is true, set the splunk.com/include annotation to true on pods or namespaces to only ingest their logs. All other logs will be ignored. For example:
# annotates a namespace
kubectl annotate namespace <my-namespace> splunk.com/include=true
# annotates a pod
kubectl annotate pod -n <my-namespace> <my-pod> splunk.com/include=trueFilter source types
Use the splunk.com/sourcetype annotation on a pod to overwrite the sourcetype field. If not set, it will default to kube:container:CONTAINER_NAME.
kubectl annotate pod -n <my-namespace> <my-pod> splunk.com/sourcetype=kube:apiserver-auditReview performance benchmarks
Configurations set using the Collector for Kubernetes Helm chart might have an impact on overall performance of log ingestion. The more receivers, processors, exporters, and extensions you add to any of the pipelines, the greater the performance impact.
The Collector for Kubernetes can exceed the default throughput of the HTTP Event Collector (HEC). To address capacity needs, monitor the HEC throughput and back pressure on the Collector for Kubernetes deployments and, if necessary, add additional nodes.
The following table provides a summary of performance benchmarks run internally:
Log generator count | Event size (byte) | Agent CPU usage | Agent EPS |
|---|---|---|---|
1 | 256 | 1.8 | 30,000 |
1 | 516 | 1.8 | 28,000 |
1 | 1024 | 1.8 | 24,000 |
5 | 256 | 3.2 | 54,000 |
7 | 256 | 3 | 52,000 |
10 | 256 | 3.2 | 53,000 |
The data pipelines for these test runs involved reading container logs as they are being written, then parsing filename for metadata, enriching it with Kubernetes metadata, reformatting the data structure, and sending logs (without compression) to the Splunk HEC endpoint.
Use Fluentd to collect Kubernetes logs
Alternatively, you can use Fluentd to collect Kubernetes logs and send them through the Collector, which does all of the necessary metadata enrichment.
Add the following line to your configuration to use Fluentd to collect logs.
logsEngine: fluentdDisplay Fluentd logs in Kubernetes navigators
Kubernetes navigators display Kubernetes logs that contain OpenTelemetry metadata fields. To display logs collected from Fluentd, you must create field aliases to map Fluentd fields to OpenTelemetry fields. For more information on field aliases, see Create field aliases.
Prerequisites
To enable Kubernetes navigators to display logs collected from Fluentd, you must meet the following requirements.
You have the administrator or power user role in Splunk Observability Cloud.
You are not using the Splunk Distribution of the OpenTelemetry Collector to collect Kubernetes logs.
You have set up Kubernetes log collection from Fluentd. For instructions, see Use Fluentd to collect Kubernetes logs.
Your Fluentd Kubernetes logs contain the following metadata fields:
cluster_namehostpod_namenamespacepod_namespacecontainer_namepod_workload
Procedure
| Original Fluentd field name | New OpenTelemetry field alias |
|---|---|
cluster_name | k8s.cluster.name |
host | k8s.node.name |
pod_name | k8s.pod.name |
namespace | k8s.namespace.name |
pod_namespace | k8s.namespace.name |
container_name | k8s.container.name |
pod_workload | k8s.workload.name |
Collect events
Collect Kubernetes events
To see Kubernetes events as part of the Events Feed section in charts, set splunkPlatform.logsEnabled and clusterReceiver.eventsEnabled to true. When enabled, the clusterReceiver.eventsEnabled setting adds the k8seventsreceiver to the logs pipeline. The k8seventsreceiver collects events from the Kubernetes API server and reports the events by serializing them to JSON format in the body of a log.
To collect Kubernetes events as logs for Log Observer Connect using the Collector, you need to add clusterReceiver.k8sObjects to your configuration file, and set logsEnabled to true in either splunkObservability or splunkPlatform. Events are processed in the logs pipeline.
clusterReceiver.k8sObjects has the following fields:
name: Required. Name of the object, for examplepodsornamespaces.mode: Defines in which way this type of object is collected: eitherpullorwatch.pullby default.pullmode reads all objects of this type using the list API at an interval.watchmode sets up a long connection using the watch API to get updates only.
namespaces: If specified, the Collector only collects objects from the specified namespaces. By default, the matching objects from all namespaces are included.labelSelector: Selects objects by labels.fieldSelector: Selects objects by fields.interval: Only applies topullmode. The interval at which object is pulled.60seconds by default.
For example:
clusterReceiver.k8sObjects:
- name: pods
mode: pull
label_selector: environment in (production),tier in (frontend)
field_selector: status.phase=Running
interval: 15m
- name: events
mode: watch
group: events.k8s.io
namespaces: [default]For more information, see Kubernetes objects receiver and the Github documentation for the cluster receiver Helm chart deployment at Kubernetes objects collection using OpenTelemetry Kubernetes Object Receiver .
Collect journald events
The Splunk Distribution of OpenTelemetry Collector for Kubernetes can collect journald events from Kubernetes environments. To process journald events, add the following section to your values.yaml configuration:
logsCollection:
journald:
enabled: true
directory: /run/log/journal
# List of service units to collect and configuration for each. Update the list as needed.
units:
- name: kubelet
priority: info
- name: docker
priority: info
- name: containerd
priority: info
# Optional: Route journald logs to a separate Splunk Index by specifying the index
# value. Make sure the index exists in Splunk and is configured to receive HEC
# traffic (not applicable to Splunk Observability Cloud).
index: ""