Filelog receiver
The Filelog receiver tails and parses logs from files.
The Filelog receiver tails and parses logs from files. The supported pipeline type is logs
. See Process your data with pipelines for more information.
Get started
Follow these steps to configure and activate the component:
-
Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
-
Configure the Filelog receiver as described in the next section.
-
Restart the Collector.
Sample configuration
-
To activate the Filelog receiver, add
filelog
to thereceivers
section of your configuration file:receivers: filelog:
-
To complete the configuration, include the receiver in the
logs
pipeline of theservice
section of your configuration file:service: pipelines: logs: receivers: [filelog]
Configuration example
This example shows how to tail a simple JSON file:
receivers:
filelog:
include: [ /var/log/myservice/*.json ]
operators:
- type: json_parser
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S'
This example shows how to tail a plaintext file:
receivers:
filelog:
include: [ /simple.log ]
operators:
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$'
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S'
severity:
parse_from: attributes.sev
The receiver reads logs from the simple.log file, such as:
2023-06-19 05:20:50 ERROR This is a test error message
2023-06-20 12:50:00 DEBUG This is a test debug message
Use operators to format logs
The Filelog receiver uses operators to process logs into a desired format. Each operator fulfills a single responsibility, such as reading lines from a file, or parsing JSON from a field. You need to chain operators together in a pipeline to achieve your desired result.
For instance, you can read lines from a file using the file_input
operator. From there, you can send the results of this operation to a regex_parser
operator that creates fields based on a regex pattern. Next, you can send the results to a file_output
operator to write each line to a file on disk.
All operators either create, modify, or consume entries:
-
An entry is the base representation of log data as it moves through a pipeline.
-
A field is used to reference values in an entry.
-
A common expression syntax is used in several operators. For example, expressions can be used to filter or route entries.
Available operators
For a complete list of available operators, see What operators are available? in GitHub.
The following applies to operators:
-
Each operator has a
type
. -
You can give a unique Id to each operator.
-
If you use the same type of operator more than once in a pipeline, you must specify an Id.
-
Otherwise, the Id defaults to the value of
type
.
-
-
An operator outputs to the next operator in the pipeline.
-
The last operator in the pipeline emits from the receiver.
-
(Optional) You can use the output parameter to specify the Id of another operator to pass logs there directly.
-
Parser operators
Use parser operators to isolate values from a string. There are 2 classes of parsers: simple and complex.
Parse header metadata
To turn on header metadata parsing, set the filelog.allowHeaderMetadataParsing
feature, and set start_at
at the beginning. If set, the file input operator attempts to read a header from the start of the file.
The following applies:
-
Each header line must match the
header.pattern
pattern. -
Each line is emitted into a pipeline defined by
header.metadata_operators
. -
Any attributes on the resultant entry from the embedded pipeline are merged with the attributes from previous lines. If attribute collisions happen, they are resolved with an upsert strategy.
-
After all header lines are read, the final merged header attributes are present on every log line that is emitted for the file.
The receiver does not emit header lines.
Parsers with embedded operations
You can configure many parsing operators to embed certain follow-up operations such as timestamp and severity parsing.
For more information on complex parsers, see Parsers on GitHub.
Multiline configuration
If set, the multiline configuration block instructs the file_input
operator to split log entries on a pattern other than new lines.
The multiline configuration block must contain line_start_pattern
or line_end_pattern
. These are Regex patterns that match either the beginning of a new log entry or the end of a log entry.
Supported encodings
The Filelog receiver supports the following encodings:
Key |
Description |
---|---|
|
No encoding validation. Treats the file as a stream of raw bytes. |
|
UTF-8 encoding. |
|
UTF-16 encoding with little-endian byte order. |
|
UTF-16 encoding with big-endian byte order. |
|
ASCII encoding. |
|
The Big5 Chinese character encoding. |
Other less common encodings are supported on a best-effort basis. See the list of available encodings in https://www.iana.org/assignments/character-sets/character-sets.xhtml.
Advanced use cases
See a few use cases for the Filelog receiver in the following sections.
You can find more examples in the GitHub repository splunk-otel-collextor/examples .
Send logs to Splunk Cloud Platform
Use the following configuration to send logs to Splunk Cloud Platform:
receivers:
filelog:
include: [ /output/file.log ]
operators:
- type: regex_parser
regex: '(?P<before>.*)\d\d\d\d-\d\d\d-\d\d\d\d(?P<after>.*)'
parse_to: body.parsed
output: before_and_after
- id: before_and_after
type: add
field: body
value: EXPR(body.parsed.before + "XXX-XXX-XXXX" + body.parsed.after)
exporters:
# Logs
splunk_hec:
token: "${SPLUNK_HEC_TOKEN}"
endpoint: "${SPLUNK_HEC_URL}"
source: "otel"
sourcetype: "otel"
service:
pipelines:
logs:
receivers: [filelog, otlp]
processors:
- memory_limiter
- batch
- resourcedetection
#- resource/add_environment
exporters: [splunk_hec]
Send truncated logs to Splunk Enterprise
Use the following configuration to truncate logs and send them to Splunk Enterprise:
https://raw.githubusercontent.com/signalfx/splunk-otel-collector/main/examples/otel-logs-truncate-splunk/otel-collector-config.yml
Send sanitized logs to Splunk Enterprise
Use the following configuration to sanitize logs and send them to Splunk Enterprise.
https://raw.githubusercontent.com/signalfx/splunk-otel-collector/main/examples/otel-logs-sanitization-splunk/otel-collector-config.yml
Route logs to different indexes
Use the following configuration to route logs to different Splunk platform indexes.
https://raw.githubusercontent.com/signalfx/splunk-otel-collector/main/examples/otel-logs-processor-splunk/otel-collector-config.yml
Associate log sources with source types
This example showcases how the Collector collects data from files and sends it to Splunk Enterprise, associating each source with a different source type. The source type is a default field that identifies the structure of an event, and determines how Splunk Enterprise formats the data during the indexing process.
https://raw.githubusercontent.com/signalfx/splunk-otel-collector/main/examples/otel-logs-with-sourcetypes-splunk/otel-collector-config.yml
Settings
start_at
defaults to end
.The following table shows the configuration options for the Filelog receiver:
included
https://raw.githubusercontent.com/splunk/collector-config-tools/main/cfg-metadata/receiver/filelog.yaml
Troubleshooting
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
-
Submit a case in the Splunk Support Portal.
-
Contact Splunk Support.
Available to prospective customers and free trial users
-
Ask a question and get answers through community support at Splunk Answers.
-
Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups.