Filelog receiver
The Filelog receiver tails and parses logs from files.
The Filelog receiver tails and parses logs from files. The supported pipeline type is logs. See Process your data with pipelines for more information.
Get started
Follow these steps to configure and activate the receiver:
Deploy the collector
Deploy the Splunk Distribution of the OpenTelemetry Collector onto your host or container platform:
Configure the receiver
Configure the receiver in the Splunk Distribution of the OpenTelemetry Collector that you deployed on your host or container platform:
- Add
filelogto thereceiverssection of your configuration file:receivers: filelog:- Sample configuration to tail a simple JSON file
receivers: filelog: include: [ /var/log/myservice/*.json ] operators: - type: json_parser timestamp: parse_from: attributes.time layout: '%Y-%m-%d %H:%M:%S'- Sample configuration to tail a plaintext file
receivers: filelog: include: [ /simple.log ] operators: - type: regex_parser regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$' timestamp: parse_from: attributes.time layout: '%Y-%m-%d %H:%M:%S' severity: parse_from: attributes.sevThe receiver reads logs from the simple.log file, such as:
2023-06-19 05:20:50 ERROR This is a test error message 2023-06-20 12:50:00 DEBUG This is a test debug message
(Optional) Use operators to format logs.
(Optional) Send logs to other Splunk platforms.
(Optional) Configure other settings.
- Add
filelogto the collectorlogspipeline, which is within theservicesection of the collector configuration file:service: pipelines: logs: receivers: [filelog]
Use operators to format logs
The Filelog receiver uses operators to process logs into a desired format. Each operator fulfills a single responsibility, such as reading lines from a file, or parsing JSON from a field. You need to chain operators together in a pipeline to achieve your desired result.
For instance, you can read lines from a file using the file_input operator. From there, you can send the results of this operation to a regex_parser operator that creates fields based on a regex pattern. Next, you can send the results to a file_output operator to write each line to a file on disk.
All operators either create, modify, or consume entries:
An entry is the base representation of log data as it moves through a pipeline.
A field is used to reference values in an entry.
A common expression syntax is used in several operators. For example, expressions can be used to filter or route entries.
- Available operators
For a complete list of available operators, see What operators are available? in GitHub.
The following applies to operators:
Each operator has a
type.You can give a unique Id to each operator.
If you use the same type of operator more than once in a pipeline, you must specify an Id.
Otherwise, the Id defaults to the value of
type.
An operator outputs to the next operator in the pipeline.
The last operator in the pipeline emits from the receiver.
(Optional) You can use the output parameter to specify the Id of another operator to pass logs there directly.
- Parser operators
Use parser operators to isolate values from a string. There are 2 classes of parsers: simple and complex.
- Parse header metadata
To activate header metadata parsing, set the
filelog.allowHeaderMetadataParsingparameter, and setstart_atat the beginning. When these parameters are set, the file input operator attempts to read a header from the start of the file. The following applies:Each header line must match the
header.patternpattern.Each line is emitted into a pipeline defined by
header.metadata_operators.Any attributes on the resultant entry from the embedded pipeline are merged with the attributes from previous lines. If attribute collisions happen, they are resolved with an upsert strategy.
After all header lines are read, the final merged header attributes are present on every log line that is emitted for the file.
The receiver does not emit header lines.
- Parsers with embedded operations
You can configure many parsing operators to embed certain follow-up operations such as timestamp and severity parsing. For more information on complex parsers, see Parsers on GitHub.
- Multiline configuration
If set, the multiline configuration block instructs the
file_inputoperator to split log entries on a pattern other than new lines. The multiline configuration block must containline_start_patternorline_end_pattern. These are regex patterns that match either the beginning of a new log entry or the end of a log entry.- Supported encodings
The Filelog receiver supports the following encodings:
Key
Description
nopNo encoding validation. Treats the file as a stream of raw bytes.
utf-8UTF-8 encoding.
utf-16leUTF-16 encoding with little-endian byte order.
utf-16beUTF-16 encoding with big-endian byte order.
asciiASCII encoding.
big5The Big5 Chinese character encoding.
Other less common encodings are supported on a best-effort basis. See the list of available encodings in https://www.iana.org/assignments/character-sets/character-sets.xhtml.
Send logs to other Splunk platforms
See a few use cases for the Filelog receiver in the following sections. You can find more examples in the GitHub repository splunk-otel-collextor/examples .
- Send logs to Splunk Cloud Platform
Use the following configuration to send logs to Splunk Cloud Platform:
receivers: filelog: include: [ /output/file.log ] operators: - type: regex_parser regex: '(?P<before>.*)\d\d\d\d-\d\d\d-\d\d\d\d(?P<after>.*)' parse_to: body.parsed output: before_and_after - id: before_and_after type: add field: body value: EXPR(body.parsed.before + "XXX-XXX-XXXX" + body.parsed.after) exporters: # Logs splunk_hec: token: "${SPLUNK_HEC_TOKEN}" endpoint: "${SPLUNK_HEC_URL}" source: "otel" sourcetype: "otel" service: pipelines: logs: receivers: [filelog, otlp] processors: - memory_limiter - batch - resourcedetection #- resource/add_environment exporters: [splunk_hec]- Send truncated logs to Splunk Enterprise
Use the following configuration to truncate logs and send them to Splunk Enterprise:
receivers: filelog: include: [ /output/file.log ] operators: - type: regex_parser regex: '(?P<before>.*)\d\d\d\d-\d\d\d-\d\d\d\d(?P<after>.*)' parse_to: body.parsed output: before_and_after - id: before_and_after type: add field: body value: EXPR(body.parsed.before + "XXX-XXX-XXXX" + body.parsed.after) exporters: splunk_hec/logs: # Splunk HTTP Event Collector token. token: "00000000-0000-0000-0000-0000000000000" # URL to a Splunk instance to send data to. endpoint: "https://splunk:8088/services/collector" # Optional Splunk source: https://docs.splunk.com/Splexicon:Source source: "output" # Splunk index, optional name of the Splunk index targeted. index: "logs" # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100. max_connections: 20 # Whether to disable gzip compression over HTTP. Defaults to false. disable_compression: false # HTTP timeout when sending data. Defaults to 10s. timeout: 10s # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false. # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true. tls: insecure_skip_verify: true processors: batch: transform: log_statements: - context: log statements: - set(body, Substring(body,0, 10)) extensions: health_check: endpoint: 0.0.0.0:13133 pprof: endpoint: :1888 zpages: endpoint: :55679 expvar: enabled: true service: extensions: [ pprof, zpages, health_check ] pipelines: logs: receivers: [ filelog ] processors: [ batch, transform ] exporters: [ splunk_hec/logs ]- Send sanitized logs to Splunk Enterprise
Use the following configuration to sanitize logs and send them to Splunk Enterprise.
receivers: filelog: include: [ /output/file.log ] operators: - type: regex_parser regex: '(?P<before>.*)\d\d\d\d-\d\d\d-\d\d\d\d(?P<after>.*)' parse_to: body.parsed output: before_and_after - id: before_and_after type: add field: body value: EXPR(body.parsed.before + "XXX-XXX-XXXX" + body.parsed.after) exporters: splunk_hec/logs: # Splunk HTTP Event Collector token. token: "00000000-0000-0000-0000-0000000000000" # URL to a Splunk instance to send data to. endpoint: "https://splunk:8088/services/collector" # Optional Splunk source: https://docs.splunk.com/Splexicon:Source source: "output" # Splunk index, optional name of the Splunk index targeted. index: "logs" # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100. max_connections: 20 # Whether to disable gzip compression over HTTP. Defaults to false. disable_compression: false # HTTP timeout when sending data. Defaults to 10s. timeout: 10s # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false. # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true. insecure_skip_verify: true processors: batch: extensions: health_check: endpoint: 0.0.0.0:13133 pprof: endpoint: :1888 zpages: endpoint: :55679 expvar: enabled: true service: extensions: [pprof, zpages, health_check] pipelines: logs: receivers: [filelog] processors: [batch] exporters: [splunk_hec/logs]- Route logs to different indexes
Use the following configuration to route logs to different Splunk platform indexes.
receivers: filelog: include: [ /output/file*.log ] start_at: beginning operators: - type: regex_parser regex: '(?P<logindex>log\d?)' exporters: splunk_hec/logs: # Splunk HTTP Event Collector token. token: "00000000-0000-0000-0000-0000000000000" # URL to a Splunk instance to send data to. endpoint: "https://splunk:8088/services/collector" # Optional Splunk source: https://docs.splunk.com/Splexicon:Source source: "output" # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100. max_connections: 20 # Whether to disable gzip compression over HTTP. Defaults to false. disable_compression: false # HTTP timeout when sending data. Defaults to 10s. timeout: 10s tls: # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false. # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true. insecure_skip_verify: true processors: batch: attributes/log: include: match_type: strict attributes: - { key: logindex, value: 'log' } actions: - key: com.splunk.index action: upsert value: "logs" - key: logindex action: delete attributes/log2: include: match_type: strict attributes: - { key: logindex, value: 'log2' } actions: - key: com.splunk.index action: upsert value: "logs2" - key: logindex action: delete attributes/log3: include: match_type: strict attributes: - { key: logindex, value: 'log3' } actions: - key: com.splunk.index action: upsert value: "logs3" - key: logindex action: delete extensions: health_check: endpoint: 0.0.0.0:13133 pprof: endpoint: :1888 zpages: endpoint: :55679 expvar: enabled: true service: extensions: [pprof, zpages, health_check] pipelines: logs: receivers: [filelog] processors: [batch, attributes/log, attributes/log2, attributes/log3] exporters: [splunk_hec/logs]- Associate log sources with source types
This example showcases how the Collector collects data from files and sends it to Splunk Enterprise, associating each source with a different source type. The source type is a default field that identifies the structure of an event, and determines how Splunk Enterprise formats the data during the indexing process.
processors: batch: resource/one: attributes: # Set the com.splunk.sourcetype log attribute key to sourcetype1. # com.splunk.sourcetype is the default key the HEC exporter will use to extract the source type of the record. # See https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/splunkhecexporter # under the configuration key `hec_metadata_to_otel_attrs/sourcetype` - key: com.splunk.sourcetype value: "sourcetype1" action: upsert resource/two: attributes: - key: com.splunk.sourcetype value: "sourcetype2" action: upsert resource/three: attributes: - key: com.splunk.sourcetype value: "sourcetype3" action: upsert receivers: filelog/onefile: include: [ /output/file.log ] filelog/twofile: include: [ /output/file2.log ] filelog/threefolder: include: [ /output3/*.log ] exporters: splunk_hec/logs: # Splunk HTTP Event Collector token. token: "00000000-0000-0000-0000-0000000000000" # URL to a Splunk instance to send data to. endpoint: "https://splunk:8088/services/collector" # Optional Splunk source: https://docs.splunk.com/Splexicon:Source source: "output" # Splunk index, optional name of the Splunk index targeted. index: "logs" # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100. max_connections: 20 # Whether to disable gzip compression over HTTP. Defaults to false. disable_compression: false # HTTP timeout when sending data. Defaults to 10s. timeout: 10s tls: # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false. # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true. insecure_skip_verify: true extensions: health_check: endpoint: 0.0.0.0:13133 pprof: endpoint: :1888 zpages: endpoint: :55679 expvar: enabled: true service: extensions: [pprof, zpages, health_check] pipelines: logs/one: receivers: [ filelog/onefile ] processors: [ batch, resource/one ] exporters: [ splunk_hec/logs ] logs/two: receivers: [ filelog/twofile ] processors: [ batch, resource/two ] exporters: [ splunk_hec/logs ] logs/three: receivers: [ filelog/threefolder ] processors: [ batch, resource/three ] exporters: [ splunk_hec/logs ]
Restart the collector
The command to restart the Splunk Distribution of the OpenTelemetry Collector varies depending on what platform you deployed it on and what tool you used to deploy it, but here are general examples of the restart command:
sudo systemctl restart splunk-otel-collectorWindows with installer script:
stop-service splunk-otel-collector start-service splunk-otel-collectorsudo systemctl restart splunk-otel-collector
Settings
start_at defaults to end.The following table shows the configuration options for the Filelog receiver:
included
https://raw.githubusercontent.com/splunk/collector-config-tools/main/cfg-metadata/receiver/filelog.yaml
Troubleshooting
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
Submit a case in the Splunk Support Portal.
Contact Splunk Support.
Available to prospective customers and free trial users
Ask a question and get answers through community support at Splunk Answers.
Join the Splunk community #observability Slack channel to communicate with customers, partners, and Splunk employees worldwide.