Create an S3 destination
To write events to a remote storage volume, select a preconfigured S3 destination when you configure the "Route to Destination" rule, You can write to multiple S3 destinations. The "Immediately send to" field has a typeahead capability that displays all preconfigured S3 destinations.
You configure and validate S3 destinations through the Destinations tab on the Ingest Actions page. Select S3 under the New Destination button and fill out the fields, following the examples provided there. You can create multiple S3 destinations.
You can create a maximum of eight S3 destinations. When rulesets route to a destination that is invalid or does not exist, the Splunk Platform instance blocks all queues and pipelines and does not drop data.
Partition events
When creating an S3 destination, you can define a partitioning schema for events based on timestamp and optionally source type. The events then flow into a directory structure based on the schema.
Go to the "Partitioning" section of the New Destination configuration. You can choose a partitioning schema through the drop-down menu. The choices are:
- Day (YYYY/MM/DD)
- Month (YYYY/MM)
- Year (YYYY)
- Legacy
The legacy setting is for use with pre-9.1 destinations only. With legacy, for each 2MB (by default) batch, the latest event timestamp in the batch identifies the folder using the format "YYYY/MM/DD". However, unlike the true partitioning options such as "day", the folder might also contain events with other timestamps, if its batch contains other timestamps.
In the case of destinations created pre-9.1, "legacy" is the default. In the case of destinations created in 9.1 and higher, "day" is the default.
You can also set source type as a secondary key. However, if you are using federated search for Amazon S3 with the AWS Glue Data Catalog integration, you need to make sure that your Glue Data Catalog tables do not include a duplicate entry for the sourcetype column.
For details on the partitioning methods and examples of the resulting paths, see the partitionBy setting in outputs.conf
Use KMS encryption (Splunk Cloud Platform only)
You can employ SSE-KMS encryption when using ingest actions to write data to customer-owned S3 buckets. This capability is enabled through the configuration of AWS cross-account IAM roles.
- You are assuming ownership and full responsibility for the integrity and ongoing availability of your AWS KMS key.
- The KMS key is required for encrypting Splunk data in real-time.
- Loss of access to the KMS key can result in service interruption and/or permanent loss of data access by all parties (AWS, Splunk, and you).
- Unauthorized access to the KMS key can result in accidental or explicit key operations (such as key deactivation or deletion) that could lead to service disruption or permanent loss of data access by all parties (AWS, Splunk and you).
- You must maintain Splunk privileged access to the KMS key via Splunk-mandated key policy definitions.
- Keys must be in the same region as their Splunk Cloud stack. Multi-region keys are not supported.
- Key aliases are not supported.
To enable KMS encryption, create the SplunkIngestActions IAM role in your AWS account:
- Go to the IAM roles section in the AWS configuration UI.
- Create the exact role "SplunkIngestActions".
- Edit the permissions section for that role by adding an inline policy and overwriting the existing JSON with JSON created through the Generate Permission Policy button in the Splunk ingest actions UI. You can edit that JSON text as needed for your organization.
- Edit the trust relationship section by overwrite the existing JSON with JSON created through the Generate Trust Policy button in the Splunk ingest actions UI. You can edit this JSON text as needed for your organization.
Perform advanced configurations with outputs.conf
While Destinations on the Ingest Actions page can handle most common S3 configuration needs, for some advanced configurations, you might need to directly edit outputs.conf, using the rfs stanza.
For a complete list of rfs settings, see Remote File System (RFS) Output. The remote filesystem settings and options for S3 are similar to the SmartStore S3 configuration.
Troubleshoot
To troubleshoot the S3 remote file system, search the _internal index for events from the RfsOutputProcessor and S3Client components. For example:
index="_internal" sourcetype="splunkd" (ERROR OR WARN) RfsOutputProcessor OR S3Client
Key provisos
Note the following:
- You can configure and use multiple S3 remote storage locations, up to a maximum of 8 destinations.
- In the case of a Splunk Cloud Platform deployment, buckets must be in the same region as the deployment.
- In the case of an indexer cluster, each remote storage configuration must be identical across the indexer cluster peers.
- AWS has an upload limit of 5 GB for single objects. An attempt to upload an object greater than 5 GB will result in data loss. You will only encounter this limit if you set
batchSizeThresholdKBinoutputs.confto a value that is greater than 5 GB. - The remote file system creates buckets similar to index buckets on the remote storage location. The bucket names include the peer GUID and date.
- Remember to set the correct life cycle policies for your S3 buckets and their paths. This data will live forever by default unless removed.
- For information on S3 authentication requirements, see SmartStore on S3 security strategies. Ingest actions requirements are similar.
Configure S3 destinations using agent management
If you are using agent management to manage heavy forwarders, you can configure S3 destinations on the agent management instance. The configuration then propagates to connected agents. This capability requires an agent management instance running version 10.2 or higher. There are no version restrictions for agents, as long as the agent is compatible with the agent management instance.
Note the following:
- Configure all destinations from agent management to avoid inconsistency. If you configure some destinations on the agent management instance and others directly on individual agents, the configuration might not work as expected.
- This feature is enabled by default. To disable it, set the
enableS3ConfigOnDsflag tofalseinlimits.confon the agent management instance. No configuration is required on agents to use this feature.