Send data from Ingest Processor to Amazon S3

Send data from Ingest Processor to Amazon S3 buckets by using either an Amazon S3 dataset or an Amazon S3 destination in a pipeline.

How you configure the Ingest Processor to send data to an Amazon S3 bucket varies depending on which Splunk Cloud Platform version the Ingest Processor is associated with.

For information about migrating from a legacy destination to a dataset, see Migrate legacy Amazon S3 destinations to datasets for Ingest Processor pipelines

Using an Amazon S3 connection and dataset

Splunk Cloud Platform deployments that are on version 10.4.2604 or higher include the Data Management app, which allows you to configure connections and datasets that can be used as both pipeline destinations and data sources for federated searches.

To send data from Ingest Processor to an S3 bucket, do the following:

  1. In the Data Management app on Splunk Cloud Platform, create a connection that provides access to Amazon Web Services (AWS).

    1. Start creating the connection by providing information about the AWS account that the bucket is associated with. For more information, see Create an Amazon S3 connection for Ingest Processor pipelines.

    2. Then, configure authentication and finish creating the connection. You can choose to authenticate the connection using an access key pair or an Identity and Access Management (IAM) role. For more information, see the following pages:
  2. In the Data Management app on Splunk Cloud Platform, create a dataset that represents the location in Amazon S3 where you want to send data. For more information, see Create an Amazon S3 dataset for Ingest Processor pipelines.

  3. In the Ingest Processor service, create a pipeline that uses the Amazon S3 dataset as a destination. For more information, see Create pipelines for Ingest Processor.

    Note: To ensure that the events you send to the dataset are compatible with federated searches, there are several best practices that you need to follow when configuring your pipeline. For more information, see Best practices for sending data from Ingest Processor to a dataset.
  4. In the Ingest Processor service, apply the pipeline to the Ingest Processor. For more information, see Apply a pipeline.

When you apply that pipeline to the Ingest Processor, it starts sending the data that it receives to your S3 bucket. In S3, this data is identified by a file path and name that is constructed using auto-generated values from the system as well as some of the values that you specify in the connection and dataset configuration.

Using a legacy Amazon S3 destination

When working with a Splunk Cloud Platform deployment that is on version 10.3.2512 or lower, you must use the Ingest Processor service to create Amazon S3 destinations. Unlike the datasets that are available in later versions of Splunk Cloud Platform, these legacy destinations cannot be used in federated searches.

To send data from Ingest Processor to an S3 bucket, do the following in the Ingest Processor service:

  1. Create an Amazon S3 destination that provides access to the location in Amazon S3 where you want to send data. For more information, see Create a legacy Amazon S3 destination for Ingest Processor.

  2. Create a pipeline that uses the Amazon S3 destination. For more information, see Create pipelines for Ingest Processor.

  3. Apply the pipeline to the Ingest Processor. For more information, see Apply a pipeline.

When you apply that pipeline to the Ingest Processor, it starts sending the data that it receives to your S3 bucket. In S3, this data is identified by a file path and name that is constructed using auto-generated values from the system as well as some of the values that you specify in the connection and dataset configuration.