Send data from Edge Processors to Amazon S3

Send data from Edge Processors to Amazon S3 buckets by using either an Amazon S3 dataset or an Amazon S3 destination in a pipeline.

How you configure an Edge Processor to send data to an Amazon S3 bucket varies depending on which Splunk Cloud Platform version the Edge Processor is associated with.

For information about migrating from a legacy destination to a dataset, see Migrate legacy Amazon S3 destinations to datasets for Edge Processor pipelines.

Using an Amazon S3 connection and dataset

Splunk Cloud Platform deployments that are on version 10.4.2604 or higher include the Data Management app, which allows you to configure connections and datasets that can be used as both pipeline destinations and data sources for federated searches.

To send data from an Edge Processor to an S3 bucket, do the following:

  1. In the Data Management app on Splunk Cloud Platform, create a connection that provides access to Amazon Web Services (AWS).

    1. Start creating the connection by providing information about the AWS account that the bucket is associated with. For more information, see Create an Amazon S3 connection for Edge Processor pipelines.

    2. Then, configure authentication and finish creating the connection. If any of the instances in your Edge Processor are not installed on Amazon EC2 instances that are associated with the same AWS account as the S3 bucket, then you must authenticate the connection using an access key pair. Otherwise, you can choose to use either an access key pair or an Identity and Access Management (IAM) role. For more information, see the following pages:
  2. In the Data Management app on Splunk Cloud Platform, create a dataset that represents the location in Amazon S3 where you want to send data. For more information, see Create an Amazon S3 dataset for Edge Processor pipelines.

  3. In the Edge Processor service, create a pipeline that uses the Amazon S3 dataset as a destination. For more information, see Create pipelines for Edge Processors.

    Note: To ensure that the events you send to the dataset are compatible with federated searches, there are several best practices that you need to follow when configuring your pipeline. For more information, see Best practices for sending data from an Edge Processor to a dataset.
  4. In the Edge Processor service, apply the pipeline to an Edge Processor. For more information, see Apply pipelines to Edge Processors.

When you apply that pipeline to your Edge Processor, it starts sending the data that it receives to your S3 bucket. In S3, this data is identified by a file path and name that is constructed using auto-generated values from the system as well as some of the values that you specify in the connection and dataset configuration.

Using a legacy Amazon S3 destination

When working with a Splunk Cloud Platform deployment that is on version 10.3.2512 or lower, you must use the Edge Processor service to create Amazon S3 destinations. Unlike the datasets that are available in later versions of Splunk Cloud Platform, these legacy destinations cannot be used in federated searches.

To send data from an Edge Processor to an S3 bucket, do the following in the Edge Processor service:

  1. Create an Amazon S3 destination that provides access to the location in Amazon S3 where you want to send data. For more information, see Create a legacy Amazon S3 destination for Edge Processor pipelines.

  2. Create a pipeline that uses the Amazon S3 destination. For more information, see Create pipelines for Edge Processors.

  3. Apply the pipeline to an Edge Processor. For more information, see Apply pipelines to Edge Processors.

When you apply that pipeline to your Edge Processor, it starts sending the data that it receives to your S3 bucket. In S3, this data is identified by a file path and name that is constructed using auto-generated values from the system as well as some of the values that you specify in the connection and dataset configuration.