Create an SQS queue and set up event notification for the Amazon S3 bucket that contains your dataset to keep your Splunk-managed data catalog in sync with dataset changes.
If you are using a Splunk-native data catalog to refer to your dataset, you can optionally arrange for the catalog to be updated automatically whenever data is added to or removed from the dataset. To get this automatic update capability, set up a simple queue service (SQS) queue and event notification for the Amazon S3 bucket that contains the dataset. When you do this, retrieve the SQS queue ARN and add it to your dataset definition in the Data Management app.
- Ensure you have access to your Amazon Web Services (AWS) account, with permissions to manage S3 buckets and create SQS queues with the Amazon Simple Queue Service.
- Identify the Amazon S3 bucket that contains the dataset you want to keep synchronized with your Splunk-native data catalog.
Step 1: Create an SQS queue for the S3 bucket that contains your dataset
- In a new browser tab, log in to your AWS account and navigate to the Simple Queue Service (SQS) console.
- On the SQS dashboard, select Create queue.
- In the Details section, select the Standard queue type.
The standard queue type provides high throughput for catalog updates.
- In the Name field, enter a descriptive name for your queue.
- Scroll to the bottom of the page and select Create queue.
- In the Details section, find the ARN field. This is the SQS queue ARN. Select the copy icon to copy it.
- Go back to the browser tab that has your federated search dataset definition and paste the ARN value into the SQS queue ARN field.
- In the browser tab that has your SQS queue information, scroll down to the Access policy section and select Edit.
- In the Access policy editor, replace the default access policy with the following template:
{
"Version": "2012-10-17",
"Id": "example-ID",
"Statement": [
{
"Sid": "example-statement-ID",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sqs:SendMessage",
"Resource": "YOUR_SQS_QUEUE_ARN",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::YOUR_S3_BUCKET_NAME"
}
}
}
]
}
- Replace
YOUR_SQS_QUEUE_ARN with the ARN for the SQS queue.
- Replace
YOUR_S3_BUCKET_NAME with the name of the Amazon S3 bucket that contains the Amazon S3 location for your dataset.
- Select Save.
Step 2: Set up event notification for the S3 bucket that contains your dataset
- In the browser tab you opened for your AWS account, navigate to the Amazon S3 console.
- From the General purpose buckets list, select the name of the S3 bucket that contains the Amazon S3 location for your dataset.
- Select the Properties tab.
- Scroll down to the Event notifications section and select Create event notification.
- In the General configuration section, supply an Event name, such as SplunkS3DiscoveryEvent.
- Ensure that the Prefix matches the file path for your dataset's Amazon S3 location.
For example, if the Amazon S3 location for your dataset is s3://bucket1/path1/my_csv_data/ and you are setting up event notification for your bucket1 bucket, set the Prefix field to path1/my_csv_data/.
- In the Event types section, select the following checkboxes:
- Select All object create events to notify your Splunk-managed data catalog when new data is uploaded to your dataset.
- Select All object removal events to notify your Splunk-managed data catalog when data is deleted or a delete marker is created.
- Select All restore object events to notify your Splunk-managed data catalog when objects are restored from Archive storage.
- Select Lifecycle events and All lifecycle expiration events to capture automatic promotion and expiration of dataset objects based on lifecycle rules.
- In the Destination section, select SQS queue as the destination type.
- Under Specify SQS queue, choose Enter SQS queue ARN.
- Paste the SQS queue ARN you copied in the previous section into the SQS queue text field.
- Select Save changes.
Once you save your event notification, AWS attempts to send a test s3:TestEvent to your SQS queue. If the configuration is successful, you see a green success banner at the top of the S3 console. If you receive an error regarding permissions, return to the SQS Access Policy and ensure the Resource and aws:SourceArn fields correctly match your SQS queue ARN and S3 Bucket ARN respectively.
When your event notification is validated, return to Create an Amazon S3 dataset for federated search that is backed by a Splunk-native data catalog and continue where you left off in the setup of your dataset.