Create an SQS queue and set up event notification for the Amazon S3 bucket that contains your DDSS dataset, to keep your Splunk-managed data catalog in sync with dataset changes.
When you create a DDSS dataset, Splunk software generates a data catalog to back it up. This Splunk-native data catalog enables you to run efficient and cost-effective searches of the dataset.
You must arrange for this data catalog to remain consistent with your DDSS dataset as objects are added to the dataset, removed from the dataset, or restored to the dataset from the Glacier Deep Archive or Glacier Flexible Retrieval storage classes. You do this by setting up a Simple Queue Service (SQS) queue and event notification for the Amazon S3 bucket that contains the dataset. When you do this, retrieve the SQS queue ARN and add it to your DDSS dataset definition in the Data Management app.
- Ensure you have access to your Amazon Web Services (AWS) account, with permissions to manage S3 buckets and create SQS queues with the Amazon Simple Queue Service.
- Identify the Amazon S3 bucket that serves as the location for the DDSS dataset that you want to synchronize with your Splunk-native data catalog.
- Copy the DDSS resource prefix that appears on the Configure dataset page of your DDSS dataset definition. You'll paste this prefix into the Name field for your SQS queue. See Define a DDSS dataset.
Step 1: Create an SQS queue for the S3 bucket that contains your dataset
- In a new browser tab, log in to your AWS account and navigate to the Simple Queue Service (SQS) console.
- On the SQS dashboard, select Create queue.
- In the Details section, select the Standard queue type.
The standard queue type provides high throughput for catalog updates.
- In the Name field, provide a name for your SQS queue. This name must begin with the DDSS resource prefix that you can copy from the Configure dataset page in the Splunk Data Management app. Paste in that prefix, and then add to it an SQS queue name of your choice.
For example, if the DDSS resource prefix provided on the Configure dataset page is "fa2-nnn-aname-ddss-2-abcdefghijkl-", you might enter fa2-nnn-aname-ddss-2-abcdefghijkl-my-SQL-queue into the Name field.
- Scroll to the bottom of the page and select Create queue.
- In the Details section, find the ARN field. This is the SQS queue ARN. Select the copy icon to copy it.
- Go back to the browser tab that has your federated search dataset definition and paste the ARN value into the SQS queue ARN field.
- In the browser tab that has your SQS queue information, scroll down to the Access policy section and select Edit.
- In the Access policy editor, replace the default access policy with the following template:
{
"Version": "2012-10-17",
"Id": "example-ID",
"Statement": [
{
"Sid": "example-statement-ID",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sqs:SendMessage",
"Resource": "YOUR_SQS_QUEUE_ARN",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::YOUR_S3_BUCKET_NAME"
}
}
}
]
}
- Replace
YOUR_SQS_QUEUE_ARN with the ARN for the SQS queue.
- Replace
YOUR_S3_BUCKET_NAME with the name of the Amazon S3 bucket that contains the DDSS location for your dataset.
- Select Save to save your SQS queue access policy changes.
Step 2: Set up event notification for the S3 bucket that contains your dataset
- In the browser tab you opened for your AWS account, navigate to the Amazon S3 console.
- From the General purpose buckets list, select the name of the S3 bucket that contains the Amazon S3 location for your dataset.
- Select the Properties tab.
- Scroll down to the Event notifications section and select Create event notification.
- In the General configuration section, supply an Event name, such as SplunkS3DiscoveryEvent.
- Ensure that the Prefix corresponds to the Index path that appears on the Define dataset page. See Define a DDSS dataset.
For the Prefix field, enter everything in the Index path after the bucket name. For example, if the Index path for your dataset is s3://bucket1/path1/archived_data/ddss_index1, set the Prefix field to path1/archived_data/ddss_index1.
- In the Event types section, select the following checkboxes:
- Select All object create events to notify your Splunk-managed data catalog when new data is uploaded to your dataset.
- Select All object removal events to notify your Splunk-managed data catalog when data is deleted or a delete marker is created.
- Select All restore object events to notify your Splunk-managed data catalog when objects are restored from the Glacier Deep Archive or Glacier Flexible Retrieval storage classes.
- Select Lifecycle transition events and All lifecycle expiration events to capture automatic promotion and expiration of dataset objects based on lifecycle rules.
- In the Destination section, select SQS queue as the destination type.
- Under Specify SQS queue, choose Enter SQS queue ARN.
- Paste the SQS queue ARN you copied in the previous section into the SQS queue text field.
- Select Save changes.
Once you save your event notification, AWS attempts to send a test s3:TestEvent to your SQS queue. If the configuration is successful, you see a green success banner at the top of the S3 console. If you receive an error regarding permissions, return to the SQS Access Policy and ensure the Resource and aws:SourceArn fields correctly match your SQS queue ARN and S3 Bucket ARN respectively.
When your event notification is validated, return to Define a DDSS dataset and continue where you left off in the setup of your dataset.