Define a DDSS dataset

Define a Dynamic Data Self Storage (DDSS) dataset in the Data Management app to facilitate federated search of data stored in a specific DDSS location in AWS.

Define a Dynamic Data Self Storage (DDSS) dataset in the Data Management app for use in federated searches. Each DDSS dataset you define lets you run federated searches over data stored in a specific DDSS location in an Amazon S3 bucket without having to reindex it first.

Note: Federated Search for DDSS currently does not support creation of datasets for DDSS locations in Azure or GCP environments.
  • You must have a Splunk Cloud Platform (SCP) deployment with dynamic data self storage (DDSS) locations configured in Amazon S3 buckets. See Store expired Splunk Cloud Platform data in your private archive in the Splunk Cloud Platform Admin Manual.

  • Your user account on the SCP deployment must have a role with the edit_connections and edit_datasets capabilities. See Define roles on the Splunk platform with capabilities in the Splunk Cloud Platform Manage Users and Security manual.

  • You must have an AWS account with sufficient permissions to manage the Amazon S3 buckets that serve as locations for your DDSS datasets and apply policies or permissions to them. You also must have permissions that allow you to create and manage SQS queues for those Amazon S3 buckets.

Note: Federated Search for DDSS does not support search of DDSS data that is archived in the S3 Glacier Deep Archive or S3 Glacier Flexible Retrieval storage classes. If the DDSS location you want to search contains data that is stored in either of these storage classes you must restore it to a searchable storage class before you can search it. For more information, see Restoring an archived object in the Amazon Simple Storage Service User Guide.
  1. In Splunk Cloud Platform, select Data Management from the Apps panel.
  2. Navigate to the Datasets page, and then select Create dataset.
  3. On the Select data store page, select Dynamic Data Self Storage(DDSS), then select Next.
  4. On the Define dataset page, provide values for the following fields and select Next:
    Setting Description
    S3 bucket name Select the name of the Amazon S3 bucket that contains the DDSS dataset.
    DDSS index Select the index in your Splunk Cloud Platform deployment that your DDSS dataset is associated with. This will be an index that has been configured for Dynamic Data Self-Storage and is set up to move expired data to the location at the S3 bucket path you have provided.
    Note: This field is unavailable until you enter an S3 bucket path that is associated with one or more DDSS indexes in your Splunk Cloud Platform deployment.

    If no indexes appear in the DDSS index dropdown, select Refresh indexes to update the index list.

    Each DDSS index can be associated with only one DDSS dataset. DDSS indexes that already have a DDSS dataset association won't appear in the DDSS index dropdown.

    After you supply the S3 bucket name and DDSS index, Splunk software displays an Index path for your DDSS dataset. Save this path. You will need it when you set up event notifications for the Amazon S3 bucket in Step 6.

    For more information about configuring DDSS indexes, see Store expired Splunk Cloud Platform data in your private archive in the Splunk Cloud Platform Admin Manual.

    Dataset name Supply a unique name for your dataset. The dataset name can contain only alphanumeric characters, underscores, and hyphens.
    Dataset description (Optional) Provide a description for your dataset.
  5. On the Configure dataset page, select the copy icon to copy the provided DDSS resource prefix to your clipboard. You will need this DDSS resource prefix in the following step.

    Here is an example of a DDSS resource prefix: fa2-nnn-aname-ddss-2-abcdefghijkl-.

  6. To arrange for the Splunk-native data catalog that backs your DDSS dataset to be updated automatically so it is consistent with the dataset it represents, set up an SQS queue and event notification for the Amazon S3 bucket that contains your DDSS dataset. Do this by following the instructions in Set up automated updates for a Splunk-native DDSS data catalog in AWS.
    Note: When you create the SQS queue, you must give it a Name that begins with the DDSS resource prefix you copied in Step 5.

    After you set up the SQS queue, you'll obtain an Amazon Resource Name (ARN) that you can paste into the SQS queue ARN field.

  7. Select Next.
  8. On the Update policies page, you'll apply two generated policies to your AWS account.
    1. AWS S3 Bucket Policy
      • Copy the AWS S3 Bucket policy from your dataset definition.

      • In a new browser tab, open the AWS Management Console and navigate to the Amazon S3 console.

      • Select your bucket, go to the Permissions tab, and and append the copied bucket policy to the existing Bucket policy.

        Note: Do not overwrite existing policies. Resolve security warnings, errors, and suggestions before saving your changes.

        For additional guidance, see Adding a bucket policy by using the Amazon S3 console in the Amazon Simple Storage Service User Guide.

    2. AWS SQS Queue Policy
      • Copy the AWS SQS Queue Policy from your dataset definition.

      • In the AWS console, open the Simple Queue Service console.

      • Select the SQS queue you defined for the Amazon S3 bucket that holds your DDSS data. Select Edit and append the copied SQS queue policy to the existing Access policy.

        Note: Do not overwrite existing policies. Resolve security warnings, errors, and suggestions before saving your changes.

        For additional guidance, see Configuring an access policy in Amazon SQS in the Amazon Simple Queue Service Developer Guide.

  9. When both policies are pasted and saved to their appropriate locations in your AWS account, select Next.
  10. On the Review page, review your dataset definition. If the details appear correct, select Create Dataset to create your dataset.
After you create your DDSS dataset, do these things: