Restore indexed data from a self-storage location

You might need to restore indexed data from a self-storage location. You restore this data by moving the exported data into a thawed directory on a Splunk Enterprise instance, such as $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb. When it is restored, you can then search it.

You can restore one bucket at a time. Make sure that you are restoring an entire bucket that contains the rawdata journal, and not a directory within a bucket.

Note: An entire bucket initially contains a rawdata journal and associated tsidx and metadata files. During the DDSS archival process, only the rawdata journal is retained.

For more information on buckets, see How the indexer stores indexes in Splunk Enterprise Managing Indexers and Clusters of Indexers.

Data in the thaweddb directory is not subject to the server's index aging scheme, which prevents it from immediately expiring upon being restored. You can put archived data in the thawed directory for as long as you need it. When the data is no longer needed, simply delete it or move it out of the thawed directory.

Note: As a best practice, restore your data using a 'nix machine. Using a Windows machine to restore indexed data to a Splunk Enterprise instance might result in a benign error message. See Troubleshoot Dynamic Data Self Storage.

Restore indexed data from an AWS S3 bucket

  1. Set up a Splunk Enterprise instance. The Splunk Enterprise instance can be either local or remote. If you have an existing Splunk Enterprise instance, you can use it.
    Note: You can restore self storage data only to a Splunk Enterprise instance. You can't restore self storage data to a Splunk Cloud Platform instance.
  2. Install the AWS Command Line Interface tool on your local machine. The AWS CLI tool must be installed in the same location as the Splunk Enterprise instance responsible for rebuilding.
  3. Configure the AWS CLI tool with the credentials of your AWS self storage location. For instructions on configuring the AWS CLI tool, see the Amazon Command Line Interface Documentation.
  4. Use the recursive copy command to download data from the self storage location to the thaweddb directory for your index. You can restore only one bucket at a time. If you have a large number of buckets to restore, consider using a script to do so. Use syntax similar to the following:
    aws s3 cp s3://<self storage bucket>/<self_storage_folder(s)>/<index_name> /SPLUNK_HOME/var/lib/splunk/<index_name>/thaweddb/ --recursive 
    
    Note: Make sure you copy all the contents of the archived Splunk bucket because they are needed to restore the data. For example, copy starting at the following level: db_timestamp_timestamp_bucketID. Do not copy the data at the level of raw data (.gz files). The buckets display in the thaweddb directory of your Splunk Enterprise instance.
  5. Restore the indexes by running the following command:
    ./splunk rebuild <SPLUNK_HOME>/var/lib/splunk/<index_name>/thaweddb/<bucket_folder>  <index_name>
    
    tsidx source types.
  6. After the data is restored, go to the Search & Reporting app, and search on the restored index as you would any other Splunk index.
    Note: When you restore data to the thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud Platform deployment.

Restore indexed data from a GCP bucket

  1. Set up a Splunk Enterprise instance. The Splunk Enterprise instance can be either local or remote. If you have an existing Splunk Enterprise instance, you can use it.
    Note: You can restore self storage data only to a Splunk Enterprise instance. You can't restore self storage data to a Splunk Cloud Platform instance.
  2. Install the GCP command line interface tool, gsutil, on your local machine. The GCP CLI tool must be installed in the same location as the Splunk Enterprise instance responsible for rebuilding.
  3. Configure the gsutil tool with the credentials of your GCP self storage location. For instructions on configuring the gsutil tool , see the gsutil tool documentation.
  4. Use the recursive copy command to download data from the self storage location to the thaweddb directory for your index. You can restore only one bucket at a time. If you have a large number of buckets to restore, consider using a script to do so. Use syntax similar to the following:
    gsutil cp -r gs:<self storage bucket>/<self_storage_folder(s)>/<index_name> /SPLUNK_HOME/var/lib/splunk/<index_name>/thaweddb/
    
    Note: Make sure you copy all the contents of the archived Splunk bucket because they are needed to restore the data. For example, copy starting at the following level: db_timestamp_timestamp_bucketID. Do not copy the data at the level of raw data (.gz files). The buckets display in the thaweddb directory of your Splunk Enterprise instance.
  5. Restore the indexes by running the following command:
    ./splunk rebuild <SPLUNK_HOME>/var/lib/splunk/<index_name>/thaweddb/<bucket_folder>  <index_name>
    
    tsidx source types.
  6. After the data is restored, go to the Search & Reporting app, and search on the restored index as you would any other Splunk index.
    Note: When you restore data to the thawed directory on Splunk Enterprise, it does not count against the indexing license volume for the Splunk Enterprise or Splunk Cloud deployment.