Back up and restore the Data Management control plane
Back up the Splunk Enterprise node that hosts the Data Management control plane and restore it from backup. Taking regular backups from a healthy environment enables you to restore from a backup in the event of a disaster. To create a backup do the following:
-
Back up the $SPLUNK_HOME/etc directory which includes the Data Management control plane and Pipeline Builders app with the pipeline specifications that have been authored. For more information, see Back up configuration information in the Splunk Enterprise Admin Manual.
-
Back up the existing KV store which is where the various Splunk tokens are stored that were used to onboard the Edge Processors. For more information, see Back up and restore the KV store in the Splunk Enterprise Admin Manual.
-
Create a backup of the following databases for the Data Management services by using the Storage sidecar:
-
search_metadata
-
kvstore
-
acies_config_service
-
opamp_service
-
Back up the Data Management services using the Storage sidecar
The Storage sidecar is included in the installation of Splunk Enterprise and helps manage the backing storage for other sidedars. For more information, see About Splunk sidecars in the Splunk Enterprise Admin Manual. To create a backup through the Storage sidecar, do the following:
-
Find the port of the API server assigned by the IPC broker.
This will return the second matching listening port that corresponds to the API server.lsof -i -P | awk '/postgres/ && /LISTEN/ && $9 ~ /localhost/ && $9 !~ /5432/ {split($9, a, ":"); ports[++count] = a[2]} END {if (count >= 2) print ports[2]}'
-
Retrieve the password for the
<admin_password>
section in the next step by entering the following command:
The password will come from thecurl https://localhost:8089/servicesNS/nobody/system/storage/passwords/postgres:postgres_admin?output_mode=json -k -u <dmx username> <dmx login password>
“clear_password”
field of the output and will be used to replace<admin_password>
in the next step. -
Set the environment variables. Using the password that was retrieved from step 2, replace
<admin_password>
.export PG_ADMIN_USER=postgres_admin export PG_ADMIN_PASS=<admin_password> export PG_API_PORT=<actual_api_port> #e.g., 37173, retrieved via lsof export PG_AUTH_BASIC=$(echo -n "$PG_ADMIN_USER:$PG_ADMIN_PASS" | base64)
-
After finding the port and setting the environment variables, create the backup by entering the following command while replacing
<DATABASE NAME>
with the name of the database that you want to back up and replacing<DB BACKUP FILE>
with the name of the backup file.
This command will send out a response, specifically an ID output that will be used to replace thecurl -X POST "https://localhost:$PG_API_PORT/v1/postgres/recovery/backup" \ -H "Content-Type: application/json" \ -H "Authorization: Basic $PG_AUTH_BASIC" \ -d '{ "database": "<DATABASE NAME>", "backupFile": "<DB BACKUP FILE>" }' -k
$PG_BACKUP_ID
command in the next step. - Verify the status of the backup.
export PG_BACKUP_ID=<id_from_backup_response> curl -X GET "https://localhost:$PG_API_PORT/v1/postgres/recovery/status/$PG_BACKUP_ID" \ -H "Content-Type: application/json" \ -H "Authorization: Basic $PG_AUTH_BASIC" -k
You now have a PostgreSQL database that contains the backup of your Splunk Enterprise node that hosts the Data Management control plane.
Restore the Data Management control plane using the Storage sidecar backup
To restore the Storage sidecar backup, do the following for each database:
- Enter the restore command.
This command will send out a response, specifically an ID output that will be used to replace thecurl -X POST "https://localhost:$PG_API_PORT/v1/postgres/recovery/restore" \ -H "Content-Type: application/json" \ -H "Authorization: Basic $PG_AUTH_BASIC" \ -d '{ "database": "<DATABASE NAME>", "backupFile": "<DB BACKUP FILE>" }' -k
$PG_BACKUP_ID
command in the next step. - Verify the status of the restored backup.
export PG_BACKUP_ID=<id_from_backup_response> curl -X GET "https://localhost:$PG_API_PORT/v1/postgres/recovery/status/$PG_BACKUP_ID" \ -H "Content-Type: application/json" \ -H "Authorization: Basic $PG_AUTH_BASIC" -k
You have restored the backup for your Data Management control plane.
Troubleshooting the Storage sidecar and PostgreSQL
Review this topic if you are having difficulties with using the Storage sidecar. If the problem that you're experiencing is not described on this page, see Sidecar troubleshooting for more information.
The Storage sidecar is not running
If you see that there are no metrics being collected in your database, the Storage sidecar might not be running.
Cause
The sidecars have not been enabled.
Solution
sudo su - splunk
cd /opt/splunk/var/run/supervisor/pkg-run/
ls -la
# should show sidecar binaries: pkg-postgresxxxxxxx
Logs:
cd /opt/splunk/var/log/splunk
grep '"msg":"Postgres health check"' /opt/splunk/var/log/splunk/sup-pkg-postgres-stdout.log
If the sidecars have been verified to not start, try to do the following:Restart Splunk Enterprise.
Check if the feature is enabled.
Check for configuration issues or messages.
The PostgreSQL service is not running in a healthy state
curl -X GET "https://localhost:$PG_API_PORT/v1/postgres/health"
and it returns "healthStatus":"unhealthy"
, this indicates that the service is not healthy and errors need to be resolved.Cause
This can happen for multiple reasons. Inspect the logs to identify what could have caused the unhealthy state.
Solution
Troubleshoot by inspecting the logs. Restart the Splunk Enterprise.
The PostgreSQL service is running but is unreachable on port 5432
Cause
This can happen when the port has been changed due to CLI or functional test overrides.
Solution
grep 'listening on IPv4 address' /opt/splunk/var/log/splunk/postgres-*.log