Perform a rolling upgrade of a search head cluster
Splunk Enterprise version 7.1.0 and higher supports rolling upgrade for search head clusters. A rolling upgrade performs a phased upgrade of cluster members with minimal interruption to your ongoing searches. You can use a rolling upgrade to minimize search disruption when upgrading cluster members to a new version of Splunk Enterprise.
Requirements and considerations
Review the following requirements and considerations before you initiate a rolling upgrade:
- Rolling upgrade only applies to upgrades from version 7.1.x to higher versions of Splunk Enterprise.
- All search head cluster members, indexer cluster manager node, and indexer cluster peer nodes must be running version 7.1.0 or higher.
- When performing a rolling upgrade to Splunk Enterprise version 9.0 or higher, you must manually migrate the KV store to the WiredTiger storage engine and server version 2.0, if you have not already done so. For detailed instructions, see Migrate the KV store in a clustered deployment.
- Do not attempt any clustering maintenance operations, such as rolling restart, bundle pushes, or node additions, during a rolling upgrade.
How a rolling upgrade works
When you initiate a rolling upgrade, you select a cluster member and put that member into manual detention. While in manual detention, the member cannot accept new search jobs, and all in-progress searches try to complete within a configurable timeout. When all searches are complete, you perform the software upgrade and bring the member back online. You repeat this process for each cluster member until the rolling upgrade is complete.
A rolling upgrade behaves in the following ways:
- Cluster members are upgraded one at a time.
- While in manual detention, the following applies to a cluster member:
- The cluster member cannot receive new searches, execute ad hoc searches, or receive new search artifacts from other members.
- The cluster member continues to participate in most cluster operations, such as captain election and automatic configuration replication.
- New scheduled searches are executed on other members.
- The cluster member waits for in-progress searches to complete, up to a maximum time set by the user. The default of 180 seconds is enough time for the majority of searches to complete in most cases.
- Rolling upgrades apply to both historical and real-time searches.
Perform a rolling upgrade
To upgrade a search head cluster with minimal search interruption, perform the following steps:
1. Run preliminary health checks
On any cluster member, run the splunk show shcluster-status
command using the verbose
option to confirm that the cluster is in a healthy state before you begin the upgrade:
splunk show shcluster-status --verbose
Here is an example of the output from the command:
Captain:
decommission_search_jobs_wait_secs : 180
dynamic_captain : 1
elected_captain : Tue Mar 6 23:35:52 2018
id : FEC6F789-8C30-4174-BF28-674CE4E4FAE2
initialized_flag : 1
kvstore_maintenance_status : enabled
label : sh3
max_failures_to_keep_majority : 1
mgmt_uri : https://sroback180306192122accme_sh3_1:8089
min_peers_joined_flag : 1
rolling_restart : restart
rolling_restart_flag : 0
rolling_upgrade_flag : 0
service_ready_flag : 1
stable_captain : 1
Cluster Manager(s):
https://sroback180306192122accme_manager1_1:8089 splunk_version: 7.1.0
Members:
sh3
kvstore_status : maintenance
label : sh3
manual_detention : off
mgmt_uri : https://sroback180306192122accme_sh3_1:8089
mgmt_uri_alias : https://10.0.181.9:8089
out_of_sync_node : 0
preferred_captain : 1
restart_required : 0
splunk_version : 7.1.0
status : Up
sh2
kvstore_status : maintenance
label : sh2
last_conf_replication : Wed Mar 7 05:30:09 2018
manual_detention : off
mgmt_uri : https://sroback180306192122accme_sh2_1:8089
mgmt_uri_alias : https://10.0.181.4:8089
out_of_sync_node : 0
preferred_captain : 1
restart_required : 0
splunk_version : 7.1.0
status : Up
sh1
kvstore_status : maintenance
label : sh1
last_conf_replication : Wed Mar 7 05:30:09 2018
manual_detention : off
mgmt_uri : https://sroback180306192122accme_sh1_1:8089
mgmt_uri_alias : https://10.0.181.2:8089
out_of_sync_node : 0
preferred_captain : 1
restart_required : 0
splunk_version : 7.1.0
status : Up
The output shows a stable, dynamically elected captain, enough members to support the replication factor, no out-of-sync nodes, and all members running a compatible Splunk Enterprise version (7.1.0 or higher). This indicates that the cluster is in a healthy state to perform a rolling upgrade.
For information on health check criteria, see Health check output details.
Or, send a GET request to the following endpoint to monitor cluster health:
/services/shcluster/status?advanced=1
For endpoint details, see shcluster/status in the REST API Reference Manual.
2. Initialize rolling upgrade
To initialize the rolling upgrade, run the following CLI command on any cluster member:
splunk upgrade-init shcluster-members
Or, send a POST request to the following endpoint:
/services/shcluster/captain/control/control/upgrade-init
For endpoint details, see shcluster/captain/control/control/upgrade-init in the REST API Reference Manual.
3. Put a member into manual detention mode
Select a search head cluster member other than the captain and put that member into manual detention mode:
splunk edit shcluster-config -manual_detention on
Or, send a POST request to the following endpoint:
servicesNS/admin/search/shcluster/member/control/control/set_manual_detention \
-d manual_detention=on
For endpoint details, see shcluster/member/control/control/set_manual_detention in the REST API Reference Manual.
For more information on manual detention mode, see Put a search head into detention.
4. Confirm the member is ready for upgrade
Run the following command to confirm that all searches are complete:
splunk list shcluster-member-info | grep "active"
The following output indicates that all historical and real-time searches are complete:
active_historical_search_count:0
active_realtime_search_count:0
Or send a GET request to the following endpoint:
/services/shcluster/member/info
For endpoint details, see shcluster/member/info in the REST API Reference Manual.
5. Upgrade the member
Upgrade the search head following the standard Splunk Enterprise upgrade procedure. See How to upgrade Splunk Enterprise in the Installation Manual.
6. Bring the member back online
- Run following command on the cluster member:
splunk start
- Turn off manual detention mode:
splunk edit shcluster-config -manual_detention off
Or, send a POST request to the following endpoint:
servicesNS/admin/search/shcluster/member/control/control/set_manual_detention \ -d manual_detention=off
For endpoint details, see shcluster/member/control/control/set_manual_detention in the REST API Reference Manual.
7. Check cluster health status
After you bring the member back online, check that the cluster is in a healthy state.
Run the following command on the cluster member:
splunk show shcluster-status --verbose
Or, use this endpoint to monitor cluster health:
/services/shcluster/status?advanced=1
For endpoint details, see shcluster/status in the REST API Reference Manual.
For information on what determines a healthy search head cluster, see Health check output details.
8. Repeat steps 3-7 for all members
Repeat steps 3-7 until you have upgraded all cluster members.
9. Upgrade the deployer
Make sure that you upgrade the deployer right after you upgrade the cluster members. The deployer must run the same version as the cluster members, down to the minor level. For example, if members are running 7.1.1, the deployer must run 7.1.x.
To upgrade the deployer, do the following:
- Stop the deployer.
- Upgrade the deployer, following standard Splunk Enterprise upgrade procedure. See How to upgrade Splunk Enterprise in the Installation Manual.
- Start the deployer.
For more information on the deployer, see Deployer requirements.
10. Finalize the rolling upgrade
Run the following CLI command on any search head cluster member.
splunk upgrade-finalize shcluster-members
Or, send a POST request to the following endpoint:
/services/shcluster/captain/control/control/upgrade-finalize
For endpoint details, see shcluster/captain/control/control/upgrade-finalize in the REST API Reference Manual.
Example upgrade automation script
Version 7.1.0 and higher includes an example automation script (shc_upgrade_template.py
) that you can use as the basis for automating the search head cluster rolling upgrade process. Modify this template script based on your deployment.
shc_upgrade_template.py
is located in SPLUNK_HOME/bin
and includes detailed usage and workflow information.
shc_upgrade_template.py
is an example script only. Do not apply the script to a production instance without editing it to suit your environment and testing it extensively.