About distributed search

Use cases

These are some of the key use cases for distributed search:

  • Horizontal scaling for enhanced performance. Distributed search facilitates horizontal scaling by providing a way to distribute the indexing and searching loads across multiple Splunk Enterprise instances, making it possible to index and search large quantities of data.
  • Access control. You can use distributed search to control access to indexed data. For example, some users, such as security personnel, might need access to data across the enterprise, while others need access to data only in their functional area.
  • Managing geo-dispersed data. Distributed search allows local offices to access their own data, while maintaining centralized access at the corporate level. For example, users in Chicago and San Francisco can look just at their local data, while users at headquarters in New York can search the local data, as well as the data in Chicago and San Francisco.

Distributed search components

With distributed search, a Splunk Enterprise instance called a search head sends search requests to a group of indexers, or search peers, which perform the actual searches on their indexes. The search head then merges the results back to the user. Here is a basic distributed search scenario, with one search head managing searches across several indexers:

Horizontal scaling 60.png

Types of distributed search

Independent search heads

A small distributed search deployment has one independent search head; that is, a search head that is not part of a cluster.

To scale beyond a single search head, deploy a search head cluster.

Search head clusters

A search head cluster is a group of search heads that work together to provide scalability and high availability. It serves as a central resource for searching across a set of search peers.

The search heads in a cluster are, for most purposes, interchangeable. All search heads have access to the same set of search peers. They can also run or access the same searches, dashboards, knowledge objects, and so on.

A search head cluster is the recommended topology when you need to run multiple search heads across the same set of search peers. The cluster coordinates the activity of the search heads, allocates jobs based on the current loads, and ensures that all the search heads have access to the same set of knowledge objects.

See "About search head clustering."

Indexer clusters and search heads

Indexer clusters also use search heads to search across the set of indexers, or peer nodes. The search heads in an indexer cluster can be either independent search heads or members of a search head cluster.

You deploy and configure search heads very differently when they are part of an indexer cluster:

Parallel reduce search processing

If you struggle with extremely large high-cardinality searches, you might be able to apply parallel reduce processing to them to help them complete faster. You must have a distributed search environment to use parallel reduce search processing.

High-cardinality searches are searches that must match, filter, and aggregate fields with extremely large numbers of unique values. During a parallel reduce search process, some or all of a high-cardinality search job is processed in parallel by indexers that have been configured to behave as intermediate reducers for the purposes of the search. This parallelization of reduction work that otherwise would be done entirely by the search head can result in faster completion times for high-cardinality searches.

If you want to take advantage of parallel reduce search processing, your indexers should be operating with a light to medium load on average. You can use parallel reduce search processing whether or not your indexers are clustered.

See Overview of parallel reduce search processing.