Use subsearch to correlate events

A subsearch takes the results from one search and uses the results in another search. This enables sequential state-like data analysis. You can use subsearches to correlate data and evaluate events in the context of the whole event set, including data across different indexes or Splunk Enterprise servers in a distributed environment.

For example, say you have two or more indexes for different application logs. The event data from these logs share at least one common field. You can use the values of this field to search for events in one index based on a value that is not in another index:

sourcetype=some_sourcetype NOT [search sourcetype=another_sourcetype | fields field_val]

That search is equivalent to the SQL "NOT IN" functionality:


SELECT * from some_table 
WHERE field_value  
NOT IN (SELECT field_value FROM another_table)

Example

To identify the IP address of the top customer at Buttercup Games with the most purchases, you could run the following search:

sourcetype=access_* status=200 action=purchase
| top limit=1 clientip

Then, you could search the customer's purchase history by running the following search on the customer's IP address, which is 87.194.216.51:

sourcetype=access_* status=200 action=purchase clientip=87.194.216.51
| stats count, distinct_count(productId), values(productId) by clientip

But, what if the next time you run this search, someone else is the top customer? You would have to run the first search again to find out the new top customer's IP address and then rewrite the second search with that new IP address. Instead of going to all of that trouble, you could get the same results by using a subsearch to correlate the events with the IP address and pass the top customer's IP address to the main search every time you run the search:

sourcetype=access_* status=200 action=purchase 
    [ search sourcetype=access_* status=200 action=purchase 
    | top limit=1 clientip 
    | table clientip
        ] 
| stats count, distinct_count(productId), values(productId) by clientip

The search results look something like this:


clientip	count	dc(productId)	values(productId)
87.194.216.51	134	14	BS-AG-G09 CU-PG-G06 DB-SG-G01 DC-SG-G02 FI-AG-G08 FS-SG-G03 MB-AG-G07 MB-AG-T01 PZ-SG-G05 SC-MG-G10 WC-SH-A01 WC-SH-A02 WC-SH-G04 WC-SH-T02