Anatomy of a search

The anatomy of a search

About the search pipeline

The "search pipeline" refers to the structure of a Splunk search, in which consecutive commands are chained together using a pipe character, "|". The pipe character tells Splunk software to use the output or result of one command (to the left of the pipe) as the input for the next command (to the right of the pipe). This enables you to refine or enhance the data at each step along the pipeline until you get the results that you want.

A Splunk search starts with search terms at the beginning of the pipeline. These search terms are keywords, phrases, boolean expressions, key/value pairs, etc. that specify which events you want to retrieve from the index(es). See "About retrieving events".

The retrieved events can then be passed as inputs into a search command using a pipe character. Search commands tell Splunk software what to do to the events after you retrieved them from the index(es). For example, you might use commands to filter unwanted information, extract more information, evaluate new fields, calculate statistics, reorder your results, or create a chart. Some commands have functions and arguments associated with them. These functions and their arguments enable you to specify how the commands act on your results and which fields to act on; for example, how to create a chart, what kind of statistics to calculate, and what fields to evaluate. Some commands also enable you to use clauses to specify how you want to group your search results.

Quotes and escaping characters

Generally, you need quotes around phrases and field values that include white spaces, commas, pipes, quotes, or brackets. Quotes must be balanced, an opening quote must be followed by an unescaped closing quote. For example:

  • A search such as error | stats count will find the number of events containing the string error.
  • A search such as ... | search "error | stats count" would return the raw events containing the literal string error, a pipe character ( | ) , stats, and count, in that order.

Additionally, you want to use quotes around keywords and phrases if you don't want to search for their default meaning, such as Boolean operators and field-value pairs. For example:

  • A search for the keyword AND without meaning the Boolean operator: error "AND"
  • A search for this field-value phrase: error "startswith=Error"

The backslash character ( \ ) is used to escape quotes, pipes, and itself. Backslash escape sequences are still expanded inside quotes. For example:

  • The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe split between commands.
  • The sequence \" will send a literal quote to the command, for example for searching for a literal quotation mark or inserting a literal quotation mark into a field using rex.
  • The \\ sequence will be available as a literal backslash in the command.

If Splunk software does not recognize a backslash sequence, it will not alter it.

  • For example, \s in a search string will be available as \s to the command, because \s is not a known escape sequence.
  • However, in the search string, \\s will be available as \s to the command, because \\ is a known escape sequence that is converted to \.

Asterisks ( * ) cannot be searched for using a backslash to escape the character. Splunk software treats the asterisk character as a major breaker. Because of this, it will never be in the index. If you want to search for the asterisk character, you have to run a post-filtering regex search on your data:

For more information about major breakers, read "Overview of event processing" in the Getting Data in Manual.

Examples

Example 1: The myfield field is created with the value of 6.

Example 2: The myfield field is created with the value of ".

Example 3: The myfield field is created with the value of \.

Example 4: This search would produce an error because of unbalanced quotation marks.

Fields

Events and results flowing through the Splunk search pipeline exist as a collection of fields. Fields can fundamentally come from the Splunk index, for example, _time as the time of the event, source as the filename, and so on. Or can be derived from a wide variety of sources at search time, such as eventtypes, tags, regex extractions using the rex command, totals coming from the stats command, and so on.

For a given event, a given field name might be present or absent. If present, it might contain a single value or multiple values. Each value is a text string. Values might be of positive length (a string, or text) or zero length (empty strings, or "").

Numbers, for example, are strings that contain the number. For example, a field containing a value of the number 10 contains the characters 1 and 0: "10". Commands that take numbers from values automatically convert them internally to numbers for calculations.

Null field

A null field is not present on a particular result or event. Other events or results in the same search might have values for this field. For example, the fillnull command adds a field and default value to events or results that lack fields present on other events or results in the search.

Empty field

An empty field is shorthand for a field that contains a single value that is the empty string.

Empty value

A value that is the empty string, or "". You can also describe this as a zero-length string.

Multivalue field

A field that contains more than one value. For example, events such as email logs often have multivalue fields in the To: and Cc: information. See Manipulate and evaluate fields with multiple values in the Search Manual.