rex command: Overview, syntax, and usage
Use the SPL2 rex
command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.
The rex
command matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names.
When mode=sed
, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. This sed-syntax is also used to mask sensitive data at index-time.
_raw
field. Running the rex
command against the _raw
field might have a performance impact.Use the rex
command for search-time field extraction or string replacement and character substitution.
Use these links to quickly navigate to the main sections in this topic:
Syntax
The required syntax is in bold.
rex
[field=<field>] [max_match=<int>] [offset_field=<string>]
( <regex-expression> | mode=sed <sed-expression> )
Required arguments
You must specify either <regex-expression>
or mode=sed <sed-expression>
when you use the rex
command.
regex-expression
Syntax: <string>
Description: The regular expression using the perl-compatible regular expressions (PCRE) format that defines the information to match and extract from the specified field. Quotation marks are required.
mode
Syntax: mode=sed
Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
sed-expression
Syntax: <string>
Description: When mode=sed
, specify whether to replace strings (s) or substitute characters (y) in the matching regular expression. No other sed commands are implemented. Quotation marks are required. Sed mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the string.
Optional arguments
field
Syntax: field=<field>
Description: The field that you want to extract information from.
Default: _raw
max_match
Syntax: max_match=<int>
Description: Controls the number of times the regular expression is matched. If greater than 1, the resulting fields are multivalued fields. You can use 0 for unlimited matches.
Default: 1
offset_field
Syntax: offset_field=<string>
Description: If provided, a field is created with the name specified by <string>. The value of this field has the endpoints of the match in terms of zero-offset characters into the matched field. For example, if the rex expression is (?<tenchars>.{10})
, this matches the first ten characters of the field, and the offset_field
contents is 0-9
.
Default: None
Usage
SPL2 supports perl-compatible regular expressions (PCRE) for regular expressions.
Pipe characters
A pipe character ( | ) is used in regular expressions to specify an OR condition. For example, A or B is expressed as A | B.
Because pipe characters are used to separate commands in SPL2, you must enclose a regular expression that uses the pipe character in double quotation marks. For example:
...| rex "expression | with pipe"
This is interpreted by SPL2 as a search for the text "expression" OR "with pipe".
Escaping characters with backslashes
The backslash ( \ ) character is used to ignore, or escape, most special characters in regular expressions.
Character classes and string expressions
Regular expressions that include a character class, such as \d
or \w
, can be specified using one of two methods. The following table describes the methods and shows an example:
Description | Example |
---|---|
Enclose the string expression in quotation marks and escape the backslash character in the character class. |
|
Enclose the string expression in forward ( / ) slashes. You don't need to escape the backslash character in the character class. |
|
Period characters
The period ( . ) character is used in a regular expression to match any character, except a line break character. If you want to match a period character, you must escape the period character by specifying \.
in your regular expression.
Asterisk characters
The asterisk ( * ) character is a reserved character in SPL2 and can't be escaped. SPL2 uses the asterisk as a wildcard character.
Double backslash characters
When a search includes a regular expression that contains a double backslash, for example to represent a file path like c:\\temp
, the search interprets the first backslash as an escape character. The file path is interpreted as c:\temp
. One of the backslashes is removed.
You must escape both backslash characters in a file path by specifying 4 consecutive backslashes for the root portion of the file path. For example: c:\\\\temp
. For a longer file path, such as c:\\temp\example
, you would specify c:\\\\temp\\example
in your regular expression.
Sed expression
When using the rex
command in sed mode, you have two options: replace (s) or character substitution (y).
The syntax for using sed to replace (s) text in your data is: s/<regex>/<replacement>/<flags>
- <regex> is a PCRE regular expression in searches and in pipelines, which can include capturing groups.
- <replacement> is a string to replace the regex match. Use
n
for back references, where "n" is a single digit. - <flags> can be either:
g
to replace all matches, or a number to replace a specified match.
The syntax for using sed to substitute characters is: y/<string1>/<string2>/
- This substitutes the characters that match <string1> with the characters in <string2>.
Differences between SPL and SPL2
The differences between the SPL and SPL2 rex
command are described in these sections.
Support for raw string literals
SPL2 supports raw string literals.
Options must be specified before the expressions
The field
option must be specified before the <regex-expression> or <sed-expression> argument.
Version | Example | Example |
---|---|---|
SPL | ...rex "From: (?<from>.*) To: (?<to>.*)" field=myfield | ...rex "From: (?<from>.*) To: (?<to>.*)" max_match=10 offset_field=newofield |
SPL2 | ...rex field=myfield "From: (?<from>.*) To: (?<to>.*)" | ...rex max_match=10 offset_field=newofield "From: (?<from>.*) To: (?<to>.*)" |
The max_match
and offset_field
options must be specified before the <regex-expression> argument.
Version | Example |
---|---|
SPL | ...rex "From: (?<from>.*) To: (?<to>.*)" max_match=10 offset_field=newofield |
SPL2 | ...rex max_match=10 offset_field=newofield "From: (?<from>.*) To: (?<to>.*)" |
See also
rex command
Related information
About Splunk regular expressions in the SPL2 Search Manual
SPL2 and regular expressions in the SPL2 Search Manual