lookup command: Overview, syntax, and usage
The SPL2 lookup command enriches your source data with related information that is in a lookup dataset. Field-value pairs in your source data are matched with field-value pairs in a lookup dataset. You can either append to or replace the values in the source data with the values in the lookup dataset.
Use these links to quickly navigate to the main sections in this topic:
How the SPL2 lookup command works
The following lookup dataset, named products, contains product information and prices for a set of board and card games. For example:
| product_id | product_name | price |
|---|---|---|
| DB-SG-G01 | Mediocre Kingdoms | 24.99 |
| DC-SG-G02 | Dream Crusher | 39.99 |
| FS-SG-G03 | Final Sequel | 24.99 |
| WC-SH-G04 | World of Cheese | 24.99 |
The events contain the field productID. A search was run to summarize the total number of purchase transactions, the total number of products purchased, and the product IDs. The results are organized by ipaddress.
This is a sample of the search results. The products are identified by the productID.
| ipaddress | total_purchases | total_products | productID |
|---|---|---|---|
| 107.3.146.207 | 72 | 3 | DB-SG-G01
FS-SG-G03 WC-SH-G04 |
| 128.241.220.82 | 95 | 2 | DB-SG-G01
DC-SG-G02 |
| 194.215.205.19 | 60 | 4 | DB-SG-G01
DC-SG-G02 FS-SG-G03 WC-SH-G04 |
| 211.166.11.101 | 91 | 2 | DB-SG-G01
WC-SH-G04 |
| 87.194.216.51 | 134 | 3 | DC-SG-G02
FS-SG-G03 WC-SH-G04 |
You can use the lookup command to lookup the product_id in the products dataset, match that with the productID in the events, and return the product_name.
...| lookup products product_id AS productID OUTPUT product_name
The results look like this:
| ipaddress | total_purchases | total_products | productID | product_name |
|---|---|---|---|---|
| 107.3.146.207 | 72 | 3 | DB-SG-G01
FS-SG-G03 WC-SH-G04 |
Dream Crusher
Final Sequel World of Cheese |
| 128.241.220.82 | 95 | 2 | DB-SG-G01
DC-SG-G02 |
Mediocre Kingdoms
Dream Crusher |
| 194.215.205.19 | 60 | 4 | DB-SG-G01
DC-SG-G02 FS-SG-G03 WC-SH-G04 |
Mediocre Kingdoms
Dream Crusher Final Sequel World of Cheese |
| 211.166.11.101 | 91 | 2 | DB-SG-G01
WC-SH-G04 |
Mediocre Kingdoms
World of Cheese |
| 87.194.216.51 | 134 | 3 | DC-SG-G02
FS-SG-G03 WC-SH-G04 |
Dream Crusher
Final Sequel World of Cheese |
Syntax
The SPL2 lookup command supports different syntaxes in different product contexts.
Syntax for searches
The required syntax is in bold.
lookup lookup-dataset (lookup-field [AS event-field] )...
[ (OUTPUT | OUTPUTNEW) ( lookup-destfield [AS event-destfield] )...]
Syntax for pipelines
The required syntax is in bold.
lookup [casesensitive=boolean] [maxmatches=int] [minmatches=int] [defaultmatch="string"]
lookup-dataset (lookup-field [AS event-field] )...
[ (OUTPUT | OUTPUTNEW) ( lookup-destfield [AS event-destfield] )...]
Required arguments
lookup-dataset
Syntax: string
Description: The name of the lookup table that is defined in your datasets.
lookup-field
Syntax: [match-type-function] string
Description: A field in the lookup dataset to match against the events. You can specify multiple lookup-field values.
Searches: This argument must specify a field name.
Pipelines: This argument can specify a field name or a match type function that contains a field name. The following match type functions are supported:
| Function | Description |
|---|---|
cidr() |
Events and lookup entries are matched using CIDR matching. A match occurs if the event value is an IP address that belongs to the CIDR block indicated by the lookup value. |
exact() |
Events and lookup entries are matched only if both field values are exactly the same. Case-insensitive matches are allowed if the |
wildcard() |
Events and lookup entries are matched using regular expressions. A match occurs if the event value matches the regular expression in the lookup field. |
If you do not specify a match type function, exact() is used by default.
Optional arguments
casesensitive
Syntax: casesensitive=boolean
Description: Supported only in Edge Processor and Ingest Processor pipelines. Specifies whether the values in the lookup dataset and the event must use the same letter casing in order to be matched.
Default: false
defaultmatch
Syntax: defaultmatch="string"
Description: Supported only in Edge Processor and Ingest Processor pipelines. The string value to output when the number of lookup matches for an event is less than the minmatches value. For an example that demonstrates how this argument works, see "Using regular expression matching and enforcing a minimum number of matches" in lookup command: Examples.
Default: None
event-destfield
Syntax: AS string
Description: A field in the events. You can specify multiple event-destfield values. If the name of the event-destfield is the same as the lookup-destfield, you don't need to specify the event-destfield. The name of the lookup-destfield is used. This argument is used with OUTPUT | OUTPUTNEW to replace or append field values.
Default: The value of lookup-destfield.
event-field
Syntax: AS string
Description: A field in the incoming events to match with a field in the lookup-dataset. You can specify multiple event-field values. You don't need to specify the event-field if the name of the event-field is the same as the name of the lookup-field.
Searches: This argument can specify a field name.
Pipelines: This argument can specify a field name, or an expression that evaluates to a value to be matched against the values in the lookup-field.
Default: The name specified in the lookup-field argument.
lookup-destfield
Syntax: string
Description: A field in the lookup table to be applied to the events. You can specify multiple lookup-destfield values. This argument is used with OUTPUT | OUTPUTNEW to replace or append field values.
Searches: This argument can specify a field name.
Pipelines: This argument can specify a field name, or an expression that evaluates to a value to be output in the event-destfield.
Default: All fields are applied to the events if no fields are specified.
maxmatches
Syntax: maxmatches=int
Description: Supported only in Edge Processor and Ingest Processor pipelines. The maximum number of lookup matches to return for each event. You can set this argument to an integer between 1 and 1000, inclusive.
Default: 100
minmatches
Syntax: minmatches=int
Description: Supported only in Edge Processor and Ingest Processor pipelines. The minimum number of lookup matches to return for each event. When the number of valid matches is less than the minmatches value, the pipeline outputs the value specified in the defaultmatch argument to make up the difference. For an example that demonstrates how this argument works, see "Using regular expression matching and enforcing a minimum number of matches" in lookup command: Examples.
You can set this argument to any non-negative integer that is less than the maxmatches value. When this argument is set to 0, the command does not require a minimum number of lookup matches.
Default: 0
OUTPUT | OUTPUTNEW
Syntax: OUTPUT | OUTPUTNEW
Description: Specifies whether to replace or append values from the lookup dataset to the events. OUTPUT replaces values in existing event fields with values from the lookup dataset. Where there is no value in a field, OUTPUT adds values from the lookup dataset to the event fields. OUTPUTNEW appends fields and values from the lookup dataset to the events. If the events already have the fields specified in lookup-field, the OUTPUTNEW argument only fills in missing values in those fields. OUTPUT and OUTPUTNEW must be specified in uppercase.
Default: OUTPUT
Usage
If an OUTPUT or OUTPUTNEW clause is not specified, all of the fields in the lookup table that are not the match field are used as output fields.
If the OUTPUT clause is specified, the output lookup fields overwrite existing fields with the same name.
If the OUTPUTNEW clause is specified, the lookup is not performed for events in which the output fields already exist.
Optimization
Whenever possible, perform lookups after transforming commands like stats and timechart.
A transforming command acts like a filter. Running the transforming command before the lookup can minimize the work that the lookup command must do, if the field needed for the lookup is retained by the transforming command.
Here's an example of an optimized search. The transforming command stats is before the lookup command. The stats command retains the status field, which is the field needed for the lookup.
from <dataset> where sourcetype=access_* | stats count() by status | lookup status_desc status OUTPUT description
Here's the same search, but it is not optimized. The lookup is before the transforming command stats. In this example the stats command does not retain the status field needed for the lookup. The stats command includes the description field. There is no optimization advantage to running the stats command before the lookup.
from <dataset> where sourcetype=access_* | lookup status_desc status OUTPUT description | stats count() by description
The lookup in the first search is faster because it only needs to match the results of the stats command and not all the Web access events.
Differences between SPL and SPL2
The differences between the SPL and SPL2 lookup command are described in these sections.
The command options have been removed
The command options local and update are not supported in SPL2.
| Version | Example |
|---|---|
| SPL | ... lookup [local=<bool>] update=<bool> <lookup-dataset>... |
| SPL2 | Not supported |
The list of lookup fields must be comma-delimited
| Version | Example |
|---|---|
| SPL | ... lookup lookupDataset key1 AS field1 key2 AS field2 |
| SPL2 | ... lookup lookupDataset key1 AS field1, key2 AS field2 |
The list of output fields must be comma-delimited
| Version | Example |
|---|---|
| SPL | ... lookup lookupDataset key1 AS field1 OUTPUT out1 AS event1 out2 AS event2 |
| SPL2 | ... lookup lookupDataset key1 AS field1 OUTPUT out1 AS event1, out2 AS event2 |
See also
lookup command