from command: Examples

The following are examples for using the SPL2 from command. To learn more about the from command, see How the SPL2 from command works.

You can specify the clauses in the from command in uppercase or lowercase. These examples use uppercase for readability.

Note: Some of these examples start with the SELECT clause and others start with the FROM clause. Both of these clauses are valid syntax for the from command when it is used in a search.

1. Specify string values in quotations

The following search shows that string values in field-value pairs must be enclosed in double quotation marks.

Because string values must be enclosed in double quotation marks, you can reverse the order of field-value pairs. For example, the previous search can also be specified this way:

2. Specify a JSON array as a dataset literal

The following search uses a JSON array of nested objects as the source dataset:

The results look like this:

For a list of sample arrays that you can use, see Sample dataset literals in the SPL2 Search Manual.

3. Search a metric index

The following search looks for data in the _metrics index:

To use a wildcard in the WHERE clause, you cannot use the asterisk ( * ) wildcard character. You must use the like function. See Comparison and Conditional functions.

4. Pipeline examples

These examples show how to use the from command in a pipeline.

Select data to be processed in a pipeline

The following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor and then sends it to a destination, such as an index or Amazon S3 bucket. The $source parameter refers to an internal dataset in the Edge Processor or Ingest Processor.

$pipeline = | from $source | into $destination

Mask sensitive information using a pipeline

The following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor, replaces the credit card numbers in the _raw field with the word "<redacted>", and then sends the masked data to a destination. The $source parameter refers to an internal dataset in the Edge Processor or Ingest Processor.

$pipeline = | from $source 
| eval _raw=replace(_raw, /[1-5][0-9]{15}/i, "<redacted>") 
| into $destination

Filter extracted data in pipelines

The following example filters a set of audit logs so that only failed login attempts remain. First, the record types and result values must be extracted from the logs.

$pipeline = | from $source  
| rex field=_raw /type=(?P<RecordType>[A-Z_]+).*res=(?P<Result>\w+)/
| where RecordType = "USER_LOGIN"
| where Result = "failed"
| fields - RecordType, Result
| into $destination

5. Search using wildcards

You can use a wildcard character ( * ) in the SELECT clause to search for similar field names. You must enclose the wildcard syntax in single quotation marks. For example:

You can use a wildcard to search for only internal fields, which begin with an underscore ( _ ) character . For example:

The WHERE clause does not support the wildcard character ( * ). However you can use the like function to perform a wildcard search. For example:

The like function supports several syntaxes, see Comparison and Conditional functions.

6. Specify multiple expressions in the WHERE clause

Use the WHERE clause to filter the data by specifying one or more expressions. You need to separate multiple expressions using logical operators, such as AND and OR.

The WHERE clause uses the like function to perform a search with wildcard. The WHERE clause does not support the asterisk ( * ) wildcard character. For more information about the like function, see Comparison and Conditional functions.

For more information about logical operators, see Predicate expressions in the SPL2 Search Manual.

7. Search for multiple terms in events

You can search for multiple terms in your events by using a search literal in the WHERE clause. An AND operator is implied between the terms specified in the search literal. To specify a search literal, you enclose the list of terms in backtick characters ( ` ).

The following search looks for the terms invalid AND user AND sshd[5258] and returns the events that contain all three terms:

For more information, see Search literals in expressions in the SPL2 Search Manual.

8. Specify a single field in the GROUP BY clause

You can specify one or more fields to group by. In this example a single field, host, is specified.

When using the from command, if the GROUP BY clause is specified, the SELECT clause must also be specified. The SELECT clause must contain either an aggregation or the fields in the GROUP BY clause. In this example, the SELECT clause contains the aggregation avg(cpu_usage):

9. Specify a time span in the GROUP BY clause

You can arrange search results in groups using a time span.

When using the from command, if the GROUP BY clause is specified, the SELECT clause must also be specified.

The following search returns web access error information, grouped by host in 5 minute time spans.

There are several ways to specify a time span with the GROUP BY clause, see from command: Syntax.

10. Sorting search results using the ORDER BY clause

Suppose you use the following search to return count of the actions taken, grouped by the productId field.

The results look something like this:

productId count(action)
DC-SG-G02 12
FS-SG-G03 10
MB-AG-G07 17
PZ-SG-G05 4
SF-BVS-G01 11
SF-BVS-T01 6
WC-SH-G04 2
WC-SH-T02 15

By default the results are sorted on the GROUP BY field, productId.

You want to sort the results in descending order based on the count. However, the name of the count field in the output is the name of the aggregation specified in the SELECT clause, count(action). The ORDER BY clause will not sort on a field name that is an aggregation because it contains special characters, the parenthesis. You have two options, you can either rename the aggregation field count(action) in the SELECT clause using the AS keyword, or you can enclose the field name in single quotations, such as ORDER BY 'count(action)' DESC.

Here's the updated search using the rename option:

The results look something like this:

productId Count
MB-AG-G07 17
WC-SH-T02 15
DC-SG-G02 12
SF-BVS-G01 11
FS-SG-G03 10
SF-BVS-T01 6
PZ-SG-G05 4
WC-SH-G04 2

11. Enrich event data with a lookup dataset using the JOIN clause

Consider the following data from a set of events with login information:

_time action userID host port
8:00 AM 29 Nov 2021 Failed password patel yangtze.buttercupgames.com 3390
7:15 AM 29 Nov 2021 Failed password zhang nile.example.net 1851
9:30 PM 15 Nov 2021 Session opened dubois danube.sample.com 1260
6:11 AM 14 Nov 2021 Failed password sullivan volga.example.com 2766
11:20 AM 15 Nov 2021 Failed password martin volga.example.com 3622
08:13 AM 31 Oct 2021 Failed password mayer ganger.example.com 3658
11:59 PM 23 Oct 2021 Failed password patel yangtze.buttercupgames.com 1214

You want to enrich the event data with information from the host_info lookup dataset, which contains information about known hosts:

hostname kind status host_contact
mekong.buttercupgames.com internal allowed alex@buttercupgames.com
yangtze.buttercupgames.com internal allowed claudia@buttercupgames.com
danube.sample.com supplier allowed martin@sample.com
ganger.example.com external allowed david@example.com
volga.example.com external banned

Specifically, you want every event that matches the search criteria to appear in the search results. If there is a match between an event and the host_info lookup dataset, you want to display the kind and status from the host_info lookup dataset with each event. This is referred to as a left join, which is shown in the following image.

An image that shows a Left Join using a Venn diagram. The Venn diagram has two intersecting circles, circle A and circle B. Circle A is completely shaded, including the portion of the circle where it overlaps with circle B.

The A circle represents the event dataset and the B circle represents the host_info lookup dataset.

The following example enriches data in the main event dataset with data from the host_info lookup dataset, where there is a matching host name. An alias for each dataset is created using the AS clause. The WHERE clause filters out events where the host kind is not internal. The SELECT clause specifies which fields to return. The results are organized by the host field.

Note: When you use the JOIN clause, the aliases you specify in the search are not propagated to the search results. For example, in this search you specified m.host, but the search results display host.

The results of this search are shown in the following table. As you can see, the events that have a host with a kind of internal, the buttercupgames.com hosts, have been removed. The results also show that there is no host information for the nile.example.net host.

12. Use consecutive JOIN clauses to return data from multiple datasets

You can create a stacked join search that uses multiple JOIN clauses to return data from multiple datasets.

Consider the following data from a set of events in the orders dataset:

_time clientip action pid quantity
12:00:01 PM 20 Jan 2022 192.0.2.0 purchase DC-SG-G02 1
10:13:34 AM 20 Jan 2022 203.0.113.255 addtochart MB-AG-G07 3
9:55:51 AM 20 Jan 2022 203.0.113.0 purchase WC-SH-A01 1
9:21:25 AM 20 Jan 2022 198.51.100.255 changequantity PZ-SG-G05 2
9:14:17 AM 20 Jan 2022 192.0.2.0 purchase SF-BVS-01 1
8:42:23 AM 20 Jan 2022 198.51.100.0 purchase SF-BVS-G01 1
8:30:45 AM 20 Jan 2022 192.0.2.0 purchase WC-SH-T02 2
7:57:14 AM 20 Jan 2022 198.51.100.0 purchase PZ-SG-G05 1

You want to enrich the orders event data with information from the products lookup dataset, which contains product and price information. Here is an example of the data in the products dataset:

productId product_name price sale_price supplierId
DC-SG-G02 Dream Crusher 39.99 24.99 1238
FS-SG-G03 Final Sequel 24.99 16.99 5017
WC-SH-G04 World of Cheese 24.99 19.99 7024
WC-SH-T02 World of Cheese Tee 19.99 16.99 7024
PZ-SG-G05 Puppies vs. Zombies 4.99 3.99 7045
MB-AG-G07 Manganiello Bros. 38.99 27.99 4111
SF-BVS-G01 Grand Theft Scooter 26.99 21.99 5007
SF-BVS-01 Pony Run 49.99 41.99 5007

You want to display the product names instead of the product IDs in your search results.

You want every order event that matches the search criteria to appear in the results, even if the item ordered does not have a matching entry in the products lookup dataset. Notice that the third order contains the product ID WC-SH-A01, which does not appear in the products lookup dataset.

You can display the product names in the search results by including a JOIN clause to your search that enriches the orders dataset with the data from the products dataset. Specifically, you need to use a LEFT JOIN to accomplish this result. The datasets are joined on the field that the datasets have in common, which is the product ID field.

Here is the search you can use to add the product names to the orders events:

The results look like this:

Because there is no matching product ID for WC-SH-A01 in the products dataset, there is no product name in the search results. Using a LEFT JOIN is a way to highlight missing information from the second, or right-side, dataset.

Now you want to find out the name and city of the supplier that provides each product. You can enrich the search results data with information from the suppliers lookup dataset, based on the supplier ID.

Here is an example of the data in the suppliers dataset:

supplier_id supplier_name city state/province country
1009 Mile High Games Denver Colorado United States
1237 Area 51 Games Roswell New Mexico United States
4111 Isthmus Pastimes Panama City Panama Panama
5007 EuroToys Prague Central Bohemia Czech Republic
5017 Der Kriegsspiel Cologne North Rhine-Westphalia Germany
7024 Happy Fun Games Kyoto Kyoto Japan
7045 Kiwi Game Warehouse Auckland Auckland New Zealand

To display the supplier names and city in your search results you need to add another JOIN clause to your search. Because you want every product in the search results returned, whether or not there is a corresponding supplier, you will use a LEFT JOIN. The products and suppliers datasets can be joined on the supplier ID field.

Looking back at the products dataset, the Dream Crusher product has a supplier ID of 1238, which does not appear in the suppliers dataset. The LEFT JOIN will highlight the absence of this information.

Here is the updated search:

The results of the search look like this:

13. Return data from a view

This search returns the timestamp and client IP fields from a view called mysecurityview.

14. Use the HAVING clause to filter after aggregations

The following example calculate the sum of the bytes field in the main index from events that occurred in the last 5 minutes. The results are grouped by the host field. The sum and the host fields are returned, where the sum of the bytes is greater than I MB.

15. Specify offsets and limits

The following search returns web access error information, grouped by host and 5 minute time spans, that have a count greater than 10. The LIMIT clause is used to return up to 50 results. The OFFSET clause is used to skip the first 20 results, starting with the 21st result.