Write and run federated searches over remote datasets with SPL2
Use SPL2 to write and run federated searches over remote datasets.
Use SPL2 to write and run federated searches over remote datasets in Amazon S3, Azure Blob Storage and Azure Data Lake Storage containers, Azure Databricks delta shares, and more.
SPL2 is fully compatible and can operate in parallel with SPL.
If you are new to SPL2, see the SPL2 Overview.
For an introduction to the SPL2 user interfaces, see Searching data using SPL2 in the SPL2 Search Manual.
For detailed information about any of the SPL2 command syntax shown in these examples, refer to the SPL2 Search Reference.
Getting started with SPL2
SPL2 is an intuitive language that supports both SPL and SQL syntax patterns.
- If you prefer SPL-like syntax, then use the
searchcommand. - If you prefer SQL-like syntax, then use the
fromcommand. This command supports a flexible syntax, so you can choose to start your search using the FROM clause or the SELECT clause.
from command are often more efficient than SPL2 federated searches that use the search command.
from command clause names are shown in uppercase for readability and alignment with SQL syntax. You can specify clause names in uppercase or lowercase. However, you cannot specify clause names in mixed case, such as Group By.
The following table provides examples of valid searches that use SPL-like or SQL-like syntax.
| Scenario | SPL-like syntax | SQL-like syntax |
|---|---|---|
Search a dataset named mydataset. |
CODE
|
CODE
|
Search a dataset named mydataset. Filter its data for events where the value of the status field is 200, the value of the host field is www2. In the search results, include only the fields named _raw, _time, productId and categoryId. |
CODE
|
CODE
|
Search a dataset named mydataset and return the total value of the bytes field for each distinct clientip value in the data. |
CODE
|
CODE
|
When you use SPL-like syntax for an SPL2 search in the Search bar, if the first expression is anything other than index=<index_or_dataset_name>, you must include the search command at the beginning of the search. For example, search 404 index=mydataset is a valid search, but 404 index=mydataset is not.
You cannot mix syntax styles in the same search. For example, this search returns a syntax error: SELECT * index=dataset status=200 host=www2
index=mydataset returns 0 results. But the SPL2 search index=mydataset does return results.
For more information about SPL2 limitations for federated searches, see Federated search limitations.
Converting SPL syntax to SPL2 syntax
sdselect command, consider converting them to SPL2, as sdselect is deprecated and will be removed in an upcoming release.
You see the Convert to SPL2 button on the Search page when the language picker is set to SPL. When you select Convert to SPL2 and the conversion succeeds, the contents in the Search bar are updated into SPL2 and the language picker setting changes to SPL2.
For example, say you have this SPL search:
| sdselect count(scan_count), sum(scan_count) FROM my_csv_data GROUPBY host LIMIT 5
If you select Convert to SPL2, that SPL search changes to this SPL2 search:
FROM my_csv_data GROUP BY host SELECT count(scan_count), sum(scan_count) LIMIT 5
Search and filter data
You can specify field-value pairs in the search or from command to filter for events that contain those combinations of fields and values. You can also specify a time range to filter for events that occurred during that time range.
For information about the supported time range syntax, see Time modifiers in the SPL2 Search Manual.
For example, the following searches return events from an Amazon S3 dataset named my_s3_bucket that pass all of the following filtering conditions:
- The value in the
statusfield is200. - The value in the
hostfield iswww4. - The value in the
actionfield ispurchase. - The timestamp of the event is stored in the
_timefield and indicates that the event happened within the past hour.
| Syntax | Example |
|---|---|
Using the search command |
CODE
|
Using the from command, starting with the FROM clause |
CODE
|
Using the from command, starting with the SELECT clause |
CODE
|
Join two Amazon S3 datasets
You can use the JOIN clause supported in the from command to join two Amazon S3 datasets.
For example, the following searches create an inner join between the customer_id field from the s3_bucket_1 dataset and the cust_id field from the s3_bucket_2 dataset.
| Syntax | Example |
|---|---|
Using the from command, starting with the FROM clause |
CODE
|
Using the from command, starting with the SELECT clause |
CODE
|
Aggregate data and update event fields
You can use the stats command or the SELECT and GROUP BY clauses supported in the from command to calculate aggregation statistics. Then you can use the eval command to add a new field to your events.
| Syntax | Example |
|---|---|
Using the search command |
CODE
|
Using the from command, starting with the FROM clause |
CODE
|
Using the from command, starting with the SELECT clause |
CODE
|