stats command: Examples

The following are examples for using the SPL2 stats command. To learn more about the stats command, see How the SPL2 stats command works.

Many of these examples use the statistical functions. See Overview of SPL2 stats and chart functions.

The AS and BY keywords are displayed in uppercase in the syntax and examples to make the syntax easier to read. You can specify the AS and BY keywords in uppercase or lowercase in your searches.

1. Calculating the sum of a field

If you just want a simple calculation, you can specify the aggregation without any other arguments. For example:

CODE

This search summarizes the bytes for all of the incoming results. One row is returned with one column. The name of the column is the name of the aggregation. For example:

sum(bytes)
3195256256

2. Grouping the results by a field

This example takes the incoming result set and calculates the sum of the bytes field and groups the sums by the values in the host field.

CODE

The results contain as many rows as there are distinct host values. There are two columns returned: host and sum(bytes). If you don't specify a name for the results using the " AS field" syntax, then the names of the columns are the name of the by-clause field and the name of the aggregation.

If there are two distinct hosts, the results are returned as a table similar to this:

host sum(bytes)
host1 3123124124
host2 72132132

3. Specifying multiple aggregations and multiple by-clause fields

You can also specify more than one aggregation and by-clause with the stats command. You can rename the output fields using the AS field clause. For example:

CODE

This search organizes the incoming search results into groups based on the combination of host and sourcetype. It returns the sum of the bytes in the Sum of bytes field and the average bytes in the Average field for each group. If there are two distinct hosts and two distinct sourcetypes, the search will produce results similar to this:

host sourcetype Sum of bytes Average
host1 webaccess 3123124124 3782
host2 webaccess 72132132 3742
host1 json 32132 3213

4. Specifying a time span in the BY clause

This example counts the values in the action field and organized the results into 30 minute time spans. When you use the span argument, the field you use in the by-clause must be either the _time field, or another field with values in UNIX time. For example:

CODE

5. Updating metadata fields after aggregation in pipelines

The following is an example of an Edge Processor or Ingest Processor pipeline that calculates the sum of bytes_out, groups the sums by the values in the server_name and sourcetype fields, and then updates the sourcetype value to be prefixed by orig_. When you set the mode argument to summary, the pipeline updates the sourcetype value to prevent aggregated events from being categorized under the same source type as non-aggregated events.

PYTHON
$pipeline = | from $source 
| stats mode="summary" sum(bytes_out) BY server_name, sourcetype 
| into $destination;

6. Specifying the size of the state window for Edge Processor aggregations

Edge Processors instances calculate aggregations based on the events that are included in a given state window. The following is an example of an Edge Processor pipeline that calculates the sum of bytes_out and groups the sums by the values in the server_name field, based on the set of events that the Edge Processor instance is able to hold and aggregate before 10 minutes have elapsed or 1 gigabyte of disk space has been consumed:

PYTHON
$pipeline = | from $source 
| @maxdelay("10m") @maxdisk("1GB") stats sum(bytes_out) BY server_name 
| into $destination;