stats command: Overview, syntax, and usage

The SPL2 stats command calculates aggregate statistics, such as average, count, and sum, over the incoming search results set or pipeline data. This is similar to SQL aggregation.

If the stats command is used without a BY clause, only one row is returned, which is the aggregation over the entire incoming result set. If a BY clause is used, one row is returned for each distinct value in the field specified in the BY clause.

Use these links to quickly navigate to the main sections in this topic:

How the SPL2 stats command works

What's important to remember about the SPL2 stats command is that the command returns only the fields used in the aggregation.

Suppose these are some of the events in your dataset:

_time host action quantity productId method
6 Apr 2025 9:39:48.000 PM www2 purchase 1 PZ-SG-G05 POST
6 Apr 2025 9:34:10.000 PM www1 view 1 GET
6 Apr 2025 9:34:02.000 PM www3 purchase 2 SC-MG-G10 POST
6 Apr 2025 9:34:01.000 PM www2 remove 1 CU-PG-G06 GET
6 Apr 2025 9:34:01.000 PM www1 purchase 3 POST
6 Apr 2025 9:29:55.000 PM www3 addtocart 2 SC-MG-G10 GET
6 Apr 2025 9:20:51.000 PM www1 addtocart DB-SG-G01 GET
6 Apr 2025 9:12:56.000 PM www2 changequantity 2 FS-SG-G03 GET
6 Apr 2025 9:12:53.000 PM www1 1 DB-SG-G01 GET

Using functions

You can use a wide range of statistical functions with the stats command. See Quick Reference for SPL2 Stats and Charting Functions.

The following stats command performs several aggregate calculations. When you perform more than one aggregation, separate each aggregation with a comma.

CODE

The results look like this:

Grouping results

Use a BY clause when you want to group aggregations by a specific field.

Note: The BY keyword is displayed in uppercase in the syntax and examples to make the syntax easier to read. You can specify this keyword in uppercase or lowercase.

The following stats command groups the results by the action field:

CODE

The results look like this:

You can perform an aggregation on one field and group the results by another field. The following stats command groups the results by the host field:

CODE

The results look like this:

Renaming fields

Use the AS clause to rename a field. The following stats command renames the count(action) field to count:

CODE

The results look like this:

For additional examples, see stats command: Examples.

Note: In searches, you can also use the from command to specify aggregate functions, group by a field, and rename a field. See the Usage section for examples.

Syntax

The SPL2 stats command supports different syntaxes in different product contexts.

Syntax for searches

The required syntax is in bold.

stats

[allnum = boolean] [delim = "string"] [partitions = num]

aggregation ...

( [by-clause] [span=time-span] )

Syntax for pipelines

The required syntax is in bold. Syntax that starts with the at ( @ ) sign represents

annotations. Annotations for commands must be specified before the command name.

[@maxdelay("string")] [@maxdisk("string")]

stats

[prestats = string] [mode = string]

aggregation ...

( [by-clause] [span=time-span] )

Required arguments

aggregation

Syntax: aggregate-function "("field [as field] ")" ["," aggregate-function "("field [as field] ")" ]...

Description: A statistical aggregation function. The function can be applied to an eval expression, or to one or more fields. By default, the name of the field used in the output is the same as your aggregate function. For example, if your stats command is ... | stats sum(bytes) the field name in the output is sum(bytes). Use the as clause to place the result into a new field with a name that you specify, for example ... | stats sum(bytes) as 'sum of bytes'.

For a description of the aggregate functions that you can use with the stats command, see Aggregate functions.

For information about whether a given aggregate function is supported in searches or pipelines, see Compatibility Quick Reference for SPL2 statistical functions.

Optional arguments

allnum

Syntax: allnum=boolean

Description: Supported only in searches. If set to true, computes numerical statistics on each field if and only if all of the values of that field are numerical.

Default: false

by-clause

Syntax: BY field [span=timespan] ["," field [span=timespan] ]...

Description: The name of one or more fields to group the results by. You can specify a time span to apply to the grouping. The by-clause returns one row for each distinct value in the by-clause fields. You cannot use the wildcard character to specify multiple fields with similar names. You must specify each field separately.

Default: If no by-clause is specified, the stats command returns only one row, which is the aggregation over the entire incoming set of data.

For information about timespans, see Specifying time spans in the SPL2 Search Manual.

delim

Syntax: delim=string

Description: Supported only in searches. Specifies how the values in the list() or values() functions are delimited.

Default: A single space

mode

Syntax: mode=summary | passthrough

Description: Supported only in Edge Processor and Ingest Processor pipelines, and applies only to aggregations where the BY clause contains any of the following Splunk metadata fields: host, source, sourcetype, and index. This argument determines whether the pipeline changes the names of the specified metadata fields, in order to differentiate aggregated data from non-aggregated data. You can set this argument to one of the following:
  • summary: The pipeline prefixes the metadata fields specified in the BY clause with orig_.

  • passthrough: The pipeline does not change the names of the metadata fields.

Default: summary

partitions

Syntax: partitions=num

Description: Supported only in searches. If specified, partitions the incoming search results based on the by-clause fields for multithreaded reduce. The partitions argument runs the reduce step (in parallel reduce processing) with multiple threads in the same search process on the same machine. Compare that with parallel reduce that runs the reduce step in parallel on multiple machines.

Default: 1

prestats

Syntax: prestats=raw | fields

Description: Supported only in Edge Processor and Ingest Processor pipelines. This argument determines whether and how the pipeline puts the aggregated data into "prestats" format. Prestats is a Splunk internal format that is used for summary indexing. When data is in prestats format, additional data values are stored in fields that have names prefixed by psrsvd_. You can set this argument to one of the following:
  • raw: The pipeline puts data into prestats format, and stores the psrsvd_-prefixed fields inside the _raw field.

  • fields: The pipeline puts data into prestats format, and stores the psrsvd_-prefixed fields as top-level event fields.

If this argument is not specified, the pipeline does not put data into the prestats format.

Default: None

Optional annotations

@maxdelay

Syntax: @maxdelay("string")

Description: Supported only in Edge Processor pipelines. Specifies the maximum span of time for which the Edge Processor can hold and aggregate incoming events before it emits the result. You can set this annotation to an integer followed by a timescale, such as 5m for 5 minutes. For more information, see Specifying time spans in the SPL2 Search Manual.

Default: 30s

@maxdisk

Syntax: @maxdisk("string")

Description: Supported only in Edge Processor pipelines. Specifies the maximum amount of disk space for which the Edge Processor can hold and aggregate incoming events before it emits the result. You can set this annotation to an integer followed by one of the following abbreviations for a unit of measurement:
Abbreviation Unit of measurement
B Bytes
KB Kilobytes
MB Megabytes
GB Gigabytes

Example: @maxdisk("10KB").

Default: 1MB

Usage

Using the from command instead

In searches, most of the things you can do with the stats command are also possible using the from command.

For example, if your stats command is ...| stats count() BY host, the following searches return the same results:

from command alternatives
FROM dataset GROUP BY host SELECT count(), host
SELECT count(), host FROM dataset GROUP BY host

For more information, see from command: Overview.

Aggregating multivalue fields

When you perform an aggregation over a multivalue field, each of the values in the field is included in the aggregation. Suppose that you have this set of data:

fieldX fieldY
A 1
A [2,3,4]
B [5,6,7,8,9]

When you run this stats command ...| stats count, count(fieldY), sum(fieldY) BY fieldX, these results are returned:

fieldX count count(fieldY) sum(fieldY)
A 2 4 10
B 1 5 35

  • The results are grouped first by the fieldX.
  • The count field contains a count of the rows that contain A or B.
  • The count(fieldY) aggregation counts the rows for the fields in the fieldY column that contain a single value. However, if a field is a multivalue field, the aggregation counts the number of values in the fields.
  • The sum(fieldY) aggregation adds up all of the values in both single value and multivalue fields.

Grouping results

Use the BY clause to group your aggregations.

For example, suppose you have the following events:

_time clientip action pid
2026/01/20 12:00 192.0.2.0 purchase DC-SG-G02
2026/01/20 11:58 203.0.113.0 addtochart MB-AG-G07
2026/01/20 11:58 203.0.113.0 purchase WC-SH-A01
2026/01/20 11:56 198.51.100.255 changequantity PZ-SG-G05
2026/01/20 11:51 192.0.2.0 purchase SF-BVS-01
2026/01/20 11:47 198.51.100.0 purchase SF-BVS-G01
2026/01/20 11:42 192.0.2.0 purchase WC-SH-T02
2026/01/20 11:39 198.51.100.0 purchase PZ-SG-G05

To group the aggregations by the type of action, add | stats count(pid) BY action to your search or pipeline.

The results look like this:

Group results by a timespan

To group aggregations by a timespan, use the span statistical function.

Group results by a multivalue field

When grouping by a multivalue field, the stats command produces one row for each value in the field. For example, suppose the incoming data is this:

fieldA fieldB fieldC
1 x V1, V2, V3
2 y V4, V5
3 z V2, V5

If you specify the fieldC in the <by-clause>, such as ...| stats sum(fieldA) BY fieldC, the results are:

fieldC sum(fieldA)
V1 1
V2 4
V3 1
V4 2
V5 5

Differences between SPL and SPL2

The differences between the SPL and SPL2 stats command are described in these sections. These differences are relevant for searches only, since pipelines do not use SPL.

Command options must be specified before command arguments

The stats-options are:

  • allnum = boolean
  • delim = "string"
  • partitions = num

New span option added to the by-clause

With SPL2 you can specify a time span. The field you use in the by-clause must be either the _time field, or another field in UNIX time. For example:

Version Example
SPL Not supported. The SPL equivalent is ...| bin _time span=5min | stats count (error) BY _time
SPL2 ...| stats count(error) BY _time span=5min

This example returns the count of events in 5 minute intervals.

You can accomplish that same results using the from command. Here are several examples:

from command alternatives
FROM <dataset> GROUP BY span(_time, 5min) SELECT count(error), _time
SELECT count(error), _time FROM <dataset> GROUP BY span(_time, 5min)

For more information about specifying a span, see Specifying time spans in the SPL2 Search Manual.

All functions require parentheses

In SPL, the count function could be specified without parentheses. In SPL2, the parentheses are required when you use the count function. For all other functions, you must specify a field inside the parentheses or BY clause.

Version Example
SPL ...| stats count
SPL2 ...| stats count ()

Field lists must be comma-delimited

If you specify a list of fields in the by-clause, the list must be comma-delimited. Otherwise a parsing error is returned.

Version Example
SPL ...| stats count BY host source
SPL2 ...| stats count () BY host, source

Aggregations must be comma-delimited

If you specify multiple aggregations, the aggregations must be comma-delimited. Otherwise a parsing error is returned.

Version Example
SPL ...| stats avg (fieldX) max (fieldY) BY host
SPL2 ...| stats avg (fieldX), max (fieldY) BY host