Use time fields in sdselect searches

When you run an sdselect search over a federated index that maps to an AWS Glue table that represents an Amazon S3 or Amazon Security Lake dataset, you might encounter difficulties if you use the Time field that has been declared for that federated index in your search. Instead, for best results, set the Unix time field for the federated index that maps to that dataset to _time and then use _time as your time field throughout your search.

When you must use the Unix time field for your sdselect searches

Each federated index has a declared Time field, Time format, and Unix time field that correspond to the AWS Glue table dataset that the federated index is mapped to.

The Time field is the actual time field in the dataset to which the index is mapped, and the Time format is the format of that time field. The Unix time field is a field alias of the Time field that accurately converts the Time field values into numeric UNIX time format values, even if the Time field has a string UNIX time format, a custom string format, or a SQL timestamp data type.

The Unix time field defaults to _time for all federated indexes. The _time field is useful because Splunk software treats _time as a field with both numeric and string properties.

  • Because _time values are in numeric UNIX time format, you can use _time in functions and operations that require numeric values.
  • However, when it makes sense to do so, Splunk Web translates _time values into human-readable timestamp strings that display with the time zone of the Splunk platform deployment running the search.

The following table shows you how different kinds of Time field fields match up against Unix time field fields when you apply time fields to sdselect use cases such as numeric stats aggregation functions, WHERE clauses, and GROUPBY clauses with span arguments. It shows that a Unix time field set to _time is the most versatile option when you need to use a time field in sdselect searches.

Type of time field Time field format Can I apply the time field to a WHERE clause? Can I apply the time field to numeric stats aggregation functions such as avg() or sum()? Can I apply the time field to a GROUPBY clause with a span argument?
Time field %UT (Numeric UNIX time format) Yes Yes No
Time field Custom string format, defined with a series of time format variables. Can also be %s for string UNIX time format. Yes, but the search might return incorrect results due to lexicographical1 comparison. Yes No
Time field %ST (SQL timestamp datatype) No No No
Unix time field Numeric UNIX time format Yes Yes Yes

1 Lexicographical order sorts items based on the values used to encode the items in computer memory. In Splunk software, this is almost always UTF-8 encoding, which is a superset of ASCII. Lexicographical order sorts numbers before letters, based on the first digit. For example, the numbers 10, 9, 70, 100 are sorted lexicographically as 10, 100, 70, 9.

Note: If you are defining an Amazon S3 federated index for a customer-created Glue table, and you prefer to see your human-readable timestamps replaced by numeric UNIX time format values, change the value of Unix time field to a different field name than _time. The Unix time field value cannot be shared by field names in the AWS Glue table dataset to which the federated index maps.

To learn more about declaring the Time field, Time format, and Unix time field for Amazon S3 federated indexes that map to customer-created AWS Glue datasets, see Map an Amazon S3 federated index to a customer-created AWS Glue table dataset.

For more information about Amazon S3 federated indexes that map to Splunk-managed AWS Glue table datasets, see Map an Amazon S3 federated index to a Splunk-managed AWS Glue table dataset.

For more information about Amazon Security Lake federated indexes, see Map Amazon Security Lake federated indexes to AWS Glue tables.

For more information about applying time fields to sdselect WHERE clauses, see Apply date and time eval functions to fields in the WHERE clause.

Applying time range picker selections to sdselect searches

When you apply a time range filter selection from the time range picker in Splunk Web to an sdselect search that does not itself include time range filters or subsearches, the sdselect command applies the selection to the results of the search without any issues.

On the other hand, if you apply a time range filter selection from the time range picker to an sdselect search that also includes 1 or more time range filters or subsearches in its search string, a set of factors determine whether sdselect applies the time range picker selection to the search, and if so, what parts of the search sdselect applies the time range picker selection to.

Consider the following scenarios when you apply a time range picker selection to an sdselect search that includes time ranges in its search string.

  • Does the sdselect search have subsearches? If the sdselect search doesn't have subsearches, does it have time filters that use time fields declared in the definition of the federated index referenced in the FROM clause?

  • Does the sdselect search have 1 subsearch in its FROM clause, but no JOIN clauses?

  • Does the sdselect search use JOIN clauses to join 2 or more datasets? Do those JOIN clauses include subsearches?

The following subsections cover these scenarios in more detail.

Apply a time range picker selection to an sdselect search with 1 subsearch

When your sdselect search contains a subsearch in its FROM clause and has no JOIN clauses, sdselect applies your time range picker selection for that search only to that FROM clause subsearch, and only if that subsearch does not already contain 1 or more time range filters.

Say you have an sdselect search with a FROM clause that uses a subsearch to define its dataset, and the search has no JOIN clauses. When you use the time range picker in Splunk Web to select a time range filter for this sdselect search, the outcome is dependent on whether the FROM clause subsearch contains 1 or more time range filters.

  • If the FROM clause subsearch contains 1 or more time range filters, sdselect does not apply your time range picker selection to the subsearch.

  • If the FROM clause subsearch does not include time range filters, sdselect applies your time range picker selection to the subsearch.

In either case, sdselect does not apply your time range picker selection to the outer sdselect search.

Apply a time range picker selection to an sdselect search that has multiple subsearches

What happens when you apply a time range filter selection from the time range picker in Splunk Web to an sdselect search that contains 2 or more subsearches? An sdselect search can only have 2 or more subsearches if it contains 1 or more JOIN clauses that join the sdselect search on multiple datasets.

When you select a time range filter from time range picker for an sdselect search containing 2 or more subsearches, sdselect applies the time range picker selection only to subsearches in the search that do not have time filters of their own. If all of the subsearches have their own time filters, sdselect does not apply your time range picker selection to any portion of the sdselect search.

Note: If you want your time range picker selection to be applied to an sdselect search with 1 or more JOIN clauses, the best practice is to arrange for the FROM clause and JOIN clauses to define their datasets with subsearches rather than federated indexes.

Using VPC flow log dataset time fields

Note: The following likely applies to you only if you are using a federated index that maps to a Splunk-managed AWS Glue table for a VPC flow log dataset.

When you run sdselect searches over VPC flow log datasets, there are two fields in the VPC flow log schema that you can use as time fields: start and end. When you use end in an sdselect search, you must always put the time field name within single quotes, like this: 'end'.

When you define the federated index for a VPC flow log dataset, you can declare either start or end to be the Time field for the federated index.

See Map an Amazon S3 federated index to a Splunk-managed AWS Glue table for a default format VPC flow log dataset.

When time fields have values with the SQL timestamp data type

Remote datasets in Amazon S3 or Amazon Security Lake might include time fields that have SQL timestamp data type values. The sdselect command does not natively support the timestamp data type. The sdselect command converts timestamp values into a string with a yyyy-mm-dd hh:mm:ss[.nnnnnnnnn] format, which means that timestamp values cannot be directly applied to SPL elements that require numeric values, such as stats aggregation evaluation functions like avg() and sum().

When you run an sdselect search that involves a time field with SQL timestamp data type values, you have options for converting those values into numeric UNIX time format that depend on whether you are dealing with the Time field that is declared for the federated index, or any other time-related field in the remote dataset to which the federated index is mapped. The following table summarizes these options and provides links to more information.

Time-related field with a SQL timestamp data type Value conversion method For more information
The Time field that is declared for the federated index Set the federated index Unix time field to _time and replace the Time field in your searches with _time. See Use _time to convert Time fields with SQL timestamp values.
Any time field in the remote dataset to which your federated index is mapped, including the declared Time field and any Time partition field. Use the timestamp_from_unixtime and timestamp_to_unixtime evaluation functions to convert timestamp values to numeric UNIX time format values. See Employ specialized evaluation functions to utilize time fields with SQL timestamp values in searches.

Note: If you use Federated Analytics, you work with federated indexes that have time_dt as their default Time partition field. The time_dt field has the SQL timestamp data type. If you use time_dt in your searches to filter data partitions, you will use the timestamp_from_unixtime and timestamp_to_unixtime evaluation functions to translate timestamp values.

The sdselect command displays timestamp values in the Coordinated Universal Time (UTC) time zone.

Use _time to convert Time fields with SQL timestamp values

If the Time field for your federated index has the SQL timestamp data type, your best option is to set the Unix time field for that federated index to _time. The Unix time field is an alias for the Time field. The Unix time field converts the SQL timestamp data type values of the Time field into numeric UNIX time format. This conversion is perfect for functions that require numeric time field values.

Splunk Web displays _time values in human-readable string format where it is appropriate to do so, in the local time zone for your Splunk platform deployment.

This example involves a federated index named Time_Field_Timestamp_Data_Type. The time field values for this federated index are as follows:

Time field Time format Unix time field
SQLTimestampTime %ST _time

The Time format variable of %ST indicates that SQLTimestampTime, the declared Time field for this federated index, has values of the SQL timestamp data type.

Here is a search that summarizes the information contained in the AWS Glue table dataset to which the federated index maps.

Here are the results of that search. Notice that the timestamp values for SQLTimestampTime, the Time field from the AWS Glue table, have the UTC time zone, while the timestamps for _time, the Unix time field, are the same values in Pacific Standard Time (PST), the time zone local to the Splunk platform deployment the search ran upon.

Now, here is an sdselect search that is run against that dataset. Notice that this search uses the _time field in several spots, and that the WHERE clause compares _time against a time value in numeric UNIX time format. This UNIX time value is equivalent to March 9, 2023 4:51:21 PM (PT), which means that the search filters events for Larry and Ramon out of its results.

The search returns the following results.

If the search uses SQLTimestampTime for the avg() function, the WHERE clause, or for the GROUPBY clause with the span argument, the search fails and returns error messages. This happens for the following reasons:

  • SQLTimestampTime has time field values with the SQL timestamp data type, which sdselect cannot apply to stats aggregation functions that require numeric values, such as avg() and sum().
  • sdselect cannot apply time field values with the SQL timestamp data type to WHERE clauses.
  • sdselect requires that you apply the Unix time field to GROUPBY clauses with a span argument.

When you use _time, the Unix time field alias of SQLTimestampTime, the search works as expected.

Employ specialized evaluation functions to utilize time fields with SQL timestamp values in searches

When you need to run an sdselect search using time fields that have SQL timestamp values, you can use the timestamp_from_unixtime and timestamp_to_unixtime evaluation functions in the WHERE clause to directly compare timestamps in the numeric UNIX time format with SQL timestamp data type values. You can apply these functions to any field with SQL timestamp values in your data, including the declared Time field and any declared Time partition field.

Note: If you use Federated Analytics to search Amazon Security Lake data, your federated indexes are generated with a default time_dt Time partition field. If you want to get the optimization benefits of partition-based searching when you run sdselect searches over your Amazon Security Lake datasets, you must use the timestamp_from_unixtime and timestamp_to_unixtime evaluation functions to convert time_dt values into numeric UNIX time format values.

The timestamp_from_unixtime evaluation function takes a numeric UNIX time format timestamp or a field with a numeric UNIX time format timestamp as its argument and converts that input into an equivalent SQL timestamp data type value, as in the following example:

The timestamp_to_unixtime evaluation function takes a field with a SQL timestamp data type as its main argument and converts its value into an equivalent numeric UNIX time format timestamp, as in the following example.

Note: Because timestamp_from_unixtime does not transform SQL timestamp data type field values, it is better at optimizing searches involving time partition fields with SQL timestamp values than timestamp_to_unixtime.

For detailed information about these evaluation functions and many more search examples, see Evaluation functions specific to sdselect.