Identify time partitions

Identify time partition fields in your dataset definition to improve federated search performance and reduce search cost.

Partitioning is an organization strategy for large datasets that enables you to search them efficiently. When you partition your data, you organize it into a hierarchical directory structure based on the distinct values of one or more fields in the data.

For example, you might partition your application logs in Amazon S3 by date, breaking them down by year, month, and day. Then you can place files corresponding to a single day's worth of data in an Amazon S3 path like s3://my_bucket/logs/year=2025/month=08/day=23/ if your dataset uses Hive-style partitions or s3://my_bucket/logs/2025/08/23/ if your dataset does not use Hive-style partitions.

Note: You can identify time partition settings for a dataset even if you have not selected Define the time field. Time partition fields exist in the Amazon S3 paths that form the structure of your dataset. The Time field exists as a column in the data catalog that references your dataset.

When you list time partition fields for a dataset, start with the first field by which data is partitioned, then list the second field, and so on. For example, say your data catalog references a dataset that that is partitioned by year, month, and day like this: s3://my_bucket/logs/year=2025/month=08/day=23/. In this case you would identify year as the first time partition field, month as the second time partition field, and day as the third time partition field.

Follow the link that is most appropriate for your needs: