Discovering data for your investigations using the Catalog

Use the Catalog in Splunk Cloud Platform to discover available datasets and find the data you need to power your investigations before writing SPL or SPL2 searches.

The Catalog is a centralized data discovery interface that consolidates information about the datasets that are available to you. Boost your investigations by using the Catalog to identify relevant datasets and learn about them before even writing your first SPL or SPL2 search.

To navigate to the Catalog, do either of the following from the global navigation bar in Splunk Cloud Platform:

  • Select the Catalog (Image of the "Catalog" icon) icon.

  • Select the Settings (Image of the "Settings" icon) icon. Then, in the Knowledge section, select Catalog.

To learn more about the Catalog, see the following sections:

What does the Catalog show me?

The Catalog in Splunk Cloud Platform shows you the datasets that you can access and detailed metadata for each dataset.

The Catalog displays the datasets that are available to you based on the roles granted to your user account and the permissions that are configured in each dataset.

The Catalog supports the following types of datasets:

Dataset type Description
Federated datasets

Includes the following kinds of datasets from the Data Management app:

  • Amazon S3

  • Azure Databricks

  • Dynamic Data Self Storage (DDSS)

  • Microsoft Azure

  • Snowflake

Splunk indexes

Includes the following kinds of Splunk platform indexes:

  • Events indexes

  • Metrics indexes

The Catalog also displays a variety of information about each dataset. You can select a dataset to open a side panel containing details such as the following:

  • The resource name of the dataset

  • The timestamps of the earliest and latest events in the dataset

  • The number of events in the dataset

  • The type of data source that the events originated from

The exact information that is available in the side panel varies depending on the type of dataset you selected and whether the Schema Collection feature is turned on. For example, dataset statistics like the number of events or the earliest and latest event timestamps are available for Splunk indexes but not for federated datasets.

Note: Be aware that the list of datasets on the Catalog and the dataset statistics in the side panel are subject to a 30-minute refresh rate. Newly added datasets and changes to the dataset statistics are not reflected immediately in the Catalog.

Inspecting event fields in datasets

By default, the Catalog does not display detailed information about the events inside the datasets. However, Splunk platform administrators can configure the Catalog to show the names of the event fields in each dataset by turning on a background search operation called Schema Collection. When Schema Collection is turned on, the dataset side panels show a Fields section that lists the names of the available event fields.

Inspecting these field names allows you to learn about the data schema so that you can start your investigations with more specific and targeted searches on the dataset, eliminating the need for broad, exploratory searches that can be costly and resource-intensive.

For more information about how an administrator can turn on this feature, see Configure background search operations for the Catalog in the Splunk Cloud Platform Admin Manual.

Note:

Be aware that the event field information is subject to the following limitations:

  • The list of field names for newly added indexes is populated in phases. When the Catalog retrieves the field names for an index for the first time, it runs a search every 60 minutes until all the information is retrieved.

  • The list of field names for existing datasets is subject to a daily refresh rate. Updates to the field names available in the dataset are not reflected immediately in the Catalog.

  • Event fields are not available for metrics indexes or federated datasets where the schema is defined in a customer-managed data catalog such as AWS Glue or Apache Iceberg REST.

What can I use the Catalog for?

Use the Catalog in Splunk Cloud Platform to find relevant datasets, investigate their contents, and start running targeted SPL or SPL2 searches so you can analyze your data more effectively.

Filter the Catalog for datasets of interest, and select each dataset to view more details about it. Using the Catalog to gain these preliminary insights into your data allows you to start your investigations with more focused SPL or SPL2 searches.

For more information about Catalog features and how you can use them to explore, discover, and gain insights into your data, see the following:

Find relevant datasets

Use the filtering and sorting options on the Catalog to find datasets that are of interest to your investigation.

Filter for datasets using keywords

Enter one or more keywords in the Filters field and then select Apply. Separate each keyword with a space.

The Catalog returns a dataset if your keywords partially or fully match any of the following:

  • The information on the page, such as the dataset name or type

  • The general information in the dataset side panel, such as the dataset description or kind

For example, filtering for aws can return datasets named aws_logs or rawstrings, as well as datasets where the kind is aws_s3 or the description is Cold storage in AWS S3 for INFO logs.

Filter datasets by name

Select the Name column header, and then enter a partial or complete dataset name in the Filter field.

The Catalog returns a dataset if any part of the name matches what you entered.

Filter for a specific type of dataset

Select the Type column header, and then select Splunk index or Federated dataset.

Filter for datasets that were created by a specific user

Select the Created by column header, and then enter a user name in the Filter field.

The Catalog returns a dataset only if the name of the user that created the dataset is an exact match with the user name that you entered.

Sort the list of datasets

Select any of the column headers to sort the datasets by ascending or descending order based on the values in the column.

For example, you can view older datasets first by selecting the Created column header so that the down arrow icon (Image of an arrow pointing down) changes to an up arrow icon (Image of an arrow pointing up), indicating that the datasets are being sorted in ascending order based on the date and time when they were created.

Investigate the datasets contents

Select a dataset to learn more about it.

When you select a dataset, the Catalog opens a side panel that contains details about the dataset as well as actions you can use to discover more information.

Sample the field values

Select Top 10 values per field to run an SPL2 search that returns the top 10 values from each event field in the dataset.

Make sure to set the time range picker beside the Search bar to an appropriate time range for the events in the selected dataset.

Learn more about a Splunk index

For Splunk indexes specifically, you can select View to navigate to the Indexes page and view additional information about the selected index.

For federated datasets, if you have the necessary permissions, you can select Edit to open the dataset for editing in the Data Management app and view additional information about the configuration settings.

Search, transform, and analyze your data

After finding the relevant dataset for your investigation, you can run a targeted search on that dataset and start working with the data that you need.

By using the Catalog to find the right dataset, you can avoid running slow and expensive searches that span multiple datasets, such as index=*.

Run a search with the default time range

Select the dataset on the Catalog, and then select Search from the side panel.

When you select Search, you navigate to the Search & Reporting app and run an SPL2 search that retrieves all the events in the dataset that match the default time range, which is typically the last 30 minutes.

For information about how to continue building and refining your search, refer the following documentation:
For this information Refer to this documentation
Using SPL2 to work with your data
Searching federated datasets

If you want to search your data using SPL instead of SPL2 or write a search without running it immediately, you can manually navigate to the Search & Reporting app and then start working from there. Do the following:

  1. To navigate to the Search & Reporting app, start by selecting Home to return to the Splunk home page. Then, from the Apps panel, select Search & Reporting.

  2. Set the language picker to SPL or SPL2 as desired.

  3. In the Search bar, enter a search that targets the dataset you discovered through the Catalog. If the dataset is a Splunk index, then you can use either SPL or SPL2. For federated datasets, use SPL2.

    For example, the following SPL search targets a Splunk index named my_logs:
    CODE
    index=my_logs
    As another example, the following SPL2 search targets a federated dataset named my_logs:
    CODE
    FROM my_logs

For information about using SPL searches, see the following documentation:

See also

Related documentation for creating knowledge objects from search results and managing background search operations.

For information about how to create knowledge objects based on your search results, refer to the following documentation:

Task Documentation
Visualizing your search results in a dashboard See the Dashboard Studio manual.
Saving your search as a report that you can use again at a later time or run on a scheduled interval See the Reporting Manual.
Setting up alerts that notify you when the search results meet specific conditions See the Alerting Manual.

For information about background search operations that are used to support the Catalog, see Configure background search operations for the Catalog in the Splunk Cloud Platform Admin Manual.