About Federated Search for Microsoft Azure

Federated Search for Microsoft Azure lets you run federated searches from your Splunk Cloud Platform deployment over datasets in Microsoft Azure Data Lake Storage and Azure Blob Storage containers.

Federated Search for Microsoft Azure lets you run federated searches from your Splunk platform deployment over datasets located in Microsoft Azure Data Lake Storage (ADLS) and Azure Blob Storage (ABS) containers. When you run these federated searches, you use familiar SPL2 search commands and syntax.

Note: If you want to search Azure Databricks tables stored remotely in Unity Catalog, see About Federated Search for Azure Databricks.

Connections and datasets

Federated Search for Microsoft Azure is part of the Data Management app, where you'll set up your federated search experience through the definition of connections and datasets.

Connection: A Microsoft Azure connection defines how Splunk software securely authenticates a link between your Splunk Cloud Platform deployment and a federated dataset in an ADLS or ABS container. Connections are reusable and can be associated with multiple Microsoft Azure datasets. Microsoft Azure connections do not specify what data is searchable.
Dataset: A Microsoft Azure dataset is a searchable data object that is associated with a single Microsoft Azure connection. Each Microsoft Azure dataset is defined by an ADLS or ABS container URL.

Two core workflows

Microsoft Azure datasets and connections support two workflows. One workflow facilitates a combination of data routing and federated search. The other workflow is only for federated search.

Data routing to federated search workflow: Configure a Data routing and federated search dataset that sends Edge Processor data, Ingest Processor data, or a combination of both to a dataset in an ADLS or ABS container that can then be used as a pipeline destination. You can optionally configure the dataset to support federated searches, so you can use the same dataset to write data to and read data from an ADLS or ABS container.
Federated search only workflow: Create a Federated search only dataset that is stored in an ADLS or ABS container and is backed by a Splunk data catalog. Select this option when you want to focus on search of data that you are storing in Microsoft Azure and do not require a data routing solution.

Splunk-native data catalog generation

Federated Search for Microsoft Azure searches apply filtering and statistical functions to data catalogs that contain column, schema, and partition definitions for datasets in your ADLS or ABS containers. This means that a data catalog must be associated with each Microsoft Azure dataset you intend to search.

Federated Search for Microsoft Azure builds a Splunk-native data catalog for for each dataset you define. You can let Splunk software automatically infer the dataset schema and partitions with a crawler, or you can manually configure the dataset schema and partitions yourself.

You can arrange to keep this catalog in sync with your dataset as your dataset changes over time.

What you need to get started

To get started with federated search of data you store in Microsoft Azure, you must have the following things:

You must have an Splunk Cloud Platform (SCP) deployment.
Your user account on the SCP deployment must have a role with the edit_connections and edit_datasets capabilities. See Define roles on the Splunk platform with capabilities in the Splunk Cloud Platform Manage Users and Security manual.
You must have a Microsoft Azure account with data in ADLS or ABS containers that conforms to supported file and compression types.
(Optional) The Azure storage account that contains the Microsoft Azure dataset you want to access may have network-level access restrictions that prevent you from performing read or write operations on that dataset. To get around these restrictions, set up an IP address allow list for the storage account that corresponds to the Cloud region of your Splunk Cloud Platform deployment.
- For instructions, see Set the default public network access rule for an Azure Storage account in Azure Blob Storage documentation.
- To get an IP address list that corresponds to the Cloud region of your Splunk platform deployment, see IP address lists for Cloud regions.

Checklist of tasks to set up Federated Search for Microsoft Azure

The following checklist guides you through the cross-account setup of Federated Search for Microsoft Azure.


Step	Task	Description
1	Create a Microsoft Azure connection	A connection contains the tools you need to authenticate the ability to run federated searches over Microsoft Azure datasets from your Splunk platform deployment. Connections can also support the sending of data from Edge Processor or Ingest Processor to an Azure dataset.
2	Define a Microsoft Azure dataset	Providebaseline information for your dataset, including its name and the URL of its ADLS or ABS container. Link it to a connection. Determine whether the dataset is used for Data Routing and Federated Search or just Federated Search.
3	Configure Microsoft Azure dataset details	Define a Federated Search dataset that is in an Azure storage container and is backed by a Splunk Catalog. You can define the dataset's schema and partition keys yourself, or you can let Splunk software use a crawler to automatically infer the schema and partition keys.
4	Give your users role-based access control of federated datasets	After you have successfully created a Microsoft Azure dataset, give your users role-based access to it.
5	Write and run federated searches over federated datasets with SPL2	Run federated searches over your new Microsoft Azure dataset with SPL2.

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

Enterprise Security

SOAR

IT Service Intelligence

Content Packs

Splunk Observability Cloud

AppDynamics SaaS

AppDynamics On-Premises

Virtual Appliance (Self-Hosted)

Developer Documentation

Splunkbase

Splunk Enterprise

Splunk Cloud Platform

Splunkbase

DATA MANAGEMENT

SEARCH AND ANALYTICS

ADMINISTRATION

Enterprise Security

SOAR

ENTERPRISE SECURITY

SOAR

RELATED APPS

IT Service Intelligence

Content Packs

ITSI

IT Ops

ADMINISTRATION

EXTENSIONS

Splunk Observability Cloud

MONITORING

DATA MANAGEMENT

ADMINISTRATION

AppDynamics SaaS

AppDynamics On-Premises

Virtual Appliance (Self-Hosted)

ESSENTIALS

MONITORING

ADMINISTRATION

Developer Documentation

Splunkbase

PLATFORM

OBSERVABILITY

REFERENCE

Resources

REFERENCE

Learn More

Support

About Federated Search for Microsoft Azure

Connections and datasets

Two core workflows

Splunk-native data catalog generation

What you need to get started

Checklist of tasks to set up Federated Search for Microsoft Azure