Splunk Cloud connect data management

Splunk Cloud Connect provides a secure integration bridge between your on-premises environment and Splunk Cloud Services (SCS). Review how data is managed, transmitted, and used when you use Splunk Cloud Connect to access on-premises deployments of Splunk Enterprise Security with Cloud-native, Splunk managed, or Splunk Cloud Services (SCS) security extensions such as Threat Intelligence Management (TIM) or Detection Studio (DS).

Data flow is different when you connect to Splunk Cloud Connect as opposed to when you access Splunk Cloud Platform in the following ways:

  • Data flow in Splunk Cloud Connect: Splunk Cloud Connect operates as an app to onboard and configure the secure channel to your Splunk Cloud Services tenant in the Splunk Cloud Platform. Splunk Cloud Connect does not store, cache, or retain any information that is in transit. It only establishes and maintains a secure, reliable connection between your local environment and the Cloud environment.

  • Data flow in Splunk Cloud Platform: Once data reaches Splunk Cloud, it is received and used to support the features and services that you are entitled to use. Data handled through Splunk Cloud Connect is treated the same way as data in any standard Splunk Cloud service. The same security, operational, and compliance standards are applied to your data that you expect from Splunk Cloud Services.

Each connected extension (and included features or services) that can be accessed using Splunk Cloud Connect have specific behaviors and requirements for data flow

Outbound data from on-premises Splunk deployment

Detection studio: When a test is actively initiated by you in the Detection Studio, specific data can exit your environment.

Note: Data outflow from your on-premises environment does not occur automatically in the background.
  • Detection content and metadata: Splunk Enterprise Security correlation searches, including detection names, search logic, schedule settings, action configurations, severity assignments, MITRE ATT&CK mappings, and version tracking information, are synchronized to Detection Studio on a scheduled basis. Detection Studio needs this information to support centralized detection management.

  • Environment compatibility metadata: A catalog of your Splunk deployment configuration is retrieved on a scheduled basis to allow Detection Studio to validate detection compatibility. The catalog includes the names and versions of installed applications, available index names, the last activity dates, data model definitions, acceleration status, macro definitions used by detections, lookup table names, source and sourcetype inventory. The catalog also contains configuration metadata but not security event content.

  • Detection test result samples: When a detection test is run within Detection Studio, your Splunk deployment runs the detection search against your historical data and returns a time-bound sample of results. These results include job status, search validation output, a configurable number of matched events (with a default maximum of 1,000 events for each test), and aggregate statistics about the event volume for the tested time window.

Threat intelligence Management: A very limited amount of administrative and operational outbound data exchange occurs with this integration.

Note: Outbound data from your environment does not include security event data, log content, investigation records, or any other customer-generated security information.
Outbound data Description of the outbound data Reason for the outbound data tansfer
Subscription verification A query to confirm your tenant has an active Threat Intelligence Management subscription Periodic administrative check to activate the integration
Enclave configuration retrieval A request to retrieve the identifiers of the intelligence enclaves your tenant is configured to access Required to know which intelligence sources from which to retrieve information
Retrieval checkpoint state Tracking the timestamp of the last successfully processed indicator Activates incremental retrieval so only new indicators are recovered

Inbound data from on-premises Splunk deployment

Detection Studio: Inbound data includes a limited set of structured data to your Splunk deployment to support detection testing and operational visibility. This inbound data is written to dedicated indexes that you can view, search, or audit within your Splunk on-premises deployment.

Following data is added to your Splunk on-premises deployment with the Detection Studio integration:

  • Requests for detection test runs: When a test is initiated, a search is sent to run inside your local Splunk on-premises deployment and the results are returned to Detection Studio for scoring.

  • Records for tracking Confidence values: After each detection test, a small, structured record is written to a dedicated index in your environment. Each record contains a detection identifier, a cryptography hash of the result for de-duplication, a version reference, a timestamp, and an indicator of whether the test found matches to the specified criteria. Raw event content is not included in these records.

  • KPI performance metrics: Aggregated detection portfolio scores are written to a dedicated index in your environment on a scheduled basis. These include scores covering detection coverage, health, priority, confidence, performance, and impact at both the portfolio level and for each individual detection. These metrics power dashboards and trending analysis within Splunk deployments.

Threat Intelligence Management (TIM): Following are some examples of the inbound data from Splunk on-premises deployments with the Threat Intelligence Management integration:

  • Threat indicators: Structured threat indicators across multiple observable categories are retrieved from the cloud service and uploaded directly into the native threat intelligence collections of Splunk Enterprise Security. These include:

    • Network-based indicators: IP addresses and domain names

    • Web-based indicators: URLs

    • Identity-based indicators: Email addresses

    • File-based indicators: Cryptography hashes and software identifiers

    • Endpoint-based indicators: Registry key paths

    Each indicator is delivered with metadata including its observable type, a confidence value, the intelligence source it originates from, and a threat key that supports the native Splunk Enterprise Security correlation.

  • Threat attribution context: Where available, indicators include threat actor and malware attribution data. This is extracted and loaded into a dedicated threat group collection within Splunk Enterprise Security and supports adversary-centric analysis.

  • Full indicator records: A complete copy of each indicator, including all associated metadata, is maintained in a Mission Control-specific collection within your environment for cross-feature use and unified visibility.

Data retained in Splunk Cloud Platform

Detection Studio: The following table indicates the data that is retained in Splunk Cloud Platform for Detection Studio:

Data category Retained in Splunk Cloud (Y/N?) Notes
Detection definitions and metadata Yes Core function; This is managed content.
Compatibility analysis results Yes Stored as scores and gap metadata, not raw configuration.
Confidence and KPI scores Yes Aggregate scores and hashes; These are not raw events.
Detection test result samples No Used to score calculations and then discarded.
Raw security event data No Not retained.
Environment configuration snapshots Metadata only Used for compatibility analysis.

Threat Intelligence Management: Intelligence records in your environment are subject to automated retention management. Records that exceed the defined retention period are discarded automatically. This keeps intelligence collections current and prevents unbounded growth of historical indicator data. The retention period is configurable to align with your operational and compliance requirements.

Data usage in Threat Intelligence Management

Data is processed within your environment for this integration using a structured, three-stage pipeline that operates entirely within your Splunk environment after retrieving data from the cloud service.

  1. Stage 1 (Retrieval): An authenticated process within your Splunk environment connects to the Threat Intelligence Management cloud service and retrieves available indicators from your configured intelligence sources. During the first retrieval, a defined historical window of indicators is retrieved. Subsequent retrievals are incremental and only indicators that are new or updated since the last successful retrieval are recovered.

  2. Stage 2 (Local staging): Retrieved indicators are temporarily written as structured files to a local directory within your Splunk environment. This intermediate stage allows the pipeline to handle indicator volume without affecting the performance of the Splunk search. The number of staged files is limited and managed automatically.

  3. Stage 3 ( Parsing and loading): A separate process reads the staged files, classifies each indicator by observable type, and writes it to the appropriate threat intelligence collection in Splunk Enterprise Security. Processed files are deleted after ingestion. Threat actor and malware attribution data is extracted separately and written to a dedicated threat group collection.

Data not leaving your on-premises environment

Detection Studio: The following table indicates the categories of data that are not part of this integration's data exchange:

Data category

Data sent to Detection Studio (Y/N?)

Raw security event archives

No

Full index contents

No

All saved searches (non-detection)

No. Only correlation searches are synchronized.

Incident or notable event records

No

Investigation or case data

No

User activity outside detection testing

No

Credentials or authentication secrets

No

Packet capture or network flow data

No

Threat Intelligence Management: The following categories of data are not part of this integration's data exchange:

Data category

Data sent to Threat Intelligence Management (Y/N?)

Security event data

No

Log or index content

No

Detection results or alert records

No

Investigation or case content

No

User activity or behavioral data

No

Packet capture or network flow data

No

Credentials or private keys

No

Threat intelligence collections

No

Security controls

The following table outlines the security controls for your data in Detection Studio and Threat intelligence Management:

Security control

Description

Encryption

All data in transit protected by TLS 1.2 or higher.

Message authentication

HMAC-based authentication applied to all messages.

WebSocket establishment direction

Outbound data from your environment. No inbound connections from the Cloud.

No direct database access

No database connections cross the network boundary.

Replay protection

Session controls prevent message interception or replay.

Audit logging

Client API activity is logged to Splunk audit logs in your environment.

Data minimization

Only required data is exchanged. Hashes are used instead of raw content when ever possible.

Role-based access

Detection Studio enforces role-based controls for all operations.

Authentication

Tenant-scoped service credentials authenticate all cloud service interactions.

Encrypted transport

All data retrieved from the cloud service is transmitted over encrypted channels.

Tenant isolation

Intelligence retrieved from enclaves is configured for your specific tenant.

Incremental retrieval

New or updated indicators are retrieved after initial upload.

Staging cleanup

Intermediate files are removed from the local storage after successful processing.

Retention management

Automated removal of aged records for each configured policy.

Unidirectional intelligence flow

Intelligence enters your environment. Your event data does not leave through this integration.

Ability to audit the full collection

All delivered intelligence visible and searchable within your Splunk environment.

Customer controls

The following table outlines the customer controls for your data in Detection Studio and Threat intelligence Management:

Controlled data flow

Method used to control the data flow

Whether the integration is active

Activate or de-activate the client component in your environment.

When detection tests run

Tests are both operator-initiated and automatically run and the frequency of the automatic tests can be configured.

Test result volume

The maximum event sample size that can be configured for each test.

On-premises operational independence

Splunk Enterprise Security runs detections regardless of cloud connectivity.

Scope of the intelligence souurce

Configure the enclaves that must be included for retrieval.

Retrieval behavior

Adjustable scheduling within supported parameters.

Retention period

Operational and compliance requirements can be configured.

Full data visibility

All intelligence collections are accessible using standard Splunk tools.

Operational independence

Splunk Enterprise Security continues using uploaded intelligence if cloud connectivity is temporarily unavailable.