Exclude historical data processing from ingest actions rulesets

Configure ingest actions rulesets to separate historical data processing from stream data processing for the same source type.

Ensure you have administrative access to the ingest actions rulesets configuration.

For a given source type, you can configure specific rules to separate historical data processing from stream data processing. This configuration routes historical data to a dedicated index and excludes it from further processing by rulesets designed for stream data.

Note: Each source type can only have one associated ruleset.
  1. Identify the rulesets that require updating:
    1. In Splunk Cloud Platform, select Settings > Ingest actions > Rulesets.
    2. In the sourcetype column, find rulesets that process historical data.

      Alternatively, use the search box on the right to filter for relevant rulesets. For example, you can filter by a sourcetype.

      Ingest actions rulesets page with the sourcetype column and search box for filtering
    3. For the ruleset that you want to update, in the Actions column, select Edit.
  2. Exclude historical data in a ruleset.

    You will add two rules to your ruleset to handle historical data:

    • Configure data routing for historical data.
    • Drop historical data from further processing within this ruleset.
    For example, you want to exclude historical data in a ruleset for the test123 sourcetype. Let's assume the following conditions:
    • This ruleset replaces the staging to mask, and then sets the index to stream.

    • You want to send historical data to the promote index, and continue the ingest action processing for stream data.

    Add the following rules at the beginning of your ruleset, before any existing processing rules:

    1. To configure the data routing rule, go to the Route to Destination section and set the following values:
      • In the Condition section, select Eval.

      • In the Eval Expression field, enter isnotnull(splunk_promote_id).

      Note:

      The promote data uses a different index than the stream data, so you don't have to set an index.

      This rule keeps promote data within the ingest actions processing pipeline. The next rule will remove it from further processing.

    2. To confirm changes, select Apply.
    3. To configure the data dropping rule, go to the Filter using Eval section and in the Drop Events Matching Eval Expression field, enter isnotnull(splunk_promote_id).
    4. To confirm changes, select Apply.
  3. (Optional) Test your ruleset by sending both stream and historical data using HEC (HTTP Event Collector).

    Example HEC input:

    
    {
        "event": {
            "message": "stream data",
            "environment": "staging"
        },
        "sourcetype": "test123",
        "index": "main",
        "host": "dataserver992.example.com",
        "fields": {
            "data_manager_input_id": "200"
        }
    }
    {
        "event": {
            "message": "promote data",
            "environment": "staging"
        },
        "sourcetype": "test123",
        "index": "main",
        "host": "dataserver992.example.com",
        "fields": {
            "splunk_promote_id": "400"
        }
    }
                        
  • The historical data is sent to the promote index and remains unprocessed (environment=staging).
  • The stream data is sent to the stream index and is processed according to your existing rules (environment=masked).
Exclude historical data processing from Ingest Processor pipelines