Exclude historical data processing from Ingest Processor pipelines

Configure Ingest Processor pipelines to exclude historical data processing by adding exclusion rules to specific pipelines.

Before you begin, ensure you have administrative access to the Ingest Processor pipelines configuration.

If you have both Data Manager and Ingest Processor, historical data ingested into your environment is, by default, processed by any Ingest Processor pipelines whose partition conditions are met. To prevent historical data from being processed by Ingest Processor, you can configure your pipelines to exclude this data from processing.

Note: These steps must be carried out for each individual Ingest Processor pipeline you want to exclude from processing historical data.
  1. In Splunk Cloud Platform, from the menu, in the Data management section, select Pipelines.
  2. On the Ingest Processor Pipelines page, find a pipeline that you want to exclude from processing historical data. You can use the search field to filter the pipelines by name.
  3. In the row that lists the pipeline that you want to exclude, select the more icon and then select Edit.
  4. In the pipeline editor, in the Actions section, select the plus icon and then select Route a subset of data.
  5. In the Route data dialog box, configure the following settings:
    1. Set Field to splunk_promote_id.
    2. Set Action to Exclude.
    3. From the Operator drop-down list, select = is NULL.
    4. Select Apply to confirm.
  6. The newly generated route action is added as the last processing step by default. To ensure historical data is excluded before any other pipeline processing, you must move this route action to the beginning of the pipeline.

    Cut and paste the generated route action in Search Processing Language, version 2 (SPL2) to be the first processing step, prior to any other pipeline processing steps you want to exclude. This ensures that historical data is filtered out immediately.

    Before (default placement):

    
    import route from /splunk.ingest.commands
    
    $pipeline = | from $source
    | eval processed_by_ingest_processor=true
    | route isnotnull(splunk_promote_id), [
    | into $destination2
    ]
    | into $destination
                        

    After (correct placement for exclusion):

    
    import route from /splunk.ingest.commands
    
    $pipeline = | from $source
    | route isnotnull(splunk_promote_id), [
    | into $destination2
    ]
    | eval processed_by_ingest_processor=true
    | into $destination
                        
  7. Configure the destination for excluded data.
    1. For the generated route action, in the Route data section, select Send data to .
    2. From the list of available destinations, select splunk_indexer and then select Apply
  8. Select the Save pipeline button above the actions panel.
  9. Select Save to apply your changes to the pipeline.

The Ingest Processor pipeline excludes historical data where the splunk_promote_id field is null, ensuring that only new, stream data goes through the subsequent steps in this pipeline.

Create the SplunkDMReadOnly IAM role