Prerequisites

The list of prerequisites that mus be met before ingesting historical data.

Make sure that the following prerequisites are met before ingesting historical data:
  • The Ingest Processor and ingest actions are designed to process incoming data, often routing it to specific Splunk indexes based on criteria like source type. When you create a promote job to ingest historical data, it also relies on source type for its processing.

    The historical data can be ingested and processed through Ingest Processor and ingest actions.

    To exclude historical data processing from ingest actions rulesets and Ingest Processor pipelines, you have to update your ingest actions ruleset and Ingest Processor pipeline configurations.

    Important: Define the rules how your promote data should be handled by Ingest Processor and ingest actions before you create the promote input or its scheduled execution time.

    If you do not define specific rules for your promote data, it will automatically be processed by existing Ingest Processor pipelines and ingest actions. These existing pipelines are likely configured for stream data and may route your promote data to an incorrect index, apply redundant or conflicting transformations, or even drop the data entirely.

    Configure your Ingest Processor and ingest actions with explicit routing rules for the source type of your promote data. To exclude the promote data from ingest actions and Ingest Processor pipeline, follow these procedures:

  • Create the SplunkDMReadOnly IAM role

    Ask your AWS admin to create the SplunkDMReadOnly IAM role in your AWS account. This role lets Splunk Cloud read the configurations from the various AWS services that data is collected from. Configure the SplunkDMReadOnly IAM role with the following trust relationship and policy. Make sure that the AWS administrator replaces the account identifiers in the policy. If you have already created this role, verify that its role policy matches the following role policy.

    Copy the Role Policy statement from Data Manager:
    The Role Policy statement with placeholder information
    The following code presents the Role Policy statement but with the <DATA_ACCOUNT_ID> placeholder. Replace this placeholder with your AWS account ID.
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "iam:GetRole",
                    "iam:PassRole",
                    "iam:GetRolePolicy",
                    "iam:ListRolePolicies",
                    "iam:ListAttachedRolePolicies",
                    "iam:GetPolicy",
                    "iam:GetPolicyVersion"
                ],
                "Resource": [
                    "arn:aws:iam::<DATA_ACCOUNT_ID>:role/SplunkDM*",
                    "arn:aws:iam::<DATA_ACCOUNT_ID>:policy/*",
                    "arn:aws:iam::<DATA_ACCOUNT_ID>:role/SplunkUCF-*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": "guardduty:GetMasterAccount",
                "Resource": "arn:aws:guardduty:*:<DATA_ACCOUNT_ID>:detector/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "securityhub:GetEnabledStandards",
                    "securityhub:GetMasterAccount",
                    "securityhub:ListMembers",
                    "securityhub:ListInvitations"
                ],
                "Resource": "arn:aws:securityhub:*:<DATA_ACCOUNT_ID>:hub/default"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "cloudformation:DescribeStacks",
                    "cloudformation:GetTemplate"
                ],
                "Resource": "arn:aws:cloudformation:*:<DATA_ACCOUNT_ID>:stack/SplunkDM*/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "cloudwatch:ListMetrics",
                    "cloudwatch:GetMetricStatistics",
                    "cloudtrail:DescribeTrails",
                    "guardduty:ListDetectors",
                    "guardduty:ListMembers",
                    "guardduty:ListInvitations",
                    "guardduty:GetFindingsStatistics",
                    "access-analyzer:ListAnalyzers",
                    "sqs:GetQueueUrl",
                    "ec2:DescribeFlowLogs"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "logs:DescribeLogGroups",
                    "logs:DescribeSubscriptionFilters"
                ],
                "Resource": [
                    "arn:aws:logs:*:<DATA_ACCOUNT_ID>:log-group:*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "firehose:DescribeDeliveryStream"
                ],
                "Resource": [
                    "arn:aws:firehose:*:<DATA_ACCOUNT_ID>:deliverystream/SplunkDM*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "events:DescribeRule"
                ],
                "Resource": [
                    "arn:aws:events:*:<DATA_ACCOUNT_ID>:rule/SplunkDM*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::splunkdmfailed*",
                    "arn:aws:s3:::sdm-dataingest-cft*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "lambda:GetFunction"
                ],
                "Resource": [
                    "arn:aws:lambda:*:<DATA_ACCOUNT_ID>:function:SplunkDM*"
                ]
            }
        ]
    }
    Copy the Trust Relationship statement from Data Manager:
    The View Trust Relationship statement is available on the Prerequisites page when you create a new input
    The following code presents the Trust Relationship statement but with placeholders for the following information:
    • <EXTERNAL_ID>

    • <YOUR_AWS_ACCOUNT_ID>

    • <YOUR_IAM_ROLE_NAME>

    These placeholders are replaced with the valid values in the Trust Relationship statement that is available in Data Manager.
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "sts:AssumeRole",
                "Principal": {
                    "AWS": "arn:aws:iam::<YOUR_AWS_ACCOUNT_ID>:role/<YOUR_IAM_ROLE_NAME>"
                },
                "Condition": {
                    "StringEquals": {
                        "sts:ExternalId": "<EXTERNAL_ID>"
                    }
                }
            }
        ]
    }
  • (Optional) Create an onboarding user

    Ask your AWS admin to create the onboarding user in the AWS account. This user allows you to take actions on resources, such as creating CloudFormation stacks and listing S3 buckets. Configure the onboarding user with the following IAM User policy. Make sure that the AWS administrator replaces the account identifiers in the policy.

    Copy the IAM User Policy statement from Data Manager:
    The following code presents the IAM Role Policy statement but with the the <DATA_ACCOUNT_ID> placeholder. Replace this placeholder with your AWS account ID.
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "iam:GetRole",
                    "iam:PassRole",
                    "iam:CreateRole",
                    "iam:DeleteRole",
                    "iam:PutRolePolicy",
                    "iam:GetRolePolicy",
                    "iam:DeleteRolePolicy",
                    "s3:CreateBucket",
                    "s3:DeleteBucket",
                    "s3:GetObject",
                    "s3:PutObject",
                    "s3:DeleteObject",
                    "s3:ListBucket",
                    "s3:GetBucketVersioning",
                    "s3:ListBucketVersions",
                    "cloudformation:DeleteStack",
                    "cloudformation:DescribeStackEvents",
                    "cloudformation:TagResource"
                ],
                "Resource": [
                    "arn:aws:s3:::splunkdmfailed-*",
                    "arn:aws:s3:::sdm-dataingest-cft-*",
                    "arn:aws:cloudformation:*:<DATA_ACCOUNT_ID>:stack/SplunkDM*",
                    "arn:aws:iam::<DATA_ACCOUNT_ID>:role/SplunkDM*",
                    "arn:aws:iam::<DATA_ACCOUNT_ID>:role/SplunkUCF-*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "logs:CreateLogGroup",
                    "logs:DeleteLogGroup",
                    "logs:DescribeLogStreams",
                    "logs:CreateLogStream",
                    "logs:DeleteLogStream",
                    "logs:TagResource",
                    "logs:ListTagsForResource"
                ],
                "Resource": [
                    "arn:aws:logs:*:<DATA_ACCOUNT_ID>:log-group:/aws/kinesisfirehose/SplunkDM*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "lambda:GetFunction",
                    "lambda:DeleteFunction",
                    "lambda:GetFunctionConfiguration",
                    "lambda:CreateFunction",
                    "lambda:InvokeFunction",
                    "lambda:AddPermission",
                    "lambda:RemovePermission"
                ],
                "Resource": [
                    "arn:aws:lambda:*:<DATA_ACCOUNT_ID>:function:SplunkDM*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "firehose:DescribeDeliveryStream",
                    "firehose:CreateDeliveryStream",
                    "firehose:DeleteDeliveryStream",
                    "firehose:UpdateDestination"
                ],
                "Resource": [
                    "arn:aws:firehose:*:<DATA_ACCOUNT_ID>:deliverystream/SplunkDM*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "events:DescribeRule",
                    "events:PutRule",
                    "events:DeleteRule",
                    "events:PutTargets",
                    "events:RemoveTargets"
                ],
                "Resource": [
                    "arn:aws:events:*:<DATA_ACCOUNT_ID>:rule/SplunkDMIAMCredentialReportScheduleRule",
                    "arn:aws:events:*:<DATA_ACCOUNT_ID>:rule/SplunkDMIAMAccessAnalyzerEventBridgeRule",
                    "arn:aws:events:*:<DATA_ACCOUNT_ID>:rule/SplunkDMMetadata*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "cloudformation:CreateStack",
                    "cloudformation:ListStacks",
                    "cloudformation:DescribeStacks",
                    "cloudformation:UpdateStack",
                    "s3:ListAllMyBuckets"
                ],
                "Resource": "*"
            }
        ]
    }
  • Prepare information about your AWS account and path to the S3 buckets you wish to ingest.