Upload and inference pre-trained AWS SageMaker models in the AI Toolkit

In version 5.6.4 and higher of the AI Toolkit you can invoke pre-trained AWS SageMaker machine learning (ML) models for inferencing in the toolkit.

The SageMaker Inference Endpoint Integration feature lets AI Toolkit users invoke their own advanced, custom-built, AWS SageMaker–hosted models directly from Splunk platform searches, dashboards, and alerts, bringing model predictions into Splunk platform workflows using the familiar ML-SPL apply command.

Pro-code users can operationalize advanced ML workloads within the Splunk platform while leveraging SageMaker's managed infrastructure for scalable inference. This eliminates GPU, CPU, and Python library limitations, allowing for inference on large, complex, or custom ML models hosted in AWS, without overloading the search head.

Note: SageMaker models follow the same permission rules as other models you create in the AI Toolkit.

Key benefits

The SageMaker Inference Endpoint Integration feature offers the following benefits:

  • Advanced model support by using SageMaker to host and scale complex ML models that the AI Toolkit cannot run on the Splunk search head.

  • Improved scale and performance by offloading the heavy, high‑cardinality inference from the Splunk search head to managed SageMaker endpoints.

  • Faster operationalization by invoking SageMaker models directly from Splunk platform searches, dashboards, and alerts with a single ML-SPL command.

SageMaker feature permissions

See the following table the permissions needed to perform SageMaker Inference Endpoint Integration feature operations:
Note: All users can run inference on registered models. Users without the edit_endpoints capability can run models but cannot register new models.
SageMaker model inference operationRequired permissions
Edit, create, test, and delete edit_endpoints, edit_storage_passwords, and list_storage_passwords
Use the apply command to invoke the SageMaker modelSearch permissions and list_storage_passwords

SageMaker feature requirements

You must meet the following requirements to use the SageMaker Inference Endpoint Integration feature:

  • You must be a user of the AWS SageMaker Service and are expected to manage your own AWS customer configurations.

    • All costs associated with SageMaker training and inference are borne directly by the customer in their AWS account.

  • Completion of configuration steps within your instance of AWS SageMaker.

  • Completion of configuration steps from the ML models tab of the AI Toolkit.

SageMaker feature workflow overview

See the following table for the high-level workflow of the SageMaker Inference Endpoint Integration feature:

Workflow stepDescription
Build and deploy in AWSCustomers use their existing AWS accounts and SageMaker expertise to build, train, and deploy their models.
Register in the Splunk platformAn administrator securely registers the SageMaker model in the AI Toolkit's governed catalog. This is a one-time, secure setup using IAM roles, with no exposed credentials.
Note: The AI Toolkit supports certain content types for the SageMaker Inference Endpoint. See Supported content types.
.
Invoke the model in the Splunk platformPro-code users leverage the apply command in SPL to run the model against their Splunk platform data. Predictions appear directly in Splunk platform searches, dashboards, and alerts for instant operationalization.

Syntax: | apply <endpoint-name> runtime=sagemaker

Example: | apply <sagemaker-endpoint> runtime="sagemaker" features="field1,field2,field3"

Supported content types

The AI Toolkit supports the following content types for the SageMaker Inference Endpoint:

Content typeSample input feature mapping Sample output feature mapping Open API spec
application/json
{ 
"cpu_usage": "instances[*].cpu_usage", 
"memory_usage": "instances[*].memory_usage", 
"disk_io": "instances[*].disk_io", 
"network_latency": "instances[*].network_latency", 
"error_count": "instances[*].error_count" 
} 
{ 
"result[*].prediction": "log_severity", 
"result[*].confidence": "confidence" 
} 
Note: Supports only json spec and version 3.0.0.
{ 
  "openapi": "3.0.0", 
  "info": { 
	"title": "Log Event Severity Classification API", 
	"version": "1.0.0", 
	"description": "Classifies log events into severity levels with confidence scores" 
  }, 
  "paths": { 
	"/invocations": { 
  	"post": { 
    	"requestBody": { 
      	"content": { 
        	"application/json": { 
          	"schema": { 
            	"type": "object", 
            	"properties": { 
              	"instances": { 
                	"type": "array", 
                	"items": { 
                  	"type": "object", 
                  	"properties": { 
                    	"cpu_usage": { 
                      	"type": "number" 
                    	}, 
                    	"memory_usage": { 
                      	"type": "number" 
                    	}, 
                    	"disk_io": { 
                      	"type": "number" 
                    	}, 
                    	"network_latency": { 
                      	"type": "number" 
                    	}, 
                    	"error_count": { 
                      	"type": "integer" 
                    	} 
                  	}, 
                  	"required": [ 
                    	"cpu_usage", 
                    	"memory_usage", 
                    	"disk_io", 
                    	"network_latency", 
                    	"error_count" 
                  	] 
                	} 
              	} 
            	}, 
            	"required": [ 
              	"instances" 
            	] 
          	} 
        	} 
      	} 
    	}, 
    	"responses": { 
      	"200": { 
        	"content": { 
          	"application/json": { 
            	"schema": { 
              	"type": "object", 
              	"properties": { 
                	"result": { 
                  	"type": "array", 
                  	"items": { 
                    	"type": "object", 
                    	"properties": { 
                      	"prediction": { 
                        	"type": "integer" 
                      	}, 
                      	"confidence": { 
                        	"type": "number" 
                      	} 
                    	} 
                  	} 
                	} 
              	} 
            	} 
          	} 
        	} 
      	} 
    	} 
  	} 
	} 
  } 
} 
text/ csv For CSV no Mapping is needed as the data converts to CSV without column header and sends it through payload. Provide the empty {}. Sample the same way as input feature mapping.
Note: Supports only json spec and version 3.0.0.
{ 
  "openapi": "3.0.0", 
  "paths": { 
	"/invocations": { 
  	"post": { 
    	"requestBody": { 
      	"content": { 
        	"text/csv": { 
          	"schema": { 
            	"type": "string" 
          	}, 
          	"example": "97,166,734,489\n84,523,892,347" 
        	} 
      	} 
    	}, 
    	"responses": { 
      	"200": { 
        	"content": { 
          	"text/csv": { 
            	"schema": { 
              	"type": "string" 
            	} 
          	} 
        	} 
      	} 
    	} 
  	} 
	} 
  } 
} 

SPL examples

See the following SPL examples of the apply command being called to invoke a SageMaker model:

Example 1

| makeresults count=6
| streamstats count
| eval cpu_usage = case(count=1, 45.2, count=2, 89.5, count=3, 23.1, count=4, 34.15, count=5, 31.64, count=6, 98.45)
| eval memory_usage = case(count=1, 62.3, count=2, 94.7, count=3, 38.5, count=4, 50.4, count=5, 43.61, count=6, 104.17)
| eval disk_io = case(count=1, 125.5, count=2, 876.2, count=3, 45.8, count=4, 85.65, count=5, 87.85, count=6, 963.82)
| eval network_latency = case(count=1, 12.3, count=2, 156.7, count=3, 8.2, count=4, 10.25, count=5, 8.61, count=6, 172.37)
| eval error_count = case(count=1, 2, count=2, 47, count=3, 0, count=4, 1, count=5, 1, count=6, 51)
| table cpu_usage memory_usage disk_io network_latency error_count
| fields - _time
| apply sg_metric_alert_classification runtime=sagemaker features="cpu_usage,memory_usage,disk_io,network_latency,error_count

Example 2

| makeresults count=6
| streamstats count
| eval cpu_usage = case(count=1, 45.2, count=2, 89.5, count=3, 23.1, count=4, 34.15, count=5, 31.64, count=6, 98.45)
| eval memory_usage = case(count=1, 62.3, count=2, 94.7, count=3, 38.5, count=4, 50.4, count=5, 43.61, count=6, 104.17)
| eval disk_io = case(count=1, 125.5, count=2, 876.2, count=3, 45.8, count=4, 85.65, count=5, 87.85, count=6, 963.82)
| eval network_latency = case(count=1, 12.3, count=2, 156.7, count=3, 8.2, count=4, 10.25, count=5, 8.61, count=6, 172.37)
| eval error_count = case(count=1, 2, count=2, 47, count=3, 0, count=4, 1, count=5, 1, count=6, 51)
| table cpu_usage memory_usage disk_io network_latency error_count
| fields - _time
| apply sg_classification-nested-model runtime=sagemaker features="cpu_usage,memory_usage,disk_io,network_latency,error_count"

SageMaker feature configuration steps

Configuration is a one-time, secure setup that uses IAM roles with no exposed credentials.

The following are the minimal AWS permissions this IAM role requires, along with an AllowAssumeRole permission to enable secure role assumption through STS:

  • "sagemaker:InvokeEndpoint"

  • "sagemaker:InvokeEndpointAsync"

  • "sagemaker:DescribeEndpoint"

  • "sagemaker:DescribeEndpointConfig"

  • "sagemaker:ListEndpoints"

  • "sagemaker:ListModels"

  • "sagemaker:DescribeModel"

Example IAM policy:
{ 
  "Version": "2012-10-17", 
  "Statement": [ 
    { 
      "Sid": "SageMakerInferenceAccess", 
      "Effect": "Allow", 
      "Action": [ 
        "sagemaker:InvokeEndpoint", 
        "sagemaker:InvokeEndpointAsync", 
        "sagemaker:DescribeEndpoint", 
        "sagemaker:DescribeEndpointConfig", 
        "sagemaker:ListEndpoints", 
        "sagemaker:ListModels", 
        "sagemaker:DescribeModel" 
      ], 
      "Resource": "*" 
    }, 
    { 
      "Sid": "AllowAssumeRole", 
      "Effect": "Allow", 
      "Action": "sts:AssumeRole", 
      "Resource": "*" 
    } 
  ] 
} 

Complete the following steps:

  1. Log into the AI Toolkit and navigate to the Models tab.

  2. As show in the following image, from the +Model button, choose SageMaker:

    This image shows the Models tab of the AI Toolkit. The +Model button on the far right is selected and the option for SageMaker is highlighted.

  3. On the Add SageMaker model window, complete the following fields:

    Field nameDescription
    Model nameRequired field. The name of the model name as created in SageMaker. Model name is created using the AWS SageMaker "Create endpoint workflow".

    The model name must be unique and free of special characters.

    DescriptionOptional field. Input a description to explain the model's purpose and intended use.
    EndpointRequired field. Endpoint is a SageMaker Inference Endpoint Name, created using the AWS Sagemaker "Create endpoint workflow".
    AWS regionRequired field. Taken from your AWS credentials.
    AWS access key IDRequired field. Taken from your AWS credentials.
    IAM role ARNRequired field. The IAM role used for the SageMaker inference API call. It assumes STS to generate temporary credentials and has the necessary permissions to invoke the SageMaker model.
  4. Select Test connection to confirm the connection information is correctly added.

    1. If you see a Connection successful message, continue to step 5.

    2. If you see an Unable to establish connection message, check the information you added in step 3 and try again.

  5. Complete the remaining fields:

    Field nameDescription
    Input feature mappingRequired field.
    Output feature mappingRequired field.
    Open API for inference endpointRequired field.
    SPL results batch sizeRequired field. Number of rows the SageMaker deployed end point can accept for each inference invocation. Default is 1 and maximum is 10,000.
  6. Select Add Model when done.