Set up an Edge Processor in a Docker container

The following guide provides instructions on how to run Edge Processor within a Docker container, and is intended to demystify the "Install" and "Uninstall" scripts found in your tenant's web UI. If your goal is to get started quickly, simply copy-pasting the scripts provided in the UI should suffice; however, the information below is here to provide additional context if needed.

Prerequisites

  • Access to a functioning tenant in the Splunk Cloud Platform environment.

  • Docker installed on the host machines where the containers will be spun up.

Architecture overview

Setup and install

Perform the following steps in order to set up and install an Edge Processor on a Docker container.

Create a shared service principal

The following command provisions a single, shared service principal intended for consumption by multiple Edge Processor instances. This is typically performed once during initial setup, persisted in the principal.yaml file, and used to interact with other Splunk Cloud Platform services.
CODE
docker run --rm \ 
	-e TENANT=<tenant-id> \ 
	-e TOKEN=<access-token> \ 
	splunk/edge-processor:<image-tag> \ 
	eptools setup > principal.yaml
Note: The value for TOKEN can be found in the install script provided in your tenant's web UI. To locate it, navigate to the Edge Processors page, select or create a processor. Select Actions, and then Install/Uninstall from the dropdown menu in the top right-hand corner. Then set the Instance type to Docker. The token will be visible in the resulting script.

Running a containerized instance

Now that a shared service principal has been provisioned, spinning up a containerized Edge Processor instance is relatively simple.

First, you'll need to compile a list of ingress ports your instance will use to receive ingested data. To do so, navigate to the Edge Processors page in your tenant's web UI and click the Shared Settings button in the top right-hand corner. Here, ports will be defined for all supported ingestion types, each of which should be mapped to the corresponding port on the host machine via Docker's -p flag. Finally, running the instance simply involves mounting the principal.yaml file to the /opt/splunk-edge/etc/principal.yaml path within the container, publishing the ports gathered previously, and running the image:
CODE
docker run -d \
  -e TENANT=<tenant-id> \
  -e GROUP_ID=<edge-processor-id> \
  -e MACHINE_HOSTNAME=$(hostname) \
  -v $(pwd)/principal.yaml:/opt/splunk-edge/etc/principal.yaml \
  -p <port-1>:<port-1> ··· -p <port-n>:<port-n> \
  splunk/edge-processor:<image-tag>
Note: GROUP_ID is used to associate new instances with a specific Edge Processor. This value can be found by selecting one of the processors listed on the Edge Processors page of your tenant's web UI, then copying its ID field (for example, 431e1ead-fd5b-4af8-ac89-ccae2ae81eda). This value also appears at the end of the page's URL.

Running multiple containerized instances

Because the generated service principal is designed to be shared by multiple Edge Processor instances, spinning up subsequent containers is as simple as rerunning the previous docker run command. Note that modifying GROUP_ID between runs will register instances with different Edge Processors within your tenant—all under the same service principal. However, we suggest restricting service principals to one per processor to avoid unnecessary confusion and potential rate limiting. A single service principal can even be used across different host machines if the same principal.yaml file is mounted in each container, though we also discourage this practice.

Despite the benefits of doing so, we'd like to address one complication of spinning up multiple instances on a single host: port availability.

Avoiding port conflicts on a single host

When running multiple containers on the same host machine, port conflicts become a bottleneck since each published container port must bind to a unique port on the host. To prevent conflicts, publish the container port without pinning it to a specific host port (i.e., -p <container_port> instead of -p <host_port>:<container_port>). When the host port is omitted, Docker selects an available, ephemeral port at runtime and publishes the container port accordingly.

The trade-off of this approach, however, is discoverability. Because host ports are assigned dynamically, clients don't know where to route traffic ahead of time. As a result, you'll need to query the running containers to compile a per-host inventory of published ports, then distribute traffic among them using a load balancer or client-side routing. To list all running containers on the current host and their associated port mappings, run the following:
JSON
docker inspect $(docker ps -q) -f '
  {{- $id := slice .Id 0 12 -}}
  {{- range $cport, $bindings := .NetworkSettings.Ports -}}
    {{- if $bindings -}}
      {{- range $b := $bindings -}}
  {{ printf "%s: %s -> %s:%s\n" $id $cport $b.HostIp $b.HostPort }}
      {{- end -}}
    {{- end -}}
  {{- end -}}
'

Using statically assigned host ports

Instead of publishing ports dynamically, you could alternatively assign each container to a predefined host port within a reserved range. For example, if each container receives HEC events on port 8088, you might map instances to sequential ports 9001, 9002, 9003, etc. on the host:
JSON
docker run -p 9001:8088 {···}
docker run -p 9002:8088 {···}
docker run -p 9003:8088 {···}
By reserving a dedicated port range, you effectively eliminate runtime ambiguity and simplify traffic routing. Upstream data sources can then distribute traffic across the known port range via client-side load balancing and/or a lightweight, external load balancer. This approach improves predictability and removes the need to query container metadata at runtime. However, it also requires manual coordination to ensure ports are allocated consistently and are not reused unintentionally.

Uninstallation and cleanup

To uninstall a containerized Edge Processor instance, stop and (optionally) remove the container using the following command:
CODE
docker stop <container-id> && docker rm <container-id>
After doing so, you should see the instance disappear from the UI within a few seconds. Once all instances associated with a specific service principal have been uninstalled, it is recommended (but not required) that you deprovision the service principal by running the following command:
CODE
docker run --rm \
  -e TENANT=<tenant-id> \
  -e TOKEN=<access-token> \
  splunk/edge-processor:<image-tag> \
  eptools cleanup
Note: The value for TOKEN is not the same as before. Instead, it is located in the Settings page of your tenant's Cloud Console UI. More specifically, navigate to https://console.scs.splunk.com/{tenant-id}/settings and copy the provided Access Token

Updating the Docker image

Splunk recommends using the latest stable container image when possible. This ensures access to the latest bug fixes and security patches for vulnerabilities in the base image's OS, installed dependencies, and our software. You can either use the latest tag or hand-pick a version from https://hub.docker.com/r/splunk/edge-processor/tags, then run the following command:
CODE
docker pull splunk/edge-processor:<image-tag>
It should be noted that existing containers are not automatically updated when a new image is pulled. Instead, they must be stopped, removed, and recreated using the pulled image.

Other considerations

  • Port changes made using the Edge Processor Shared settings page in the UI are not automatically propagated to Docker. When modifying these values, you'll need to manually stop all running containers still bound to the previous ports, recreate them with the new port mappings, then update all affected ingestion flows to target these new ports.

  • Containerizing Edge Processor does not provide any additional control over the deployed binary. The container image version is independent of the Edge Processor binary version, which will continue to follow the standard, non-container lifecycle managed by the Edge Processor team.

  • For systemd-managed deployments, use Docker's --restart=always policy when spinning up an Edge Processor container to ensure automatic recovery after failures or host restarts.