System architecture of the Edge Processor solution

The primary components of the Edge Processor solution include the Edge Processor service, Edge Processors, and SPL2 pipelines that support data processing. The following diagram provides an overview of how these components work together:

This diagram shows how the Edge Processor service on Splunk Enterprise works with the Edge Processors installed on the customer's network. Data is generated by a source, collected by agents such as forwarders, sent to an Edge Processor for processing, and then routed to a destination..

Edge Processor service

The Edge Processor service is hosted by Splunk Enterprise. It is part of the data management experience, which is a set of services that fulfill a variety of data ingest and processing use cases.

You can use the Edge Processor service to do the following:

  • Configure and install Edge Processors on your local environment for on-location data processing.
  • Create and apply SPL2 pipelines that determine how each Edge Processor processes and routes the data that it receives.
  • Define source types to identify the kind of data that you want to process and determine how Edge Processors break and merge that data into distinct events.
  • Create connections to the destinations that you want your Edge Processors to send processed data to.

You access the Edge Processor service by logging into the Splunk Enterprise instance hosting your data management control plane and navigating to the Data Management app.

When you set up a data management control plane to host the Edge Processor service, the default_telemetry_indexer destination is created and connected to the Splunk Enterprise instance hosting the control plane. By default, Edge Processors will use this destination as a storage location for the logs and metrics that are generated by Edge Processors. The Edge Processor service retrieves these logs and metrics from the instance and displays them in the user interface of the service. However, Splunk best practice dictates that this internal data should be sent to the indexer tier in your Splunk Enterprise deployment. This requires that the default_telemetry_indexer destination be configured to the Splunk Enterprise instance hosting your indexers. See Add or manage destinations for more information on updating the configuration settings for your destinations.

Note: These Edge Processor logs and metrics only contain information pertaining to the operational status of a given Edge Processor. They do not contain any of the actual data that you are ingesting and processing through Edge Processors. See the Edge Processors section that follows for more details.

Edge Processors

An Edge Processor is a data processing engine that allocates resources for processing and routing data. You can install an Edge Processor on a single server node in your network or on a cluster of multiple server nodes. Multi-instance Edge Processors provide more powerful data processing capabilities than single-instance Edge Processors. Be aware that multiple Edge Processor instances cannot run on the same machine, so you must install each instance on a different machine.

Each Edge Processor instance is associated with a supervisor, which contacts the OpAMP service at regular intervals to check for system updates, provide telemetry data, and confirm that the instance is still connected to the service. See Sidecar configuration settings in the Admin manual for more information about the OpAmp service. When you use the Edge Processor service to change your Edge Processor configurations or pipeline definitions, or when Splunk releases new features or bug fixes for Edge Processors, the supervisor detects these changes and updates the instance as needed.

The supervisor sends the following information from the Edge Processor instance to the Edge Processor service in Splunk Enterprise:

  • Configuration information. This includes details such as the following:
    • The list of applied pipelines
    • The datasets that represent the selected data sources and destinations
    • The names of the Splunk indexes that the Edge Processor sends internal logs and metrics to
    • The version of the Edge Processor software that the instance is running
  • Heartbeats that indicate the status of the Edge Processor instance and confirm if the instance is still connected to the service. These heartbeats include information such as the following:
    • Whether the instance is running or stopped
    • How much CPU and memory the instance is consuming
    • The version of the Edge Processor software that the instance is running

As an Edge Processor works to process data, it generates logs and metrics containing operational information such as the amount of data that was processed and any events, warnings, or errors that have occurred. The Edge Processor sends these logs and metrics through the default_telemetry_indexer destination, which should be configured to send this internal data to the indexing tier of your Splunk Enterprise deployment.

Note: The information that an Edge Processor instance and its supervisor sends to Splunk Enterprise does not contain any of the actual data that is being ingested and processed. The data that you send through an Edge Processor only gets transmitted to the destinations that you choose in the Edge Processor configuration settings and the applied pipelines.

Pipelines

A pipeline is a set of data processing instructions written in SPL2. When you create a pipeline, you write a specialized SPL2 statement that specifies which data to process, how to process it, and where to send the results. For example, you can create a pipeline that filters for syslog events and sends them to a dedicated index in Splunk Enterprise. When you apply a pipeline to an Edge Processor, the Edge Processor uses those instructions to process all the data that it receives from data sources such as Splunk forwarders, HTTP clients, and logging agents.

The Edge Processor solution supports a subset of SPL2 commands and functions. Pipelines can include only the commands and functions that are part of the EdgeProcessor profile. For information about the specific SPL2 commands and functions that you can use to write pipelines for Edge Processors, see Edge Processor pipeline syntax. For a summary of how the EdgeProcessor profile supports different commands and functions compared to other SPL2 profiles, see the following pages in the SPL2 Search Reference: