Set up alerts for Edge Processor metrics
As an Edge Processor administrator, you can set up alerts that trigger when Edge Processor metrics meet a certain criteria so that you can monitor the health and status of your Edge Processors. You can then take action to troubleshoot any potential issues with your Edge Processors. You can do this from your Splunk Cloud Platform deployment in your cloud tenant for use in Edge Processors.
This table highlights the search queries that you can use to set up alerts for Edge Processor metrics as well as some potential action items you can take once that situation occurs. You can create these queries and alerts by utilizing Splunk Cloud Platform functionality. For more information on how to configure alerts in Splunk Cloud Platform, see Getting started with alerts in the Splunk Cloud Platform Alerting Manual.
Metrics | Alert trigger conditions | Example search | Action item |
---|---|---|---|
Edge Processor availability | If the Edge Processor is not sending any metrics to the connected Splunk Cloud Platform deployment. Edge Processors send metrics to Splunk Cloud Platform every 30 seconds. If the SPL query for this alert returns 0, that means the Edge Processor has not sent any metrics, indicating that it is not running as expected. | SPL query to see the number of metrics data points that the Edge Processor has sent: | First, verify that the Edge Processor is not in the Error status. See An Edge Processor instance is in the "Error" status for troubleshooting guidance. If this alert persists, then verify that the host machine meets the necessary network requirements and the Edge Processor is able to send data to Splunk Cloud Platform. See Network requirements. |
Edge Processor data ingestion in bytes | If data ingestion is below a certain threshold. For example, 0 indicates that the Edge Processor is not ingesting any data at all. | SPL query to see the amount of ingested data in bytes: | First, verify that the Edge Processor is not in the Error status. See An Edge Processor instance is in the "Error" status for troubleshooting guidance. If the alert persists, then verify that the ports for receiving data are configured correctly, and that your data sources are correctly configured to send data to those ports. See Configure shared Edge Processor settings. |
Edge Processor queue size | If queue size is above a certain threshold. For example, 70%. This indicates that you need to increase your queue size. | SPL query to see latest queue size for each instance: | Increase your queue size to process more data. See these topics for more information: |
Destination data send failure | If the Edge Processor fails to send data to a destination, creates errors, and those errors are above a certain threshold. This indicates that your destination configuration might be incorrect or the destination might be offline. | SPL query to see total send errors per dataset: | Verify that the destination information is correct for Edge Processors by checking the edge.log file. See View logs for the Edge Processor solution for more information. |
CPU usage | If your host resource has an idle CPU usage above a certain threshold. This indicates that the host CPU can't handle the required workload. | SPL query to see the CPU usage by state for each host: | Verify what is causing a high CPU usage and take action accordingly. Increase CPU specifications or create an additional host to manage traffic. See An Edge Processor instance is in the "Warning" status for more information. |
Memory usage | If your host resource has a memory usage above a certain threshold. This indicates that the host memory can't handle the required workload. | SPL query to see memory usage in bytes per host: | Verify what is causing a high memory usage and take action accordingly, such as by increasing memory specifications. See An Edge Processor instance is in the "Warning" status for more information. |