Drill Into an Anomaly

In Alert & Respond > Anomaly Detection, view the Anomalies tab.
Double-click an anomaly to open the detailed view.

Initially, the page describes everything that is occurring during the anomaly's Start Time. To review how things change later in the anomaly's lifecycle, click events further along its timeline.

Examine the Anomaly Description

The anomaly description describes the anomaly in relation to its named Business Transaction, the severity level of the selected state transition event, and the top deviating Business Transaction metrics.

In this example, these are:

Business Transaction: /r/Checkout
Severity Level: Critical
Top Deviating Metrics: Average Response Time

The deviating metric is Average Response Time indicating that checkout responding slowly is the problem.

Examine the Timeline

The state transition events mark the moments when the anomaly moves between Warning and Critical states.

The timeline in this example begins in the Critical state, followed 30 minutes later by a transition to the Warning state, which lasts only eight minutes.
Because this simple anomaly starts in the Critical state and remains there for most of its lifecycle, we can probably learn all we need to know from the initial event

Critical to Warning Timeline

By contrast, patterns that appear in more complicated timelines may help you to understand anomalies. For example, this timeline from a different anomaly repeatedly toggles from a brief Warning state to a longer Criticalstate:

In this case, you should examine several state change events to determine what clues toggling between states offers about problems in your application.

Examine the Flow Map

The example flow map contains:

The START label shows that the Business Transaction begins with the OrderService tier.
Between the OrderService tier and its numerous dependencies, two tiers are red—these are the tiers where the system has found Suspected Causes.

You can now focus on determining which of the red tiers contains the root cause of the anomaly.

Note: Anomaly Detection flow maps are different. There are two types of flow maps in Splunk AppDynamics: Anomaly Detection and Automated RCA flow map (described on this page) Business Transaction flow map. Each of these flow maps detects deviating or unhealthy entities in its own way. Therefore, you will see some differences: The two flow maps may show a different health status (as represented by color) for the same entity because each one uses its own algorithm to determine health. User preferences for positioning or hiding entities saved for the Business Transaction flow map have no effect on the Anomaly Detection flow map. Some links between tiers might be shown in one type of flow map but hidden in the other. For example, when no data is flowing through a tier or between tiers: Business Transaction flow map may hide them, as 'inactive' tiers or links. Anomaly Detection flow map may show them in order to represent the application topology completely.

Examine the Top Suspected Causes

The Top Suspected Causes show likely root causes of a Business Transaction performance problem. You can can traverse up to the following entities in the call paths to find the root cause of the anomaly:

services such as payment service, order service
backend such as database backend, HTTP backend
cross-applications
Infra machine entity-server

Warning: Currently, the Top Suspected Causes feature is not available for base page experience, databases, and network request issues.

In the following example, we want to know why the business transaction /order is throwing an error. The first suspected cause is a front end issue on frontend15novauto.

Hover over the Suspected Cause to highlight the relevant entities in the flow map. Everything but the critical path fades away, revealing that ApacheWebServer, where the Business Transaction starts, relies on frontend15novauto, which had an anomaly in the errors per minutes metric.

suspected cause

Note: There can be anywhere from zero to three Top Suspected Causes. For example, if ART is high but every entity connected with ART is behaving normally, there are zero suspected causes because no suspected cause can be identified.