Troubleshoot Health Rule Violations

A health rule is violated when the health rule processor detects that the health rule's critical or warning condition is true. In this case, a health rule violation is created with a status of Open, and a Health Rule Violation Started - Critical event or a Health Rule Violation Started - Warning event is generated.

A health rule violation ends when it is either: resolved (the agent reports metrics that indicate that that the violated condition is no longer true) or canceled (the health rule processor can no longer accurately assert that the health rule violation continues to violate or that it has ended).

Warning: When a violated health rule goes into an unknown state and remains in that state until Wait Time after Violation, the health rule violation event is canceled after Wait Time after Violation is over.

When the violation status of a health rule becomes resolved, a Health Rule Violation Canceled - Critical event or a Health Rule Violation Ended - Warning event is generated.

The health rule violation status is canceled when:

  • The health rule is edited.
  • The health rule is disabled.
  • Affected entities or evaluation entities on which the health rule is based are added or removed.
  • The metric values on which the health rule violation is based become UNKNOWN .
  • The health rule cron based schedule is ended.
  • The health rule continues to violate straight for 72 hours.

    Note: If you want to extend the health rule violation status even after 72 hours of continuous violation, contact your Splunk AppDynamics Support.

When the violation status of a health rule becomes Canceled, a Health Rule Violation Ended - Canceledevent or a Health Rule Violation Canceled - Warning event is generated.

If the same health rule is violated after a violation of it has been resolved or canceled, a new health rule violation is started.

During the life of a single health rule violation, there may be other types of health rule violation events such as Health Rule Violation Ungraded/Downgraded/Continues events.

The figure below illustrates the health rule violation life cycle.

Health Rule Violation Lifecycle

The boxes represent the health rules violation statuses that you see in the health rule violations list in Troubleshoot > Health Rule Violations. To get more information about a particular violation, select the violation in the list and click Details. You can also view the health rule violations in the Controller UI. Health rule violation events are listed in the Events tab of various dashboards.

Because there is a set of default health rules, you may see health rule violations reported for your application even if you have not set up your own health rules. Violations reported for the the APPDYNAMICS_DEFAULT_TX business transaction are for default health rule violations in the All Other Traffic business transaction.

Note: When a health rule violates, you might see a difference of approximately two to three minutes between the time stamps of the violating metrics on the Metric Browser window and the corresponding event on the Events tab of an application. The time stamp of the violating metrics is earlier than the time stamp of the corresponding event. This difference is because of the time taken by the agents to send data to the Controller.