Agent Behavior When Disconnected from the Controller

If there are network problems or agent errors, the Controller may become unreachable. The Controller server may also be down for a variety of reasons.

If the Controller is unreachable for one minute:

  • The agent goes into standby mode during which it does not detect any transactions.
  • Any collected snapshots and events are dropped and lost. Snapshots and events are dropped because they consume too much memory to cache.
  • All metrics that have not been posted to the Controller are stored in memory. The memory impact of retaining metrics is minimal.
  • New business transaction registrations that have not been posted to the Controller are stored in memory.
  • The agent attempts to connect to the Controller every minute and resumes normal activity when it can download its full configuration.

If the Controller becomes reachable in the following minute or two:

  • All metrics that have been stored in memory are posted to the Controller.
  • New business transaction registrations that have been stored in memory are posted to the Controller.
  • Snapshots and events collected in the 20 seconds prior to the reconnection are posted to the Controller.

If the Controller is not reachable after three failed attempts that are one minute apart:

  • The agent is muted and all business transaction interceptors are disabled. The interceptors are still called when monitored application entry point methods are executed, but they are unproductive. No new business transactions are discovered or registered. Correlation exit points will set a header such as “ notxdetect=true”, which tells downstream tiers to also ignore the transaction.
  • JMX metrics are stored in the application server memory and transmitted to Controller after reconnection; so, there are no gaps in the metric history.
  • Periodic metrics for the last three minutes are stored in memory. Metrics older than three minutes are purged from memory.
  • The agent configuration channel and the metric channel continue to attempt to connect to the Controller once each minute.

The agent attempts to connect to the Controller in seven one-minute intervals and in five minute intervals afterwards. If the Controller is not able to reconnect after five minutes, the license is freed for another agent to use.

If the connection is successful and the agent is able to download its full configuration and a license:

  • All periodic metrics, such as JMX metrics and Windows performance counters for the last three minutes, are posted to the Controller. The Controller drops metrics that were collected too long ago in the past, such as when rollups are already completed.
  • The agent is reactivated, business transaction interceptors are re-enabled, business transactions are monitored and possibly snapshotted, new business transactions will be discovered and registered, and downstream correlation is re-enabled.