Can I setup host dependencies to prevent a flood of alerts?

Yes. From a host's overview page, click the Host Dependencies link to get started.

Host dependencies define one or more hosts which a given host depends on. For example, your ‘web server’ might depend on your ‘core router’. Or your ‘home page’ might depend on both your ‘DB server’ and your ‘HTTP server’.

The upshot of defining these dependencies is that alerts triggered by a host will be suppressed if any of the hosts it depends on is in a failed state.

Using the first example above, if your core router has failed (and presumably already triggered alerts), then when Wormly detects that your web server has also failed, alerts triggered by the web server host will be suppressed.

Note the caveat here; Wormly can only suppress alerts on the web server if it knows that the core router has failed. If Wormly tests the web server before the core router - or if the failure occurs immediately prior to a web server test - then alert suppression cannot occur during that test cycle.

The way to avoid this scenario is to configure hosts to send alerts after an escalation time delay that exceeds the test interval of hosts they depend on.

e.g. If your core router is tested every minute, then ensure that the web server won’t trigger alerts until downtime has exceeded one minute. That way you can be confident that Wormly will have tested the core router before alerts for the web server can be triggered.

When configuring a hosts’ dependencies, you also need to specify a Recovery time. This dictates how long we allow a host to recover after the host(s) it depends on recovers. For example, once the core router recovers, we might allow 60 seconds for the web server to recover. If the web server fails to recover in that time frame, then alerts will no longer be suppressed.

Not what you were looking for? Try a search:

Ninja Tip: trace* will match traceroute.

Also in this topic: