Alerting: Better Visibility, Easy Correlation — Librato Blog

Alerting: Better Visibility, Easy Correlation

Today we’re happy to announce a few new enhancements to our recently redesigned alerting platform. These are the first of a long list of supplemental features that we have in the works, and as always, we’re excited, to be pushing features that will improve your visibility and quiet your pager.

Alert On Derivatives

First, we’ve added an option to the alert configuration form that allows you to select the derivative of a metric as the signal to check the alert thresholds against. This is useful when you’re sampling an always incrementing counter (like a packet-counter in a router), and you want to alert against that metric’s rate of change. This works on both native counters and gauge metrics. Once you have selected the derivative for the alert you can also optionally toggle "reset detection". For counters that roll over (reset to ‘0’, usually after overrunning a 32-bit buffer), you should enable reset detection to prevent triggering the alert on a counter reset.

Graphical Previews

As you may have noticed in the screenshot above, we've also added graphical previews to the alert configuration form. The new preview windows display the current threshold setting against historical data and allow you to adjust the threshold by dragging it up and down in the graph. By visualizing your current threshold against your actual data we think this feature will give you a better notion of how sensitive (or not) your current thresholds are, and make it easier to select meaningful thresholds in the future.

Automated Alert Annotations

Finally, we’ve added automatic generation of annotation events for alerts. Every time an alert triggers a notification, we will now automatically create a matching annotation event in a special librato.alerts annotation stream. This enables you to overlay the history of any given alert (or set of alerts) on any instrument as depicted below:

The naming convention for a given alert will take the form: librato.alerts.#{alert_name}. For each notification that has fired, you will find an annotation in the stream with its title set to the alert name, its description set to the list of conditions that triggered the notification, and a link back to the original alert definition. This feature will enable you to quickly and easily correlate your alerts against any combination of metrics.

We hope you get some use out of these features, and look forward to releasing what we have in the works, so keep your eyes peeled. As always, your feedback, is greatly appreciated, so let us know what you think.