Tales of an Ops Team: Papertrail in Production — Librato Blog

Tales of an Ops Team: Papertrail in Production


Infrastructure as a service acts as a force multiplier, making it possible to bootstrap software solutions at a fraction of the cost and effort compared to just a few years ago. But all software solutions eventually present a similar set of IT infrastructure conundrums: logging, monitoring, alerting, security, deployment pipeline, testing - to name just a few. These are hard problems that, one way or another, have to be solved well, and that are best solved with well integrated solutions.

You may have heard that Librato and Papertrail recently joined forces, but you probably didn't know that before we were partners, we were actually customers of each other. This bred an amount of trust and a sense of reliability that are hard to come by otherwise. We are thrilled to be a part of the same company after three years of collaborating so closely by choice.

One aspect of software-as-a-service that isn't often written about is the crazy amount behind-the-scenes effort we SaaS providers put into integration. In fact, you get a huge amount of value from leaning on SaaS applications over DIY OSS because we put time and effort into integrating with other complementary SaaS solutions. To give you an idea of what that looks like in practice, let’s take a look at how we use Papertrail at Librato.

Internally, Librato is a microservices architecture, that consists of a dozen or so services. We've written before that our metrics dashboards are organized around these services. Generally we maintain one dashboard per service, which consists of metrics chosen by the engineers who build the service. Here, for example, is a screenshot of our dashboard for L2Met Service, which does the heavy lifting for our Heroku Integration.

We treat our Papertrail dashboard in pretty much exactly the same way: one section per service, composed of searches created by the engineers who created the service.

All of our services, in one way or another, communicate via HTTP, so web logs are quite important to us. Many of our Papertrail service dashboards break out HTTP response types for various types of inter-service communication, and of course for our front-end API as well.

In our world, problems tend to arise wherever the data rests, so we also send a lot of logs to Papertrail from our queuing and data persistence systems like Cassandra and Kafka. JVM logs are one of the best ways, for example, to detect Stop-The-World Garbage Collection events and heap problems.

We send alerts on some of these logs, especially on the HTTP Side of the world. We have 45 different alerts configured. The majority  of them use Papertrail's Slack integration to send alerts into one of our Chat channels. We've written before about our reliance on web-based persistent chat for day-to-day engineering work, so no surprises there.

Our next biggest alert category is our own Librato-Papertrail integration. This emits a count of the incidences of log-lines that contain the alert criteria to our metrics system, so that they can be visualized and alerted on from within Librato's UI.

Lastly, a few very actionable alerts are delivered directly to PagerDuty, using the Papertrail-PagerDuty integration, where they can properly interrupt the sleep pattern of the Op-Dejure.

SaaS systems like Librato and Papertrail, then, make IaaS viable in the long term. They provide a means for everyone — large and small — to build against time-tested, well architected solutions that work together out of the box, and scale at great value. So whether you’re bootstrapping an MVP, or refactoring an existing infrastructure, cloud-native solutions enable you to rapidly implement the infrastructure you need to stabilize and grow your product.

We hope you’ve enjoyed this first peek at how Papertrail and Librato work together. Stay tuned to this channel for more on monitoring and logging, and, of course, feel free to dip your toes in: both Librato and Papertrail offer free, no credit card required, trials.