Reliable Insights

A blog on monitoring, scale and operational Sanity

September 11, 2017

It’s easy to convert Pull to Push

If you have to choose one of push or pull in your core, which should it be?

Read more

August 14, 2017

Which kind of push? Events or metrics?

Continuing in our exploration of the ongoing epic saga of push vs. pull where the very future of humanity is at stake, let's look at two general classes of push that are often conflated.

Read more

May 22, 2017

Push needs Service Discovery

It's often claimed that an advantage of push-based monitoring systems is that, compared to pull-based systems like Prometheus, they don't need service discovery. This isn't true, and I'm going to explain why.

Read more

February 27, 2017

Label Lookups and the Child

The Prometheus client library guidelines recommend having a Child be returned via labels(). Why?

Read more

August 2, 2016

One agent to rule them all

Another not uncommon question we get about Prometheus is as to why we don't have a single per-machine agent that handles all the collection, and instead have one exporter per application. Doesn't that make it harder to manage?

Read more

July 14, 2016

Monitoring without Consensus

When designing a monitoring system and the datastore that goes with it, it can be tempting to go straight for a clustered highly consistent approach. But is that the best approach?

Read more

June 2, 2016

Prometheus Security: Authentication, Authorization and Encryption

It's a frequently asked question as to how to do various security related features with Prometheus. Let's take a deeper look at why we chose the approach we did.

Read more

May 17, 2016

Prometheus and Alertmanager Architecture

A not uncommon question about Prometheus is why the Alertmanager is a separate binary. Let's look at that.

Read more

August 23, 2015

There are 100,000 Seconds in a Day

Just after you've launched is not the best time to find out that you can't handle the load you predicted, or that running costs are much higher than you'd like. By estimating the operational parameters of your system as you design you can gain confidence that the system will work as you expect.

Read more


Blog   |   Training   |   Book   |   Careers   |   Privacy   |   Demo