Reliable Insights

A blog on monitoring, scale and operational Sanity

December 11, 2017

Why are Prometheus histograms cumulative?

Have you ever wondered why the buckets in histograms are not just counters of events that fall into each bucket?

Read more

December 4, 2017

Using time series as alert thresholds

Usually alert thresholds are hardcoded in the alert. In more sophisticated setups, it would be useful for it to be parameterised based on another time series.

Read more

October 23, 2017

Converting Rules to the Prometheus 2.0 Format

With the upcoming release of Prometheus 2.0 comes a new format for writing recording and alerting rules.

Read more

September 4, 2017

Functions to Avoid

As PromQL has evolved, there are some functions that should no longer be used.

Read more

August 28, 2017

Avoid irate() in alerts

While the irate() function is useful for granular graphs, it is not suitable for alerting.

Read more

July 24, 2017

Existential issues with metrics

The Prometheus instrumentation best practices say to "Avoid missing metrics". Let's look at why, and how to deal with it.

Read more

May 29, 2017

What’s in a __name__?

You may have noticed that most PromQL functions and operators remove the metric name in their result. Let's look at why.

Read more

May 1, 2017

Common query patterns in PromQL

For day to day use, there's only a handful of PromQL patterns you need to know. Let's look at them.

Read more

March 27, 2017

Combining alert conditions

Prometheus alerts use the same powerful PromQL expressions as queries and graphs. This can be used to produce sophisticated alerts.

Read more

March 6, 2017

Booleans, logic and math

Prometheus doesn't have an explicit boolean type or functionality. However there is a convention and enough power in PromQL to work with booleans.

Read more


Blog   |   Training   |   Book   |   Careers   |   Privacy   |   Demo