Reliable Insights

A blog on monitoring, scale and operational Sanity

January 2, 2017

Memory usage of Prometheus client libraries

A common question around Prometheus client libraries is how much RAM they'll use on a busy process. There tends to be disbelief when we say it's the same as an inactive server. Let's look deeper.

Read more

December 26, 2016

How does a Prometheus Gauge work?

We looked previously at the counter, how does the Prometheus gauge work?

Read more

September 26, 2016

Instrumenting Python with Prometheus

Python is one of the four languages that has an official Prometheus client. Let's take a quick look at how to use it.

Read more

September 12, 2016

Who wants seconds?

The Prometheus instrumentation guidelines say to use seconds, and the timing functions in client libraries follow this. Why?

Read more

August 22, 2016

Exposing the software version to Prometheus

I've previously mentioned that you shouldn't have the version of your software as either a target label, or exposed via a label on all metrics of your server as it'll make using the metrics more challenging. What should you do instead?

Read more

August 8, 2016

On the naming of things

How you choose to name metrics is important. If everyone choose different schemes it'd lead to confusion, irritation and prevent us from sharing and reusing each others' work. I'd like to share some guidelines to help keep things sane for everyone.

Read more

April 8, 2016

How does a Prometheus Counter work?

There are four standard types of metric in Prometheus instrumentation: Gauge, Counter, Summary and Histogram. Today we'll have a look at the principles around Counters, and how Prometheus differs from other monitoring systems.

Read more

October 9, 2015

Monitoring Batch Jobs in Python

Prometheus monitoring is usually against on long-lived daemons, but what if you've a batch job that you want to monitor?

Read more


Blog   |   Training   |   Book   |   Privacy