October 1, 2018

What is a job label for?

The job label is one of the labels your targets will always have. So how can you use it?

August 13, 2018

Why predeclare metrics?

The standard way to use metrics in Prometheus is to declare them at file level, before using them. Why?

June 4, 2018

Will using Prometheus instrumentation clients lock me in?

Considering using Prometheus, but worried about committing to using our clients?

April 16, 2018

When to Alert with Prometheus

Alerting is an art. One must be sure to alert just enough to be aware of all problems arising in the monitored system while at the same time not drown out the signal with excess noise. In this blogpost we'll explain some of the best practices to use when alerting with Prometheus.

March 26, 2018

Why not send graphs with alerts?

You may have noticed that notifications from the Alertmanager are text. Wouldn't it be nice if Prometheus sent graphs along?

February 26, 2018

Dude, where’s my exporter?

So you have just discovered Prometheus and want to try it out or use it to replace your old monitoring system but have run into a part of your stack that you cannot instrument with a client library and for which there are no officially supported exporters. What do you do?

February 19, 2018

Common pitfalls when using the Pushgateway

Jobs of an ephemeral nature are often not around long enough to have their metrics scraped by Prometheus. In order to remedy this the Pushgateway was developed to allow for these types of jobs to push their metrics to a metrics cache in order to be scraped by Prometheus long after the original jobs have gone away. This blogpost discusses some of the common pitfalls users tend to fall into when adding the Pushgateway to their monitoring stack.

December 25, 2017

Keep It Simple scrape_interval-id

How many scrape intervals should you have in a Prometheus?

November 13, 2017

Are increasing timestamps Counters or Gauges?

Every now and then someone asks what metric type a increasing timestamp should be. Let's take a look.

October 30, 2017

Running into burning buildings because the fire alarm stopped

At what point should you consider an alert resolved?

