The node exporter's textfile collector is handy for monitoring machine-level cronjobs. How would you go about that?

Let's say you had a simple bash script and you wanted to track how long it takes, and when it last ran. You could do something like:

#!/bin/bash

# Adjust as needed.
TEXTFILE_COLLECTOR_DIR=/var/lib/node_exporter/textfile_collector/
# Note the start time of the script.
START="$(date +%s)"

# Your code goes here.
sleep 10

# Write out metrics to a temporary file.
END="$(date +%s)"
cat << EOF > "$TEXTFILE_COLLECTOR_DIR/myscript.prom.$$"
myscript_duration_seconds $(($END - $START))
myscript_last_run_seconds $END
EOF

# Rename the temporary file atomically.
# This avoids the node exporter seeing half a file.
mv "$TEXTFILE_COLLECTOR_DIR/myscript.prom.$$" \
  "$TEXTFILE_COLLECTOR_DIR/myscript.prom"

 

Once you ensure that the --collector.textfile.directory flag on the node exporter is set to the right directory you're good to go. You could add any more metrics you think are useful, such as whether the script succeeded.

The node exporter exposes a metric called node_textfile_mtime_seconds which indicates when each textfile collector file was last modified, which can be useful for detecting if a cronjob hasn't run in a while.

Instead of using a temporary file, you could also use sponge from moreutils to handle that for you.

 

Wondering how best to monitor batch jobs? Contact us.