Docker health checks

Docker health checkSomehow I missed the news that starting from version 1.12 Docker containers support health checks. Such checks don’t just test if container itself is running, but rather is it doing the job right. For instance, it can ping containerized web server to see if it responds to incoming requests, or measure memory consumption and see if it’s reasonable. As Docker health check is a shell command, it can test virtually anything.

When the test fails few times in a row, problematic container will get into “unhealthy” state, which makes no difference in standalone mode (except for triggered health_status event), but causes container to restart in Swarm mode. Continue reading “Docker health checks”

Checking service health status with Consul

Consul logoIn previous post we created a small Consul cluster which kept track of 4 services in it: two web services and two db‘s. However, we didn’t tell Consul agents how to monitor those services, so they completely missed the fact that none of the services actually exists. So today we’re going to take a close look at Consul’s health checks and see what effect they have on service discoverability. Continue reading “Checking service health status with Consul”

Quick intro to Elasticsearch

ElasticsearchSo far we’ve been dealing with name-value kind of monitoring data. However, what works well for numeric readings isn’t necessarily useful for textual data. In fact, Grafana, Graphite and Prometheus are useless for other kind of monitoring records – logs and traces.

There’re many, many tools for dealing with those, but I decided to take a look at Elastic’s ELK stack: Elasticsearch, Logstash and Kibana – storage, data processor and visualization tool. And today we’ll naturally start with the first letter of the stack: “E”.

What’s Elasticsearch

Elasticsearch is fast, horizontally scalable open source search engine. It provides HTTP API for storing and indexing JSON documents and with default configuration it behaves a little bit like searchable NoSQL database.

Continue reading “Quick intro to Elasticsearch”

Tracking application events in Graphite

I don’t know if that’s a coincidence or not, but drastic changes in application metrics usually happen soon after a product upgrade was made. In fact, whenever I have to deal with new issue on production server, the first thing I do is checking if it was recently updated. No wonder it makes sense to record such events along with other monitoring data.

But assuming our monitoring data is in Graphite, how would we do that?

Continue reading “Tracking application events in Graphite”

Scraping application metrics with Prometheus

Prometheus logoThere’re two conceptually different approaches in collecting application metrics. There’s PUSH approach, when metrics storage sits somewhere and waits until metrics source pushes some data into it. For instance, Graphite doesn’t do any collection on its own, it waits until somebody like collectd does the delivery.

There’s second approach – PULL. In this approach metrics sources don’t try to be smart and just provide their readings on demand. Whoever needs those metrics can make a call, e.g. HTTP request, in order to get some.

Prometheus collects metrics using the second approach. Continue reading “Scraping application metrics with Prometheus”

Building dashboards with Grafana

Even though Graphite does very decent job in displaying individual metrics graphs, its dashboards support is quite limited. Of cause, we could take its powerful Render URL API and build anything we like in good old HTML, but on the other hand, there’s Grafana.

Grafana dashboard

Continue reading “Building dashboards with Grafana”

Creating collectd data source in JavaScript

Builds graph with data collected by JavaScript app

Exec Plugin

In the variety of collectd plugins there’s one ‘to rule them all’. If due to some course of events all collectd plugins except for Exec would be taken from you, you’d still be able to restore all its functionality with Exec.

As the name suggests, Exec starts external program or script and interprets its output as source of data. To be specific, it looks for lines that follow this scheme:

To be even more specific, these lines would work:

What’s interesting, Exec doesn’t specify in what language script should be written, so anything, including JavaScript, might work. In fact, using JavaScript would be beneficial in some scenarios, e.g. when dealing with RESTful services returning JSON.

Before we try JavaScript app as data source for collectd, let’s talk about PUTVAL lines a little bit more. Continue reading “Creating collectd data source in JavaScript”

Quick intro to rrdtool

I mentioned in previous post that collectd uses rrdtool for saving its data by default. It results .rrd  file for each metric, which later can be rendered using very same rrdtool. RRD files are not something most of the people are familiar with and the tool itself isn’t particularly easy to use, so why such an easy to use tool as collectd would choose it?

For a number of reasons. Continue reading “Quick intro to rrdtool”