Scraping application metrics with Prometheus

Prometheus logoThere’re two conceptually different approaches in collecting application metrics. There’s PUSH approach, when metrics storage sits somewhere and waits until metrics source pushes some data into it. For instance, Graphite doesn’t do any collection on its own, it waits until somebody like collectd does the delivery.

There’s second approach – PULL. In this approach metrics sources don’t try to be smart and just provide their readings on demand. Whoever needs those metrics can make a call, e.g. HTTP request, in order to get some.

Prometheus collects metrics using the second approach. Continue reading “Scraping application metrics with Prometheus”

Building dashboards with Grafana

Even though Graphite does very decent job in displaying individual metrics graphs, its dashboards support is quite limited. Of cause, we could take its powerful Render URL API and build anything we like in good old HTML, but on the other hand, there’s Grafana.

Grafana dashboard

Continue reading “Building dashboards with Grafana”

Creating collectd data source in JavaScript

Builds graph with data collected by JavaScript app

Exec Plugin

In the variety of collectd plugins there’s one ‘to rule them all’. If due to some course of events all collectd plugins except for Exec would be taken from you, you’d still be able to restore all its functionality with Exec.

As the name suggests, Exec starts external program or script and interprets its output as source of data. To be specific, it looks for lines that follow this scheme:

To be even more specific, these lines would work:

What’s interesting, Exec doesn’t specify in what language script should be written, so anything, including JavaScript, might work. In fact, using JavaScript would be beneficial in some scenarios, e.g. when dealing with RESTful services returning JSON.

Before we try JavaScript app as data source for collectd, let’s talk about PUTVAL lines a little bit more. Continue reading “Creating collectd data source in JavaScript”

Quick intro to rrdtool

I mentioned in previous post that collectd uses rrdtool for saving its data by default. It results .rrd  file for each metric, which later can be rendered using very same rrdtool. RRD files are not something most of the people are familiar with and the tool itself isn’t particularly easy to use, so why such an easy to use tool as collectd would choose it?

For a number of reasons. Continue reading “Quick intro to rrdtool”

Host monitoring with collectd

collectdDistributed apps introduce a challenge that we usually could avoid in monolithic ones: how do we say that app is performing well? I’m not talking about it being user-friendly or providing business value. How do you tell that components of your distributed app are actually running? Which services are overutilized? Underutilized? Run out of disk space?

There’re tools to get that answers and collectd is one of them.

Continue reading “Host monitoring with collectd”

Highly available Kafka cluster in Docker

Apache Kafka cluster in Docker

Up until now we’ve been experimenting with Apache Kafka, a tool build with cluster and high availability in mind, but using exactly one host and availability settings which only few very optimistic people would call high.

Not today.

Today we’re going to spin up multi-host Kafka cluster and we’ll replicate topic in it, so if one host goes down, data and its availability won’t suffer.

Continue reading “Highly available Kafka cluster in Docker”

“Hello world” with Apache Kafka

Single node cluster

So it’s time to send some data bits through Apache Kafka. But first, as usual, we need to install it first.

Installing Kafka is so trivial, so I’ll change my rule and will actually explain the process. Here goes the manual:

  1. Install Java Development Kit (you probably have it already)
  2. Download Kafka tarball
  3. Uncompress it ( tar -xzf kafka_2.11-0.10.1.0.tgz in *nix systems)
  4. Done. You installed Kafka.

Continue reading ““Hello world” with Apache Kafka”

Quick intro to Apache Kafka

What is Apache Kafka

Official definition of Apache Kafka is distributed streaming platform, which starts to make sense only after reading at least few chapters of its documentation. However, idea behind it is relatively simple. In large distributed apps we have many services that produce messages: logs, monitoring events, audit entries – any type of records. On the other hand there’s similar amount of services that consume that data. Kafka brings these parties together: it accepts data from producers, reliably stores it in topics and allows consumers to subscribe to them. In other words, Kafka is a love child of distributed storage and messaging system.

Apache Kafka

Continue reading “Quick intro to Apache Kafka”

Building RabbitMQ Cluster

Cluster with RabbitMQ

As I promised last time, it’s time to check out RabbitMQ feature we can consider advanced – clustering. RabbitMQ cluster is a set of individual nodes that share the same users, queues, exchanges and runtime parameters. New nodes can come and go, be located at different continents, yet for the connected client they will look like one entity.

Clustering is not the same as replication or high availability. Yes, users and whatever is usually necessary for node to work will be copied across all nodes. Queues, however, will reside on the node they were initially created, though they will be accessible from any node. If one node goes down, its queues go with him.

Continue reading “Building RabbitMQ Cluster”