So far all examples I made for Docker in Swarm Mode or Kubernetes blog posts were built around some sort of a service: web server, message queue, message bus. After all, “service” is a main concept in Swarm Mode, and even the whole micro-service application thing has, well, a “service” in it. But what about one-off jobs: maintenance tasks, scheduled events, or anything else, that we need to run just sometimes, not as a service?
For instance, take unit tests. If tests suite execution takes 60 minutes, I could take 60 containers, distribute tests among them and throw the whole thing into a cluster. A valid use case and end result is definitely not a service.
Doing so in Swarm would be possible, but tricky. We’d still had to create a service and as a bare minimum we should tell it to not restart containers that finished. It’s much simpler with Kubernetes, though. Not only we can schedule execution of individual pods without a service, k8s also has Job and Cron Job workloads, that will make an execution and control over such pods much simpler. Let’s see how we can use these three.
Setup
To follow along you’ll need VirtualBox
, minikube
and kubectl
. Access to Google Container Engine will also do. I covered local setup details before, so let’s skip that part and jump straight into the cluster.
Pods
Assume you need a task to run. For instance, for some strange reason you decided to find all prime numbers between 1 and 70 by using bash script and Kubernetes. Such things happen, you know. If this one-liner does the math (and it does, I checked):
1 |
current=0; max=70; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [ `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done |
Then we can use this Pod configuration to schedule the task in a cluster:
1 2 3 4 5 6 7 8 9 10 11 |
apiVersion: v1 kind: Pod metadata: name: primes spec: containers: - name: primes image: ubuntu command: ["bash"] args: ["-c", "current=0; max=70; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [ `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done"] restartPolicy: Never |
There’s nothing special in this pod except for that we explicitly tell it to stay shut down once it’s finished. Now we can deploy this pod with kubectl create -f pod.yml
, watch it started and then start monitoring its STDOUT with kubectl logs -f primes
, where the pod will echo its findings.
1 2 3 4 5 6 7 8 9 10 11 |
kubectl get pod #NAME READY STATUS RESTARTS AGE #primes 1/1 Running 0 3s kubectl logs -f primes #1 #2 #3 #... #61 #67 |
After it’s done, we still can see the pod with kubectl get pod
, but this time we need to provide --show-all
parameter to it. After all, the pod is no longer running.
1 2 3 |
kubectl get pod --show-all #NAME READY STATUS RESTARTS AGE #primes 0/1 Completed 0 1m |
However, such tasks execution approach lacks few important features. Firstly, what if the node this worker pod was running at suddenly shuts down? The pod dies as well, right? Wouldn’t it be cool if something rescheduled it elsewhere?
Moreover, finding prime number in bash is slow. We could’ve made it faster by splitting 1..70 range into, let’s say, three smaller ranges (1-30, 31-50, 51-70) and distributing them between multiple pods. But this means we need to create those pods manually. Not a big deal for three pods, but quite a problem for a hundred.
On the other hand, there’re Job workloads.
Jobs
The Job is special kind of controller that creates and manages a set of pods that are going to do some finite work. Like Deployment, Job will recreate its Pods in case of a node failure. It also has parallelism
property to specify how many pods should be doing the job and how many of them should succeed (completions
) until the whole jobs becomes “finished”.
Here’s how simple it is to convert a Pod from previous example into a Job workload:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
apiVersion: batch/v1 kind: Job metadata: name: primes spec: template: metadata: name: primes spec: containers: - name: primes image: ubuntu command: ["bash"] args: ["-c", "current=0; max=70; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [ `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done"] restartPolicy: Never |
It’s basically a copy-paste excercise. However, the job will look much more interesting if we tell it to run up to 4 workers in parallel as long as it takes to get 8 successful completions:
1 2 3 4 5 6 |
#.. spec: completions: 8 parallelism: 4 template: #... |
Of cause, those 4 parallel pods will be doing the same thing – finding prime numbers, but in real life that could’ve been picking a task from MQ, database, or from anywhere else.
Now, create the job, give it some time and see what’s happening in pods area:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
kubectl create -f job.yml #job "primes" created # several seconds later kubectl get pods --show-all #NAME READY STATUS RESTARTS AGE #primes-5g2xp 0/1 Completed 0 31s #primes-9l9tf 0/1 ContainerCreating 0 4s #primes-d2jwk 0/1 Completed 0 14s #primes-pxhqx 0/1 ContainerCreating 0 4s #primes-rvq5x 0/1 Completed 0 31s #primes-rxdrw 1/1 Running 0 4s #primes-sw2lq 0/1 Completed 0 31s #primes-v5bv8 0/1 Completed 0 31s |
Yup, they are definitely doing something in parallel.
We also could’ve tried other parallelism
and completions
combinations. For example, skipping parallelism
would cause 8 pods to run one after the other until we reach completions
count. Alternatively, skipping completions
would cause 4 parallel pods to be scheduled, but as soon as they’re done, the Job would be done as well.
Cron Jobs
Like Job starts worker pods, Cron Job starts a Job on schedule. The schedule
uses crontab format, so if you ever created any cron job on Linux, you already know how to configure Cron Job in Kubernetes.
Let’s assume that calculating prime numbers became so important for our company, that we need to recalculate them once per minute. Absolutely no problem, I’ll just copy previous job configuration as jobTemplate
for a Cron Job and set up a schedule:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
apiVersion: batch/v1beta1 kind: CronJob metadata: name: primes spec: schedule: "*/1 * * * *" jobTemplate: spec: completions: 8 parallelism: 4 template: metadata: name: primes spec: containers: - name: primes image: ubuntu command: ["bash"] args: ["-c", "current=0; max=70; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [ `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done;"] restartPolicy: Never |
We can create and monitor it the same way as we did so for regular jobs.
1 2 3 4 5 6 7 8 9 10 |
kubectl create -f cron.yml #cronjob "primes" created kubectl get cronjobs #NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE #primes */1 * * * * False 0 <none> kubectl get cronjobs #NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE #primes */1 * * * * False 1 Tue, 28 Nov 2017 00:24:00 -0500 |
Absolutely no surprises.
1 2 3 4 5 6 7 8 9 10 |
kubectl get pods --show-all #NAME READY STATUS RESTARTS AGE #primes-1511846640-5m9tz 1/1 Running 0 5s #primes-1511846640-5xvqj 0/1 Completed 0 32s #primes-1511846640-dn8mq 1/1 Running 0 5s #primes-1511846640-g98qb 0/1 Completed 0 32s #primes-1511846640-hk7rl 0/1 Completed 0 32s #primes-1511846640-kkcks 1/1 Running 0 5s #primes-1511846640-vv5zm 0/1 Completed 0 32s #primes-1511846640-xlf4r 1/1 Running 0 5s |
Summary
So as you can see there’s not just a way to run one-off Kubernetes jobs, there’re actually three ways. For small ad-hoc tasks running them directly in pods should probably be enough. When a task can be parallelized and we care about giving a second chance to pods that failed for some reason, Job sounds like a reasonable choice make. Finally, if we’re talking about a task that should run on schedule, Cron Job is the way to go.
Thanks for Job example, really useful!