In one of my previous posts about Docker health checks closer to the end of the post I managed to build a Dockerfile and run it as a service in Docker in Swarm mode. To be honest, I’m a little bit surprised that Docker allowed me to do that. That Swarm cluster could’ve had more than one host. What if the service went somewhere, where underlying image didn’t exist? Swarm node wouldn’t copy the image to the node that needs it, right? Or would it?
Let’s try replicating our service based on custom image across all hosts of multi-host Swarm cluster and see how that goes (spoiler: we’ll need private registry in order for that to work).
Attempt 1. Simply use locally built image
Create Swarm
As usual, I have docker-machine
and VirtualBox installed, so creating a multi-host Swarm cluster becomes a trivial task. First, create one VM for manager node and other two for worker nodes:
1 2 3 4 5 6 7 8 9 |
$ docker-machine create master $ docker-machine create worker-1 $ docker-machine create worker-2 $ docker-machine ls # NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS # master - virtualbox Running tcp://192.168.99.100:2376 v17.06.0-ce # worker-1 - virtualbox Running tcp://192.168.99.101:2376 v17.06.0-ce # worker-2 - virtualbox Running tcp://192.168.99.102:2376 v17.06.0-ce |
Piece of cake. Then, let’s turn them to Swarm cluster by running docker swarm init
in master
node and executing ‘join’ command it produces on other two hosts:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
docker-machine ssh master \ docker swarm init --advertise-addr 192.168.99.100 # Swarm initialized: current node (uiywvzl6p0guvbxgrhv1td7jz) is now a manager. # # To add a worker to this swarm, run the following command: # # docker swarm join --token SWMTKN-1-2ok9r6uroyreyghfpkqlj42cninm63rzy4vifjiu9rzg8z0okv-7lu70r5goheti58bcr74m00zf 192.168.99.100:2377 # # To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. docker-machine worker-1 \ ssh docker swarm join --token SWMTKN-1-2ok9r6uroyreyghfpkqlj42cninm63rzy4vifjiu9rzg8z0okv-7lu70r5goheti58bcr74m00zf 192.168.99.100:2377 # This node joined a swarm as a worker. docker-machine ssh worker-2 \ docker swarm join --token SWMTKN-1-2ok9r6uroyreyghfpkqlj42cninm63rzy4vifjiu9rzg8z0okv-7lu70r5goheti58bcr74m00zf 192.168.99.100:2377 # This node joined a swarm as a worker. |
Swarm is ready. Now let’s install helper viz
service and get to business.
Install visualization service
When dealing with multiple hosts and services on them it’s useful to have some sort of visual feedback. That’s why I think having visualization service in a cluster is a good idea. First, I’ll connect local Docker client to Docker Engine in manager host, and then we can create viz
service directly in cluster.
1 2 3 4 5 6 7 8 9 10 |
# Connect to Docker Engine in master eval $(docker-machine env master) # Deploy visualization service docker service create \ --name=viz \ --publish=8080:8080 \ --constraint=node.role==manager \ --mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \ dockersamples/visualizer |
Checking master
‘s port 8080 produces this nice picture, which means we’re safe to go.
Building a custom image
Even though any Dockerfile would suffice, I picked the one with JS file from Docker health checks post. It’s just simpler for me and I know it worked at least once. So here’s the Dockerfile I used back then:
1 2 3 4 5 |
FROM node COPY server.js / EXPOSE 8080 8081 HEALTHCHECK --interval=5s --timeout=10s --retries=3 CMD curl -sS 127.0.0.1:8080 || exit 1 CMD [ "node", "/server.js" ] |
And server.js
that accompanied it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
"use strict"; const http = require('http'); function createServer () { return http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('OK\n'); }).listen(8080); } let server = createServer(); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); if (server) { server.close(); server = null; res.end('Shutting down...\n'); } else { server = createServer(); res.end('Starting up...\n'); } }).listen(8081); |
Because our local Docker client is still connected to master
Docker Engine, we can simply build the image and expect that it’ll be available at least at one host in the cluster – master
:
1 2 3 |
docker build . -t server:latest # ... # Successfully tagged server:latest |
And now to the main part: creating and scaling the service based on custom image.
Starting the service in Swarm
Creating a service from custom image on single node that knows about the image is easy. I did it once, there’s no reason it shouldn’t work again.
1 2 3 4 5 6 7 |
docker service create --name=node-server server # image server:latest could not be accessed on a registry to record # its digest. Each node will access server:latest independently, # possibly leading to different nodes running different # versions of the image. # # be04kf5lkimmwganpqgii1j7h |
By the way, service create
output did warn us that the image was not taken from the registry.
Checking it in visualization service confirms it did work:
And now the moment of truth: will it scale?
1 2 |
docker service scale node-server=3 # node-server scaled to 3 |
Hm, interesting. It did scale. But how? Let’s check what service’s tasks have to say:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
docker service ls # ID NAME MODE REPLICAS IMAGE PORTS # be04kf5lkimm node-server replicated 3/3 server:latest # fbpsri2f35zd viz replicated 1/1 dockersamples/visualizer:latest *:8080->8080/tcp docker service ps be04kf5lkimm # ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS # m7rk2hj6uaoq node-server.1 server:latest master Running Running 8 seconds ago # kgq0oto5d1o8 \_ node-server.1 server:latest worker-2 Shutdown Rejected 21 seconds ago "No such image: server:latest" # tb1d1q8rarl4 \_ node-server.1 server:latest worker-2 Shutdown Rejected 27 seconds ago "No such image: server:latest" # vgm1k2spyhnj \_ node-server.1 server:latest worker-2 Shutdown Rejected 31 seconds ago "No such image: server:latest" # njsqyfu0jt1m \_ node-server.1 server:latest worker-1 Shutdown Rejected 36 seconds ago "No such image: server:latest" # u0yj5jteuali node-server.2 server:latest master Running Running 9 seconds ago # w6h0m9s02hbd \_ node-server.2 server:latest worker-2 Shutdown Rejected 23 seconds ago "No such image: server:latest" # 3go9gqapkt3g \_ node-server.2 server:latest worker-1 Shutdown Rejected 29 seconds ago "No such image: server:latest" # j6yzc4k8lblm \_ node-server.2 server:latest worker-1 Shutdown Rejected 33 seconds ago "No such image: server:latest" # 9679vv7vjis3 \_ node-server.2 server:latest worker-2 Shutdown Rejected 35 seconds ago "No such image: server:latest" # j33pwzps37k8 node-server.3 server:latest master Running Running 29 seconds ago |
Dirty cheater, it put all replicas onto the same host: master
. But it’s really interesting to see a number of failed tasks, which did try deploying the service on worker nodes, but they all failed with quite predictable error No such image: server:latest
. What a surprise. But at least we confirmed that Swarm won’t copy-paste custom images between its host and we need to find another approach.
Let’s remove the service and try something else.
1 |
docker service rm node-server |
Attempt 2. Use private registry in Swarm
The obvious something else is using some sort of in-swarm Docker Hub: private registry service. If it’s available within the cluster, we could push our custom image into it and refer to it during node-server
creation. Creating private registry in general is fairly easy to do and something like docker run -d -p5000:5000 registry:latest
would do the trick. However, there’s a catch: HTTP based registry is only available from localhost, so if we need to make it available within the whole cluster, we need HTTPS and therefore – a trusted SSL certificate (I don’t want to use
insecure-registries Docker Engine flag or whatever it’s called now). (Update: as smart people mentioned, if registry is going to be pushed to and pulled from only from within the cluster, we still can go with HTTP – localhost is shared within the cluster). On the other hand, there’s couple of tricks we could use to get away with self-signed certificate, so here’s the plan:
- Generate self-signed SSL certificate for, let’s say, myregistry.com, which will point to our in-swarm registry.
- Force Docker to trust that certificate.
- Add myregistry.com to
/etc/hosts
ofmaster
andworker-*
nodes, so they know where the registry is. - Create registry service in HTTPS mode.
- Push
server:latest
image into the registry - Create a service from the image from private registry.
That’s a lot of things to do, so let’s get started.
Generate self signed certificate
Fortunately, this part is trivial. At least on *nix systems:
1 2 3 |
openssl req -newkey rsa:4096 -nodes -sha256 \ -keyout registry.key -x509 -days 365 \ -out registry.crt |
It asked several questions along the way which I safely ignored, but one of them – Common Name – is important. That should be the name of our registry service: myregistry.com.
Force Docker to trust self-signed certificate
That’s one would be also simple to do, if it wasn’t 1 AM in my time zone. The secret is to place registry.crt file to Docker Engine’s certificates store. I did that in three steps:
- Copy registry.crt file to Swarm host,
- create a folder for the certificate,
- move registry.crt into that folder.
1 2 3 |
docker-machine scp registry.crt master:/home/docker/ && \ docker-machine ssh master sudo mkdir -p /etc/docker/certs.d/myregistry.com:5000 && \ docker-machine ssh master sudo mv /home/docker/registry.crt /etc/docker/certs.d/myregistry.com:5000/ca.crt |
This will make master node to trust the certificate, but we need to repeat this command for worker-1
and worker-2
as well. I’m almost certain that there’s much simpler way to do that, but after midnight this is as far as I can go.
Add myregistry.com to /etc/hosts
master
node happened to have 192.168.99.100
IP address and this is the node that will host the registry, so myregistry.com should point to that IP.
1 2 3 |
docker-machine ssh master sudo sh -c "echo ' 192.168.99.100 myregistry.com' >> /etc/hosts" exit |
This also needs to be done to worker nodes.
Create registry in HTTPS mode
registry
service will need to have an access to certificate files and I didn’t want to use docker secret
to store them (with no particular reason), so first we need to copy them to master
node:
1 2 |
docker-machine scp registry.crt master:/home/docker/ && \ docker-machine scp registry.key master:/home/docker/ |
And now the time for our own Docker registry has come:
1 2 3 4 5 6 7 |
docker service create --name registry --publish=5000:5000 \ --constraint=node.role==manager \ --mount=type=bind,src=/home/docker,dst=/certs \ -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 \ -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt \ -e REGISTRY_HTTP_TLS_KEY=/certs/registry.key \ registry:latest |
Behold! One more colored rectangle on the cluster map:
Push server image into private registry
We could’ve just added updated the tag of existing server
image, but rebuilding the image with the right tag takes about the same amount of keystrokes, so why bother:
1 |
docker build . -t myregistry.com:5000/server:latest |
Because our image tag starts with registry URL, it’ll find the right way when we push it:
1 2 3 |
docker push myregistry.com:5000/server:latest # The push refers to a repository [myregistry.com:5000/server] # 309eab97be6f: Pushed |
Now our image supposed to be in the registry hosted in Swarm. We’ll confirm that in a moment.
Create and scale Swarm service
The moment of truth:
1 2 |
docker service create --name=node-server myregistry.com:5000/server docker service scale node-server=3 |
And visualizer:
It worked! Scriptures didn’t lie and using private repository does solve the problem of custom images.
Summary
Using custom Docker images in Swarm is not that trivial as using them in standalone Docker host. Unless you deploy your image on every host in the cluster, which sounds quite unpractical, you’ll have to use private registry for it. Fortunately, with legit SSL certificate it’s fairly easy to do. Even with self-signed one, if you Swarm hosts are provisioned automatically (as they should), configuring them to trust to self-signed certificate is quite straightforward.
Thank you so much for that tidy and useful blog post, describing it right to the point.
Well done!
You’re welcome. I’m glad you found it useful.
Hello, thank you so much for this great post. I have pushed an image to the private repository, but still the worker nodes can’t see that image, and the service is still limited to the sever node. Even if i try to go through the worker nodes to pull the image, it gave me always this message : ” Error response from daemon: Get https://myregistry.com:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”.
Also on the sever node if i tried ‘curl -k https://myregistry.com:5000/v2/_catalog‘, it will give me this error: ” curl: (7) Failed to connect to myregistry.com port 5000: Timed out “. would you please let me know what i am doing wrong ?
You are welcome. It might be hard to diagnose this without seeing it all, but can you ping the host itself? E.g.
ping https://myregistry.com
. There was the following code block in the post, which was needed exactly for making the host visible.docker-machine ssh master
sudo sh -c "echo ' 192.168.99.100 myregistry.com' >> /etc/hosts"
exit
If myregistry.com and its IP address is not in /etc/hosts, you won’t be able to access the host via this name. If it’s there and still doesn’t work, can you ping the host via its IP address? Maybe your hosts are in separate networks.
Hello Pav, thank you so much for your reply. So, yes i followed your steps and was able to push an image to the private registry. But the problem now that still when i run the service, the worker node can’t pull the image, and it is only working with the manager node. Even i tried to pull the image from the worker node but it gave me this error: ” Error response from daemon: Get https://myregistry.com:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”. I can see the image in the private registry from the manager node only.
Hi Pav,
Great post!
I am trying to figure out if it is possible to deploy a stack by simply declaring the registry as a service inside docker-compose.yml without having to create the registry service prior to the docker stack deploy. What I see from documentation and examples I’ve come up with so far, is that the registry must be created first and then the images be pushed into it. In theory though it would be great to deploy a stack in the swarm without this extra step…
Any thoughts?
Hi George,
Very theoretically, if the first of your docker-compose services deploys a registry, and the second one downloads private image .tar from somewhere and pushes it into the registry, then probably the third service could pull the image from it. But I’ve never tried this and the whole approach depends on at what stage docker-compose pulls the images for its services. If it happens on demand – right before the service is going to be initialized, then it might work.
But again, what happens next with the second service? It pulled the .tar, it did its job, so probably it should vanish. But the last time I checked Swarm there were no way to create one-time jobs or services.
So basically, the whole idea might work, but it looks like it’s going to be messy 🙂
Hello George
have you solved this issue, i too facing same issue could you please help me.
in my scenario i am able to pull the image form my private registry, if i run docker service create name and image is throwing followed error.
image sha256:d887e0ca5f95160f532a23c907593ca20d5b6653ac0e85f34837ed49175b2f82 could not be accessed on a registry to record
its digest.
Bhaskar,
The real answer to orchestration is Kubernetes.
If you need to use local registry, just do it with docker commands to create the service, push the images to it and then use it in your stack. No, you can’t do it in one go with compose.
Or use a private registry service. Like docker hub.
One more point, and that is why I said Kubernetes, because docker is weak on multi-host. And especially on persistent data. Even swarm.
Thank you sir, just what I was looking for.
Keep up the good work.
In case anyone finds it useful, while trying to take advantage of the “localhost” trick, by deploying the registry to the swarm and referencing it using http://localhost:5000/v2/, I seem to have come across an old docker bug related to conflicts between IPv4 and IPv6 and exposed ports.
If anyone is having problems reaching your own local registry when using localhost, try the loopback 127.0.0.1 instead.
Meaning, this should be reachable from all nodes in the cluster:
curl http://127.0.0.1:5000/v2/
Awesome article!
Can you elaborate a bit more regarding what you mean by:
> (Update: as smart people mentioned, if registry is going to be pushed to and pulled from only from within the cluster, we still can go with HTTP – localhost is shared within the cluster).
In what way the localhost is shared within the cluster?
If a worker try to pull from localhost:5000/someimage, will it succeed if the registry is on anther node in the cluster?
Thank you, Amit.
Yes, you can skip the whole SSL and /hosts setup and just use
127.0.0.1
address instead (haven’t testedlocalhost
, though). Any node in swarm will be able to pull from that registry,. I don’t know docker networking internals, but it looks like in swarm mode it treats localhost as a cluster itself. It’s not something fundamentally new – containers within the same pod in Kubernetes would share 127.0.0.1 as well.This is how code from the post would change for ‘localhost’ mode:
docker service create --name registry --publish=5000:5000 -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 registry:latest
docker build . -t 127.0.0.1:5000/server:latest
docker push 127.0.0.1:5000/server:latest
docker service create --name=node-server 127.0.0.1:5000/server
Awesome. Thank you.
This also worked for me, thanks for the article!
its working perfectly , Thank you
Does this still works after a complete restart?
Can’t test it now, but after host machine restart your VirtualBox VMs should remain stopped. After VMs restart I’d be slightly surprised if swarm and its services, including registry, stopped working. But it needs confirming
Any chance you have something like this where Gitlab CI pushes new versions of the image to the registry?
Not the image, but something similar. [Here](https://codeblog.dotsandbrackets.com/continuous-integration-deployment/), closer to the end of the post, gitlab-runner in one container copies build results to another one. If your container is a part of a swarm, instead of
tsc
andcp
for build and upload you could usedocker build
anddocker push
. Being inside of a swarm you can push directly to127.0.0.1:5000
and avoid setting up SSL and self-signed certificates.In this step: –
Add myregistry.com to /etc/hosts
Do we need to add same master node’s IP address to point to myregistry.com in worker nodes /etc/hosts ??
Hello my dear, how are you?
It’s 2024 and your altruistic attitude of sharing your knowledge helped me resolve this issue of registering docker images in my Homelab Docker Swarm because every time it kept returning an error that the “image” was not located on one of the nodes.
Some tips for anyone who reads after me:
– If the blocking message returns: “dial tcp 172.16.0.211:5000: connect: connection refused” check the logs to see if the certificate was imported into your myregistry service. It will show a message that the certificate “mydomain.crt not found”.
After correcting this step (I was mapping the directory the wrong way) the system worked like magic.
– To make viewing logs easier, I recommend the portainer.io community version.
Thank you very much Pav. Success in your personal and professional journey.