Using private registry in Docker Swarm

In one of my previous posts about Docker health checks closer to the end of the post I managed to build a Dockerfile and run it as a service in Docker in Swarm mode. To be honest, I’m a little bit surprised that Docker allowed me to do that. That Swarm cluster could’ve had more than one host. What if the service went somewhere, where underlying image didn’t exist? Swarm node wouldn’t copy the image to the node that needs it, right? Or would it?

Let’s try replicating our service based on custom image across all hosts of multi-host Swarm cluster and see how that goes (spoiler: we’ll need private registry in order for that to work).

Attempt 1. Simply use locally built image

Create Swarm

As usual, I have docker-machine and VirtualBox installed, so creating a multi-host Swarm cluster becomes a trivial task. First, create one VM for manager node and other two for worker nodes:

Piece of cake. Then, let’s turn them to Swarm cluster by running docker swarm init in master node and executing ‘join’ command it produces on other two hosts:

Swarm is ready. Now let’s install helper viz service and get to business.

Install visualization service

When dealing with multiple hosts and services on them it’s useful to have some sort of visual feedback. That’s why I think having visualization service in a cluster is a good idea. First, I’ll connect local Docker client to Docker Engine in manager host, and then we can create viz service directly in cluster.

Checking master‘s port 8080 produces this nice picture, which means we’re safe to go.

visualization service

Building a custom image

Even though any Dockerfile would suffice, I picked the one with JS file from Docker health checks post. It’s just simpler for me and I know it worked at least once. So here’s the Dockerfile I used back then:

And server.js that accompanied it:

Because our local Docker client is still connected to master Docker Engine, we can simply build the image and expect that it’ll be available at least at one host in the cluster – master:

And now to the main part: creating and scaling the service based on custom image.

Starting the service in Swarm

Creating a service from custom image on single node that knows about the image is easy. I did it once, there’s no reason it shouldn’t work again.

By the way, service create output did warn us that the image was not taken from the registry.

Checking it in visualization service confirms it did work:

One instance of node server

And now the moment of truth: will it scale?

Hm, interesting. It did scale. But how? Let’s check what service’s tasks have to say:

Dirty cheater, it put all replicas onto the same host: master. But it’s really interesting to see a number of failed tasks, which did try deploying the service on worker nodes, but they all failed with quite predictable error No such image: server:latest. What a surprise. But at least we confirmed that Swarm won’t copy-paste custom images between its host and we need to find another approach.

Let’s remove the service and try something else.

Attempt 2. Use private registry in Swarm

The obvious something else is using some sort of in-swarm Docker Hub: private registry service. If it’s available within the cluster, we could push our custom image into it and refer to it during node-server creation. Creating private registry in general is fairly easy to do and something like docker run -d -p5000:5000 registry:latest  would do the trick. However, there’s a catch: HTTP based registry is only available from localhost, so if we need to make it available within the whole cluster, we need HTTPS and therefore – a trusted SSL certificate (I don’t want to use  insecure-registries Docker Engine flag or whatever it’s called now). (Update: as smart people mentioned, if registry is going to be pushed to and pulled from only from within the cluster, we still can go with HTTP – localhost is shared within the cluster). On the other hand, there’s couple of tricks we could use to get away with self-signed certificate, so here’s the plan:

  1. Generate self-signed SSL certificate for, let’s say, myregistry.com, which will point to our in-swarm registry.
  2. Force Docker to trust that certificate.
  3. Add myregistry.com to /etc/hosts of master and worker-* nodes, so they know where the registry is.
  4. Create registry service in HTTPS mode.
  5. Push server:latest image into the registry
  6. Create a service from the image from private registry.

That’s a lot of things to do, so let’s get started.

Generate self signed certificate

Fortunately, this part is trivial. At least on *nix systems:

It asked several questions along the way which I safely ignored, but one of them – Common Name – is important. That should be the name of our registry service: myregistry.com.

Force Docker to trust self-signed certificate

That’s one would be also simple to do, if it wasn’t 1 AM in my time zone. The secret is to place registry.crt file to Docker Engine’s certificates store. I did that in three steps:

  • Copy registry.crt file to Swarm host,
  • create a folder for the certificate,
  • move registry.crt into that folder.

This will make master node to trust the certificate, but we need to repeat this command for worker-1 and worker-2 as well. I’m almost certain that there’s much simpler way to do that, but after midnight this is as far as I can go.

Add myregistry.com to /etc/hosts

master node happened to have 192.168.99.100 IP address and this is the node that will host the registry, so myregistry.com should point to that IP.

This also needs to be done to worker nodes.

Create registry in HTTPS mode

registry service will need to have an access to certificate files and I didn’t want to use docker secret to store them (with no particular reason), so first we need to copy them to master node:

And now the time for our own Docker registry has come:

Behold! One more colored rectangle on the cluster map:

registry service

Push server image into private registry

We could’ve just added updated the tag of existing server image, but rebuilding the image with the right tag takes about the same amount of keystrokes, so why bother:

Because our image tag starts with registry URL, it’ll find the right way when we push it:

Now our image supposed to be in the registry hosted in Swarm. We’ll confirm that in a moment.

Create and scale Swarm service

The moment of truth:

And visualizer:

replicated node server

It worked! Scriptures didn’t lie and using private repository does solve the problem of custom images.

Summary

Using custom Docker images in Swarm is not that trivial as using them in standalone Docker host. Unless you deploy your image on every host in the cluster, which sounds quite unpractical, you’ll have to use private registry for it. Fortunately, with legit SSL certificate it’s fairly easy to do. Even with self-signed one, if you Swarm hosts are provisioned automatically (as they should), configuring them to trust to self-signed certificate is quite straightforward.

22 thoughts on “Using private registry in Docker Swarm

  1. Hello, thank you so much for this great post. I have pushed an image to the private repository, but still the worker nodes can’t see that image, and the service is still limited to the sever node. Even if i try to go through the worker nodes to pull the image, it gave me always this message : ” Error response from daemon: Get https://myregistry.com:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”.
    Also on the sever node if i tried ‘curl -k https://myregistry.com:5000/v2/_catalog‘, it will give me this error: ” curl: (7) Failed to connect to myregistry.com port 5000: Timed out “. would you please let me know what i am doing wrong ?

    1. You are welcome. It might be hard to diagnose this without seeing it all, but can you ping the host itself? E.g. ping https://myregistry.com. There was the following code block in the post, which was needed exactly for making the host visible.

      docker-machine ssh master
      sudo sh -c "echo ' 192.168.99.100 myregistry.com' >> /etc/hosts"
      exit

      If myregistry.com and its IP address is not in /etc/hosts, you won’t be able to access the host via this name. If it’s there and still doesn’t work, can you ping the host via its IP address? Maybe your hosts are in separate networks.

      1. Hello Pav, thank you so much for your reply. So, yes i followed your steps and was able to push an image to the private registry. But the problem now that still when i run the service, the worker node can’t pull the image, and it is only working with the manager node. Even i tried to pull the image from the worker node but it gave me this error: ” Error response from daemon: Get https://myregistry.com:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”. I can see the image in the private registry from the manager node only.

  2. Hi Pav,
    Great post!
    I am trying to figure out if it is possible to deploy a stack by simply declaring the registry as a service inside docker-compose.yml without having to create the registry service prior to the docker stack deploy. What I see from documentation and examples I’ve come up with so far, is that the registry must be created first and then the images be pushed into it. In theory though it would be great to deploy a stack in the swarm without this extra step…
    Any thoughts?

    1. Hi George,
      Very theoretically, if the first of your docker-compose services deploys a registry, and the second one downloads private image .tar from somewhere and pushes it into the registry, then probably the third service could pull the image from it. But I’ve never tried this and the whole approach depends on at what stage docker-compose pulls the images for its services. If it happens on demand – right before the service is going to be initialized, then it might work.
      But again, what happens next with the second service? It pulled the .tar, it did its job, so probably it should vanish. But the last time I checked Swarm there were no way to create one-time jobs or services.
      So basically, the whole idea might work, but it looks like it’s going to be messy 🙂

    2. Hello George

      have you solved this issue, i too facing same issue could you please help me.

      in my scenario i am able to pull the image form my private registry, if i run docker service create name and image is throwing followed error.
      image sha256:d887e0ca5f95160f532a23c907593ca20d5b6653ac0e85f34837ed49175b2f82 could not be accessed on a registry to record
      its digest.

      1. Bhaskar,
        The real answer to orchestration is Kubernetes.
        If you need to use local registry, just do it with docker commands to create the service, push the images to it and then use it in your stack. No, you can’t do it in one go with compose.
        Or use a private registry service. Like docker hub.
        One more point, and that is why I said Kubernetes, because docker is weak on multi-host. And especially on persistent data. Even swarm.

  3. In case anyone finds it useful, while trying to take advantage of the “localhost” trick, by deploying the registry to the swarm and referencing it using http://localhost:5000/v2/, I seem to have come across an old docker bug related to conflicts between IPv4 and IPv6 and exposed ports.

    If anyone is having problems reaching your own local registry when using localhost, try the loopback 127.0.0.1 instead.
    Meaning, this should be reachable from all nodes in the cluster:
    curl http://127.0.0.1:5000/v2/

  4. Awesome article!

    Can you elaborate a bit more regarding what you mean by:
    > (Update: as smart people mentioned, if registry is going to be pushed to and pulled from only from within the cluster, we still can go with HTTP – localhost is shared within the cluster).

    In what way the localhost is shared within the cluster?
    If a worker try to pull from localhost:5000/someimage, will it succeed if the registry is on anther node in the cluster?

    1. Thank you, Amit.
      Yes, you can skip the whole SSL and /hosts setup and just use 127.0.0.1 address instead (haven’t tested localhost, though). Any node in swarm will be able to pull from that registry,. I don’t know docker networking internals, but it looks like in swarm mode it treats localhost as a cluster itself. It’s not something fundamentally new – containers within the same pod in Kubernetes would share 127.0.0.1 as well.
      This is how code from the post would change for ‘localhost’ mode:
      docker service create --name registry --publish=5000:5000 -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 registry:latest
      docker build . -t 127.0.0.1:5000/server:latest
      docker push 127.0.0.1:5000/server:latest
      docker service create --name=node-server 127.0.0.1:5000/server

    1. Can’t test it now, but after host machine restart your VirtualBox VMs should remain stopped. After VMs restart I’d be slightly surprised if swarm and its services, including registry, stopped working. But it needs confirming

      1. Any chance you have something like this where Gitlab CI pushes new versions of the image to the registry?

        1. Not the image, but something similar. [Here](https://codeblog.dotsandbrackets.com/continuous-integration-deployment/), closer to the end of the post, gitlab-runner in one container copies build results to another one. If your container is a part of a swarm, instead of tsc and cp for build and upload you could use docker build and docker push. Being inside of a swarm you can push directly to 127.0.0.1:5000 and avoid setting up SSL and self-signed certificates.

  5. In this step: –
    Add myregistry.com to /etc/hosts

    Do we need to add same master node’s IP address to point to myregistry.com in worker nodes /etc/hosts ??

  6. Hello my dear, how are you?

    It’s 2024 and your altruistic attitude of sharing your knowledge helped me resolve this issue of registering docker images in my Homelab Docker Swarm because every time it kept returning an error that the “image” was not located on one of the nodes.

    Some tips for anyone who reads after me:

    – If the blocking message returns: “dial tcp 172.16.0.211:5000: connect: connection refused” check the logs to see if the certificate was imported into your myregistry service. It will show a message that the certificate “mydomain.crt not found”.

    After correcting this step (I was mapping the directory the wrong way) the system worked like magic.

    – To make viewing logs easier, I recommend the portainer.io community version.

    Thank you very much Pav. Success in your personal and professional journey.

Leave a Reply

Your email address will not be published. Required fields are marked *