Persistent data in Docker volumes

Docker volumes

As Docker containers supposed to be small, single process and easy replaceable instances, it’s not particularly clear how persistent data fits into that picture. Imagine you have MySQL container which you decided to upgrade. What will you do with its database files? In containers world “upgrade” means “nuke an old one, start a new one” and your data will turn into radioactive ashes with the rest of container’s file system.

However, along with the problem Docker also provides a solution: Docker volumes.

What is Docker volumes

Generally speaking, Docker volume is just a host directory mounted to container’s file system. As it no longer belongs to container’s FS, it’s not a problem to delete one container, create another one and mount existing data volume to it. There’re several approaches of how to use Docker volumes and today we’ll take a look at three of them.

1. Simple directory mounts

The simplest approach is mounting arbitrary host directory to container’s FS. Imagine you you’re running mysql container and want to preserve its data files during upgrades, or just perform occasional backups. We can map host directory to container’s data directory, so anything mysql writes to e.g. /var/lib/mysql will end up in relative safety of host FS:

When we destroy mysqlserver, its data will survive.

Now we can start new mysql container, mount the same data directory to it and continue as if nothing has happened.

Read-only mounts

If container doesn’t supposed to update the data in mounted directory, it can be made read-only by simply adding :ro suffix. Obviously, it doesn’t make much sense to do so for database mount, but for a web server it’s quite logical:

Listing existing mounts

Don’t even try to remember what exactly is mounted to what container – our brains doesn’t work like that. Instead, docker inspect %container% not only will tell network, host and container settings, but also what volumes and mounts it uses at the moment:

2. Docker data volumes

Let’s try another thing: run the first mysql example one more time, but this time skip -v (volume) argument. Then, if we inspect it, it’ll be hard to not notice that it still has a volume attached to it!

Yes, this time it has a weird name, and mount source path is much longer, but it’s still points to /var/lib/mysql data folder. How is that possible?

The answer lies in the bottom of mysql image Dockerfile:

VOLUME /var/lib/mysql creates a new volume attached to /var/lib/mysql. It behaves slightly like a regular directory mount, but actually is not quite the same. Whenever Docker sees a volume declaration, it’ll generate a unique 64 byte name for it, create new mount directory (a volume) in host FS –  /var/lib/docker/volumes/%name%, and when container starts the first time, unlike with regular host directory mounts, it’ll copy whatever container had in /var/lib/mysql to the volume, and after that will use the content from the volume, not container’s FS. That has an important implication: when I create a container from newer mysql image that also has newer content in /var/lib/mysql, if the volume already exists, that new content will be ignored.

Creating data volumes from a command line

Volumes declarations don’t have to be in Dockerfile, we can create them from a command line as well:

This command creates a new volume mounted to /data, then touch /data/README.md creates an empty file in it and after that container immediately exits.

Unlike with regular host mounts, docker keeps track of all volumes it ever created and provides some docker volume ... commands for them. One of them is ls, which prints out existing volumes. By using it we can find the volume we just created:

Docker volumes don’t exclusively belong to containers that created them. Using the name, we can be mount them to any number of other containers:

Classic example of using this feature is connecting a volume to a container that is going to make data backup:

Creating volumes without containers

You don’t even need a container to create a volume. We could use docker volume create command to create a volume in advance and then mount it to any container that needs one. The beauty of this approach is that we can choose the volume name and once and for all get rid of that 64 chars monstrosity:

Creating shared-storage data volumes

volume create command also provides something much more powerful than just selecting a name. Until now we’ve been creating volumes that are hosted in current file system, which is not particularly scalable. In order to address that Docker supports volume plugins that enable storing volume data in other locations: Azure, DigitalOcean, and several others.

Installing and configuring such plugins might be tricky sometimes, but after it’s done, using a plugin is just a matter of adding one more argument to volume create command:

3. Volume containers

There’s also old Docker pattern called data-only container. As name suggests, it’s a container with one or more attached volumes, who’s sole responsibility it to exist and provide that volumes for others. Container doesn’t even have to be running.

Once we’ve got such container, we can attach its volumes all at once to other containers with --volumes-from argument:

Honestly, I don’t see any benefits of having dedicated volume container. Same functionality could be achieved by regular data volumes with zero overhead. Maybe this feature made more sense before Docker 1.8 introduced Volumes API, but now it’s more confusing than useful.

Summary

Today we took a look at several ways to persist data inside of docker containers: host directory mount, data volume and volume container. While all three of them are supported by Docker, only the second one – data volume – looks like a ‘true’ way to do the job. After all, directory mounts work only at local host, volume containers don’t add much value comparing to named data volumes, and only data volumes work well both on local host and anywhere in the cloud, assuming you installed a plugin for that.

Leave a Reply

Your email address will not be published. Required fields are marked *