We have a Docker Swarm deployment running on two nodes (node1 and node2) for our business application.
The application needs a volume to store persistent data. As it is unclear where a container is deployed (node1 or node2) as well as maybe two containers of our application should run on both nodes we needed a solution to provide a shared volume for all nodes.
For sharing a volume we set up a NFS server on a third node3 with the following /etc/exports file:
/srv *(rw,sync,anonuid=1000,anongid=1000,all_squash,subtree_check,crossmnt,fsid=root)
(I use anonuid/gid to explicitly set for each files in the export the user information for a known user in the node3 system. all_squash is used to make sure all file rights from all accessing users are rewritten to this local user)
In our docker-compose.yml we use the following setup to include the volume:
volumes:
nfs-data:
driver: local
driver_opts:
type: nfs
o: nfsvers=4,addr=node3.example.com,rw,nolock,soft
We now ran into a problem where the the container just didn't want to start up, the error message was:
failed to copy file info for /var/lib/docker/volumes/MY_CONTAINER_nfs-data/_data: failed to chown /var/lib/docker/volumes/MY_CONTAINER_nfs-data/_data: lchown /var/lib/docker/volumes/MY_CONTAINER_nfs-data/_data: operation not permitted
After some digging around I found out that the problem relies in the initial empty folder on the exported NFS directory on the node3 server. As soon as I put in an empty file the startup of the container in node1 and node2 is working totally fine.
Does anyone has an explanation for that?
When a named volume is initialized from an empty/new state, docker will copy the contents of the image directory into the named volume. There are several options to deal with this:
RUN chown -R 1000:1000 /path
. This should prevent issues but you'll want to test to be sure there isn't still a chown trying to run from docker depending on how it initializes these files.The example from docker's documentation on the "nocopy" option looks like: