I'm running a small set of Docker Swarm nodes on Raspberry Pis, and using glusterfs as shared storage for the docker volumes. I originally set this up while on Ubuntu Server 21.04 (hirsute) which has gluster 9.0 included in its default packages. This was working great, with only occasional blips in the mounted volumes, seemingly when some updates would apply silently in the background.
However, since upgrading all 3 nodes to 21.10 (impish) and therefore gluster 9.2, I've been having no end of issues where when some containers start up and interact with their files (the specifics on this I'm foggy about), mount.glusterfs on the node that the container is running in (and it happens on any node) seemingly crashes with the below log message from journalctl for the related mount unit:
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: pending frames:
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(0) op(0)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(0) op(0)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(1) op(LK)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(0) op(0)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(0) op(0)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(1) op(OPEN)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: frame : type(1) op(OPEN)
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: patchset: git://git.gluster.org/glusterfs.git
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: signal received: 11
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: time of crash:
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: 2021-12-12 05:18:42 +0000
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: configuration details:
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: argp 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: backtrace 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: dlfcn 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: libpthread 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: llistxattr 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: setfsid 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: spinlock 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: epoll.h 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: xattr.h 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: st_atim.tv_nsec 1
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: package-string: glusterfs 9.2
Dec 12 05:18:42 node1 mnt-gfs-docker1[12330]: ---------
Dec 12 05:20:39 node1 systemd[1]: Unmounting /mnt/gfs/docker1...
Dec 12 05:20:39 node1 systemd[1]: mnt-gfs-docker1.mount: Deactivated successfully.
Dec 12 05:20:39 node1 systemd[1]: Unmounted /mnt/gfs/docker1.
Dec 12 05:20:39 node1 systemd[1]: mnt-gfs-docker1.mount: Consumed 6h 49min 33.197s CPU time.
As I'm new to gluster, there's nothing in here that I understand about why this is happening, and I can't find any specifics about what might be happening at the time that is causing this. I've checked the heal status of the volume, and there are no pending files, and after a full heal, there weren't any syncs (even with a few containers still running).
tldr; gluster volume keeps unmounting/crashing on client nodes, despite seemingly no issues with the underlying gluster bricks, with no obvious log entries despite the above that I can't see an issue listed in
What is causing this and how can I prevent it?
0 Answers