Ping a Specific Port

Question

Jonathan Gilbert

Asked: 2024-06-11 05:18:23 +0800 CST2024-06-11 05:18:23 +0800 CST 2024-06-11 05:18:23 +0800 CST

ZFS folder very slow after having many files

772

I am running Ubuntu 24.04 using ZFS for my filesystems. This is on a laptop whose only storage device is a WD Black SN850X NVMe card. The default Ubuntu installation process configured two ZFS pools:

                                          capacity     operations     bandwidth 
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
bpool                                    187M  1.69G      0      0    381    204
  86349523-abd9-7a45-ab84-60d7622c240f   187M  1.69G      0      0    381    204
--------------------------------------  -----  -----  -----  -----  -----  -----
rpool                                    286G   634G     13     31  1.11M   796K
  cc31ec4d-1dd2-ed4f-9f90-fa99ec5aa3a2   286G   634G     13     31  1.11M   796K
--------------------------------------  -----  -----  -----  -----  -----  -----

/tmp is part of the root mount, which is in rpool.

My /tmp folder briefly had over 2 million files in it due to a bug in some code. When there were this many files in it, performance took a nosedive -- even just listing files (without sorting) would pause for upwards of a second. I removed most of the files, and things are back down to a manageable level now. But, operations on the list of files in /tmp are still slow.

When I time ls --sort=none on e.g. /bin, which has 2,842 entries it, I get something like:

real    0m0.088s
user    0m0.001s
sys     0m0.075s

But the same command on /tmp, which currently has 4,444 entries:

real    0m0.472s
user    0m0.007s
sys     0m0.446s

It seems that briefly housing 2 million files has left a permanent impact on the structure of /tmp? Is there a way to fix this? Do I just need to make a new /tmp and cut over to it??

2 Answers

Voted

John Mahowald · Answer 1 · 2024-06-12T05:16:32+08:00

Somewhere above millions of files in a directory, performance will be much worse. Does not really matter which file system or how many IOPS in the block device. POSIX semantics mean a significant overhead to maintain the file in directory concept. Which then becomes an exercise in understanding file system internals.

From your flame graph, not surprised that most of the stacks originates in readdir calls. I am surprised that the top level, actually taking the time, is mostly LZ4 uncompress. Which is a fast algorithm. Hundreds of milliseconds of CPU time doing that implies lots of metadata, or lots of calls to getdents64, or something else being slow.

From what little I understand about ZFS on disk format, datasets have their own sets of objects. So yes, you could make a new tmp dataset out of the root pool and mount that over the existing /tmp. Copying data not required, as it is temporary files.

Or a tmpfs on /tmp. Simplify things by removing both ZFS and block devices.

Way too late to prevent this too many files problem, but OpenZFS does have object quotas. groupquota@group to set, and zfs userspace to list. Also can be set per user or project.

Jonathan Gilbert · Answer 2 · 2024-06-14T09:57:34+08:00

Best Answer

Jonathan Gilbert

2024-06-14T09:57:34+08:002024-06-14T09:57:34+08:00

I now have the answer to this. So, yes, it is a known issue. In the internal terminology of ZFS, "if ZAP records are deleted such that an entire leaf block of the ZAP object is emptied, the block is not reclaimed." But, not only is it a known issue, but it is a fixed issue. :-) The fix isn't yet in any shipping version but is expected to be soon.

This is the fix:

https://github.com/openzfs/zfs/pull/15888

2

ZFS folder very slow after having many files

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?