I recently installed Munin on a development web server to keep track of system usage. I've noticted that the system's inode usage is climbing by about 7-8% per day even though the disk usage has barely increased at all. I'm guessing something is writing a ton of tiny files but I can't find what / where.
I know how to find disk space usage but I can't seem to find a way to summarize inode usage.
Is there a good way to determine inode usage by directory so I can locate the source of the usage?
Don't expect this to run quickly...
cd to a directory where you suspect there might be a subdirectory with lots of inodes. If this script takes a huge amount of time, you've likely found where in the filesystem to look. /var is a good start...
Otherwise, if you change to the top directory in that filesystem and run this and wait for it to finish, you'll find the directory with all the inodes.
I'm not worried about the cost of sorting. I ran a test and sorting through the unsorted output of that against 350,000 directories took 8 seconds. The initial find took . The real cost is opening all these directories in the while loop. (the loop itself takes 22 seconds). (The test data was run on a subdirectory with 350,000 directories, one of which had a million files, the rest had between 1 and 15 directories).
Various people had pointed out that ls is not great at that because it sorts the output. I had tried echo, but that is also not great. Someone else had pointed out that stat gives this info (number of directory entries) but that it isn't portable. It turns out that find -maxdepth is really fast at opening directories and counts .files, so... here it is.. points for everyone!
If the issue is one directory with too many files, here is a simple solution:
The idea behind the
find
line is that the size of a directory is proportional to the amount of files directly inside that directory. So, here we look for directories with tons of files inside it.If you don't want to guess a number, and prefer to list all suspect directories ordered by "size", that's easy too:
Grrr, commenting requires 50 rep. So this answer is actually a comment on chris's answer.
Since the questioner probably doesn't care about all the directories, only the worst ones, then using sort is likely very expensive overkill.
This isn't as complete as your version, but what this does is print lines if they're larger than the previous maximum, greatly reducing the amount of noise printed out, and saving the expense of the sort.
The downside of this is if you have 2 very large directories, and the first happens to have 1 more inode than the 2nd, you'll never see the 2nd.
A more complete solution would be to write a smarter perl script that keeps track of the top 10 values seen, and prints those out at the end. But that's too long for a quick serverfault answer.
Also, some midly smarter perl scripting would let you skip the while loop - on most platforms, ls sorts the results, and that can also be very expensive for large directories. The ls sort is not necessary here, since all we care about is the count.
You can use this little snippet:
It will print out how many files and directories are in each of the directories in the current folder, with the largest offenders at the bottom. It will help you find directories that have lots of files. (more info)
This isn't a direct answer to your question, but searching for recently modified files with a small size using find might narrow down your search:
ls won't find files whose names start with a period. Using find avoids this. This finds every file in the directory tree, stips off the basename from the end of each path, and counts the number of times each directory path appears in the resulting output. You may have to put the "!" in quotes if your shell complains about it.
Inodes can also be used up by files that have been deleted but are being held open by a running process. If this Munin package includes any constantly-running programs, another thing to check is whether it's holding open an unusual number of files.
I'd brute force this one: run tripwire on the entire device for a baseline, then run a check some time later and the offending directory will stick out like a sore thumb.
(not being able to comment is really getting old - this is for egorgry)
egorgry - ls -i prints the inode NUMBER for an entry, not the inode COUNT.
Try it with a file in your directory - you'll (probably) see an equally high number, but it's not the count of inodes, it's just the inode # your directory entry points to.
Update
A one liner that returns the inode count of each child of given directory with the biggest entries at the bottom.
Original Answer
Run it like this (given that the above script resides in an executable file in your working directory)
inode usage is approximately one per file or directory, right? So do
to count approximately how many inodes are used under [path].