Ping a Specific Port

Question

BMDan

Asked: 2010-09-23 15:57:33 +0800 CST2010-09-23 15:57:33 +0800 CST 2010-09-23 15:57:33 +0800 CST

rm on a directory with millions of files

772

Background: physical server, about two years old, 7200-RPM SATA drives connected to a 3Ware RAID card, ext3 FS mounted noatime and data=ordered, not under crazy load, kernel 2.6.18-92.1.22.el5, uptime 545 days. Directory doesn't contain any subdirectories, just millions of small (~100 byte) files, with some larger (a few KB) ones.

We have a server that has gone a bit cuckoo over the course of the last few months, but we only noticed it the other day when it started being unable to write to a directory due to it containing too many files. Specifically, it started throwing this error in /var/log/messages:

ext3_dx_add_entry: Directory index full!

The disk in question has plenty of inodes remaining:

Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda3            60719104 3465660 57253444    6% /

So I'm guessing that means we hit the limit of how many entries can be in the directory file itself. No idea how many files that would be, but it can't be more, as you can see, than three million or so. Not that that's good, mind you! But that's part one of my question: exactly what is that upper limit? Is it tunable? Before I get yelled at—I want to tune it down; this enormous directory caused all sorts of issues.

Anyway, we tracked down the issue in the code that was generating all of those files, and we've corrected it. Now I'm stuck with deleting the directory.

A few options here:

rm -rf (dir)

I tried this first. I gave up and killed it after it had run for a day and a half without any discernible impact.
unlink(2) on the directory: Definitely worth consideration, but the question is whether it'd be faster to delete the files inside the directory via fsck than to delete via unlink(2). That is, one way or another, I've got to mark those inodes as unused. This assumes, of course, that I can tell fsck not to drop entries to the files in /lost+found; otherwise, I've just moved my problem. In addition to all the other concerns, after reading about this a bit more, it turns out I'd probably have to call some internal FS functions, as none of the unlink(2) variants I can find would allow me to just blithely delete a directory with entries in it. Pooh.
while [ true ]; do ls -Uf | head -n 10000 | xargs rm -f 2>/dev/null; done )

This is actually the shortened version; the real one I'm running, which just adds some progress-reporting and a clean stop when we run out of files to delete, is:
```
export i=0;
time ( while [ true ]; do
  ls -Uf | head -n 3 | grep -qF '.png' || break;
  ls -Uf | head -n 10000 | xargs rm -f 2>/dev/null;
  export i=$(($i+10000));
  echo "$i...";
done )
```
This seems to be working rather well. As I write this, it has deleted 260,000 files in the past thirty minutes or so.

Now, for the questions:

As mentioned above, is the per-directory entry limit tunable?
Why did it take "real 7m9.561s / user 0m0.001s / sys 0m0.001s" to delete a single file which was the first one in the list returned by ls -U, and it took perhaps ten minutes to delete the first 10,000 entries with the command in #3, but now it's hauling along quite happily? For that matter, it deleted 260,000 in about thirty minutes, but it's now taken another fifteen minutes to delete 60,000 more. Why the huge swings in speed?
Is there a better way to do this sort of thing? Not store millions of files in a directory; I know that's silly, and it wouldn't have happened on my watch. Googling the problem and looking through SF and SO offers a lot of variations on find that are not going to be significantly faster than my approach for several self-evident reasons. But does the delete-via-fsck idea have any legs? Or something else entirely? I'm eager to hear out-of-the-box (or inside-the-not-well-known-box) thinking.

Thanks for reading the small novel; feel free to ask questions and I'll be sure to respond. I'll also update the question with the final number of files and how long the delete script ran once I have that.

Final script output!:

2970000...
2980000...
2990000...
3000000...
3010000...

real    253m59.331s
user    0m6.061s
sys     5m4.019s

So, three million files deleted in a bit over four hours.

24 Answers

Voted

Matthew Ife · Answer 1 · 2011-11-07T11:06:31+08:00

Update August 2021

This answer continues to attract a lot of attention and I feel as if its so woefully out of date it kind of is redundant now.

Doing a find ... -delete is most likely going to produce acceptable results in terms of performance.

The one area I felt might result in a higher performance is tackling the 'removing' part of the problem instead of the 'listing' part.

I tried it and it didn't work. But I felt it was useful to explain what I did and why.

In todays newer kernels, through the use of the IO uring subsystem in the kernel (see man 2 io_uring_setup) it is actually possible to attempt to perform unlinks asynchronously -- meaning we can submit unlink requests without waiting or blocking to see the result.

This program basically reads a directory, submits hundreds of unlinks without waiting for the result, then reaps the results later once the system is done handling the request.

It tries to do what dentls did but uses IO uring. Can be compiled with gcc -o dentls2 dentls2.c -luring.

#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <err.h>
#include <sched.h>

#include <sys/stat.h>
#include <sys/types.h>
#include <dirent.h>

#include <linux/io_uring.h>
#include <liburing.h>

/* Try to keep the queue size to under two pages as internally its stored in
 * the kernel as contiguously ordered pages. Basically the bigger you make it
 * the higher order it becomes and the less likely you'll have the contiguous
 * pages to support it, despite not hitting any user limits.
 * This reduces an ENOMEM here by keeping the queue size as order 1
 * Ring size internally is rougly 24 bytes per entry plus overheads I haven't
 * accounted for.
 */
#define QUEUE_SIZE 256

/* Globals to manage the queue */
static volatile int pending = 0;
static volatile int total_files = 0;

/* Probes kernel uring implementation and checks if action is 
 * supported inside the kernel */
static void probe_uring(
    struct io_uring *ring)
{
  struct io_uring_probe *pb = {0};

  pb = io_uring_get_probe_ring(ring);

  /* Can we perform IO uring unlink in this kernel ? */
  if (!io_uring_opcode_supported(pb, IORING_OP_UNLINKAT)) {
    free(pb);
    errno = ENOTSUP;
    err(EXIT_FAILURE, "Unable to configure uring");
  }

  free(pb);
}


/* Place a unlink call for the specified file/directory on the ring */
static int submit_unlink_request(
    int dfd,
    const char *fname,
    struct io_uring *ring)
{
  char *fname_cpy = strdup(fname);
  struct io_uring_sqe *sqe = NULL;

  /* Fetch a free submission entry off the ring */
  sqe = io_uring_get_sqe(ring);
  if (!sqe)
    /* Submission queue full */
    return 0;

  pending++;
  /* Format the unlink call for submission */
  io_uring_prep_rw(IORING_OP_UNLINKAT, sqe, dfd, fname_cpy, 0, 0);
  sqe->unlink_flags = 0;

  /* Set the data to just be the filename. Useful for debugging
   * at a later point */
  io_uring_sqe_set_data(sqe, fname_cpy);

  return 1;
}


/* Submit the pending queue, then reap the queue
 * clearing up room on the completion queue */
static void consume_queue(
    struct io_uring *ring)
{
  char *fn;
  int i = 0, bad = 0;
  int rc;
  struct io_uring_cqe **cqes = NULL;

  if (pending < 0)
    abort();

  cqes = calloc(pending, sizeof(struct io_uring_cqe *));
  if (!cqes)
    err(EXIT_FAILURE, "Cannot find memory for CQE pointers");

  /* Notify about submitted entries from the queue (this is a async call) */
  io_uring_submit(ring);

  /* We can immediately take a peek to see if we've anything completed */
  rc = io_uring_peek_batch_cqe(ring, cqes, pending);

  /* Iterate the list of completed entries. Check nothing crazy happened */
  for (i=0; i < rc; i++) {
    /* This returns the filename we set earlier */
    fn = io_uring_cqe_get_data(cqes[i]);

    /* Check the error code of the unlink calls */
    if (cqes[i]->res < 0) {
      errno = -cqes[i]->res;
      warn("Unlinking entry %s failed", fn);
      bad++;
    }

    /* Clear up our CQE */
    free(fn);
    io_uring_cqe_seen(ring, cqes[i]);
  }

  pending -= rc + bad;
  total_files += rc - bad;
  free(cqes);
}



/* Main start */
int main(
    const int argc,
    const char **argv)
{
  struct io_uring ring = {0};
  struct stat st = {0};
  DIR *target = NULL;
  int dfd;
  struct dirent *fn;

  /* Check initial arguments passed make sense */
  if (argc < 2)
    errx(EXIT_FAILURE, "Must pass a directory to remove files from.");

  /* Check path validity */
  if (lstat(argv[1], &st) < 0)
    err(EXIT_FAILURE, "Cannot access target directory");

  if (!S_ISDIR(st.st_mode)) 
    errx(EXIT_FAILURE, "Path specified must be a directory");

  /* Open the directory */
  target = opendir(argv[1]);
  if (!target)
    err(EXIT_FAILURE, "Opening the directory failed");
  dfd = dirfd(target);

  /* Create the initial uring for handling the file removals */
  if (io_uring_queue_init(QUEUE_SIZE, &ring, 0) < 0)
    err(EXIT_FAILURE, "Cannot initialize URING");

  /* Check the unlink action is supported */
  probe_uring(&ring);

  /* So as of writing this code, GETDENTS doesn't have URING support.
   * but checking the kernel mailing list indicates its in progress.
   * For now, we'll just do laymans readdir(). These days theres no 
   * actual difference between it and making the getdents() call ourselves.
   */
  while (fn = readdir(target)) {
    if (fn->d_type != DT_REG)
      /* Pay no attention to non-files */
      continue;

    /* Add to the queue until its full, try to consume it
     * once its full. 
     */
    while (!submit_unlink_request(dfd, fn->d_name, &ring)) {
      /* When the queue becomes full, consume queued entries */
      consume_queue(&ring);
      /* This yield is here to give the uring a chance to 
       * complete pending requests */
      sched_yield();
      continue;
    }
  }

  /* Out of files in directory to list. Just clear the queue */
  while (pending) {
    consume_queue(&ring);
    sched_yield();
  }

  printf("Total files: %d\n", total_files);

  io_uring_queue_exit(&ring);
  closedir(target);
  exit(0);
}

The results were ironically opposite what I suspected, but why?

TMPFS with 4 million files

$ time ./dentls2 /tmp/many
Total files: 4000000

real    0m6.459s
user    0m0.360s
sys 0m24.224s

Using find:

$ time find /tmp/many -type f -delete

real    0m9.978s
user    0m1.872s
sys 0m6.617s

BTRFS with 10 million files

$ time ./dentls2 ./many
Total files: 10000000

real    10m25.749s
user    0m2.214s
sys 16m30.865s

Using find:

time find ./many -type f -delete

real    7m1.328s
user    0m9.209s
sys 4m42.000s

So it looks as if batched syscalls dont make an improvement in real time. The new dentls2 spends much more time working (four times as much) only to result in worse performance. So a net loss in overall efficiency and worse latency. dentls2 is worse.

The cause of this is because io_uring produces kernel dispatcher threads to do the unlink work internally, but the directory inode being worked on can only be modified by a single writer at one time.

Basically using the uring we're creating lots of little threads but only one thread is allowed to delete from the directory. We've just created a bunch of contention and eliminated the advantage of doing batched IO.

Using eBPF you can measure the unlink frequencies and watch what causes the delays.

In the case of BTRFS its the kernel function call btrfs_commit_inode_delayed_inode which acquires the lock when unlink is called.

With dentls2

# /usr/share/bcc/tools/funclatency btrfs_commit_inode_delayed_inode
    Tracing 1 functions for "btrfs_commit_inode_delayed_inode"... Hit Ctrl-C to end.

     nsecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 18       |                                        |
       512 -> 1023       : 120      |                                        |
      1024 -> 2047       : 50982    |                                        |
      2048 -> 4095       : 2569467  |********************                    |
      4096 -> 8191       : 4936402  |****************************************|
      8192 -> 16383      : 1662380  |*************                           |
     16384 -> 32767      : 656883   |*****                                   |
     32768 -> 65535      : 85409    |                                        |
     65536 -> 131071     : 21715    |                                        |
    131072 -> 262143     : 9719     |                                        |
    262144 -> 524287     : 5981     |                                        |
    524288 -> 1048575    : 857      |                                        |
   1048576 -> 2097151    : 293      |                                        |
   2097152 -> 4194303    : 220      |                                        |
   4194304 -> 8388607    : 255      |                                        |
   8388608 -> 16777215   : 153      |                                        |
  16777216 -> 33554431   : 56       |                                        |
  33554432 -> 67108863   : 6        |                                        |
  67108864 -> 134217727  : 1        |                                        |

avg = 8533 nsecs, total: 85345432173 nsecs, count: 10000918

Using find ... -delete:

# /usr/share/bcc/tools/funclatency btrfs_commit_inode_delayed_inode
Tracing 1 functions for "btrfs_commit_inode_delayed_inode"... Hit Ctrl-C to end.
     nsecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 34       |                                        |
       512 -> 1023       : 95       |                                        |
      1024 -> 2047       : 1005784  |****                                    |
      2048 -> 4095       : 8110338  |****************************************|
      4096 -> 8191       : 672119   |***                                     |
      8192 -> 16383      : 158329   |                                        |
     16384 -> 32767      : 42338    |                                        |
     32768 -> 65535      : 4667     |                                        |
     65536 -> 131071     : 3597     |                                        |
    131072 -> 262143     : 2860     |                                        |
    262144 -> 524287     : 216      |                                        |
    524288 -> 1048575    : 22       |                                        |
   1048576 -> 2097151    : 6        |                                        |
   2097152 -> 4194303    : 3        |                                        |
   4194304 -> 8388607    : 5        |                                        |
   8388608 -> 16777215   : 3        |                                        |

avg = 3258 nsecs, total: 32585481993 nsecs, count: 10000416

You can see from the histogram that find spends 3258 nanoseconds on average in btrfs_commit_inode_delayed_inode but dentls2 spends 8533 nanoseconds in the function.

Also the histogram shows that overall io_uring threads spend at least twice as long waiting on the lock which the majority of calls taking 4096-8091 nanoseconds versus the majority in find taking 2048-4095 nanoseconds.

Find is single-threaded and isn't contending for the lock, whereas `dentls2 is multi-threaded (due to the uring) which produces lock contention and the delays that are experienced are reflected in the analysis.

Conclusion

All in all, on modern systems (as of writing this) there is less and less you can do in software to make this go faster than it is set to go.

It used to be reading a large buffer from the disk you could compound an expensive IO call down into one large sequential read, instead of seeky IO which small getdents() buffers could typically end up being.

Also due to other improvements there are smaller overheads to just invoking system calls and major improvements in sequential/random IO access times that eliminate the big IO bottlenecks we used to experience.

On my systems, this problem has become memory/cpu bound. Theres a single-accessor problem on (at least) BTRFS which limits the speed you can go to a single cpu/programs worth of unlinks per directory at a time. Trying to batch the IO's yields at best minor improvements even in ideal circumstances of using tmpfs and typically is worse on a real-world filesystem.

To top it off, we really dont have this problem anymore -- gone are the days of 10 million files taking 4 hours to remove.

Just do something simple like find ... -delete. No amount of optimization I tried seemed to yield major performance improvements worth the coding (or analysis) over a default simple setup.

Original Answer

Whilst a major cause of this problem is ext3 performance with millions of files, the actual root cause of this problem is different.

When a directory needs to be listed readdir() is called on the directory which yields a list of files. readdir is a posix call, but the real Linux system call being used here is called 'getdents'. Getdents list directory entries by filling a buffer with entries.

The problem is mainly down to the fact that that readdir() uses a fixed buffer size of 32Kb to fetch files. As a directory gets larger and larger (the size increases as files are added) ext3 gets slower and slower to fetch entries and additional readdir's 32Kb buffer size is only sufficient to include a fraction of the entries in the directory. This causes readdir to loop over and over and invoke the expensive system call over and over.

For example, on a test directory I created with over 2.6 million files inside, running "ls -1|wc-l" shows a large strace output of many getdent system calls.

$ strace ls -1 | wc -l
brk(0x4949000)                          = 0x4949000
getdents(3, /* 1025 entries */, 32768)  = 32752
getdents(3, /* 1024 entries */, 32768)  = 32752
getdents(3, /* 1025 entries */, 32768)  = 32760
getdents(3, /* 1025 entries */, 32768)  = 32768
brk(0)                                  = 0x4949000
brk(0x496a000)                          = 0x496a000
getdents(3, /* 1024 entries */, 32768)  = 32752
getdents(3, /* 1026 entries */, 32768)  = 32760
...

Additionally the time spent in this directory was significant.

$ time ls -1 | wc -l
2616044

real    0m20.609s
user    0m16.241s
sys 0m3.639s

The method to make this a more efficient process is to call getdents manually with a much larger buffer. This improves performance significantly.

Now, you're not supposed to call getdents yourself manually so no interface exists to use it normally (check the man page for getdents to see!), however you can call it manually and make your system call invocation way more efficient.

This drastically reduces the time it takes to fetch these files. I wrote a program that does this.

/* I can be compiled with the command "gcc -o dentls dentls.c" */

#define _GNU_SOURCE

#include <dirent.h>     /* Defines DT_* constants */
#include <err.h>
#include <fcntl.h>
#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

struct linux_dirent {
        long           d_ino;
        off_t          d_off;
        unsigned short d_reclen;
        char           d_name[256];
        char           d_type;
};

static int delete = 0;
char *path = NULL;

static void parse_config(
        int argc,
        char **argv)
{
    int option_idx = 0;
    static struct option loptions[] = {
      { "delete", no_argument, &delete, 1 },
      { "help", no_argument, NULL, 'h' },
      { 0, 0, 0, 0 }
    };

    while (1) {
        int c = getopt_long(argc, argv, "h", loptions, &option_idx);
        if (c < 0)
            break;

        switch(c) {
          case 0: {
              break;
          }
 
          case 'h': {
              printf("Usage: %s [--delete] DIRECTORY\n"
                     "List/Delete files in DIRECTORY.\n"
                     "Example %s --delete /var/spool/postfix/deferred\n",
                     argv[0], argv[0]);
              exit(0);                      
              break;
          }

          default:
          break;
        }
    }

    if (optind >= argc)
      errx(EXIT_FAILURE, "Must supply a valid directory\n");

    path = argv[optind];
}

int main(
    int argc,
    char** argv)
{

    parse_config(argc, argv);

    int totalfiles = 0;
    int dirfd = -1;
    int offset = 0;
    int bufcount = 0;
    void *buffer = NULL;
    char *d_type;
    struct linux_dirent *dent = NULL;
    struct stat dstat;

    /* Standard sanity checking stuff */
    if (access(path, R_OK) < 0) 
        err(EXIT_FAILURE, "Could not access directory");

    if (lstat(path, &dstat) < 0) 
        err(EXIT_FAILURE, "Unable to lstat path");

    if (!S_ISDIR(dstat.st_mode))
        errx(EXIT_FAILURE, "The path %s is not a directory.\n", path);

    /* Allocate a buffer of equal size to the directory to store dents */
    if ((buffer = calloc(dstat.st_size*3, 1)) == NULL)
        err(EXIT_FAILURE, "Buffer allocation failure");

    /* Open the directory */
    if ((dirfd = open(path, O_RDONLY)) < 0) 
        err(EXIT_FAILURE, "Open error");

    /* Switch directories */
    fchdir(dirfd);

    if (delete) {
        printf("Deleting files in ");
        for (int i=5; i > 0; i--) {
            printf("%u. . . ", i);
            fflush(stdout);
            sleep(1);
        }
        printf("\n");
    }

    while (bufcount = syscall(SYS_getdents, dirfd, buffer, dstat.st_size*3)) {
        offset = 0;
        dent = buffer;
        while (offset < bufcount) {
            /* Don't print thisdir and parent dir */
            if (!((strcmp(".",dent->d_name) == 0) || (strcmp("..",dent->d_name) == 0))) {
                d_type = (char *)dent + dent->d_reclen-1;
                /* Only print files */
                if (*d_type == DT_REG) {
                    printf ("%s\n", dent->d_name);
                    if (delete) {
                        if (unlink(dent->d_name) < 0)
                            warn("Cannot delete file \"%s\"", dent->d_name);
                    }
                    totalfiles++;
                }
            }
            offset += dent->d_reclen;
            dent = buffer + offset;
        }
    }
    fprintf(stderr, "Total files: %d\n", totalfiles);
    close(dirfd);
    free(buffer);

    exit(0);
}

Whilst this does not combat the underlying fundamental problem (lots of files, in a filesystem that performs poorly at it). It's likely to be much, much faster than many of the alternatives being posted.

As a forethought, one should remove the affected directory and remake it after. Directories only ever increase in size and can remain poorly performing even with a few files inside due to the size of the directory.

Edit: I've cleaned this up quite a bit. Added an option to allow you to delete on the command line at runtime and removed a bunch of the treewalk stuff which, honestly looking back was questionable at best. Also was shown to produce memory corruption.

You can now do dentls --delete /my/path

New results. Based off of a directory with 1.82 million files.

## Ideal ls Uncached
$ time ls -u1 data >/dev/null

real    0m44.948s
user    0m1.737s
sys 0m22.000s

## Ideal ls Cached
$ time ls -u1 data >/dev/null

real    0m46.012s
user    0m1.746s
sys 0m21.805s


### dentls uncached
$ time ./dentls data >/dev/null
Total files: 1819292

real    0m1.608s
user    0m0.059s
sys 0m0.791s

## dentls cached
$ time ./dentls data >/dev/null
Total files: 1819292

real    0m0.771s
user    0m0.057s
sys 0m0.711s

Was kind of surprised this still works so well!

Déjà vu · Answer 2 · 2010-09-26T21:49:38+08:00

Best Answer

Déjà vu

2010-09-26T21:49:38+08:002010-09-26T21:49:38+08:00

The data=writeback mount option deserves to be tried, in order to prevent journaling of the file system. This should be done only during the deletion time, there is a risk however if the server is being shutdown or rebooted during the delete operation.

According to this page,

Some applications show very significant speed improvement when it is used. For example, speed improvements can be seen (...) when applications create and delete large volumes of small files.

The option is set either in fstab or during the mount operation, replacing data=ordered with data=writeback. The file system containing the files to be deleted has to be remounted.

36

jftuga · Answer 3 · 2010-09-23T16:27:26+08:00

jftuga

2010-09-23T16:27:26+08:002010-09-23T16:27:26+08:00

Would it be possible to backup all of the other files from this file system to a temporary storage location, reformat the partition, and then restore the files?

32

Alex J. Roberts · Answer 4 · 2010-09-23T21:45:40+08:00

Alex J. Roberts

2010-09-23T21:45:40+08:002010-09-23T21:45:40+08:00

There is no per directory file limit in ext3 just the filesystem inode limit (i think there is a limit on the number of subdirectories though).

You may still have problems after removing the files.

When a directory has millions of files, the directory entry itself becomes very large. The directory entry has to be scanned for every remove operation, and that takes various amounts of time for each file, depending on where its entry is located. Unfortunately even after all the files have been removed the directory entry retains its size. So further operations that require scanning the directory entry will still take a long time even if the directory is now empty. The only way to solve that problem is to rename the directory, create a new one with the old name, and transfer any remaining files to the new one. Then delete the renamed one.

12

Alix Axel · Answer 5 · 2013-06-05T03:52:38+08:00

Alix Axel

2013-06-05T03:52:38+08:002013-06-05T03:52:38+08:00

I haven't benchmarked it, but this guy did:

rsync -a --delete ./emptyDirectoty/ ./hugeDirectory/

8

adamf · Answer 6 · 2020-03-24T18:43:45+08:00

TLDR: use rsync -a --delete emptyfolder/ x.

This question has 50k views, and quite a few answers, but nobody seems to have benchmarked all the different replies. There's one link to an external benchmark, but that one's over 7 years old and didn't look at the program provided in this answer: https://serverfault.com/a/328305/565293

Part of the difficulty here is that the time it takes to remove a file depends heavily on the disks in use and the file system. In my case, I tested both with a consumer SSD running BTRFS on Arch Linux (updated as of 2020-03), but I got the same ordering of the results on a different distribution (Ubuntu 18.04), filesystem (ZFS), and drive type (HDD in a RAID10 configuration).

Test setup was identical for each run:

# setup
mkdir test && cd test && mkdir empty
# create 800000 files in a folder called x
mkdir x && cd x
seq 800000 | xargs touch
cd ..

Test results:

rm -rf x: 30.43s

find x/ -type f -delete: 29.79

perl -e 'for(<*>){((stat)[9]<(unlink))}': 37.97s

rsync -a --delete empty/ x: 25.11s

(The following is the program from this answer, but modified to not print anything or wait before it deletes files.)

./dentls --delete x: 29.74

The rsync version proved to be the winner every time I repeated the test, although by a pretty low margin. The perl command was slower than any other option on my systems.

Somewhat shockingly, the program from the top answer to this question proved to be no faster on my systems than a simple rm -rf. Let's dig into why that is.

First of all, the answer claims that the problem is that rm is using readdir with a fixed buffer size of 32Kb with getdents. This proved not to be the case on my Ubuntu 18.04 system, which used a buffer four times larger. On the Arch Linux system, it was using getdents64.

In addition, the answer misleadingly provides statistics giving its speed at listing the files in a large directory, but not removing them (which is what the question was about). It compares dentls to ls -u1, but a simple strace reveals that getdents is not the reason why ls -u1 is slow, at least not on my system (Ubuntu 18.04 with 1000000 files in a directory):

strace -c ls -u1 x >/dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 94.00    7.177356           7   1000000           lstat
  5.96    0.454913        1857       245           getdents
[snip]

This ls command makes a million calls to lstat, which slows the program way down. The getdents calls only add up to 0.455 seconds. How long do the getdents calls take in dentls on the same folder?

strace -c ./dentls x >/dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.91    0.489895       40825        12           getdents
[snip]

That's right! Even though dentls only makes 12 calls instead of 245, it actually takes the system longer to run these calls. So the explanation given in that answer is actually incorrect - at least for the two systems I've been able to test this on.

The same applies to rm and dentls --delete. Whereas rm takes 0.42s calling getdents, dentls takes 0.53s. In either case, the vast majority of the time is spent calling unlink!

So in short, don't expect to see massive speedups running dentls, unless your system is like the author's and has a lot of overhead on individual getdents. Maybe the glibc folks have considerably sped it up in the years since the answer was written, and it now takes a linear about of time to respond for different buffer sizes. Or maybe the response time of getdents depends on the system architecture in some way that isn't obvious.

Alexandre · Answer 7 · 2010-12-24T11:54:40+08:00

Alexandre

2010-12-24T11:54:40+08:002010-12-24T11:54:40+08:00

find simply did not work for me, even after changing the ext3 fs's parameters as suggested by the users above. Consumed way too much memory. This PHP script did the trick - fast, insignificant CPU usage, insignificant memory usage:

<?php 
$dir = '/directory/in/question';
$dh = opendir($dir)) { 
while (($file = readdir($dh)) !== false) { 
    unlink($dir . '/' . $file); 
} 
closedir($dh); 
?>

I posted a bug report regarding this trouble with find: http://savannah.gnu.org/bugs/?31961

4

Matthew Read · Answer 8 · 2012-04-24T14:29:17+08:00

Matthew Read

2012-04-24T14:29:17+08:002012-04-24T14:29:17+08:00

I recently faced a similar issue and was unable to get ring0's data=writeback suggestion to work (possibly due to the fact that the files are on my main partition). While researching workarounds I stumbled upon this:

tune2fs -O ^has_journal <device>

This will turn off journaling completely, regardless of the data option give to mount. I combined this with noatime and the volume had dir_index set, and it seemed to work pretty well. The delete actually finished without me needing to kill it, my system remained responsive, and it's now back up and running (with journaling back on) with no issues.

3

karmawhore · Answer 9 · 2010-09-27T18:03:30+08:00

karmawhore

2010-09-27T18:03:30+08:002010-09-27T18:03:30+08:00

Make sure you do:

mount -o remount,rw,noatime,nodiratime /mountpoint

which should speed things up a bit as well.

3

bindbn · Answer 10 · 2010-09-23T20:04:52+08:00

bindbn

2010-09-23T20:04:52+08:002010-09-23T20:04:52+08:00

ls very slow command. Try:

find /dir_to_delete ! -iname "*.png" -type f -delete

2

rm on a directory with millions of files

TMPFS with 4 million files

BTRFS with 10 million files

Conclusion

Original Answer

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?