Ping a Specific Port

Question

Codejoy

Asked: 2024-11-21 14:10:41 +0800 CST2024-11-21 14:10:41 +0800 CST 2024-11-21 14:10:41 +0800 CST

Strange error using du -h -d1

772

I have an aging mailserver....it simply runs courier postfix and smtp. It is a KVM VM and has two drives, one for os and one for data. I have a very full data drive (reports 100% full even though the used and available have a spread of about 24GB). I am not sure why or what is eating up space and then releasing it. A top shows mostly just postfix's imapd doing stuff. I cannot get iotop on this machine. So I figured to start freeing up space in users mailboxes on the server I would do a du -h -d1 to try to get who the biggest offenders are. Well, this command runs SLOW slower than it has ever. So since it ran slow, I figured I would issue a screen command of:

du -h -d1 > mailboxsizes.txt

So I could come to it in the morning and see the usages. It wrote out about 6 mailboxes, largest one being 2.2GB and then nothing. So came to the actual machine to see what the command was doing if it was still running and saw this:

[root@xmail]# du -h -d1 > /root/mailboxsizes.txt
[14280.306953] INFO: task imapd:12559 blocked for more than 120 seconds.
[14280.307710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[14280.309680] imapd           D ffff8800d3d9cd98     0 12559      1 0x00000080
[14280.310591]  ffff8800b17bbc20 0000000000000086 ffff880057bbce70 ffff8800b17bbfd8
[14280.310591]  ffff8800b17bbfd8 ffff8800b17bbfd8 ffff880057bbce70 ffff8800d3d9cd90
[14280.313532]  ffff8800d3d9cd94 ffff880057bbce70 00000000ffffffff ffff8800d3d9cd98
[14280.313532] Call Trace:
[14280.315669]  [<ffffffff8168d159>] schedule_preempt_disabled+0x29/0x70
[14280.316637]  [<ffffffff8168adb5>] __mutex_lock_slowpath+0xc5/0x1c0
[14280.316637]  [<ffffffff81208e17>] ? unlazy_walk+0x87/0x140
[14280.318543]  [<ffffffff8168a21f>] mutex_lock+0x1f/0x2f
[14280.319516]  [<ffffffff81683c93>] lookup_slow+0x33/0xa7
[14280.320690]  [<ffffffff8120c8f3>] path_lookupat+0x773/0x7a0
[14280.321718]  [<ffffffff81183775>] ? filemap_fault+0x215/0x410
[14280.321718]  [<ffffffff811de5e5>] ? kmem_cache_alloc+0x35/0x1e0
[14280.323363]  [<ffffffff8120f23f>] ? getname_flags+0x4f/0x1a0
[14280.324348]  [<ffffffff8120c94b>] filename_lookup+0x2b/0xc0
[14280.324348]  [<ffffffff81210367>] user_path_at_empty+0x67/0xc0
[14280.325307]  [<ffffffff811b1431>] ? handle_mm_fault+0x6b1/0xfe0
[14280.327150]  [<ffffffff812103d1>] user_path_at+0x11/0x20
[14280.327965]  [<ffffffff81203843>] vfs_fstatat+0x63/0xc0
[14280.328093]  [<ffffffff81203dae>] SYSC_newstat+0x2e/0x60
[14280.328093]  [<ffffffff81692875>] ? do_page_fault+0x35/0x90
[14280.330895]  [<ffffffff8168ea88>] ? page_fault+0x28/0x30
[14280.331790]  [<ffffffff8120408e>] SyS_newstat+0xe/0x10
[14280.331857]  [<ffffffff81697089>] system_call_fastpath+0x16/0x1b

I am new to sysadmining and have zero idea what any of this is telling me save for something to do with imapd? I have done a reboot on this machine several times and it barely released any hard drives space or seemingly resources. I cannot figure out what is going on and why du failed above like it did. Mostly I am here asking where to even start? While this machine is old and has always had its moments, it has never done this before (even though I do acknowledge the data drive is low on space) but if I clear it, something eats it up.

For completeness:

df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        2.9G     0  2.9G   0% /dev
tmpfs           2.9G     0  2.9G   0% /dev/shm
tmpfs           2.9G   41M  2.8G   2% /run
tmpfs           2.9G     0  2.9G   0% /sys/fs/cgroup
/dev/vda3        21G   18G  2.1G  90% /
/dev/vdb        459G  435G  442M 100% /mail
/dev/vda1       976M  119M  790M  14% /boot
tmpfs           581M     0  581M   0% /run/user/0
tmpfs           581M     0  581M   0% /run/user/1000


top - 06:06:53 up  6:52,  3 users,  load average: 36.42, 36.64, 31.74
Tasks: 346 total,   8 running, 338 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.4 us,  1.2 sy,  0.0 ni,  0.0 id, 89.9 wa,  0.0 hi,  0.0 si,  1.5 st
KiB Mem :  5946284 total,   130280 free,  2278016 used,  3537988 buff/cache
KiB Swap:  2516988 total,  1906332 free,   610656 used.  3362528 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
19174 postfix   20   0   27028   5504   1488 R   8.3  0.1   0:38.53 imapd       
19586 postfix   20   0   27028   5504   1488 D   7.6  0.1   0:32.18 imapd       
19008 postfix   20   0   27028   5504   1488 R   7.0  0.1   0:48.61 imapd       
19372 postfix   20   0   27464   5872   1504 D   4.3  0.1   0:30.38 imapd       
20087 postfix   20   0   27028   5504   1488 D   4.3  0.1   0:23.27 imapd       
20188 postfix   20   0   27028   5504   1488 D   4.3  0.1   0:23.31 imapd       
20353 postfix   20   0   27028   5508   1488 D   4.3  0.1   0:23.05 imapd       
19963 postfix   20   0   27028   5508   1488 D   4.0  0.1   0:23.85 imapd       
20275 postfix   20   0   27028   5508   1488 D   4.0  0.1   0:22.56 imapd       
18460 postfix   20   0   29348   5748   1588 R   3.7  0.1   0:38.09 imapd       
20236 postfix   20   0   27028   5516   1488 D   3.7  0.1   0:22.86 imapd       
   32 root      20   0       0      0      0 S   1.7  0.0   5:57.44 kswapd0     
20079 postfix   20   0   32728   9152   1520 S   1.7  0.2   0:01.90 imapd       
19702 postfix   20   0   27028   5516   1488 D   1.3  0.1   0:27.77 imapd       
18575 postfix   20   0   30472   6848   1596 D   1.0  0.1   0:14.86 imapd       
19782 postfix   20   0   27028   5508   1488 D   1.0  0.1   0:27.02 imapd       
 1026 root      20   0 1174028  22616   8992 S   0.7  0.4   2:53.90 fail2ban-s+

Not sure what to look at and try next to figure out where I can du some folders and know whose old inbox's we are keeping around to get rid of in an attempt to free up space and hopefully make the server perform better.

my only idea, is to systemctl stop postfix for a bit and see if du's and ls's work better and double check that top isn't pinged out with that.

also in case it is relevant an iostat:

iostat
Linux 3.10.0-514.16.1.el7.x86_64 (xmail)    11/21/2024  _x86_64_    (3 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.95    0.01    1.29   88.86    0.65    3.24

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
vda              11.92       252.15        61.60    6335865    1547820
vdb            1517.62     62131.62        78.76 1561227117    1979120

1 Answers

Voted

HBruijn · Answer 1 · 2024-11-21T18:47:12+08:00

The message in your display is a kernel warning. Such warnings typically get broadcast to the console, recorded in the kernel ringbuffer (typically displayed with dmesg command with the optional -T switch) and/or copied to log files in /var/log.

Often the command(s) you're running in that console are not the cause of the error that gets displayed.

In this case there might be a correlation though.

Most Open Source IMAP implementations store messages in the Maildir/ format where each e-mail message is an individual file. On a filesystem with 450 GB of mail that will be a lot of directories, many with a great number of e-mails and therefore a lot of files. Running du on that /mail file system is likely to be (extremely) inefficient, depending on the file system it uses. You can check which file system is in use with for example cat /proc/mounts and/or in /etc/fstab. Some file systems perform better than others, but several 10.000's of files in a single directory (in a single Inbox or other mail folder) can be problematic already.

Similarly it will also be taxing when the imapd needs to index such an Inbox or other mail folder.

A full filesystem, as already displayed with the df command, aggravates the issue. Your users are probably still receiving messages, reading them and deleting them so there is probably quite some load.

Of interest might also be to run df -i if you're also running out of inodes.

In general (nearly) full file systems suffer from all kinds of performance issues.

For a VM it is usually possible and trivial to simply assign more storage from the hypervisor to the virtual disk, then grow /mail the file system and many of your problems should go away. Both of those are usually online operations that can be safely executed during business hours.

The opposite is to find wasted storage and reclaim that by deleting data that is no longer needed.

As to freeing up space:

Remove the Maildir and data of user accounts that no longer exist.
The IMAP protocol requires that clients mark individual messages as deleted, but that does not actually delete the files yet from a Maildir. That requires a separate EXPUNGE command. Some broken/legacy clients don't issue that expunge and deleted messages will linger indefinitely on your mail servers.
Other mail clients "delete" mail by moving the messages to a Trash or similar named folder where they can also linger indefinitely. Emptying their Trash for them can be useful to safely free up necessary space.

Depending on the mail servers that you're using there may be specific tools to help you with such clean up, like the doveadm-expunge tool that can search for email messages with the DELETED flag and will properly delete from your filesystem.

Strange error using du -h -d1

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?