I'm running a mail server for about 30 people. I have had zero issues with it. But last week, several users began reporting an error in their email client, Outlook:
Checking the server mail log around the time of the error, I could only find these entries all happening around the same time. I'm not even sure if these entries are related to the Outlook error (doesn't seem to have anything to do with smtp) but the fact that the connections are closed all around the same time and the long "waiting for input" times looks suspicious:
81218 Jan 18 11:56:56 ip-172-30-0-131 dovecot: imap(t.olixxxx)<3739040></Z84+joPNhRsOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1175.376 secs, 2 B in + 10 B out, state=wait-input) in=182 out=172366 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
81219 Jan 18 11:56:56 ip-172-30-0-131 dovecot: imap(s.damxxxx)<3739037><iQY3+joPottsOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1174.763 secs, 2 B in + 10 B out, state=wait-input) in=182 out=799331 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
81220 Jan 18 11:56:59 ip-172-30-0-131 postfix/smtpd[3740240]: warning: hostname 179.hosted-by.198xd.com does not resolve to address 45.129.14.179: Name or service not known
81221 Jan 18 11:56:59 ip-172-30-0-131 postfix/smtpd[3740240]: connect from unknown[45.129.14.179]
81222 Jan 18 11:57:00 ip-172-30-0-131 dovecot: imap(j.pomexxxxx)<3739095><k7z3/zoPqLdsOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1078.221 secs, 2 B in + 10 B out, state=wait-input) in=165 out=801497 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count= 0 body_bytes=0
81223 Jan 18 11:57:00 ip-172-30-0-131 dovecot: imap(a.cerxxxxx)<3739042><JCXQ+joPu5JsOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1169.527 secs, 2 B in + 10 B out, state=wait-input) in=182 out=303618 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
81224 Jan 18 11:57:00 ip-172-30-0-131 dovecot: imap(h.foxxxxx)<3739034><kpEo+joP9g5sOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1180.675 secs, 2 B in + 10 B out, state=wait-input) in=194 out=1927 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 bo dy_bytes=0
81225 Jan 18 11:57:00 ip-172-30-0-131 dovecot: imap(dxxxxxx)<3739057><xljV/DoPPnZsOgYu>: Connection closed (IDLE running for 0.001 + waiting input for 1135.454 secs, 2 B in + 10 B out, state=wait-input) in=182 out=458253 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 bod y_bytes=0
The errors aren't happening all the time for users but often enough to be annoying. I'm running dovecot and postfix on Debian bullseye.
Dovecote is your IMAP server. This allows mail clients to retrieve emails sent to the mailbox. Sending of emails (SMTP) is handled by Postfix in your setup. The error you've shown us relates to SMTP. The subject of your post is wrong.
If both servers are running on the same host, and you only see SMTP errors, this suggests an issue with the SMTP server rather than a problem with the host or the network (but is FAR from conclusive).
Did you check all the logs or just the mail log?
BTW these log entries are not errors per se - and don't directly correlate with what you have shown us in the picture.
Even for small setup like this, having some monitoring in place is probably advisable. I'd also suggest testing whether Outlook reports an error if it loses (established) connectivity to an IMAP server.
The root cause of the problem was a user who put in their wrong password into their mobile device. While on the client's wifi network, the device repeatedly kept trying to log in. When the number of failed login attempts reached a threshold, the fail2ban software banned the IP address from the server for a period of 10 minutes. This impacted everyone else's ability to log in.
Once banned, the mail log did not show any activity and the logs I posted here was me chasing ghosts.
However, I luckily spotted a log entry that showed the user with an "auth failed" message in it. This is when a light bulb went off: "the user might be getting banned by fail2ban." The only mystery was why other users were getting banned. This became obviuos shortly there after by seeing one of the IP addresses of the user with the bad password was on a T-mobile IP address.
Based on the information you provided, it seems that the issue you are experiencing is related to Dovecot. The log entry you shared indicates that the connection was closed after waiting for input for 1175.376 seconds. This could be due to a variety of reasons such as network connectivity issues, incorrect configuration settings, or resource overuse. To troubleshoot this issue, I recommend following the steps:https://www.linode.com/docs/guides/troubleshooting-problems-with-postfix-dovecot-and-mysql/