This server system has been serving email services since Red Hat v 1.1
(circa 1997 I think) and now is on Fedora Core 37; through many hardware and OS updates along the way it has been kept current. And, we chose postfix
and dovecot
early on, and are still using them. And I've been the system mangler all this time.
A week ago tomorrow we had our /var
tree wiped out due to a bug in a backup script - doah! And it required a full rebuild of the OS, "from scratch", to the same version. All the software not included in the Fedora Server 37
distribution was loaded fresh via dnf install
. And, we got our full config back from our good backups.
On Monday, two days ago, I noticed the system was noticeably sluggish but didn't have time to look into it. And I also noticed our link to the internet seemed to have performance problems. That was a clue...
Then, I decided to get spamassassin
working again - it takes time to configure a mature environment like this! And SA
had been disabled before the loss of /var
, so it didn't just start back up. And anyway, when I went to check /etc/var/log/maillog
to see if it was doing its job, I found all these mail being relayed messages?! Whisky Tango Foxtrot?!
I then checked the mail queues - hundreds of thousand to Gmail alone! WOW!
For now, I've turned off ALL outbound emails with:
default_transport = error: Sorry spammers, we're not sending your email! So sue us!
And began trying to figure out what went wrong.
I DID find SOME were getting through claiming to be 127.0.0.1
, so I closed that down. And I methodically went through all the various (and copious) postfix
configuration options and couldn't find a thing wrong...
So, I went to use one of these script-testing open-relay testing web sites that try a dozen or so different hacks that spammers use to convince otherwise well-configured servers to relay their mail, but I couldn't find any - the last time I looked, there were a half-dozen or so such web sites! (What happened to 'em?! If you know of one, please tell me!)
And so I used nmap
. It does NOT do a comprehensive job, or if it can, I'm not familiar with how. But I turned to send back on and tested. In testing, it says:
Host is up (0.00027s latency).
rDNS record for <ip-addr>: <reverse-lookup-map>
PORT STATE SERVICE
25/tcp open smtp
|_smtp-open-relay: Server doesn't seem to be an open relay, all tests failed
465/tcp filtered smtps
587/tcp open submission
|_smtp-open-relay: Server isn't an open relay, authentication needed
MAC Address: [its mac address] (controller's mfg name)
Nmap done: 1 IP address (1 host up) scanned in 22.03 seconds
The ONLY two websites I could find to look at it and report were non-responsive - the one loaded but didn't respond and then when I tried reloading the page, it wouldn't reload, and the second kept saying it was busy, try again later.
So ... back to figuring it out "by hand."
OK, so NOW what do we do?
ALL requests for setting information will be gladly honored, but the config file is first of all huge, and secondly it contains a lot of private information we don't want out there.
More information - at anx's request, the output of postconf -n
:
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
broken_sasl_auth_clients = yes
command_directory = /usr/sbin
compatibility_level = 3.6
daemon_directory = /usr/libexec/postfix
data_directory = /var/lib/postfix
debug_peer_level = 10
debug_peer_list = <past-not-current-external-ip>
debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin ddd $daemon_directory/$process_name $process_id & sleep 5
default_transport = error: <our-middle-finger-to-spammers>
disable_vrfy_command = yes
html_directory = no
inet_interfaces = all
inet_protocols = all
local_recipient_maps = unix:passwd.byname $alias_maps
mail_owner = postfix
mailbox_size_limit = 1073741824
mailq_path = /usr/bin/mailq.postfix
manpage_directory = /usr/share/man
message_size_limit = 536870912
meta_directory = /etc/postfix
milter_default_action = accept
mydestination = $myhostname, localhost.$mydomain, localhost, <list-of-60ish-domain-names>
mydomain = <primary-domain>
myhostname = mail.<primary-domain>
mynetworks = <list-of-5-internal-ips>
mynetworks_style = subnet
newaliases_path = /usr/bin/newaliases.postfix
proxy_interfaces = <a-non-extant-external-ip-we-used-to-have>
queue_directory = /var/spool/postfix
readme_directory = /usr/share/doc/postfix/README_FILES
relay_domains = $mydestination, <list-of-11-internal-ips-most-don't-exist-now>
sample_directory = /usr/share/doc/postfix/samples
sendmail_path = /usr/sbin/sendmail.postfix
setgid_group = postdrop
shlib_directory = /usr/lib64/postfix
smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt
smtp_tls_CApath = /etc/pki/tls/certs
smtp_tls_security_level = may
smtpd_helo_required = yes
smtpd_helo_restrictions = permit_mynetworks, check_helo_access hash:/etc/postfix/helo_access, reject_invalid_helo_hostname, reject_non_fqdn_helo_hostname
smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unauth_pipelining, reject_non_fqdn_recipient, reject_unknown_recipient_domain, reject_unauth_destination, check_sender_access hash:/etc/postfix/sender_access, check_client_access hash:/etc/postfix/pop-before-smtp, permit_mynetworks
smtpd_sasl_auth_enable = yes
smtpd_sasl_path = /var/spool/postfix/private/auth
smtpd_sasl_security_options = noanonymous
smtpd_sasl_tls_security_options = noanonymous
smtpd_sasl_type = dovecot
smtpd_sender_restrictions = permit_mynetworks, permit_sasl_authenticated, check_client_access hash:/etc/postfix/pop-before-smtp, reject_non_fqdn_sender, reject_unknown_sender_domain
smtpd_tls_cert_file = /etc/letsencrypt/live/<primary-domain-name>/fullchain.pem
smtpd_tls_dh1024_param_file = $config_directory/dh2048.pem
smtpd_tls_dh512_param_file = $config_directory/dh512.pem
smtpd_tls_key_file = /etc/letsencrypt/live/<primary-domain-name>/privkey.pem
smtpd_tls_security_level = may
soft_bounce = no
strict_mailbox_ownership = no
unknown_local_recipient_reject_code = 550
More information - most also at the request of anx:
Postfix
uses ports 25 (smtp
) and 587 (submission
ormsa
).Dovecot
uses ports 993 (imaps
) and 995 (pop3s
) while it listens on 143 & 110 (imap
/pop
) which are blocked by (multiple) firewalls.postconf -M
smtp inet n - n - - smtpd
submission inet n - n - - smtpd -o syslog_name=postfix/submission -o smtpd_tls_security_level=encrypt -o smtpd_sasl_auth_enable=yes
pickup unix n - n 60 1 pickup
cleanup unix n - n - 0 cleanup
qmgr unix n - n 300 1 qmgr
tlsmgr unix - - n 1000? 1 tlsmgr
rewrite unix - - n - - trivial-rewrite
bounce unix - - n - 0 bounce
defer unix - - n - 0 bounce
trace unix - - n - 0 bounce
verify unix - - n - 1 verify
flush unix n - n 1000? 0 flush
proxymap unix - - n - - proxymap
proxywrite unix - - n - 1 proxymap
smtp unix - - n - - smtp
relay unix - - n - - smtp -o syslog_name=postfix/$service_name
showq unix n - n - - showq
error unix - - n - - error
retry unix - - n - - error
discard unix - - n - - discard
local unix - n n - - local
virtual unix - n n - - virtual
lmtp unix - - n - - lmtp
anvil unix - - n - 1 anvil
scache unix - - n - 1 scache
postlog unix-dgram n - n - 1 postlogd
Information yet to fetch:
- What's in the "locally added received header"s.
- Analysis of postfix log "stack" for a given queue ID.
HOLD THE PROGRAM!
Simply HAVING to send a few vital emails out, but not wanting to give the spammers a single use of our systems, I decided to try just turning off dovecot
... I found that turning off dovecot
and restoring the default_transport
setting allowed outbound to work normally and Postfix
did NOT become an open relay! YAY!
SURE, it won't work for our non-local users (which number > 1 and < 100), but hey, "you gotta do what you gotta do..."
I think this shifts the focus considerably; dovecot
is THE issue.
I'm now convinced it was not a
Postfix
issue but aDovecot
issue. However, this answer may well be helpful to others.In particular, how to determine it was likely not a
Postfix
issue was a bigger learning curve than I expected for being someone who's used the same system (as a system admin) literally for two decades and then some.What I hadn't realized was that with this kind of configuration,
Postfix
itself "gives up authority control" toDovecot
. And, well, at this point,Dovecot
's ability to receive valid connections from those of us who are not in the internal network except when we're on-site has become an important feature of our functioning.Thankfully, it's less vital for us than for many other organizations as we have other alternatives, but the reality is that most don't want to bother to log-in via command line just to deal with mail - I suspect this is the most popular position! - in this circumstance! But that takes nothing away from recognizing what I had not:
You cannot just stop
Dovecot
, once configured, and expect to be able to send WITHOUT reconfiguringPostfix
to not useDovecot
. And, you can't even just block the ports toDovecot
from the outside, becausePostfix
will just callDovecot
directly anyway!Ultimately, as per my understanding (glad to be proven wrong!) once configured,
Postfix
gives up darned near all security authority for any connection that pretends to be a valid user toDovecot
, though all "internal" sends and receives function normally withDovecot
running andPostfix
prevented from sending (as described in the question).I figure it's both fruitless and inappropriate to use this question to ask about
Dovecot
's issues, and while I think these comments constitute a valid answer, I will need to ask another question aboutDovecot
because "I'll be damned if I can find what's wrong!"I strongly suspect either I'm making a (likely not uncommon) gaff OR, there's a bug in the version we have now, post rebuild, as the config before reload was NOT sending spam, and that's what we had been using for several days before the spammers began using our system! Notably, when we didn't stop
Postfix
from sending, NO external system we found / used to test our system thinks our system is an open mail relay but there's NO denying it relayed hundreds of thousand of emails improperly! Further, opening it back up (permitting Postfix to send) promptly allowedWorth citing here is that I have checked the logs and, with the limited time I have, given all the demands on me at present, I have with 90%+ certainty confirmed that there's no "stolen credentials," but that has yet to be proven. I sure don't see the evidence in the logs, though I'm not keen to open up to the spammers again with new logging configuration to gain more data. PERHAPS it's worth always having maximum authorization logging for just this reason!