Recently our internal SMTP relay server stopped working - or rather, stopped sending emails. It responds to send mail requests, a simple TELNET session shows no errors, but all messages get stuck in its queue.
This all happened after our virtualization server ran out of disk space and suspended quite a few machines.
I suspect this is a DNS issue. I was able to set up a working relay server inside a virtual machine (on my own workstation), but I had to configure 8.8.8.8 as the DNS server for that machine.
As soon as I reverted to our DHCP assigned DNS my own mail server would stop.
Unfortunately, there appear to be no useful logs whatsoever. I don't know if the mail server is having issues resolving some domain name or connecting to some IP. On my virtual machine with Google DNS the log contains send message requests followed by a set of messages for relaying - i.e. connecting to another mail server and forwarding the email. On the actual server where the issue occurs there are no messages regarding relaying.
Unfortunately, the machine hosting the mail server is also the primary DNS machine for our internal network (don't look at me, I didn't set that up), so I cannot just use Googles DNS and call it a day - doing so I'd probably break a lot of other things.
Any idea what could possibly be wrong? Alternatively - is there any way to find find out what is the EXACT reason messages aren't leaving the queue folder?
I've eventually managed to find the answer.
Someone had turned on the "disable recursion" advanced option in the DNS server.
I'm not sure why exactly turning recursion off effectively disabled the SMTP relay server's capability, but in my case this was the direct cause of the problem.
I had a space in the front of the DNS entry