On a VPS with 1 CPU core and 2GB RAM, I run a mysql+apache2 for a low traffic website. Sometimes the machine slows down or stops delivering through apache or mysql.
That's why I set up nagios which is sending me alerts like "Service Alert: localhost/Current Load is WARNING" after 5-10 days of running. Then I can login through SSH and check RAM with "free" which is still enough, 500MB+ available and only 60MB of swap in use.
Since the system slowed down again, I checked the syslog and found lots of these entries:
Jun 30 23:46:31 cl22 postfix/error[2190]: 46D8974323: to=, relay=none, delay=294806, delays=294803/3/0/0, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=zombine.com type=MX: Host not found, try again) Jun 30 23:46:31 cl22 postfix/error[2193]: 49CB374123: to=, relay=none, delay=154189, delays=154185/3.1/0/0, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=zombine.com type=MX: Host not found, try again) Jun 30 23:46:31 cl22 postfix/error[2153]: 4E2C874250: to=, relay=none, delay=433708, delays=433704/3.1/0/0, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=zombine.com type=MX: Host not found, try again) Jun 30 23:46:31 cl22 postfix/error[2176]: 480D874180: to=, relay=none, delay=174308, delays=174304/3.1/0/0, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=zombine.com type=MX: Host not found, try again)
How can I find out which process is consuming all the load? It's really lots of overload for a 1-core VPS: WARNING - load average: 3.06, 5.79, 3.42
mysql is OK, apache2 seems to be OK. postfix maybe not? anything else I did not identify yet?
Please let me know how to find out the bad process and temporarily renice or un-priorize postfix etc. to make sure that apache2 and mysql remain healthy. These 2 processes are important to me. The outgoing emails, too, because it's sending messages to clients.
According to the logs you've shown, the domain name postfix is using - zombine.com - doesn't exist on your DNS or it doesn't have a MX record, which is why postfix is erroring. Perhaps what you should try doing is running a cronjob every 5 minutes or so which checks if a new error has been added to syslog, then run top and email the results back to you. From there you can figure out which process is consuming the most memory.
That error you're seeing is not an error related to the email address; it's a DNS problem. Make sure you can see the MX record for your domain zombine.com from this server if it sends emails:
Postfix will continue attempting to send these emails over and over for days in case of a "recoverable" failure like this one.
One other thing to check out is whether you are having disk load problems (check out the hardware interrupt CPU usage, "hi", in
top
). If that is the issue, you can install and runiotop
to see what is taking up all the load.You can configure these parameters (in days) to adjust how long postfix tries to deliver undeliverable mail for:
Additionally, make sure the following settings are correct to ensure you are not operating an open relay (this can be a source of unwanted SMTP traffic as people use your server to send spam):
Then, empty your mail queue:
This is case-sensitive for safety reasons. You should then find that
postqueue -p
shows an empty queue.