Our zimbra server is experiencing unexplained slowdowns every couple of days that are only resolved after rebooting the server. From the end user's perspective, if they are using webmail and they send a message, then it will eventually timeout. From the system terminal, there are slowdowns logging in, switching users, and restarting the zimbra services. It takes up to 2 minutes to change a user using 'su -'
Restarting all the zimbra services, dns services, does not resolve the problem. The problem is only resolved after completely rebooting. After rebooting, logging in, switching users, and restarting servers happen quickly.
We are using dnsmasq for split DNS which is needed for our environment because of NAT. But querying DNS returns results immediately. We are using an external ldap database for authentication but no other servers using it show any problems and there are no load problems on it either. Everything else is a default install and configuration.
There are no obvious errors in the system logs. The server load, disk IO, is the same when there is a problem and when there is no problem.
Originally this was happening once a week usually on mondays, or tuesday. This week, it happened on Monday, and Thursday.
My version is:
zimbra@servername ~ $ zmcontrol -v Release 7.2.1_GA_2790.RHEL6_64_20120815212147 UNKNOWN_64 FOSS edition.
Has anyone encountered or solved such a problem ?
I've found that rsyslog, when forwarding logs via TCP to a remote host, will sometimes get hung up when it can't forward to the remote host. Even when the remote host comes back up, rsyslog remains hung and as a result slows down everything else on the system that tries to log. Restarting rsyslog does the trick when it happens, but restarting it regularly via a cron job never seemed to work for me. The best solution I found is to not have the remote host go down so much. :)
However, there are tweaks that can be made to rsyslog so that it queues rather than locking up. You might still experience the issue, and in that case no logs will be forwarded until rsyslog is restarted, but it will not affect the system as a whole.
Comment out your current forwarding rule, and drop this at the end of your rsyslog.conf:
You will need to make sure /var/spool/rsyslog exists because it will not create it otherwise.