We host our own Redmine rails web application internally with Apache using Mod_auth_kerb and our internal Kerberos for authentication.
We have 2 internal Kerberos servers KDC1 and KDC2. KDC1 is master. KDC2 is a slave to KDC1.
When KDC1 is working we don't have a problem, Redmine on our Apache with Passenger setup is responsive.
The Kerberos servers are running Debian Lenny The Redmine Apache2 server is running Debian Squeeze
KDC1 went offline due to some hardware issues recently. During this time every page load of Redmine was incredibly slow, each page load took around 10 seconds. Redmine worked, Kerberos authentication via the slave named KDC2 worked, but it was very slow.
For every Redmine page load the Redmine Apache system would start looking for KDC1 and eventually use KDC2. This process took several seconds each time.
I experimented with using the below options with different values on the Redmine Apache server's /etc/krb5.conf
[libdefaults]
default_realm = DOMAIN.COM
kdc_timeout = 1
max_retries = 0
I tried different values, I ran tcpdump to see where the delays were, to see if changing the above settings made a difference, and I didn't see any difference in tcpdump captures or in Redmine page loading in the browser.
Am I doing this wrong? Is it possible to make our Redmine Apache system use KDC2 faster, fast enough that it's not noticeably slower if KDC1 fails?
What are some good ways or what is the best way to set up Kerberos for high availability failover?
If I can't speed up the above where we're using a slave then I may try something else like, instead of a slave, creating two identical KDC1 servers and using heartbeat to failover the IP address for kdc1.domain.com in the event of a failure or something. I haven't gotten around to that yet.
Thanks in advance.
0 Answers