I have this Windows Server 2003 R2 32 bit machine running Apache 2.4.2 with OpenSSL 1.0.1c and PHP 5.4.5 via mod_fcgid 2.3.7. This config worked just fine for some hours, but then the site couldn't be reached with its domain name, say www.example.com
, but it could be still reached by its IP address.
In particular, while https://www.example.com/
yielded a connection error, http://123.1.2.3/
worked just fine. Yes, first https then http.
Error and access logs were clean, i.e. they showed no signs of problems. Just the usual messages, that were interrupted while the site couldn't be reached.
After some investigation, a simple restart of Apache solved the problem. Unfortunately, I didn't have the chance to test if https://123.1.2.3/
worked as well, or if http://www.example.com/
was still redirected to https as usual.
So, has anyone have any idea of what happened? Before I get tired of Apache and ditch it in favor of Nginx?
Edit: Some log informations.
The last line of sslerror.log
is from 90 minutes before the problem occurred, so I guess it's not important. ssl_request.log
shows nothing interesting, too: these are the last two lines before the problem:
[28/Aug/2012:17:47:54 +0200] x.x.x.x TLSv1.1 ECDHE-RSA-AES256-SHA "GET /login HTTP/1.1" 1183
[28/Aug/2012:17:47:45 +0200] y.y.y.y TLSv1 ECDHE-RSA-AES256-SHA "POST /upf HTTP/1.1" 73
The previous lines are all the same and don't seem interesting, except 4 lines like these 30-40 seconds before the problem:
[28/Aug/2012:17:47:14 +0200] z.z.z.z TLSv1 ECDHE-RSA-AES256-SHA "-" -
These are the corrisponding lines from sslaccess.log
:
z.z.z.z - - [28/Aug/2012:17:47:14 +0200] "-" 408 -
...
x.x.x.x - - [28/Aug/2012:17:47:54 +0200] "GET /login HTTP/1.1" 200 1183
y.y.y.y - - [28/Aug/2012:17:47:45 +0200] "POST /upf HTTP/1.1" 200 73
It seems some connections timed out?
The virtual server listening on port 80 usually redirects all the connections to the https protocol, so access.log
isn't showing anything since 40 minutes before the problem. error.log
shows some warnings 4 minutes before the issue:
[Tue Aug 28 17:53:30.921034 2012] [fcgid:warn] [pid 1964:tid 1728] mod_fcgid: process 1852 graceful kill fail, sending SIGKILL
A get a lot of these warning, I guess it's normal?
This sounds like a DNS issue. When the site becomes unreachable by name you need to first ensure that the name correctly resolves to the IP address of the server. There are many ways to do this, such as performing an
nslookup
or even just aping
of the name. Only if you are indeed getting the correct address should you start looking at the Apache end of things.