We recently migrated our site to a load balanced apache cluster behind varnish. Since that time a very small subset of users is reporting they cannot view any pages. I have narrowed the issue down quite a bit. This issue was not present before the move, the old infrastructure was a single large box.
We are on Rackspace Cloud running 8 apache2 instances behind varnish 3.0 all load balanced using Rackspace Cloud load balancers (Zeus) and 2 mysql instances for a total of 10 servers, all linux.
User can view a static html file. User can view a static asset such as an image. User cannot view any php file, even a simple one which only includes phpinfo(); User cannot view any php file when the load balancer is taken out of the picture.
The apache logs show nothing of note, other than in the access logs. PHP error reporting is set to log, and not display, although I set it to display for a short time, and the user still gets a blank page without an error. Apache/Varnish/PHP error logs show nothing of note.
Servers are:
- Ubuntu Maverick 10.10
- Apache 2.2.16-1ubuntu3.1 (mpm-worker)
- PHP 5.3.3-1ubuntu9.5 (used in fcgi)
- PHP APC is in use
- Application is on Code Igniter
- Varnish was 2.1.3, now 3.0.0 - issue was present with both versions
- MySQL is the database backend in a master-master setup but due to the client access issues with a file only containing phpinfo(); I am sure the database is not the issue.
Snapshots of some configurations:
- PHP FCGI - http://pastebin.com/6cepWbxp
- Apache Virtual Host - http://pastebin.com/FfxhYwSD
- Varnish VCL - http://pastebin.com/tAcuyfLR
- List of all apache modules running - http://pastebin.com/absHpXm5
I can provide any/all logs needed to further debug, but there is nothing of note in them for the users having this issue, typical access from apache, no errors from php.
I have a feeling it might somehow be relate to php session storage although I cannot confirm this.
Any insight into the problem is greatly appreciated. Just to reiterate one final time, this issue only affects a very small handful of users. 5-10 have contacted us about the issue but I assume the number is larger than that of people who have not bothered to report the issue. These 5-10 users who have contacted us span various continents/countries/isps.
Maybe this will help you: Have you KeepAlive to set on?
We had the same problem in varnish with mpm-itk, the problem was, that when mpm-itk was accessed with a different vhost then the actuall keep-alive session was, he just killed the connection. Every normal Browser would then try to reconnect, but varnish doesn't in a default configuration. With removing KeepAlive this behaviour could not happen, so it fixed this problem.
I know, that you are not using mpm-itk, but maybe it is worth a try.
Another Idea: Can you access the site with bypassing the varnish and try to force the blank page? So you would maybe find out if the varnish is the problem.
This turned out to be an issue where PHP was unable to log to the specified log file due to a permissions problem, and display of errors was disabled.
As such, the PHP error was not displayed, or able to be logged - but the root cause of the blank pages was due to a strange PHP fatal error.