In New Relic, one of the metrics they display as part of the application response time is "Request Queue".:
To collect request queuing time, you need to mark the HTTP request with a timestamp when queuing starts. [1]
This is done by adding an HTTP header in the Apache httpd.conf:
RequestHeader set X-Request-Start "%t"
New Relic mention that:
For the request queuing bucket, a site operator can provision more application instances.
However we have seen adding new application instances (i.e. web nodes) doesn't affect the request queuing time - it stays constant. We're seeing this measured at around 250ms.
What factors affect the request queue length and how can it be reduced?
[1] http://support.newrelic.com/help/kb/features/tracking-front-end-time