I'm about to pull out my hair on this one. I have a rails app hosted on Heroku, using Zerigo Free for DNS. The app is available at:
- smartvark.com (preferred)
- www.smartvark.com (some users insist on typing www, this redirects to remove the www)
- smartvark.heroku.com (not given to users but perhaps useful for comparison in troubleshooting)
Users are intermittently (1 in 50 or so requests) experiencing extremely long load times (~2min) and when I try to triage by watching my server logs, their requests don't seem to hit until the end of the wait period. Typical load times for the site are fast around 200-400ms. I am using NewRelic and it isn't indicating any server load issues, although it picks up the end-user issue with its beacon and charts this time as "Network".
Using Firebug and Chrome devtools I am able to see the timeline when this happens on my machine, and they both show a long wait time before any response, which Firebug classifies as "DNS lookup" and Chrome doesn't seem to classify. After the first response happens, the rest of the site loads very fast.