I have a drupal website with many modules (don't ask the number). For six months the website has been stable, but recently the servers began to seize. Generally MySQL reaches the maximum number of concurrent connections (1000) and the website crashes.
I want to find out what web pages within the site are being visited, or what cron or drush processes are running that are bringing the site down.
What is the best strategy for finding out this information?
Do I parse the apache logs, and see what web pages were visited, then proceed to benchmark the last 100 pages on the log and see how much memory they are consuming, for example?
Or is there a more accurate way of saying "this particular page or process brought down your site"?
I know that there's the PHP log, the Apache log, the MySQL log, and the top command, but it seems like too much inconclusive information.
I don't work with Drupal or MySQL, but you seemed to have all the parts you want to look at for starting to solve a problem like this.
Since the DB is the point of failure (just an assumption), I would suggest starting backwards: MySQL > PHP > Apache > OS > Network. Look at the time and the error at failure at every layer. Go back in a time a little bit. Does your hosting service provide network logs/stats? See if you can get that data as well.
Also, have you heard of New Relic? They have a free version of the diagnostic tool: Check out http://newrelic.com . They seem to be having a promo for their "Gold" release for 7 days -- might help ... ?
Good luck!
KM