I'm currently using New Relic's performance monitoring service. Works great so far, except that I'd like some things automated. I basically just get an email notification when performance is bad or my site is down. If the site is down, I'd like to try restarting the application server (killing the process if necessary, which it is sometimes). If that doesn't work after a period of time, try rebooting the whole machine... I even paid for PagerDuty, which can parse New Relic email notifications and call or SMS me with an escalation procedure for notifications. But it can't run scripts...
Seems like this would be a popular feature of any website monitoring tool... anything good out there?
The problem with hosted monitoring services, such as New Relic, running user-provided scripts is the security issues -- unless it's very well sandboxed, the script could do adverse things to the monitoring service's systems.
The only way they can really do it securely is to have a very limited set of possible reactions that can be made safe. The most common one would be something like HTTP callbacks, where the monitoring service makes a POST to a URL of your choice, containing data on what's happened, which you can react to and do whatever you need. The downside to that, of course, is that you've got to have yet another service running in your infrastructure that takes these events and takes action.
I can't find anything in a quick Google search for New Relic that would cover this sort of thing; it's entirely possible that they don't handle it, and e-mail/SMS notifications are the best you're going to get without going with another monitoring service.
It's for these sorts of reasons that I prefer to run my own monitoring infrastructure -- setups like New Relic might be useful for the specialist expertise they can offer in monitoring, say, Rails application performance, but for managing the infrastructure itself, I keep it in-house.
Well, Nagios scripts are usually only reporting status back to Nagios, but nothing would prevent them from doing more when they have to report a WARNING or FAIL.
Edit: Technically, this is working and easy to do but might have unforeseen consequences. A better solution is to configure Nagios to do something about the problem with the event handler infrastructure.
If a hosted solution is ok, AlertFox can run scripts ("macros") on error. These macros could, for example, log into your web host's configuration panel and trigger a reboot.
If you want a lightweight solution, you can use monit.
Moreover it can be integrated with nagios later on if you want.
SeaLion can run any commandline tool/script. This gives you endless possibilities. For example, you can write your own bash scripts and have GNU Mailutils send email notifications. It also has alerts for most used metrics like CPU, memory, load avg etc.