For my staging server, I'm trying to figure out a way to block search engine bots entirely, rather than an individual .htaccess file or robots.txt file. The idea is that it's out of sight and out of mind when creating a new site on the staging server. Is there a way to detect the bot's user agent using the apache module and block that connection on a sever-wide level?
Thanks!
My suggestion would be to block everything OTHER than a known-good testing agent string. That way you block bots you've never heard of. You could also block all but a known good set of IPs with %{REMOTE_ADDR}.