I'd like to block some spiders and bad bots by user-agent text string for all of my virtual hosts via httpd.conf but have yet to find success. Below are the contents of my http.conf file. Any ideas why this isn't working? env_module is loaded.
SetEnvIfNoCase User-Agent "^BaiDuSpider" UnwantedRobot
SetEnvIfNoCase User-Agent "^Yandex" UnwantedRobot
SetEnvIfNoCase User-Agent "^Exabot" UnwantedRobot
SetEnvIfNoCase User-Agent "^Cityreview" UnwantedRobot
SetEnvIfNoCase User-Agent "^Dotbot" UnwantedRobot
SetEnvIfNoCase User-Agent "^Sogou" UnwantedRobot
SetEnvIfNoCase User-Agent "^Sosospider" UnwantedRobot
SetEnvIfNoCase User-Agent "^Twiceler" UnwantedRobot
SetEnvIfNoCase User-Agent "^Java" UnwantedRobot
SetEnvIfNoCase User-Agent "^YandexBot" UnwantedRobot
SetEnvIfNoCase User-Agent "^bot*" UnwantedRobot
SetEnvIfNoCase User-Agent "^spider" UnwantedRobot
SetEnvIfNoCase User-Agent "^crawl" UnwantedRobot
SetEnvIfNoCase User-Agent "^NG\ 1.x (Exalead)" UnwantedRobot
SetEnvIfNoCase User-Agent "^MJ12bot" UnwantedRobot
<Directory "/var/www/">
Order Allow,Deny
Allow from all
Deny from env=UnwantedRobot
</Directory>
<Directory "/srv/www/">
Order Allow,Deny
Allow from all
Deny from env=UnwantedRobot
</Directory>
EDIT - @Shane Madden: I do have .htaccess files in each virtual host's document root with the following.
order allow,deny
deny from xxx.xxx.xxx.xxx
deny from xx.xxx.xx.xx
deny from xx.xxx.xx.xxx
...
allow from all
Could that be creating conflict? Sample VirtualHost config:
<VirtualHost xx.xxx.xx.xxx:80>
ServerAdmin [email protected]
ServerName domain.com
ServerAlias www.domain.com
DocumentRoot /srv/www/domain.com/public_html/
ErrorLog "|/usr/bin/cronolog /srv/www/domain.com/logs/error_log_%Y-%m"
CustomLog "|/usr/bin/cronolog /srv/www/domain.com/logs/access_log_%Y-%m" combined
</VirtualHost>