I'm trying to detect bad bot activity. If the information.php
, which is disallowed in robots.txt but is open for normal users, has been requested more than 500 times in a day by same IP address
, I'd say its a safe bet to mark the user as a bot.
Also, if I get more than 100 bad login requests
from an IP address, I'd like to mark that as bot activity. (although I think I'd be better off presenting the user with a captcha after 1 bad login attempt, but an IP-based solution will be a bit difficult to implement than a session cookie based solution IMO)
And finally, I'd like to block the IPs that have been marked for bot activity. How do I go about doing all this?
You should have a look at fail2ban which parses log files (such as apache, ssh, and so on) and takes decision (blocking for a while by iptables) to suspicious users.
What you can opt for is to use a Host Based Intrusion Detection system which parses logs and automatically blocks an account for a certain period. Examples are Fail2ban and OSSEC.
Furthermore OSSEC features a server client model so you can have a centralized model which allows for better manageability should you have more than one webserveR.
I had to implement the same solution few days ago and fail2ban is right tool for this task. There is good how-to: Fail2ban protect web server http DOS attack
Just modify
failregex
for your case.It sounds like you want
mod_security
. mod_security lets you use, or write powerful rules to to detect and react on activity on your web server.It is immensely powerful, and should suit most of your needs without too much trouble.
It is also widely used, so getting help should be easy.
http://www.modsecurity.org/projects/modsecurity/