How can I create a filter to block these with fail2ban?
476 Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
892 ltx71 - (http://ltx71.com/)
5367 Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, [email protected])
6449 Barkrowler/0.9 (+http://www.exensa.com/crawl)
This list come out from this:
sudo cat /var/log/apache2/access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -n
I've tried apache-badbot.conf, but it does not seem to work ...
The correct way to deal with annoying bots is to block them in "robots.txt". But your comments indicate they're ignoring that directive. Blocking by user-agent will ultimately be a cat and mouse game, but if you want to do it you want the following.
So, you need to enable the apache-badbots jail that reads the Apache access log if you haven't already. Create the file
/etc/fail2ban/jail.d/apache-badbots.local
with the contents:The main portion of the apache-badbots jail is defined in
/etc/fail2ban/jail.conf
so all you have to do is enable it.Next, modify the apache-badbots filter to include your bots. Edit
/etc/fail2ban/filter.d/apache-badbots.conf
. In it there is a particular line for custom bots:The bots are specified using a regular expression. Either replace those or tack yours on the end separated with
|
s.Next, you'll want to modify the
failregex
line so that the regular expression matches any part of the user agent, not just the whole thing. Change the line:to (note the two additional
.*
):Finally, reload the fail2ban configurations.
This information may be helpful for reference.
Looking at
/etc/fail2ban/filter.d/apache-badbots.conf
on an update to date Ubuntu 16.04 server I have, it looks outdated. In particular there's this comment:I generated a new one from the fail2ban git repository, but it still didn't include those bots (maybe the source is outdated or incomplete). If you're curious, you can generate a new one with following.
The new file will be available at
config/filter.d/apache-badbots.conf
. If you want to use it replace/etc/fail2ban/filter.d/apache-badbots.conf
with it.For reference, this is the definition of apache-badbots from
/etc/fail2ban/jail.conf
.The
%(apache_access_log)s
variable comes from/etc/fail2ban/paths-debian.conf
and is defined as/var/log/apache2/*access.log
.For reference, here is the
apache-badbots.conf
that I generated (without modifications).