Because of some amateur-made DDOS attack on my website, I had to deny some traffic with .htaccess which worked fine.
Unfortunately, it also blocks the googlebot/bingbot:
order allow, deny
deny from 54.
SetEnvIfNoCase Referer "^$" bad_user
SetEnvIfNoCase User-Agent "^Wget" bad_user
Deny from env=bad_user
It simply block whole traffic from 54.x.x.x
(only traffic I get from it is from infected amazon cloud - I know I could exclude just 30 IPs ranges for amazon cloud and not the whole 54.x.x.x
but I was in a need of fast solution).
The rest of bots (most of them from China, Taiwan and so on) don't use referrer, so:
SetEnvIfNoCase Referer "^$" bad_user
blocks them all.
But it also have a side effects:
- When somebody visit my page from bookmark or when he type it directly to the browser (e.g. he has red it on business card), he won't see my website.
- Googlebot, bingbot (as well as other less important bots) usually don't use referrer either.
#1
is an inconvenience, but #2
is a real problem I have to solve quickly.
I've found that bots important for me use those labels:
66.249.64.119 - - [...] "GET /robots.txt HTTP/1.1" 403 534 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.119 - - [...] "GET /programowanie/ HTTP/1.1" 403 537 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.64.115 - - [...] "GET /3d-graphic/ HTTP/1.1" 403 535 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
207.46.13.4 - - [...] "GET /robots.txt HTTP/1.1" 403 534 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.4 - - [...] "GET / HTTP/1.1" 403 524 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
Is it possible in .htaccess
to somehow merge my rules with "but if label contains "Googlebot" or "bingbot", let him go" as the most important one (even if they don't use referrer)?
If not, maybe I can add something to robots.txt
to inform Google/Bing that they should have put referrer in their labels (I doubt they would take it into account)?
I have found some solution for
#2
:Note the
order deny, allow
- thanks to it it will work that way:54.x.x.x
. Also block all traffic without referrer.http://www.bing.com/bingbot.htm
orhttp://www.google.com/bot.html
.Anyway, I will wait for other answers, because I'm not sure if it's optimal solution for
#2
.And I still did not manage to solve
#1
.So if you want to:
you can just use my code for
.htaccess
withoutdeny from 54.
andSetEnvIfNoCase User-Agent "^Wget" bad_user
lines, which are specific for my case (ddos).