I want to reduce, or mitigate the effects of, malicious layer 7 traffic (targeted attacks, generic evil automated crawling) which reaches my backend making it very slow and even unavailable. This regards load-based attacks as described in https://serverfault.com/a/531942/1816
Assume that:
- I use a not very fast backend/CMS (e.g ~1500ms TTFB for every dynamically generated page). Optimizing this is not possible, or simply very expensive in terms of effort.
- I've fully scaled up, i.e I'm on the fastest H/W possible.
- I cannot scale out, i.e the CMS does not support master-to-master replication, so it's only served by a single node.
- I use a CDN in front of the backend, powerful enough to handle any traffic, which caches responses for a long time (e.g 10 days). Cached responses (hits) are fast and do not touch my backend again. Misses will obviously reach my backend.
- The IP of my backend is unknown to attackers/bots.
- Some use cases, e.g POST requests or logged in users (small fraction of total site usage), are set to bypass the CDN's cache so they always end up hitting the backend.
- Changing anything on the URL in a way that makes it new/unique to the CDN (e.g addition of a
&_foo=1247895239
) will always end up hitting the backend. - An attacker who has studied the system first will very easily find very slow use cases (e.g paginated pages to the 10.000th result) which they'll be able to abuse together with random parameters of #7 to bring the backend to its knees.
- I cannot predict all known and valid URLs and legit parameters of my backend at a given time in order to somehow whitelist requests or sanitize the URL on the CDN in order to reduce unnecessary requests from reaching the backend. e.g
/search?q=whatever
and/search?foo=bar&q=whatever
will 100% produce the same result becausefoo=bar
is not something that my backend uses, but I cannot sanitize that on the CDN level. - Some attacks are from a single IP, others are from many IPs (e.g 2000 or more) which cannot be guessed or easily filtered out via IP ranges.
- The CDN provider and the backend host provider both offer some sort of DDoS attack feature but the attacks which can bring my backend down are very small (e.g only 10 requests per second) and are never considered as DDoS attacks from these providers.
- I do have monitoring in place and instantly get notified when the backend is stressed, but I don't want to be manually banning IPs because this is not viable (I may be sleeping, working on something else, on vacation or the attack may be from many different IPs).
- I am hesitant to introduce a per-IP limit of connections per second on the backend since I will, at some point, end up denying access to legit users. e.g imagine a presentation/workshop about my service taking place in a university or large company where from tens or hundreds of browsers will almost simultaneously be using the service from a single IP address. If these are logged in, then they'll always reach my backend and not be served by the CDN. Another case is public sector users all accessing the service from very limited amount of IP addresses (provided by the government). So this would deny access to legit users and would not help at all to attacks from many IPs each of which only does a couple of requests.
- I do not want to permanently blacklist certain large IP ranges of countries which sometimes are the origins of attacks (e.g China, eastern Europe) because this is unfair, wrong, will deny access to legit users from those areas and attacks from other places will not be affected.
So, what can I do to handle this situation? Is there a solution that I've not taken into consideration in my assumptions that could help?
I live in a similar environment (I don't manage it directly but I work with the team that does) and we've found two solutions that work well together. In our case, we host application ourselves so we have full control over traffic flow but the idea remains the same.
Some of your constraints are quite hard and I'd argue contradictory but I think they can be worked around. I'm not sure what your CDN is but I presume that it's a black box that you don't really control.
I would suggest setting up another (caching) layer in front of your application to control and modify traffic, we use Varnish for it - mostly caching but also for mitigating malicious traffic. I can be quite small and doesn't have to cache for as long as CDN as it should only see very little traffic.
I feel your pain - but you are facing an impossible task. You can't have your cake and eat it.
Normally I would suggest fail2ban as the tool to address this (if webserver rate limiting is not an option) however not only do you explicitly say that you can't even support a temporary ban, since your traffic comes via a CDN , you'll need to build a lot of functionality to report the address and to apply blocking.
You only have 2 courses of action left which I can see:
1) flatten the site into html files and serve them as static content
2) get a job somewhere else