I am looking for a way to flexibly manage outbound HTTP/HTTPS traffic in a way that respect site policies, and could be deployed at the "edge" of our datacenter network.
For example, we use several Web APIs that have throttling rates like "no more than 4 requests per second" or "max 50K requests per day", etc. We have many people at the company that use various services like these, so I cannot centrally manage all requests in software. People run these things at different schedules and at different intensities. We are fine with that (it meets internal needs), but we realize that - in aggregate - we may get into situations where we generate so much concurrent traffic, that we get blocked by a site. (although it's unintentional)
What I am expecting/hoping is that we can leverage bandwidth management / traffic shaping solutions that already exist in the network hardware world and that we could subsequently deploy such a thing at the edge of our datacenter network.
Ideally, I could then write L4 or L7 routing rules that allow us to ensure that no more than - for example - 4 req/sec outbound are generated by our datacenter. The rest of the requests would, again ideally, be queued by the hardware for some reasonable length of time, with queue capacity excess simply being refused. I realize there's no free lunch and that throttling is not going to solve a fundamental inherent demand (requests) vs. supply (site policies) problem. However, the throttling would allow us to "smooth out" requests over some window, say, a day, so that we could utilize an external service in a properly restrained manner, yet maximize our use.
Does anyone know of a network-level bandwidth management solution like this? If so, would it also support rules based not only on something like the URL in an HTTP request, but also some additional HTTP headers?
The capabilities of netfilter are almost boundless. On this one I'd use the limit module in iptables. Be aware: there is no way of limiting rates in TCP/IP without dropping packets. You can queue the packets up, but eventually when the queue is full packets get dropped. So we are going to drop SYN packages. I haven't tried this so far, probably because of very long retry timeouts no-one is doing this i.e. a browser can get locked up.
I used the secondary chain just to show the concept. You can also do this on a router, then you have to create a chain per server or entity you want to limit. And use FORWARD instead of INPUT.
Queuing
In this solution there is no "longtime" queuing. You can play with the limit and limit-burst parameters. It would also be possible to send the SYN packets to a Queuing Discipline: the setup is much more complex and I can't see how it makes things better regarding dropping of SYN packets.
URL matching
URL matching is also possible, in that case you'd drop that packet and delay the connection by waiting for the retransmit, I have done such things with the module recent, BUT I used it for prevent brute-force attacks and portscanning. So I don't care about the connections I am limiting. Proper handling of connections will get difficult!