Assume an office of people, they want to limit HTTP downloads to a max of 40% bandwidth of their internet connection speed so that it doesn't block other traffic.
We say "it's not supported in your firewall", and they say the inevitable line "we used to be able to do it with our Netgear/DLink/DrayTek".
Thinking about it, a download is like this:
HTTP GET request
Server sends file data as TCP packets
Client acknowledges receipt of TCP packets
Repeat until download finished.
The speed is determined by how fast the server sends data to you, and how fast you acknowledge it.
So, to limit download speed, you have two choices:
1) Instruct the server to send data to you more slowly - and I don't think there's any protocol feature to request that in TCP or HTTP.
2) Acknowledge packets more slowly by limiting your upload speed, and also ruin your upload speed.
How do devices do this limiting? Is there a standard way?
TCP itself implements the congestion control.
These rate limiters will simply throw packets away over the limit. TCP handles this, ensuring that the packets all arrive and all arrive in order; the client doesn't ACK for the dropped packets, and they are resent by the server.
The server's TCP stack will resend the packets, and it will also dial back a bit on its send rate because it figures there's congestion between it and the client. It'll speed back up until the rate limiter drops packets again, and so on.
The best description I've ever heard that made sense of TCP's inherent throttling method was off a recent Security Now podcast. To quote Steve Gibson:
3) You router / firewall device puts incoming data into a QoS bucket and only empties that bucket at the rate you requested. Incoming data will adapt to that speed as computers inside will only see acknowledge receipt at that speed. Also, the occasional (purposefully) dropped packet works really well for slowing down a connection.
When trying to find a device that handles this, look for QoS (Quality of Service) in the configuration / documentation. Linux (or BSD) boxes are also handy for this.
You use a firewall or device that supports QoS (quality of service) limiting.
You could build a Linux system to act as the office gateway and have it use traffic shaping to achieve this. Just needs multiple NICs installed and then every machine points to is as a gateway.
As a bonus, you could configure a proxy server on it to help ease traffic too. Something like Squid. There may be turnkey routing appliance distributions that can do this too.
The HTTP protocol doesn't provide facilities to limit the used bandwidth, and even if it did, that would be a client-side setting, on which network administrators couldn't have any control.
Bandwidth limiting (also known as "Quality Of Service") is usually managed on routers/firewalls, which handle all the incoming and outgoing traffic to/from a network; the ones that support this usually let you configure policies such as "let any single client computer use at most 10% of all the available bandwidth", or "give SMTP priority over FTP so that emails can flow even when someone is doing an heavy download".
How exactly this is accomplished depends on the router/firewall used, but the most basic way is to just throw away packets that exceed the configured limits; TCP will ensure they get retransmitted, and will eventually be able to pass through the bottleneck.