I'm running an API Proxy server and want to compress the (uncompressed) response contents I get from the API endpoints in order to send them faster to the client that initiated the request.
However, I'm wondering if there are any tipping points when performing the compression on the server, sending the compressed response content to the client and performing the decompression of the contents on the client-side would actually take longer than just sending the uncompressed response content directly to the client.
The answer depends on the compressability of your responses and average response size. For small sizes the gzip wrapping overhead will make your output longer than uncompressed and chew unnecessary CPU.
Tomcat, as an example, uses 2kb as the default minimum compressible size. If your API is returning JPG's then compression is going to be a losing proposition as well.
Your approach should be to make a histogram if your response sizes and compression ratios, and tune your compression filter to skip objects that are too small to provide reasonable compression.