The core of this problem is that our application uses websockets for real-time interfaces. We are testing our app in a new environment but strangely we're noticing an increasing delay in TCP websocket packets associated with an increase in websocket activity.
For example, if one websocket event occurs without any other activity in a 1-minute period, the response from the server is instantaneous. However, if we slowly increase client activity the latency in server response increases with a linear relationship (each packet will take more time to reach the client with more activity).
For those wondering this is NOT app-related since our logs show that our server is running and responding to requests in under 100ms as desired. The delay starts once the server processes the request and creates the TCP packet and sends it to the client (and not the other way around).
Architecture This new environment runs with a Virtual IP address and uses keepalived on a load balancer to balance the traffic between instances. Two boxes sit behind the balancer and all traffic runs through it. Our host provider manages the balancer and we do not have control over that part of the architecture.
Theory Could this somehow be related to something buffering the packets in the new environment?
Thanks for your help.
Buffering sounds like a reasonable theory. I would take a packet capture from your app servers to make sure that you don't see anything like retransmits or other possibly abnormal behavior in the tcp stream (i.e. tcp window zeroing out?). Wireshark with a capture filter for the client IP would work for this.
If you verify that the packet capture looks clean, asking your provider to run a packet capture on their load balancer so you can analyize is a reasonable request.
Lastly, have you tested from multiple locations and different machines? Perhaps buffering is somewhere between the client and the provider, or something strange is happening with the client (running a packet capture on the client as well as your servers at the same time can be enlightening as well).
It's actually an expected behavior. As the amount of data increases the size of the transmit window increases which, to a point, will increase the size of the packets sent. Larger packets mean more efficiency (less overhead and time taken for acknowledgements) but they also mean more latency. This is the tradeoff for maintaining reliable delivery while also maintaining any kind of throughput in networks of widely varying latencies and bandwidth.
Check out the PSH flag to cause TCP to flush buffers sooner. It might make a marginal difference under some circumstances. If you're looking for consistently low latency for lots of small messages, however, you might be better served looking at UDP - which means you'll have to account for assuring delivery on your own but you might realize a bit more consistency.