I cannot figure out how to get Monit to monitor the number of open/established TCP/IP connections on a server so an alert can be sent when "too many" are open. Do you know how this can be setup?
z8000's questions
I often see web app architectures with a SLB / reverse-proxy in front of a bunch of app servers.
What happens when the number of connections to the SLB requires too many resources for a single SLB to handle effectively? For a concrete yet over-the-top example, consider 2 million persistent HTTP connections. Clearly a single SLB cannot handle this.
What is the recommended configuration for scaling out a SLB?
Is it typical to create a group/cluster of LBs? If so, how is client load spread out between the group of LBs?
Let's say I have 2 machines in the same data center but not necessarily in the same rack.
How common would dropped packets be when sent using UDP between these two machines?
I'm asking under the assumption that since there are only a few switches at most between the machines that the packets will not be dropped at all.
How common is out-of-order packet arrival within the same data center? My assumption is there's but one route 99.9% of the time so this cannot happen.
However, anytime I catch myself thinking in absolute terms I know I must be missing something!
What background information do I need to gain a better understanding of when to expect dropped packets, and how often they might be dropped, and arrive out of order for machines in the same data center?
Ultimately I'm trying to decide between using multicast UDP or PGM when communicating between different Linode VPS instances located in the same data center. The information must arrive and in order. Sure, UDP does not sound so great then!
But, if one can expect almost perfect or perfect delivery in the same data center, then it's fine. But, I am testing that assumption.
Thanks.
I am designing a network service in which clients connect and stay connected -- the model is not far off from IRC less the s2s connections.
I could use some help understanding how to do capacity planning, in particular with the system resource costs associated with handling messages from/to clients.
There's an article that tried to get 1 million clients connected to the same server [1]. Of course, most of these clients were completely idle in the test. If the clients sent a message every 5 seconds or so the system would surely be brought to its knees.
But... How do you do less hand-waving and you know, measure such a breaking point?
We're talking about messages being sent by a client over a TCP socket, into the kernel, and read by an application. The data is shuffled around in memory from one buffer to another. Do I need to consider memory throughput ("5 GT/s" [2], etc.)?
I'm pretty sure I have the ability to measure the basic memory requirements due to TCP/IP buffers, expected bandwidth, and CPU resources required to process messages. I'm a little dim on what I'm calling "thoughput".
Help!
Also, does anyone really do this? Or, do most people sort of hand-wave and see what the real world offers, and then react appropriately?
[1] http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3/
Let's say you have 2 servers each with 8 CPU cores each.
The servers each run 8 network services that each host an arbitrary number of long-lived TCP/IP client connections.
Clients send messages to the services.
The services do something based on the messages, and potentially notify N>1 of the clients of state changes.
Sure, it sounds like a botnet but it isn't. Consider how IRC works with c2s and s2s connections and s2s message relaying.
- The servers are in the same data center.
- The servers can communicate over a private VLAN @1GigE.
- Messages are < 1KB in size.
How would you coordinate which services on which host should receive and relay messages to connected clients for state change messages?
There's an infinite number of ways to solve this problem efficiently.
- AMQP (RabbitMQ, ZeroMQ, etc.)
- Spread Toolkit
- N^2 connections between allservices (bad)
- Heck, even run IRC!
- ...
I'm looking for a solution that:
- perhaps exploits the fact that there's only a small closed cluster
- is easy to admin
- scales well
- is "dumb" (no weird edge cases)
What are your experiences?
What do you recommend?
Thanks!
I'm new to handling the infrastructure for production service deployments. My intuition tells me that if I want to have my service be "up" as much as possible and yet can only afford say 2 dedicated servers (startup time!) that I should make one server a redundant copy of the other. Then setup failover, replication, etc.
However, after reading some case studies and even hearing that Stack Overflow and OK Cupid only have a single database server, perhaps I'm overthinking things?
I kind of hate having to spend say $250/mo. on a leased server that acts as a backup just in case.
This all depends on your service that you provide but come on, Stack Overflow must be important enough that it should require a redundant database.
OK, enough rambling. What am I missing? Help! Thanks.
I have 2 dedicated servers provisioned for my next project's datastores. The datastores are configured for master-slave replication. There's no inherent automatic failover but I of course want this. That is, I'd love for access to the master datastore to always just work without having to configure a client library to detect when a master is down and failover to the slave.
I've seen Wackamole which is based on the Spread Toolkit. You provide Wackamole with a set of IPs and a bunch of nodes, and regardless of the up/down state of any of the nodes, those IPs will stay available/up. Wackamole detects when a node goes down and ARPs the IP(s) that were up on the now-down node. It's pretty neat actually.
So, my thought was to use Wackamole to keep the 2 virtual private IPs available/up. Clients would then just always use the same private IP to access the master datastore and the same but distinct IP for the slave datastore, even if those IPs were hosted on the same node.
My datastore servers are accessed over a private network. I am unsure if this messes with Wackamole though.
Is this lunacy? How do you generally handle automatic failover of private services like a datastore.
FWIW, it shouldn't matter but the datastore is Redis. I don't want to hear "use mySQL" please :)
Thanks.
What is the point of having more than a single public IP address for a dedicated server with a single network interface? I see a lot of dedicated server hosting offers that show something like "public IPs: 8". What am I missing? Thanks!
I have provisioned a server with 8 cores and plan on deploying a network service. For spreading out request load I'd like to run 8 instances of my service. Nothing shocking here. I do not have access to a hardware load balancer. I should mention that I currently have allocated 5 public IP addresses (but I can get more).
Thus, I would like to hear your recommendations for structuring a software load balancing solution.
The obvious choices would be to either:
- use HAProxy; or
- pre-fork my application (like Facebook Tornado and Unicorn both do);
- insert your idea here.
My goals are to:
- spread request load between the service instances; and
- allow for rolling restarts of my service (code upgrades).
I should mention that this is not a HTTP-based service so NGiNX and the like is out.
I do not love HAProxy because of its memory requirements; it seems to require a read and write buffer per client connection. Thus, I would have buffers at the kernel level, in HAProxy, and in my application. This is getting silly! Perhaps I'm missing something in this regard though?
Thanks!
I am a developer that has the good+bad situation of designing a network service that will be hit very hard by iPhone clients. The iPhone app has over 10MM downloads in the past year and now I'm bringing the users online to interact with each other.
I would like to tune the TCP implementation for the servers that will host my TCP-based network service. The per-request size sent will be "small" (say < 256 bytes). OK, you figured it out, it's a game server (shocker!).
FYI, I am not interested in UDP (or a reliable layer atop UDP as seen in ENet and RakNet for instance) for this particular service as the games are not Quake-like; all packets must be reliably received, and that's what TCP was designed for. Thus, the connections between the iPhone client and the service will be "long-lived" (as much as possible -- tunnels and elevators be damned!).
FYI, I'm running the service on a 100Mbps uplink on servers that run Linux 2.6.18-164.9.1.el5.
My goals are to simultaneously:
- keep latency as low as possible; and
- minimize the amount of memory used per connected client.
There are a large number of TCP-related knobs to tweak! After some basic research it seems that most people recommend leaving the settings as is. However, there are a number of settings that seem like they should be tweaked for particular cases. That's a little vague I know, and that's why I'm asking for help.
Things to consider tuning for small requests/responses on flaky networks while minimizing memory as much as possible might be:
- memory available to the TCP/IP implementation
- setting the "nodelay" option (disable Nagle algorithm since this is a semi-real-time game server)
- congestion control algorithms
- etc. (what else?)
Consider TCP congestion control algorithms:
- reno: Traditional TCP used by almost all other OSes
- cubic: CUBIC-TCP
- bic: BIC-TCP
- htcp: Hamilton TCP
- vegas: TCP Vegas
- westwood: optimized for lossy networks
My servers default to bic whose "goal is to design a protocol that can scale its performance up to several tens of gigabits per second over high-speed long distance networks while maintaining strong fairness, stability and TCP friendliness."
Just from the tiny description, Westwood sounds more apropos since it "is intended to better handle large bandwidth-delay product paths (large pipes), with potential packet loss due to transmission or other errors (leaky pipes), and with dynamic load (dynamic pipes)".
Am I getting in too deep here or is this par for the course?
What types of things do you guys tune TCP/IP for generally? How? What rules of thumb are there to know?
What words of wisdom do you have for my particular case?
Thanks a lot!