I have a custom server application that runs on Windows 2008 R2. It is a home grown Windows Service written in .Net supporting a number of custom terminals. I have a test machine that has a similar specification to the live server and I have a set of client simulators that I can use to produce a load that is a reasonable approximation of the real system. I need to be able to support 12,000 of these and at present the server is running out of memory (Paging is going through the roof).
My plan was to only start 100 of the simulators, measure memory usage, then start 100 more measure memory again and repeat until paging starts going up (In reality I will be taking more than three data points.) This should give me a figure for the amount of extra memory required for 100 simulators and enable me to project how much memory is required. I only need a rough idea +/-30Gb to avoid buying the full 2Tb ($150,000 worth) that the server will take. My question is whether this is a reasonable method to use and if so which Performance Counters would you monitor to give the amount of memory actually being used?
I am specifically talking about memory here as the difference between Working Set, Private Bytes, Committed, Shared, Virtual and all the other memory terms confuse me. I think I can manage to monitor CPU, IO and Networking by myself. The other thing I have noticed is that the .Net Cache adjusts its memory usage depending upon what is available which makes spotting a trend hard to see.
Honestly? I DON'T.
When specing a server that will see any kind of real workload I cram in as much RAM as I can reasonably afford (systems are more likely to wind up RAM-constrained than CPU or Disk constrained - the only other guaranteed bottleneck is the front-side bus).
If you want to figure out how much RAM your application may use a basic load test like you've proposed is a good start, but if you already have this system in production (it sounds like you do) and your production system is swapping your task is easier: Figure out how much swap space you are using --> Add at least 2x that much RAM (round up to fit in your system's DIMM-size constraints).
If you perform a load test to get rough numbers and extrapolate from there remember to factor in a few things:
The memory curve will probably be two distinct segments
(initial sharp ramp up as frameworks/shared libraries are cached, then a slightly less steep curve as each new app's un-shareable code is put in memory)
You still need free RAM for disk and shared library caching, and for the OS.
(This should be at least a few gigs over what your app needs)
ALL software leaks memory (at least all practical software does), so watch for that in your testing and be sure you have the room to deal with a leak.
Your load will probably increase over the lifetime of the server. Plan accordingly.
(If you don't have good capacity planning numbers, double today's workload and plan to handle that).
Buying too much RAM today is cheaper than having your environment fall over tomorrow.
Thanks, the update at least gives everyone a clue. That you are contemplating 2Tb of memory means you're playing in a different ballpark to the usual setups. Big system. hate to think of how much heat thats going to be putting out.
Given that its an internal server process and that you are running out of memory (you don't say at what level you start paging) but I would want to eliminate the possiblity that the server process is consuming ever larger amounts of memory before going any further. If this is occuring it makes no difference what you do, the system will stop at some point.
I don't know of any generic tools which you can use to give you much more than a basic overview of whats going on... what comes with windows. The service process itself is a black box and your dev team need to provide monitoring tools.
Quick back of the envelope calculation:
This would not be out of the range of a normal .net exe's working set.
Does the service have multiple threads? If they are launching a thread for each connection it would be worth looking at how they are doing this. ProcExp.exe from microsoft is an easy way to see if you have multiple threads and what those threads are consuming. It doesn't know about .net but will give you win32 counters.
Can you indicate how much memory and how many connections you had when doing your testing before it started paging?
So, how to establish if the server process has memory leak problems? It could be accumulating more memory with each session is connected, or it could be accumulating memory and not freeing them up ever.
What you could do is - pick a number of sessions which does not provoke paging and simulate that number of connections. - Run the simulation over a few hours and to use perfmon to watch the basic memory counters. - Repeat these tests with sessions which connect briefly and disconnect.
The idea being to see if the service is consuming more and more memory with each session, or if open sessions provoke ever increasing memory usage.