I have a WCF service app hosted in IIS. On startup, it goes and fetches a really expensive (in terms of time and cpu) resource to use as local cache.
Unfortunately, IIS seems to recycle the process on a fairly regular basis. So I am trying to change the settings on the Application Pool to make sure that IIS does not recycle the application. So far, I've change the following:
- Limit Interval under CPU from 5 to 0.
- Idle Time-out under Process Model from 20 to 0.
- Regular Time Interval under Recycling from 1740 to 0.
Will this be enough? And I have specific questions about the items I changed:
- What specifically does Limit Interval setting under CPU mean? Does it mean that if a certain CPU usage is exceeded, the application pool will be recycled?
- What exactly does "recycled" mean? Is the application completely torn down and started up again?
- What is the difference between "Worker Process shutdown" and "Application Pool recycling"? The documentation for the Idle Time-out under Process Model talks about shutting down the worker process. While the docs for Regular Time Interval under Recycling talk about application pool recycling. I don't quite grok the difference between the two. I thought the w3wp.exe is the worker process which runs the application pool. Can someone explain the difference to the application between the two?
The reason for having IIS7 and IIS7.5 tags is because the app will run in both and hope the answers are the same between the versions.
Image for reference:
Recycling
Recycling is usually* where IIS starts up a new process as a container for your application, and then gives the old one up to
ShutdownTimeLimit
to go away of its own volition before it's killed.*- usually: see
DisallowOverlappingRotation
/ "Disable overlapped recycle" settingIt is destructive, in that the original process and all its state information are discarded. Using out-of-process session state (eg, State Server or a database, or even a cookie if your state is tiny) can allow you to work around this.
But it is by default overlapped - meaning the duration of an outage is minimized because the new process starts and is hooked up to the request queue, before the old one is told "you have [
ShutdownTimeLimit
] seconds to go away. Please comply."Settings
To your question: all the settings on that page control recycling in some way. "Shutdown" might be described as "proactive recycling" - where the process itself decides it's time to go, and exits in an orderly manner.
Reactive recycling is where WAS detects a problem and shoots the process (after establishing a suitable replacement W3WP).
Now, here's some stuff that can cause recycling of one form or another:
What To Do:
Generally:
Disable Idle timeouts. 20 minutes of inactivity = boom! Old process gone! New process on the next incoming request. Set that to zero.
Disable Regular time interval - the 29 hour default has been described as "insane", "annoying" and "clever" by various parties. Actually, only two of those are true.
Optionally Turn on DisallowRotationOnConfigChange (above, Disable Reycling for configuration changes) if you just can't stop playing with it - this allows you to change any app pool setting without it instantly signaling to the worker processes that it needs to be killed. You need to manually recycle the App Pool to get the settings to take effect, which lets you pre-set settings and then use a change window to apply them via your recycle process.
As a general principle, leave pinging enabled. That's your safety net. I've seen people turn it off, and then the site hangs indefinitely sometimes, leading to panic... so if the settings are too aggressive for your apparently-very-very-slow-to-respond app, back them off a bit and see what you get, rather than turning it off. (Unless you've got auto-crash-mode dumping set up for hung W3WPs through your own monitoring process)
That's enough to cause a well-behaved process to live forever. If it dies, sure, it'll be replaced. If it hangs, pinging should pick that up and a new one should start within 2 minutes (by default; worst-case calc should be: up to ping frequency + ping timeout + startup time limit before requests start working again).
CPU limiting isn't normally interesting, because by default it's turned off, and it's also configured to do nothing anyway; if it were configured to kill the process, sure, that'd be a recycling trigger. Leave it off. Note for IIS 8.x, CPU Throttling becomes an option too.
An (IIS) AppPool isn't a (.Net) AppDomain (but may contain one/some)
But... then we get into .Net land, and AppDomain recycling, which can also cause a loss of state. (See: https://blogs.msdn.microsoft.com/tess/2006/08/02/asp-net-case-study-lost-session-variables-and-appdomain-recycles/)
Short version, you do that by touching a web.config file in your content folder (again with the picking!), or by creating a folder in that folder, or an ASPX file, or.. other things... and that's about as destructive as an App Pool recycle, minus the native-code startup costs (it's purely a managed code (.Net) concept, so only managed code initialization stuff happens here).
Antivirus can also trigger this as it scans web.config files, causing a change notification, causing....
Kindly check,
Why Do We Recycle Our Application Pools?
if you browse the web to find the reason why application pools are configured to recycle automatically periodically, you’ll be hard pressed to find a reasonable answer that doesn’t pertain to memory issues. It’s like the community in general has pretty much accepted the fact that our web applications (or service layers hosted in IIS) will need to be recycled to avoid memory problems.
I’ve always been of the opinion that if your code requires periodic restarts to keep working correctly, then something is clearly wrong. There is a bug in your code somewhere and you need to fix that, instead of restarting the process occasionally to make the problem ‘go away’.
Really need to start focusing more on memory management in .NET and on making sure that our applications can keep running without problems.
Based on the OP scenario (long initialization on startup / warm up), another thing to check is Startup time limit (seconds) which has a default value of 90 seconds. If initialization takes more than Startup time limit, the worker process can be terminated.
The answer is, you can prevent the AppPool from ever recycling, but you should not.
The reason being that if there is a memory leak, it will eventually eat up all of the memory of the server and Windows will blue screen or throw out of memory exceptions that will take down other sites on the same IIS Server.
So, decide how much memory is permitted to be used by that site, and set the above settings to recycle when that limit is reached.
Normally, the recycle is done gracefully so end uses are not aware of it. But if you are using Blazor Server, then all sessions run on the server, and all state will be lost. In practice, I see a Blazor App show the "connecting..." message for about 5 seconds while the recycle occurs. In other words, it is not graceful for Server Side Blazor apps.
The moral of the story is what was mentioned earlier, ensure your site does not leak memory. Test your memory early in the Dev Process, don't wait until it goes live, as Blazor Server is memory intensive and my experiance is that I have had to spend quite a bit of time debugging memory issues. This is not a fault of Blazor, it is just in the nature of Blazor Server apps to require very tight code. Earlier in .net, I never worried about memory as the GC would handle all of that, but running inside IIS is a different story.