I have a Spree site running the following stack:
- Nginx 1.0.8
- Passenger 3.0.9
- Ruby 1.9.2-p290
- Rack 1.3.6
- Rails 3.1.4
- Spree 0.70.5
I recently upgraded from Spree 0.70.3, which also brought a Deface upgrade from 0.7.x to 0.8.0. Since then things have been very unstable.
Recently we've seen some CPU-hogging processes which drive load up on the server and grind the whole thing to a stop. They're Rack processes and it looks like Passenger is starting them; they're owned by the site-runner
user, an unprivileged user who owns the application code. (Passenger automatically runs the site code as the user who owns it.) If I restart Nginx and kill the runaway processes, it helps for a while, but eventually similar processes return and bog things down again.
ETA: I'm looking now at passenger-status
and passenger-memory-stats
which suggest these are Passenger's application processes. If it's running away or hanging, there must be an issue with my app.
What's my best option for figuring out where this is hanging?
Rack processes are the application servers running your site code not Passenger. I'd suspect problems with the recent upgrades and all the usual troubleshooting around that. Here's what a request looks like on your system.
You system will have multiple Rack processes because each is single threaded and can only handle one request at a time. Passenger's job is to proxy requests and send them to Rack processes and start/stop/recycle those Rack processes as needed. Generally a Rack process will take 5-45 seconds to start depending on the complexity of your app so you'll usually have a few running even when not serving requests.