This happens on a number of PCs (WinXP SP3), where an application will simply lock up. It may be the browser (usually Firefox in our case) or it may be Visual Studio, or something else, but every time it seems the same, the application locks up, sometimes for minutes. There is no spike in CPU usage. Usually the other applications that are open at the time continue to work fine, but other times it takes the explorer.exe
process with it.
Given the lack of CPU spike, or any other metric I can see, how can I diagnose what is causing this issue?
It could be a network problem, the clients may have trouble resolving names (netbios and dns) or simply trouble accessing network resources and hence anything that accesses files for instance may appear to freeze waiting for various timeouts to end.
Simple things like having roaming user profiles on a file server with some folder redirection and a "faulty" network with a malfunctioning or misconfigured switch may cause delays and user interface response issues for several minutes in Windows if it's part of a domain. Since Explorer.exe is mentioned as a frequent casualty it's not too far-fetched I guess.
Does it happen or gets resolved if you unplug the client from any network connections (so those connections actually gets closed on the client)?
Does it happen if you log on as a local user instead so no user-level group policies gets applied?
You might want to look into 'Process Explorer'. This free app will allow you to see all the in's and out's of what is happening on the system. It should give you a good look at what is hanging up while the program is running.
Download Process Explorer here
What anti-virus are you running? I have seen this problem before with Norton, when it gets into its head that a process is a virus and grabs onto it for minutes at a time before deciding that it's not a virus.
Though it's not completely fleshed out yet, this question has a good starter list. You're going to need to find a smoking gun in the event logs or through eliminating startup items, unless you can find a trend in similar apps/hardware to narrow it down.
Check the following
If you have 'Roaming Profile" check the server hosting the profiles. Look at the event logs on that server (System Node) for hardware errors. Check the network cable etc.
Check DNS. This will lead you to Domain Controller(s). Check event logs on that server as well. Verify proper network connection.
Check your switches. Sometimes this can cause periodic hicups (if it's a managed switch) and you have some misconfiguration, like storm control setting.
If this happens once in a while, I've often seen DNS as the issue. Sometimes running a "dcdiag" on the domain controller fixes the issue.
One last quirk I've seen is if the clients are mapped to a printer which is no-longer in service, Ive seen the apps take a while to load. By just removing that printer from the "printers" section has also fixed this problem.