We're running TFS 2010 Standard Edition, and we've set up an automated build to run whenever someone checks in code. We run through all of the automated tests (built with MSTest) as part of the build. We've configured the build to run the tests as a 64-bit process, but the QTAgent.exe that runs the tests grows in memory while the tests are running. It is currently reaching 8GB for the ~650 tests we have, and the process has slowed significantly when we went from 450 tests to 650 tests.
When we run all of the tests in the local development environment, memory seems to be freed at least with each TestClass and never exceeds a certain level. The process of running all tests has not increased significantly in the local development environment.
Is there a way to configure the build service to free up memory with each Test or each TestClass? With the way things are currently running, the build process gets very slow when we start to run out of memory on the machine.
Edit: I found the MSTest invocation in the build log and ran it manually and saw the same behavior of runaway memory. I removed the /publish, /publishbuild, /teamproject, /platform, and /flavor parameters from the invocation of MSTest, in case the test runner was holding onto results until the end, but the behavior didn't change. I ran the same command line on a dev box, separate from the build server, and the memory freed up frequently. It seems there must be something wrong/different about the build server that is causing it to behave different, but I'm stumped where to look.
I've looked at qtagent.exe.config, mstest.exe.config, versions of both executables. What else might affect this?
I eventually found the reason why memory was being consumed so quickly during the automated build. By creating a dump of QTAgent.exe, we were able to examine its content to find what is on the .NET heap. It turns out that we were placing a large number of objects into a cache object, and since the cache object lives for the entire process, all of these objects were surviving garbage collection.
The program logic in our tests generates some random data and then checks to see if that data is unique. For the most part, it should be, but there is occasionally a collision. If there is a collision, we end up caching the previous data. This shouldn't be a problem because our tests clean up their data after they're done and so the risk of collision should be relatively low.
By examining what we have in the cache, we found that we must have been running into a lot of collisions, and it turns out that our clean up logic was failing in only the automated build/test environment due to some improperly configured dependencies. This means that every time we ran our tests, we generated more data that wasn't being cleaned up, and subsequent runs would run the risk of loading that data into the cache for the duration of the test run.
I recognize that having tests that are not completely isolated was part of our problem, but I'd like to call out a few other lessons learned:
Not really, the problem is caused by the Test Agent holding on the results of all tests so it can update the log at the end. My recomendation would be to move way from Visual Studio tests to another test agent (like NUnit).
This will cause you some headaches to get moved over, but you should end up with a cleaner solution.