With the recent problems that McAfee customers have had over the last week there has been lots of opinion that not only should the AV vendors have better testing but customers should test AV signatures before deploying.
Is this feasible? If you are doing this already do you take other measures to minimise exposure to malware while you are testing?
Update: With respect to feasible I mean is it feasible to run through a whole gamut of tests for each AV signature/definition update or do you restrict to just critical apps or critical functionality tests? Do you have a different approach for servers compared to desktops?
It shouldn't be much trouble setting up a test environment that gets the definitions first (by an hour or whatever is possible with your AV vendor) and whatever AV alerts it generates halts company-wide deployment until someone confirms it.
Setting this up for manual testing is bound to fail as someone said, due to boredom, not because it's not possible.
But just to include some perspective in the "debate", when talking Windows, switch to something like Software Restriction Policies or in Windows 7 / 2008 R2, the much improved version called AppLocker - it takes the more mature approach of only allowing specified programs to run, not the other way around...
...you can do this with code certificates too so say all Microsoft applications are allowed, and internally developed applications are allowed - without having to specify individual executables.
When properly implemented, no AV software is needed.
I can think of no firewalls today that starts out with "allow everything except specified malicious traffic". Unless someone did a poor job, they're all "block everything except specifically allowed traffic".
Using McAfee's own ePo (ePolicy Orchestrator) AV client management server you can do just this - by a combination of scheduling when you download the Dats from McAfee and setting your test machines to get the updates before the general population you can try out the dats for a day or 2 before deploying them. I imagine most "enterprise' AV solutions will have similar mechanisms.
I have 2 problems with this however:
1) You will get bored - 1 single dodgey dat out of several thousand in what 5-6 years?
It comes down to a simple cost benefit equation - if the several hundred "man hours" of testing every year cost less than the profit loss caused by a loss of service from a dodgy update (Coles Supermakets for example), then of course do it.
2) More importantly it is really really important to get new definitions out to your users as soon as possible. To illustrate; just last week I was handed a USB drive infected with an Autorun type malware - this wasn't in the current dat and was picked up with the next days set of definitions (I only detected it as I was using a Mac and saw the suspicious files).
Fundamentally McAfee should never have released a DAT that had a false positive of a Windows system file! We pay McAfee a significant sum of money every year and expect their quality control to be better than this!
---Edit----
As an aside, did anyone else think of this xkcd when they saw cash registers running AV software...
The thing to remember is that the longer you wait to deploy the definitions, the longer the window of opportunity for new infections to take root.
The AV vendor is supposed to be testing things and McAfee acknowledged that they screwed up their internal tests.
In order for you to test it, you'd have to have a simulated environment that is running a machine of each of your deployments with the exact dll installation, application combinations, update combinations, etc.
...so basically you'd probably not have a guarantee that you'll catch edge cases.
BUT you can control the effect by using backups. You may not be able to stop disasters from happening, but you can control the outcome; having backups available means you can get them back up and online, and you may be able to roll back changes (usually, the latest McAfee issue was a fluke that shot Windows square in the brain that time, but once people knew what caused it at least the file could be copied over with a Linux boot disc from the sound of it...)
So in the end you're asking to duplicate a lot of work with little payoff a more risk to your users. You'd be better off making sure you keep periodic backups so you can restore your user systems and protect their data.
If you rely primarily on heuristic detection rather than signatures then there are fewer update. There are therefore fewer updates to test. I'd like to think we're slowly moving in this direction.
With that basic heuristic detection in place, you have more time to test the updates. And you can quite easily run those updates on a Continuous Integration or build server like developers do. However that'd take quite a lot of infrastructure, especially if you're testing against every possible setup you have running in your enterprise.
You could stage to a subset say 10% on a much more frequent basis. If your phone starts ringing, well at least only 10% of your computers have a problem. We used this method for deploying windows updates, we would do 10% for 2 days and then the rest of the computers. You could also find an av vendor that has a better reputation in not hosing your systems.