Update: EMC has dropped our warranty and support, so this is going to be an insurance case. Dell says's that we can get a professional cleaning agency to refurbish the servers and keep our warranty. Cisco says "maybe". HP is still silent :(
Final update: EMC turned around and approved cleaning from a certified company. The VNX got shipped back to us today and works just fine. The rest of the server room is also getting cleaned, and our losses are limited to a couple of tape drives. The insurance company picks up the bill for just about everything else.
The original question:
Here's the story..
The owners of the building we lease office space from decided to do a renovation of the exterior. This involved in some pretty heavy work at the level where our server room is, including exchanging windows wich are fit inside a concrete wall.
My red alert went off when I heard that they were going to do the same thing with our server room (yes, our server room has a window. We're a small shop with 3 racks. The window is secured with steel bars.) I explicity told the contractor that they need to put up a temporarily wall between our racks and the original wall - and to make sure that the temporary wall is 100 % air and water-tight. They promised to do so.
The temporary wall has a small door in it, so that workers can go in/out through the day (through our server room, wich was the only option....). On several occasions I could find the small door half-way shut while working evenings/nights. I locked the door, and thought that they would hopefully get the point soon and keep the door shut. I even gave a electrician a mouthful when I saw that he didn't close the door properly.
By this point - I bet that most of you get a picture of what happened. Yes, they probably left the door open while drilling in the concrete.
I present you our 4 weeks old EMC VNX:
I'll even put in a little bonus, here is the APC UPS one rack further away from the temporary wall. See the nice little landing strip from my finger?
What should I do? The only thing that comes to mind is to either call all our suppliers (EMC, HP, Dell, Cisco) and get them to send technicians to check out all the gear in the server room, or get some kind of certified 3rd-party consulant to check all of it. Would you run production systems on this gear? How long?
I should also note that our aircondition isn't exactly enterprise-grade, given the nature of our small room. It's just a single inverter, wich have failed one time before I started working here (failed inverters usually leads to water dripping out).
First: When the server room becomes a building site, you must remove all the servers, for several reasons.
The only safe bet is, to remove your servers to a different room. This means additional shutdowns, but during the course of the building activities, you will probably have to do the shutdown anyway.
(mind you, the following is my subjective opinion) Now this has happened. The fans are covered with mineral dust. This potentially reduces their lifetime by affecting the bearings.
I would not expect the servers to massively fail, but I would expect to have a higher percentage of failures. At a fairly large customer, fan outages rose after building activities.
But then again, I would not go to great lengths to clean the systems. If the cleaning leads to damage, it is likely bigger than a failed fan. Unless, of course, if the dust contains metallic particles, which change the whole game.
What to do? Clean the room, clean the outside of the servers, reserve some cash for earlier replacements in the future.
The damage very probably is not justiciable. The part of damage that you can prove is probably not worth going to court for. You might try to get some amount of compensation out of the landlord, perhaps a reduced payment or so.
Really, servers can withstand some level of dust but this is just too much. We clean our servers regularly during downtime with a PC vacuum by 3M. It's a nice thing to have around the office.
But for now, start cleaning. The faster you get the dust out of there, the better. Try to keep heatsinks and fans clear of dust. If a heatsink or fan is covered in dust, its ability to dissipate heat is much worse then a clean unit.