TL;DR
I'm pretty sure our small network has been infected by some sort of worm/virus. It seems to only be afflicting our Windows XP machines, however. Windows 7 machines and Linux (well, yea) computers seem to be unaffected. Anti-virus scans are showing nothing, but our domain server has logged thousands of failed login attempts on various valid and invalid user accounts, particularly the administrator. How can I stop this unidentified worm from spreading?
Symptoms
A few of our Windows XP users have reported similar problems, although not entirely identical. They all experience random shutdowns/restarts that are software initiated. On one of the computers a dialog pops up with a countdown until system restart, apparently started by NT-AUTHORITY\SYSTEM and has to do with an RPC call. This dialog in particular is exactly the same as those described in articles detailing older RPC exploit worms.
When two of the computers rebooted, they came back up at the login prompt (they are domain computers) but the user name listed was 'admin', even though they hadn't logged in as admin.
On our Windows Server 2003 machine running the domain, I noticed several thousand login attempts from various sources. They tried all different login names including Administrator, admin, user, server, owner and others.
Some of the logs listed IPs, some didn't. Of the ones that did have source IP address (for the failed logins) two of them correspond to the two Windows XP machines experiencing reboots. Just yesterday I noticed a bunch of failed login attempts from an outside IP address. A traceroute showed that outside IP address to be from a Canadian ISP. We shouldn't have an connections from there, ever (we do have VPN users though). So I am still not sure whats going on with the login attempts coming from a foriegn IP.
It seems obvious that some sort of malware is on these computers, and part of what it does is try to enumerate passwords on domain accounts to gain access.
What I've Done So Far
After realizing what was happening, my first step was to make sure everyone was running up-to-date anti-virus and did a scan. Of the computers affected, one of them ha an expired anti-virus client, but the other two were current versions of Norton and full scans of both systems turned up nothing.
The server itself regularly runs up-to-date anti-virus, and has not shown any infections.
So 3/4 of the Windows NT based computers have up-to-date anti virus, but it hasn't detected anything. However I am convinced that something is going on, mainly evidenced by the thousands of failed login attempts for various accounts.
I also noticed that the root of our main file share had pretty open permissions, so I just restricted it to read+execute for normal users. The administrator has full access of course. I am also about to have the users update their passwords (to strong ones), and I am going to rename to Administrator on the server and change its password.
I already took of the machines off of the network, one is being replaced by a new one, but I know these things can spread through networks so I still need to get to the bottom of this.
Also, the server has a NAT/Firewall setup with only certain ports open. I have yet to full investigate some of the Windows related services with ports open, as I am from a Linux background.
Now what?
So all the modern and up-to-date anti-virus hasn't detected anything, but I am absolutely convinced these computers have some sort of virus. I base this on the random restarts/instability of the XP machines combined with the thousands of login attempts originating from these machines.
What I plan on doing is backing up user files on the affected machines, and then reinstalling windows and freshly formatting the drives. I also am taking a few measures to secure the common file shares that may have been used to spread to other machines.
Knowing all this, what can I do to ensure that this worm isn't somewhere else on the network, and how can I stop it from spreading?
I know this is a drawn out question, but I am out of my depths here and could use some pointers.
Thanks for looking!
These are my general suggestions for this kind of process. I appreciate you'll have covered some of them already but its better to be told something twice than miss something important. These notes are orientated towards malware that's spreading on a LAN but could easily be scaled back to deal with more minor infections.
Stopping the rot, and finding the infection source.
Make sure you have an up to date backup of every system and every bit of data on this network that the business cares about. Make sure you note that this restore media may be compromised, so that people don't try and restore from it in 3 months time while your back is turned and infect the network again. If you have a backup from before the infection happened, put this safely to one side too.
Shut down the live network, if you possibly can (you will probably need to do this as part of the cleanup process, at least). At the very least, seriously consider keeping this network, including servers, off the internet until you know what is going on - what if this worm is stealing info?
Don't get ahead of yourself. It's tempting to just say clean build everything at this point, force everyone to change passwords, etc, and call that 'good enough'. While you will probably need to do this sooner or later, it's likely to leave you with pockets of infection if you don't understand what is happening on your LAN. (If you don't want to investigate the infection further go to step 6)
Copy an infected machine to a virtual environment of some kind, isolate this virtual environment from everything else including the host machine before you boot the compromised guest.
Create another couple of clean virtual guest machines for it to infect then isolate that network and use tools like wireshark to monitor the network traffic (time to take advantage of that linux background and create another guest on this virtual LAN that can watch all this traffic without being infected by any Windows worm!) and Process Monitor to monitor changes happening on all these machines. Also consider that the issue may be a well hidden rootkit - try using a reputable tool for finding these but remember that this is a bit of an uphill struggle so finding nothing doesn't mean there is nothing there.
(Assuming you haven't / can't shut down the main LAN) Use wireshark on the main LAN to look at traffic being sent to/from the infected machines. Treat any unexplainable traffic from any machine as potentially suspicious - absence of visible symptoms is not evidence of an absence of any compromise. You should be especially worried about servers and any workstations running business critical information.
Once you have isolated any infected processes on the virtual guests, you should be able to send a sample to the company that made the antivirus software you're using on these machines. They will be keen to examine samples and produce fixes for any new malware they see. In fact, if you have not done so already, you should contact them with your tale of woe as they might have some way of helping.
Try very hard to work out what the original infection vector was - this worm may be an exploit that was hidden inside a compromised website that someone visited, it may have been brought in from someone's home on a memory stick or received by email, to name but a few ways. Did the exploit compromise these machines via a user with admin rights? If so, don't give users admin rights in future. You need to try and make sure the infection source is fixed and you need to see if there is any procedural change you can make to make that infection route more difficult for exploits in the future.
Clean-up
Some of these steps will seem over the top. Heck some of them probably are over the top, especially if you determine that only a few machines are actually compromised, but they should guarantee your network is as clean as it can be. Bosses won't be keen on some of these steps either, but there's not much to be done about that.
Shutdown all machines on the network. All workstations. All servers. Everything. Yes, even the bosses' teenage son's laptop which the son uses to sneak onto the network while waiting for dad to finish work so the son can play 'dubious-javascript-exploit-Ville' on whatever the current social media site du-jour is. In fact, thinking about it, shut this machine down especially. With a brick if that's what it takes.
Start up each server in turn. Apply any fix you've discovered for yourself or have been given by an AV company. Audit the users and groups for any unexplained accounts (both local accounts and AD accounts), audit installed software for anything unexpected and use wireshark on another system to watch traffic coming from this server (If you find any issues at this point then seriously consider rebuilding that server). Shut each system down before you start the next one, so that a compromised machine can't attack the others. Or unplug them from the network, so you can do several at once but they can't talk to each other, its all good.
Once you're as sure as you can be that all your servers are clean, start them up and using wireshark, process monitor, etc. again observe them again for any strange behaviour.
Reset every single user password. And if possible, service account passwords, too. Yeah I know its a pain. We're about to head into "possibly over the top" territory at this point. Your call.
Rebuild all the workstations. Do so one at a time, so that possibly infected machines aren't sitting there idle on the LAN attacking freshly rebuilt ones. Yes this will take a while, sorry about that.
If that's not possible then:
Carry out the steps I outlined above for servers on all the "hopefully clean" workstations.
Rebuild all the ones that showed any hint of suspicious activity, and do so while all the "hopefully clean" machines are powered off.
If you haven't already then consider centralised AV that will report problems back to a server where you can watch for problems, centralised event logging, network monitoring, etc. Obviously pick and choose which of these are right for this network's needs and budgets, but there's clearly a problem here, right?
Review user rights and software installs on these machines, and set up a periodic audit to make sure things are still how you expect them to be. Also make sure that users are encouraged to report things asap without being moaned at, encourage a business culture of fixing IT problems rather than shooting the messenger, etc.
You've done all the things I would do (if I were still a Windows admin) -- The canonical steps are (or were, last time I was a Windows guy):
Run AV/Malware/etc. scans on the whole network
Note that there's always a chance the virus/worm/whatever is lurking in email (on your mail server), or inside a macro in a word/excel document -- If the problem comes back you may need to be more aggressive in your cleaning the next time around.
The first lesson to take from this is that AV solutions aren't perfect. Not even close.
If you are up to date with the AV software vendors, call them. All of them have support numbers for exactly this sort of thing. As a matter of fact they'll probably be very interested in what hit you.
As others have said, take each machine down, wipe it and reinstall. You might take this opportunity to get everyone off of XP anyway. It's been a dead OS for quite some time. At the very least this should involve destroying the HD partitions and reformatting them. Although, it sounds like there aren't that many machines involved, so buying completely new replacements might be a better option.
Also, let your boss(es) know that this just got expensive.
Finally, why in the world would you run all of that off of a single server? (Rhetorical, I know you "inherited" it) A DC should NEVER be accessible from the internet. Fix this by getting the appropriate hardware in place to take care of the functionality you need.
It's most likely a rootkit if your A/V programs turn up nothing. Try running TDSSkiller and see what you find. Also, this would be a perfect time to simply replace the archaic Windows XP computers with something less than over a decade old. Aside from software like anti-virus programs, I've seen very very little in the way of programs that couldn't be made to run via a shim or loosening a few NTFS/Registry permissions on Windows 7. There's really little excuse for continuing to run XP.