Recently (but it is also a recurrent question) we saw 3 interesting threads about hacking and security:
How do I deal with a compromised server?.
Finding how a hacked server was hacked
File permissions question
The last one isn't directly related, but it highlights how easy it is to mess up with a web server administration.
As there are several things, that can be done, before something bad happens, I'd like to have your suggestions in terms of good practices to limit backside effects of an attack and how to react in the sad case will happen.
It's not just a matter of securing the server and the code but also of auditing, logging and counter measures.
Do you have any good practices list or do you prefer to rely on software or on experts that continuously analyze your web server(s) (or nothing at all)?
If yes, can you share your list and your ideas/opinions?
UPDATE
I received several good and interesting feedback.
I'd like to have a simple list, so that can be handy for the IT Security administrators but also for the web factotum masters.
Even if everybody gave good and correct answers, at the moment I prefer the one of Robert as it's the most simple, clear and concise and the one of sysadmin1138 as it's the most complete and precise.
But nobody consider the user perspective and perception, I think it's the first that have to be considered.
What the user will think when will visit my hacked site, much more if you own sensible data about them. It's not just a matter of where to stock data, but how to calm angry users.
What about data, medias, authorities and competitors?
There are two big areas to focus on:
Making it hard to get in
This is a very complex topic, and a lot of it focuses around making sure you have enough information to figure out WTF happened after the fact. The abstract bullet points for simplicity:
Creating policies and procedures to calmly and efficiently handle the event of someone getting in
A security-event policy is a must have for all organizations. It greatly reduces the "running around with our heads cut off" phase of response, as people tend to get irrational when faced with events such as these. Intrusions are big, scary affairs. Shame at suffering an intrusion can cause otherwise level-headed sysadmins to start reacting incorrectly.
All levels of the organization need to be aware of the policies. The larger the incident, the more likely upper management will get involved in some way, and having set procedures for handling things will greatly assist in fending off "help" from on high. It also gives a level of cover for the technicians directly involved in the incident response, in the form of procedures for middle-management to interface with the rest of the organization.
Ideally, your Disaster Recovery policy has already defined how long certain services may be unavailable before the DR policy kicks in. This will help incident response, as these kinds of events are disasters. If the event is of a type where the recovery window will NOT be met (example: a hot-backup DR site gets a realtime feed of changed data, and the intruders deleted a bunch of data that got replicated to the DR site before they were noticed. Therefore, cold recovery procedures will need to be used) then upper management will need to get involved for the risk-assessment talks.
Some components of any incident response plan:
Having policies and procedures in place before a compromise, and well known by the people who will be implementing them in the event of a compromise, is something that just needs doing. It provides everyone with a response framework at a time when people won't be thinking straight. Upper management can thunder and boom about lawsuits and criminal charges, but actually bringing a case together is an expensive process and knowing that beforehand can help damp the fury.
I also note that these sorts of events do need to be factored into the overall Disaster Response plan. A compromise will be very likely to trigger the 'lost hardware' response policy and also likely to trigger the 'data loss' response. Knowing your service recovery times helps set expectation for how long the security response team can have for pouring over the actual compromised system (if not keeping legal evidence) before it's needed in the service-recovery.
How proper helpdesk procedures can help
We need to consider how customers are dealt with here (this applies to both internal and external customers contacting a helpdesk).
First of all, communication is important; users will be angry about the disruption to business, and may also be concerned about the extent/consequences of any information breaches that may have taken place as part of an intrusion. Keeping these people informed will help manage their anger and concern, both from the point of view that sharing knowledge is good, and from the perhaps slightly less obvious point of view that one thing they will need to hear is that you are in control of the situation.
The helpdesk and IT management need to act as an "umbrella" at this point, sheltering the people doing the work to determine the extent of the intrusion and restore services from countless enquires that disrupt that work.
How deployment standards can help
Deploying to a set template (or at least a checklist) helps too, along with practising change control/management over any customisations/upgrades to your deployment template. You can have several templates to account for servers doing different jobs (e.g. a mail server template, a web server template, etc).
A template should work for both OS and apps, and include not just security but all settings you use, and should ideally be scripted (e.g. a template) rather than applied manually (e.g. a checklist) to eliminate human error as much as possible.
This helps in a number of ways:
For most of our servers we rely on host and network firewalls, anti virus/spyware software, network IDS, and host IDS for the majority of our prevention. This along with all of the general guidelines such as minimum privs, uninstalled non essential programs, updates, etc. From there we use products such as Nagios, Cacti, and a SIEM solution for various base lining and notifications of when events occur. Our HIDS (OSSEC) does a lot of SIEM type logging as well which is nice. We basically try to do block stuff as much as possible, but then log centrally so if something does happen we can analyze and correlate it.
What you really want can fall down into 3 basic areas:
If you have any information (assurance|security) staff available, then you should definitely talk to them. While Incident Response is often the sole purview of said office, the rest should be a joint development effort across all affected parties.
At the risk of self-pimping, this answer to a related question should index a lot of useful resources for you: Tips for Securing a LAMP Server.
Ideally, you should have the smallest number of supported OSes, and build each one using a base image. You should only deviate from the base as much as is required to provide whatever services that server provides. The deviations should be documented, or may be required if you have to meet PCI/HIPAA/etc. or other compliances. Using deployment and configuration management systems can help out a lot in this respect. The specifics will depend a lot on your OS, cobbler/puppet/Altiris/DeployStudio/SCCM, etc.
You should definitely perform some kind of regular log review. Given the option a SIEM can be very helpful, but they also have the downside of being expensive both in purchase price and build-out costs. Check out this question from the IT Security SE site for some comments on log analysis: How do you handle log analysis? If this is still too heavy, even common tools such as LogWatch can provide some good context for what's going on. The important piece, though, is just taking the time to look at the logs at all. This will help you get acquainted with what constitutes normal behavior, so that you can recognize abnormal.
In addition to log review, monitoring the state of the server is also important. Knowing when changes occur, whether planned or not, is crucial. Utilizing a local monitoring tool such as Tripwire can alert the admin to changes. Unfortunately, much like SIEMs and IDSes has the downside of being expensive to tune and/or purchase. Moreover, without good tuning, your alert thresholds will be so high that the any good messages will be lost in the noise and become useless.
A proper Security Information and Event Management (SIEM) policy in place will go a long ways to making your security life easier.
I'm not a security expert, so I mainly defer to them; but starting with the Principal of Least Privilege almost always makes their job significantly easier. Applying this like a healing salve works well for many aspects of security: file permissions, runtime users, firewall rules, etc. KISS never hurts either.
Most of the solution mentioned here applicable at the host and network level but we often forget insecure web applications. Web applications are the most commonly over looked security hole. By the way of web application an attacker can gain access to your database or host. No firewall, IDS, firewall can protect you against those. OWASP maintains a list of Top 10 most critical vulnerabilities and offers fixes for them.
http://www.scribd.com/doc/19982/OWASP-Web-Security-Guide