This is a Canonical Question about Fighting Spam.
Also related:
There are so many techniques and so much to know about fighting SPAM. What widely used techniques and technologies are available to Administrator, Domain Owners, and End Users to help keep the junk out of our inboxes?
We're looking for an answer that covers different tech from various angles. The accepted answer should include a variety of technologies (eg SPF/SenderID, DomainKeys/DKIM, Graylisting, DNS RBLs, Reputation Services, Filtering Software [SpamAssassin, etc]); best practices (eg mail on Port 25 should never be allowed to relay, Port 587 should be used; etc), terminology (eg, Open Relay, Backscatter, MSA/MTA/MUA, Spam/Ham), and possibly other techniques.
To defeat your enemy, you must know your enemy.
What is spam?
For our purposes, spam is any unsolicited bulk electronic message. Spam these days is intended to lure unsuspecting users into visiting a (usually shady) web site where they will be asked to buy products, or have malware delivered to their computers, or both. Some spam will deliver malware directly.
It may surprise you to learn that the first spam was sent in 1864. It was an advertisement for dental services, sent via Western Union telegram. The word itself is a reference to a scene in Monty Python's Flying Circus.
Spam, in this case, does not refer to mailing list traffic a user subscribed to, even if they changed their minds later (or forgot about it) but have not actually unsubscribed yet.
Why is spam a problem?
Spam is a problem because it works for the spammers. Spam typically generates more than enough sales (or malware delivery, or both) to cover the costs -- to the spammer -- of sending it. The spammer does not consider the costs to the recipient, you and your users. Even when a tiny minority of users receiving spam respond to it, it's enough.
So you get to pay the bills for bandwidth, servers, and administrator time to deal with incoming spam.
We block spam for these reasons: we don't want to see it, to reduce our costs of handling email, and to make spamming more expensive for the spammers.
How does spam work?
Spam typically is delivered in different ways from normal, legitimate email.
Spammers almost always want to obscure the origin of the email, so a typical spam will contain fake header information. The
From:
address is usually fake. Some spam includes fakeReceived:
lines in an attempt to disguise the trail. A lot of spam is delivered via open SMTP relays, open proxy servers and botnets. All of these methods make it more difficult to determine who originated the spam.Once in the user's inbox, the purpose of the spam is to entice the user to visit the advertised web site. There, the user will be enticed to make a purchase, or the site will attempt to install malware on the user's computer, or both. Or, the spam will ask the user to open an attachment which contains malware.
How do I stop spam?
As a system administrator of a mail server, you will configure your mail server and domain to make it more difficult for spammers to deliver their spam to your users.
I will be covering issues specifically focused on spam and may skip over things not directly related to spam (such as encryption).
Don't run an open relay
The big mail server sin is to run an open relay, a SMTP server which will accept mail for any destination and deliver it onward. Spammers love open relays because they virtually guarantee delivery. They take on the load of delivering messages (and retrying!) while the spammer does something else. They make spamming cheap.
Open relays also contribute to the problem of backscatter. These are messages which were accepted by the relay but then found to be undeliverable. The open relay will then send a bounce message to the
From:
address which contains a copy of the spam.From:
andTo:
addresses are not within your domain. The message should be rejected. (Or, use an online service like MX Toolbox to perform the test, but be aware that some online services will submit your IP address to blacklists if your mail server fails the test.)Reject anything that looks too suspicious
Various misconfigurations and errors can be a tip-off that an incoming message is likely to be spam or otherwise illegitimate.
HELO
/EHLO
.HELO
/EHLO
is:Authenticate your users
Mail arriving at your servers should be thought of in terms of inbound mail and outbound mail. Inbound mail is any mail arriving at your SMTP server which is ultimately destined for your domain; outbound mail is any mail arriving at your SMTP server which will be transferred elsewhere before being delivered (eg. it's going to another domain). Inbound mail can be handled by your spam filters, and may come from anywhere but must always be destined for your users. This mail can't be authenticated, because it is not possible to give credentials to every site which might send you mail.
Outbound mail, that is, mail which will be relayed, must be authenticated. This is the case whether it comes from the Internet or from inside your network (though you should restrict the IP address ranges allowed to use your mailserver if operationally possible); this is because spambots might be running inside your network. So, configure your SMTP server such that mail bound for other networks will be dropped (relay access will be denied) unless that mail is authenticated. Better still, use separate mail servers for inbound and outbound mail, allow no relaying at all for the inbound ones, and allow no unauthenticated access to the outbound ones.
If your software allows this, you should also filter messages according to the authenticated user; if the from address of the mail does not match the user who authenticated, it should be rejected. Do not silently update the from address; the user should be aware of the configuration error.
You should also log the username which is used to send mail, or add an identifying header to it. This way, if abuse does occur, you have evidence and know which account was used to do it. This allows you to isolate compromised accounts and problem users, and is especially valuable for shared hosting providers.
Filter traffic
You want to be certain that mail leaving your network is actually being sent by your (authenticated) users, not by bots or people from outside. The specifics of how you do this depend on exactly what kind of system you are administering.
Generally, blocking egress traffic on ports 25, 465, and 587 (SMTP, SMTP/SSL, and Submission) for everything but your outbound mailservers is a good idea if you are a corporate network. This is so that malware-running bots on your network cannot send spam from your network either to open relays on the Internet or directly to the final MTA for an address.
Hotspots are a special case because legitimate mail from them originates from many different domains, but (because of SPF, among other things) a "forced" mailserver is inappropriate and users should be using their own domain's SMTP server to submit mail. This case is much harder, but using a specific public IP or IP range for Internet traffic from these hosts (to protect your site's reputation), throttling SMTP traffic, and deep packet inspection are solutions to consider.
Historically, spambots have issued spam mainly on port 25, but nothing prevents them from using port 587 for the same purpose, so changing the port used for inbound mail is of dubious value. However, using port 587 for mail submission is recommended by RFC 2476, and allows for a separation between mail submission (to the first MTA) and mail transfer (between MTAs) where that is not obvious from network topology; if you require such separation, you should do this.
If you are an ISP, VPS host, colocation provider, or similar, or are providing a hotspot for use by visitors, blocking egress SMTP traffic can be problematic for users who are sending mail using their own domains. In all cases except a public hotspot, you should require users who need outbound SMTP access because they are running a mailserver to specifically request it. Let them know that abuse complaints will ultimately result in that access being terminated to protect your reputation.
Dynamic IPs, and those used for virtual desktop infrastructure, should never have outbound SMTP access except to the specific mailserver those nodes are expected to use. These types of IPs should also appear on blacklists and you should not attempt to build reputation for them. This is because they are extremely unlikely to be running a legitimate MTA.
Consider using SpamAssassin
SpamAssassin is a mail filter which can be used to identify spam based on the message headers and content. It uses a rules-based scoring system to determine the likelihood that a message is spam. The higher the score, the more likely the message is spam.
SpamAssassin also has a Bayesian engine which can analyze spam and ham (legitimate email) samples fed back into it.
Best practice for SpamAssassin is not to reject the mail, but to put it in a Junk or Spam folder. MUAs (mail user agents) such as Outlook and Thunderbird can be set up to recognize the headers that SpamAssassin adds to email messages and to file them appropriately. False positives can and do happen, and while they're rare, when it happens to the CEO, you will hear about it. That conversation will go much better if the message was simply delivered to the Junk folder rather than rejected outright.
SpamAssassin is almost one-of-a-kind, though a few alternatives exist.
sa-update
.Consider using DNS-based blackhole lists and reputation services
DNSBLs (formerly known as RBLs, or realtime blackhole lists) provide lists of IP addresses associated with spam or other malicious activity. These are run by independent third parties based on their own criteria, so research carefully whether the listing and delisting criteria used by a DNSBL is compatible with your organization's need to receive email. For instance, a few DNSBLs have draconian delisting policies which make it very difficult for someone who was accidentally listed to be removed. Others automatically delist after the IP address has not sent spam for a period of time, which is safer. Most DNSBLs are free to use.
Reputation services are similar, but claim to provide better results by analyzing more data relevant to any given IP address. Most reputation services require a subscription payment or hardware purchase or both.
There are dozens of DNSBLs and reputation services available, though some of the better known and useful ones I use and recommend are:
Conservative lists:
Aggressive lists:
As mentioned before, many dozens of others are available and may suit your needs. One of my favorite tricks is to look up the IP address which delivered a spam that got through against multiple DNSBLs to see which of them would have rejected it.
Use SPF
SPF (Sender Policy Framework; RFC 4408 and RFC 6652) is a means to prevent email address spoofing by declaring which Internet hosts are authorized to deliver mail for a given domain name.
-all
to reject all others.Investigate DKIM
DKIM (DomainKeys Identified Mail; RFC 6376) is a method of embedding digital signatures in mail messages which can be verified using public keys published in the DNS. It is patent-encumbered in the US, which has slowed its adoption. DKIM signatures can also break if a message is modified in transit (e.g. SMTP servers occasionally may repack MIME messages).
Consider using greylisting
Greylisting is a technique where the SMTP server issues a temporary rejection for an incoming message, rather than a permanent rejection. When the delivery is retried in a few minutes or hours, the SMTP server will then accept the message.
Greylisting can stop some spam software which is not robust enough to differentiate between temporary and permanent rejections, but does not help with spam that was sent to an open relay or with more robust spam software. It also introduces delivery delays which users may not always tolerate.
Consider using nolisting
Nolisting is a method of configuring your MX records such that the highest priority (lowest preference number) record does not have a running SMTP server. This relies on the fact that a lot of spam software will only try the first MX record, while legitimate SMTP servers try all MX records in ascending order of preference. Some spam software also attempts to send directly to the lowest priority (highest preference number) MX record in violation of RFC 5321, so that could also be set to an IP address without an SMTP server. This is reported to be safe, though as with anything, you should test carefully first.
Consider a spam filtering appliance
Place a spam filtering appliance such as Cisco IronPort or Barracuda Spam & Virus Firewall (or other similar appliances) in front of your existing SMTP server to take much of the work out of reducing the spam you receive. These appliances are pre-configured with DNSBLs, reputation services, Bayesian filters and the other features I've covered, and are updated regularly by their manufacturers.
Consider hosted email services
If it's all too much for you (or your overworked IT staff) you can always have a third party service provider handle your email for you. Services such as Google's Postini, Symantec MessageLabs Email Security (or others) will filter messages for you. Some of these services can also handle regulatory and legal requirements.
What guidance should sysadmins give to end users regarding fighting spam?
The absolute #1 thing that end users should do to fight spam is:
DO NOT RESPOND TO THE SPAM.
If it looks funny, don't click the website link and don't open the attachment. No matter how attractive the offer seems. That viagra isn't that cheap, you aren't really going to get naked pictures of anybody, and there is no $15 million dollars in Nigeria or elsewhere except for the money taken from people who did respond to the spam.
If you see a spam message, mark it as Junk or Spam depending on your mail client.
DO NOT mark a message as Junk/Spam if you actually signed up to receive the messages and just want to stop receiving them. Instead, unsubscribe from the mailing list using the unsubscribe method provided.
Check your Junk/Spam folder regularly to see if any legitimate messages got through. Mark these as Not Junk/Not Spam and add the sender to your contacts to prevent their messages from being marked as spam in the future.
I've managed over 100 separate mail environments over the years and have used numerous processes to reduce or help eliminate spam.
Technology has evolved over time, so this answer will walk through some of the things I've tried in the past and detail the current state of affairs.
A few thoughts about protection...
With regard to incoming spam...
My current approach:
I'm a strong advocate of appliance-based spam solutions. I want to reject at the perimeter of the network and save the CPU cycles at the mail server level. Using an appliance also provides some independence from the actual mail server (mail delivery agent) solution.
I recommend Barracuda Spam Filter appliances for a number of reasons. I've deployed several dozen units, and the web-driven interface, industry mindshare and set-and-forget appliance nature make it a winner. The backend technology incorporates many of the techniques listed above.
Barracuda Spam & Virus Firewall 300 status console
Newer approach:
I've been experimenting with Barracuda's Cloud-based Email Security Service over the past month. This is similar to other hosted solutions, but is well-suited to smaller sites, where an expensive appliance is cost-prohibitive. For a nominal yearly fee, this service provides about 85% of what the hardware appliance does. The service can also be run in tandem with an onsite appliance to reduce incoming bandwidth and provide another layer of security. It's also a nice buffer that can spool mail in the event of a server outage. The analytics are still useful, although, not as detailed as a physical unit's.
Barracuda Cloud Email Security console
All in all, I've tried many solutions, but given the scale of certain environments, and the increasing demands of the user base, I want the most elegant solution(s) available. Taking the multi-pronged approach and "rolling your own" is certainly possible, but I've done well with some basic security and good use monitoring of the Barracuda device. Users are very happy with the result.
Note: Cisco Ironport is great as well... Just costlier.
Partly, I endorse what others have said; partly, I don't.
Spamassassin
This works very well for me, but you need to spend some time training the Bayesian filter with both ham and spam.
Greylisting
ewwhite may feel its day has come and gone, but I can't agree. One of my clients asked how effective my various filters were, so here are approximate stats for July 2012 for my personal mailserver:
So about 44000 never made it through the greylisting; if I'd not had greylisting, and had accepted all those, they'd have all needed spam filtering, all using CPU and memory, and indeed bandwidth.
Edit: since this answer seems to have been useful to some people, I thought I'd bring the statistics up-to-date. So I re-ran the analysis on the mail logs from Jan 2015, 2.5 years later.
The numbers aren't directly comparable, because I no longer have a note of how I arrived at the 2012 figures, so I can't be sure the methodologies were identical. But I have confidence that I didn't have to run computationally-expensive spam filtering on an awful lot of content back then, and I still don't, because of greylisting.
SPF
This isn't really an anti-spam technique, but it can reduce the amount of backscatter you have to deal with, if you're joe-jobbed. You should use it both in and out, that is: You should check the SPF record of the sender for incoming email, and accept/reject accordingly. You should also publish your own SPF records, listing fully all machines that are approved to send mail as you, and lock out all others with
-all
. SPF records that don't end in-all
are completely useless.Blackhole lists
RBLs are problematic, since one can get onto them through no fault of one's own, and they can be hard to get off. Nevertheless, they have a legitimate use in spam-fighting but I would strongly suggest that no RBL should ever be used as a bright-line test for mail acceptance. The way spamassassin handles RBLs - by using many, each of which contributes towards a total score, and it's this score that makes the accept/reject decision - is much better.
Dropbox
I don't mean the commercial service, I mean that my mail server has one address which cuts through all my greylisting and spam-filtering, but which instead of delivering to anyone's INBOX, it goes to a world-writeable folder in
/var
, which is automatically pruned nightly of any emails over 14 days old.I encourage all users to take advantage of it when eg filling out email forms that require a validatable email address, where you're going to receive one email that you need to keep, but from whom you never wish to hear again, or when buying from online vendors who will likely sell and/or spam their address (particularly those outside the reach of European privacy laws). Instead of giving her real address, a user can give the dropbox address, and look in the dropbox only when she expects something from a correspondent (usually a machine). When it arrives, she can pick it out and save it in her proper mail collection. No user need look in the dropbox at any other time.
I am using a number of techniques which reduce spam to acceptable levels.
Delay accepting connections from incorrectly configured servers. A majority of the Spam I receive is from Spambots running on malware infected system. Almost all of these do not pass rDNS validation. Delaying for 30 seconds or so before each response causes most Spambots to give up before they have delivered their message. Applying this only to servers which fail rDNS avoids penalizing properly configured servers. Some incorrectly configured legitimate bulk or automated senders get penalized, but do deliver with minimal delay.
Configuring SPF for all your domains protects your domains. Most sub-domains should not be used to send email. The main exception is MX domains which must be able to send mail on their own. A number of legitimate senders delegate bulk and automated mail to servers that are not permitted by their policy. Deferring rather than rejecting based on SPF allow them to fix their SPF configuration, or you to whitelist them.
Requiring a FQDN (Fully Qualified Domain Name) in the HELO/EHLO command. Spam often uses an unqualified hostname, address literals, ip addresses, or invalid TLD (Top Level Domain). Unfortunately some legitimate senders use invalid TLDs so it may be more appropriate to defer in this case. This can require monitoring and whitelisting to enable the mail through.
DKIM helps with non-repudiation, but is otherwise not highly useful. My experience is that Spam is not likely to be signed. Ham is more likely to be signed so it has some value in Spam scoring. A number of legitimate senders don't publish their public keys, or otherwise improperly configure their system.
Greylisting is helpful for servers which show some signs of misconfiguration. Servers that are properly configured will get through eventually, so I tend to exclude them from greylisting. It is useful to greylist freemailers as they do tend to be used occasionally for Spam. The delay gives some of the Spam filter inputs time to catch the Spammer. It also tends to deflect Spambots as they usually don't retry.
Blacklists and Whitelists can help as well.
Spam filtering software is reasonably good at finding Spam although some will get through. It can be tricky getting the false negative to a reasonable level without increasing the false positive too much. I find Spamassassin catches most of the Spam that reaches it. I've added a few custom rules, that fit my needs.
Postmasters should configure the required abuse and postmaster addresses. Acknowledge the feedback you get to these addresses and act on it. This allows other to help you ensure your server is properly configured and not originating Spam.
If you are a developer, use the existing email services rather than setting up your own server. It is my experience that servers setup for automated mail senders are likely to be incorrectly configured. Review the RFCs and send properly formatted email from a legitimate address in your domain.
End users can do a number of things to help reduce Spam:
Domain owners / ISPs can help by limiting Internet access on port 25 (SMTP) to official e-mail servers. This will limit the ability of Spambots to send to the Internet. It also helps when dynamic addresses return names which do not pass rDNS validation. Even better is to verify the PTR record for mail servers do pass rDNS valiation. (Verify for typographical errors when configuring PTR records for your clients.)
I have started classifying email in three categories:
The SINGLE most effective solution I have seen is to use one of the external mail filtering services.
I have experience with the following services at current clients. I am sure there are others. Each of these has done an excellent job in my experience. The cost is reasonable for all three.
The services have several huge advantages over local solutions.
They stop most (>99%) of the spam BEFORE it hits your internet connection and your email server. Given the volume of spam, this is a lot of data not on your bandwidth and not on your server. I have implemented one of these services a dozen times and every one resulted in a noticeable performance improvement to the email server.
They also do anti-virus filtering, typically both directions. This mitigates the need to have a "mail anti-virus" solution on your server, and also keeps the virii completely
They also do a great job at blocking spam. In 2 years working at a company using MXLogic, I have never has a false positive, and can count the legit spam messages that got through on one hand.
No two mail environments are the same. So building an effective solution will require a lot of trial and error around the many different techniques available because the content of email, traffic, software, networks, senders, recipients and a lot more will all vary hugely across different environments.
However I find the following block lists (RBLs) to be well suited for general filtering:
As already stated SpamAssassin is a great solution when configured correctly, just make sure to install as many of the addon Perl modules in CPAN as possible as well as Razor, Pyzor and DCC. Postfix works very well with SpamAssassin and it's a lot easier to manage and configure than EXIM for example.
Finally blocking clients at IP level using fail2ban and iptables or similar for short periods of time (say one day to a week) after some events such as triggering a hit on an RBL for abusive behavior can also be very effective. Why waste resources talking to a known virus infected host right?