I have been using Sendmail together with milter-greylist for many years at several sites.
milter-greylist has support for defining greylisting rules based on GeoIP database lookups. This is very convenient for companies who do not do business internationally. Almost all spam is sent from foreign IP addresses. It does not matter if legitimate (ham) e-mail from foreign addresses is slightly delayed. Local e-mail must arrive without delays, thus greylisting is skipped for a couple of country codes. Also if SPF record matches or the IP is on a whitelist the greylisting is skipped. This is very simple to implement in greylist.conf with the milter hook in sendmail.cf. It is also good for the mail server's resources because most spam is dropped before it ever arrives on the server and thus the system load caused by spamassassin and/or dspam based filtering solutions further down the delivery path is much lower.
Now to the real question:
How can I implement similar (i.e. GeoIP based) greylisting with Exim?
I have a new responsibility to take care of yet another mail server which happens to run Exim and receives a high volume of spam. I do not feel like re-implementing their e-mail delivery system from scratch but I definitely need to do something about the load caused by their spam volumes. Unfortunately Exim does not seem to have milter interface. Also I was unable to locate greylisting solutions with GeoIP support for Exim. I am a complete noob with Exim (I can do everything with sendmail.cf and sendmail m4 macros).
I would be happy if implementing this feature was possible by using just exim configuration file syntax. In that case I would take the effort of learning it and possibly starting to use exim at other sites as well.
I am answering my own question now that I have a solution that I like myself.
Greylisting itself can be implemented purely with Exim access control lists, or an external greylisting helper can be hooked to the ACLs. There are several approaches to this, which are documented elsewhere.
Greylisting is typically implemented in access control lists, and therefore it is easy to add some external IP address lookup in the ACL to control greylisting behavior (for example to skip greylisting according to a country code lookup).
There are several alternatives for getting the country code:
dlfunc
library which implements the GeoIP lookup in the ACL.I personally chose the last option as it is most efficient and does not depend on external resources. I implemented a new
dlfunc
library for this purporse as none of the several existing ones had IPv6 support. My implementation with simple examples is available at: http://dist.epipe.com/exim/. While implementing this I learned about Exim ACLs and found that they are extremely powerful for implementing any kind of mail acceptance policies.Now it is easy to skip greylisting for certain countries by adding an ACL rule before the greylisting rules:
Exim versions older than 4.77 do not have
inlist{
syntax. The same can be achieved by changing the second rule as follows:By far the best way to manage spam is by bayesian filtering. While you may get transient benefits applying other approaches before applying the bayesian filters, the success of bayesian filtering is dependant on having a large volume of spam and ham to model - so if you start denying emails based on the IP address then you'll lose out in detection in the long run. OTOH it should be possible to flag the message rather than just denying it.
SPF, and RBLs are also well proven ways to prevent spam. And supported by spamassassin (along with Bayesian filtering and others).
Have you modelled your data to see whether adding country lookup will improve spam detection? Compared with a well configured spamassassin installation?
If you must go down this route....
Writing a milter is easy - but IIRC, Exim does not support milters.
Trying to map out all the allowable IP addresses as Exim ACLs would be very difficult.
So the most practical way to implement this would be using a MDA which supports header injecting filters (e.g. procmail) which then feeds into spamassassin.
It only helps a little bit, but there are GeoIP services that are queryable via DNS (like DNSBLs). Maybe you can use it as a base for making decisions based on the result.
See for example http://www.netop.org/services/ip-geolocation