In all the information on the internet I could find about greylisting I find the information that the following tripplet is used to uniquely distinguish an incoming e-mail:
- Source IP
- Source e-mail address
- Destination e-mail address
Now the source IP can make problems because large mail services use multiple IP addresses (possibly from complete different IP ranges) to re-send blocked e-mails.
Question
Why is it at all relevant to consider the source IP address? Why not just use source and destination e-mail address as the key to identify a given e-mail (sender-receiver link)?
Why not instead using the subject to more uniquely identify a specific e-mail?
Reasoning
Even after doing quite some thinking about what kind of problems could arise when just ignore the source IP I didn't find any reason where the source IP could be relevant.
- When the same IP sends two times from the same source to the same destination e-mail address (and wait the required time), the e-mail is delivered
- When two different IPs send from the same source to the same destination e-mail address, the e-mail also should be delivered (e.g. large mail services)
- Some greylisting solutions allow for a subnet mask for the source IP. But this is very unsharp and does not accommodate for all situations - especially not for ultra-large mail services with MTAs standing in completely different subnets.
- What about a legitimate mail-sender who sends 2 different e-mails within the "try later time period" to the same destination e-mail address the first time?
- Using the tripplet: Source and destination e-mail address and subject should theoretically more accurately treat each individual e-mail with the greylisting - even when coming from the same sender to the same receipient.
But my main question is: Why at all include the source IP in the tripplet? (The chance that 2 different external entities will send with the same source e-mail address to the same destination e-mail address seems extremely unlikely to me)