The Saltmine Chronicles

Behind the scenes at a web hosting company

Clusty Controls

A small primer on Greylisting email

by Sean Conner
on Monday, October 22, 2007

Mistah Wheelus requested I write an explanation we could give to our customers about greylisting. And here's my attempt at being lucid.

Greylisting is an easy and effective (so far) anti-spam technique (our current tests show an effective spam-stopping rate of over 97%) but before I explain how it works, I must first explain a bit how the email system works.

Once you click “send” your computer will connect with an outgoing email server (this is the “Outgoing SMTP Server” setting in the configuration of your email account); it then identifies itself to the outgoing email server, sends your email address (technically, the “sender address”), then the email address of who you are sending it to (the “recipient address”) and then finally, the actual email (which may have nothing to do with either the sender or recipient email address, a fact that spammers often exploit for fun and profit).

Once the outgoing email server has accepted the email, it is then queued up for final delivery.

Technobabble

Actually, I'm describing the most commonly used configuration, where all outgoing emails for an organization (or an ISP) are funneled through a so-called “relay host” or “smart host” because of security issues or as a means of preventing outgoing spam.

Some ISPs go so far as to block all outgoing email traffic from their subscriber base, only allowing connections to their outgoing email server.

If an outgoing email server isn't required, then your computer may very well connect directly to the server responsible for the recipient's email and deliver the email directly. But then, your computer becomes responsible for redelivery in case the recpient server can't accept the email at that time.

There are more details of this in the next Technobabble section.

The outgoing email server will then look up where to send the email based upon the recipient's domain name, and once this is done, connects to an incoming email server that handles email for the recipient, and using SMTP, deliver the email to the recipient's email box. And if for any reason the email can't be delivered, or there's an error during the delivery, the outgoing email server queues up the email for another attempt at a later time (which can be a few minutes to maybe an hour later). And this is an important detail to remember.

Technobabble

I glossed over quite a few details here. The computer sending the email to the recipient first does a DNS lookup for a special type of record, the MX record. This returns a list of servers than handle email for that domain. For instance, at this moment in time, the following servers handle incoming email for gmail.com:

Incoming email servers for Gmail
Server name Server Priority
gmail-smtp-in.l.google.com 5
alt1.gmail-smtp-in.l.google.com 10
alt2.gmail-smtp-in.l.google.com 10
gsmtp163.google.com 50
gsmtp183.google.com 50

The server(s) with the lowest priority is checked first. If more than one server has the same priority, then one is picked randomly. So, in this case, if for some reason gmail-smtp-in.l.google.com is not responding, then the sending computer picks either alt1.gmail-smtp-in.l.google.com or alt2.gmail-smtp-in.l.google.com.

Oh, and what if there isn't an MX for the domain in question? Then the sending computer looks up the IP address associated with the domain and delivers the email to that machine.

Once the email has been successfully delivered, it's then saved in the recpients incoming email box, which stays there until the recipient retrieves the email (which is beyond the scope of this entry).

Now, how does Greylisting fit into all of this?

Greylisting works on the recipient side of this. Send me an email, and eventually, some server from your end (“your server”) will contact the server on my end (“my server”) to deliver the email. My server then has three pieces of information: the IP address of your server, your email address (assuming it matches the sender address) and my email address. And for the sake of an example, let's say it's [ 3.4.5.6 , fred@example.net , sean@pickint.net ]. My server will see if it has seen that particular combination before, and if not, record it, and send back to your server “try again later.” And until it's been at least 25 minutes since I first saw that particular combination, my server will keep sending back “try again later.”

After the initial 25 minutes, any email from 3.4.5.6 with sender fred@example.net and recipient sean@pickint.net will be accepted. But other emails from 3.4.5.6 can still experience the delay, if the sender email addresss, recipient email address, or both, are different. It's the combination of all three pieces of information that have to match.

Basically, greylisting delays an initial email by some period of time, only “whitelisting” it after a delay period. And while it seems strange, that simple strategy can easily filter out 97% of all spam, since most spammers don't want to bother with redelivery of non-delivered email. They're trying to get their spam out as fast as possible. Attempting to redeliver their spam will only complicate things on their end.

And it's this delay that causes the biggest complaints. But the delay is for an initial email from an unknown source. Once whitelisted, no delay. Second, email is not (and never was) instant messaging, despite it appearing that way. And third … um … do not talk about Fight Club?

You have permission to link freely to any entry here.

Employees, customers and agents of Pick Internet maintain this web site to enhance public access to the general information about web hosting and Internet service issues. This is a service that is continually under development. While we try to keep the information timely and accurate, we make no guarantees. We will make an effort to correct errors brought to our attention. Users should be aware that the information available on this web site may not reflect official positions of Pick Internet or its management.

Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement, recommendation, or favoring by Pick Internet. The views and opinions of authors expressed herein do not necessarily state or reflect those of Pick Internet.

With respect to documents available from this server, neither Pick Internet nor any of their employees, agents or customers assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information disclosed, or represents that its use would not infringe privately owned rights.

The documents on this web site contain hypertext pointers to information created and maintained by other public and private organizations. Please be aware that we do not control or guarantee the accuracy, relevance, timeliness, or completeness of this outside information. Further, the inclusion of pointers to particular items in hypertext is not intended to reflect their importance, nor is it intended to endorse any views expressed or products or services offered by the author of the reference or the organization operating the server on which the reference is maintained.

Copyright © 2006-2007 by Sean Conner. All Rights Reserved.