Blog - Dominik Grzelak // Articles

This tutorial will focus on creating a domain blacklist that can be used by SpamAssassin. We'll see how to search for emails including the header information using standard UNIX command line tools to implement an extraction logic for possible spam candidates.

I presume that postfix, dovecot, and SpamAssassin are installed and configured. At the end, you will have:

postfix receives the mail
postfix sends mail to SpamAssassin
SpamAssassin sends the mail to dovecot for delivery
- But only if the sender isn't in the blacklist. Otherwise the sender gets a notification mail that the mail couldn't be delivered

The blacklist of SpamAssassin will act like a global filter for all mail users on the system.

What you get

After addresses are blacklisted, the original sender will receive a message like the following, if he/she is on the blacklist:

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of
its recipients. This is a permanent error. The following address(es)
failed:

alice@example.com:
SMTP error from remote server for TEXT command, host: mail.example.com (127.0.0.1) reason: 550 message identified as spam

This gives the sender the opportunity to contact the system administrator if the sender´s address was inserted accidentally in the blacklist.

Mail Folder Location

Dovecot is set up in this example to use virtual users. But there isn't much of a difference if system users are used. The only difference we are looking for is the path where the mails are stored for each domain and user.

First, open the configuration file of dovecot, which normally can be found at the location /etc/dovecot/dovecot.conf. Otherwise, find it by entering

find /etc -name dovecot.conf

at the command line (if dovecot is installed as a service, then don't use the config file at /etc/init/).

To find out where the mails for each domain and user are stored, find the property mail_home and mail_location. The property mail_home tells you, where the base path of all mails exist. mail_location tells you in which directory all mails for the specific domain and user are stored under mail_home. The wiki of Dovecot explains it very well (see Ways to set up home directory). In this example I have:

mail_home = /var/vmail/%d/%n
mail_location = maildir:~/

That tells me, that all mails are stored at /var/vmail/example.com/alice/ for alice@example.com.

Depending on your dovecot settings, you will find different folders in your mail directory (output truncated):

tree /var/vmail/example.com/alice/ -L 3 -a

/var/vmail/example.com/alice/
|-- .Archive
|   |-- cur
|   |-- new
|   `-- tmp
|-- cur
|   |-- 1504186870.M996138P8902.example.com,S=174137,W=176681:2,S
|   |-- ...
|-- .Drafts
|   |-- cur
|   |   |-- 1512731139.M568594P18171.example.com,S=10009212,W=10139212:2,S
|   |   |-- ...
|   |-- new
|   `-- tmp
|-- .Junk
|   |-- cur
|   |   |-- 1511683720.M263399P2587.example.com,S=3523,W=3589:2,S
|   |   |-- ...
|   |-- new
|   `-- tmp
|-- new
|-- .Sent
|   |-- cur
|   |   |-- 1512080393.M738158P32515.example.com,S=2414,W=2471:2,S
|   |   |-- ...
|   |-- new
|   `-- tmp
`-- .Trash
    |-- cur
    |   |-- 1507194573.M856241P3240.example.com,S=166495,W=168929:2,S
    |   |-- ...
    |-- new
    `-- tmp

Blacklist for SpamAssassin

Configuration

In this step, you will create a blacklist configuration file for SpamAssassin. So, locate the configuration file of SpamAssassin first:

find /etc -name local.cf
nano -w /etc/spamassassin/local.cf

Now insert the following line to end of the file:

include blacklist.cf

Save the file and after that, create the new file

touch blacklist.cf

In that file you're going to insert all domains that should be blacklisted by SpamAssassin.

Spam Detection for Mails

To automatically create a valid blacklist with possible spam candidates we have to search in all emails for a user and then manually evaluate those mails first.

Most certainly you'll find spam mails in the .Junk folder. So let's use this as an example. We're going to use grep to filter out the "From:" field of an email like this (output truncated):

> grep -rnw '/var/vmail/example.com/alice/.Junk/' -e 'From:'

Binary file /var/vmail/example.com/alice/.Junk/dovecot.index.cache matches
/var/vmail/example.com/alice/.Junk/cur/1512593241.M507525P18448.example.com,S=182232,W=184629:2,S:18:From: "Bitcoin Millions" <omzyysm@svensinc.biz.ua>
/var/vmail/example.com/alice/.Junk/cur/1515577697.M577156P7839.example.com,S=2571,W=2651:2,S:11:From: "Ralf Kanterheim" <vincentop@sheyingjun.com>
/var/vmail/example.com/alice/.Junk/cur/1515577773.M487457P7858.example.com,S=4018,W=4110:2,:10:From: "Konstanze Krantz" <konstanzewphmfdrkrantz@hisarkaplicalari.com>
/var/vmail/example.com/alice/.Junk/cur/1515577869.M759592P7858.example.com,S=826915,W=837727:2,:14:From: "Versandapotheke" <uwdilnc@limanaki.co.ua>
/var/vmail/example.com/alice/.Junk/cur/1515577953.M797451P7858.example.com,S=143228,W=145133:2,:13:From: "OneTwoSlim" <yyyujsx@svensinc.biz.ua>
/var/vmail/example.com/alice/.Junk/cur/1515577689.M553161P7839.example.com,S=2906,W=2977:2,S:10:From: "Gotthilf" <gotthilfzq9s8ca@globemg.com>

To use this we also append the -h parameter and remove the -n to suppress the filename and line number from the output:

> grep -rhw '/var/vmail/example.com/alice/.Junk/' -e 'From:'

From: "Bitcoin Millions" <omzyysm@svensinc.biz.ua>
From: "Ralf Kanterheim" <vincentop@sheyingjun.com>
From: "Konstanze Krantz" <konstanzewphmfdrkrantz@hisarkaplicalari.com>
From: "Versandapotheke" <uwdilnc@limanaki.co.ua>
From: "OneTwoSlim" <yyyujsx@svensinc.biz.ua>
From: "Gotthilf" <gotthilfzq9s8ca@globemg.com>

Next, we extract the part between "<" and ">":

> grep -rhw '/var/vmail/example.com/alice/.Junk/' -e 'From:' | grep -oP '(?<= <)[^>]+' > from-extracted.out
> cat from-extracted.out

omzyysm@svensinc.biz.ua
vincentop@sheyingjun.com
konstanzewphmfdrkrantz@hisarkaplicalari.com
uwdilnc@limanaki.co.ua
yyyujsx@svensinc.biz.ua
gotthilfzq9s8ca@globemg.com

By simply looking at those email addresses you instantly know that those are spam mails. We're doing a really simple spam detection here and nothing fancy.

You can incorporate other techniques to filter possible spam candidates and process further with the next step.

Update SpamAssassin

If you are happy, we can append this list to the blacklist configuration of SpamAssassin. The format of the blacklist command is described in the documentation:

e.g.

blacklist_from joe@example.com fred@example.com
blacklist_from *@example.com

It's easy now to add the prefix blacklist_from parameter to the beginning of each line of our generated output:

sed -e 's/^/blacklist_from /' -i from-extracted.out

In-place editing (-i) is used to add the prefix, so the file is updated after the command was entered. The last step is to append our new blacklist with the one created in SpamAssassin before and throw out the duplicates:

cat from-extracted.out >> /etc/spamassassin/blacklist.cf
sort -u -o /etc/spamassassin/blacklist.cf /etc/spamassassin/blacklist.cf

Finally, you have to restart SpamAssassin so that the blacklist can be used:

/etc/init.d/spamassassin restart

Voilà!

Two more Ideas

With the following command you can start a lint check if the configuration files of SpamAssassin are valid:

spamassassin -D --lint

Instead of using full email addresses you can insert also a wildcard like: *@example.com.

Further Alternatives

There are many other ways how to fight spam, for example:

Send to Spam-Folder and Learn

SpamAssassin can add specific spam-header-flags, so that dovecot can trigger the sieve filter. Once detected, it will move the mail into the spam folder. SpamAssassin can also be trained when the mail is moved within dovecot. For that I can refer to this tutorial at https://www.christianroessler.net/tech/2015/spamassassin-dovecot-postfix.html.

Block IPs

It's not possible (or easily done) to block single IPs or ranges with SpamAssassin. This action can take place in the MTA.
Better are entries in /etc/host.deny.

Firewall rules are also easy to add (depending on your settings, change port and ip):

iptables -A INPUT -s x.x.x.x -p tcp --dport 25 -j DROP
iptables -A INPUT -s x.x.x.x -j REJECT

Dominik Grzelak blog

Create a Script-driven Domain Blacklist in SpamAssassin