<<< Chronological >>> Author Index    Subject Index <<< Threads >>>

Re: A Question


Piet, (Richard excluded from Cc-list as requested ;-)

In your message of Tue, 17 Feb 1998 11:56:14 +0100 you wrote:

+ Just to give you an idea
+ of what my 'personalised' filter treats as spam (addresses
+ treated case-insensitive):

+ - header-from address identical to header-to address
+   (incidentally, I've had 2 cases where a 'vacation'
+   message was seen as spam, because the remote user
+   had pre-filled the To: line with his own address);

Some system administrators might want to include an exception to this rule, 
i.e. only filter this when the domain-part is not 'local' (courtesy of 
Steven Bakker). That way you won't miss crontab output (from root to root 
;-) Another problem with this rule has been pointed out by Richard.

+ - underscore in domainpart of from- or to-address;

Ah, bad-DNS-bashing ;-)

+ - empty from-address or to-address;

As Richard pointed out this (empty To, From is obligatory) might filter 
perfectly legal mail (i.e. Bcc's or mail with only a Cc-recipient (silly 
but legal))

+ - space or tab in localpart;

Aha, the space-rule could filter out legal mail coming from for example 
X.400 systems ;-). RFC822 also doesn't object to spaces in the local part, 
provided the string is quoted.

+ - localpart consisting of 8 digits;

Students with their registration ID as local part ?

+ - localpart consisting of any of the names "everyone",
+   "friend" or "user".
+ The combination of the central and 'personalised' filters
+ is quite effective: I've seen upto 80 messages per day
+ blocked or discarded this way.

Personally I have my problems with header/content based filtering. Although 
probably effective in discarding spam there is also a fair chance that such 
filters mark perfectly valid mail as 'unwanted'. Besides logging the 
rejections one probably should filter them to a separate folder and at 
times inspect that folder for false positives.

When our users ask about spam-filters I always tell them that they should 
take the risk of not seeing valid mail into account and in most cases after 
that no filters are installed ;-)

The problem is that valid mail can come in many different incarnations, 
using all the possibilities as defined in the RFC's (Bcc's, group-syntax, 
quoted spaces, the works) or can come from the many broken systems in this 
world, violating the specs where possible. This makes personal filtering a 
tricky business. Not that it shouldn't be done, but the risks should be 
taken into account.

And of course, spammers are not stupid, once they know certain things are 
commonly filtered out, a new way of spreading their junk is used and 
everyone needs to update the filters, this might not scale in the long run.

Xander





<<< Chronological >>> Author    Subject <<< Threads >>>