Re: automated spam detection
- Date: Wed, 17 Feb 1999 15:30:26 +0100 (MET)
On Wed, 17 Feb 1999, Richard Kettlewell wrote:
> > E.g. site A receives 51 mail messages with identical message bodies
> > and is configured to automatically warn its 'friends' about messages
> > that appear more than 50 times but it might not consider a message
> > spam and start rejecting it before it appears 100 times. It warns
> > system B and tells B that "Message AABBCCDD has been seen here 51
> > times" which means that B can immediately increase *its* counter for
> > that message by 51, which might mean B feels obligated to warn C or
> > maybe the counter gets high enough that B starts thinking of that
> > message as pure spam. Message body checksums are of course only one
> > way of detecting (some) spam. There are lots of other variables that
> > could be used.
>
> That sounds like a good idea.
>
> I would say that received notifications should not be passed on - in
> the above scenario, A may or may not be in communication with C, and
> if B guesses wrong whether it is or not then C may receive the
> notification for those 51 messages twice or not at all.
>
Yeah, there's a risk of loops here and having messages counted more
than once, but that can be avoided by using e.g. Path: headers when
propagating informational messages between SMTP servers (Maybe someone
isn't familiar with the Path: headers used for NNTP transmission but
it's a header with fields, separated by '!' characters, that contain
unique strings identifying each host/site that has received an article
to let the servers know which sites have and have not seen the article.
Every server that passes on an article adds its unique identifier to
the Path: header first).
> The point about legitimate mailing lists is still a problem though...
>
Mailing lists, and other forms of approved bulk-mailing, would have
to be specifically excluded. They are a problem but might not be a big
problem.
/Ragnar