<<< Chronological >>> Author Index    Subject Index <<< Threads >>>

Re: Interesting spamming tool: a robot to grab e-mail addresses


Hi,

I'm doing exactly such a time slicing to protect my directory of world
readable stuff information from unwanted mass access via our TWEB web-x500
gateway. The only difference to what is supposed here is that I protect
the whole data area by this method. The tricky thing, however, is to
decide what unwanted access is compared to wanted access. How large must
be the time slices and how many accesses do you accept during that time.
Most of the current robots or spiders have adapted to a plolicy to access
the server only once or twice in a minute. That is surely below a usable
threshhold, but how long would it last for these robots to sniff your
whole data. That question, however, seems to be of little concern to those
guys, since I repeatedly see such visits.


Kurt


BTW: My heuristic is: timeslices of 150 secs and 40 accesses during that
     time will suspend that IP-domain for roughly half an our. This
     prevents most human hackers from continuing. But as I said, that 
     will not capture sniffers any longer but can prevent you from a 
     denial-of-service attack (a second important aspect with respect to
     running a reliable service)



On Mon, 30 Mar 1998, Simon Wilkinson wrote:

> Date: Mon, 30 Mar 1998 16:27:12 +0100
> From: Simon Wilkinson sxw@localhost
> To: Peter Valkenburg valkenburg@localhost, tf-chic@localhost
> Cc: anti-spam@localhost
> Subject: Re: Interesting spamming tool: a robot to grab e-mail addresses
> 
> > > Bull's Eye Gold is the PREMIER email address collection tool.
> > > ...
> > >     ...  All you need is your web browser and this program.
> > > Our software utilizes the latest in search technology called
> > > "spidering". By simply feeding the spider program a starting
> > > website it will collect for hours. The spider will go from website
> > > to targeted website providing you with thousands upon thousands of
> > > fresh TARGETED email addresses.
> > 
> > Hm, I suspect that this software does not care about robots.txt.
> > I can't think of an easy way to stop this sort of free enterprise
> > at the doorstep of a webserver..
> 
> Here's an idea (presuming it ignores robots.txt files, and forges its
> User-Agent to look like a popular browser) Use wpoison (or similar) to
> construct an infinitely deep area of your web tree. List this area in
> your robots.txt file.
> 
> If you see more than (x) hits to this area in a certain time from a
> certain IP address then set up access control measures to block that
> IP address from accessing your server. Perhaps return a message
> telling them why this is happened, and how to have their access
> re-enabled. You could do all of this automatically, so the admin
> wouldn't have to do anything about it.
> 
> <Flight of fantasy>
> You could even set up something similar to the RBL - where a method could
> exist for maintaining a global list of "bad" IP addresses, who would then
> be barred from accessing web servers that used that list.
> </Flight of fantasy>
> 
> Cheers,
> 
> Simon.
> 


----------==========#########>>>>>ZDV<<<<<#########==========----------

X.500:                                              Tel.:
   Kurt Spanier, Zentrum fuer Datenverarbeitung,      +49 7071 29-70334
   Universitaet Tuebingen, DE
SMTP-Mail:                                          FAX.:
   kurt.spanier@localhost                   +49 7071 29-5912
Snail-Mail:
   Dr. Kurt Spanier, Zentrum fuer Datenverarbeitung,
   Universitaet Tuebingen, Waechterstrasse 76, D-72074 Tuebingen
PGP-Public-Key:
   finger "Kurt Spanier"@x500.uni-tuebingen.de

----------==========##########>>>>>@<<<<<##########==========----------







<<< Chronological >>> Author    Subject <<< Threads >>>