PDA

View Full Version : robots.txt hits


aslefo
06-02-2001, 07:14 AM
Hi!

I've figured out that the request for a robots.txt file, is a webspider scanning the site. To decrease my error-log, i've added a blank robots.txt file. Last month, however, I got about 14000 (!) hits on this file. I fear that this may affect the billing from my ISP. When looking into the access-log, I see that the same IP addresses has requested the file 100's of times in less than one minute.

Is there something I can do to make these spiders only look up the file once? What's the robots.txt for anyway?

Jason
06-02-2001, 01:19 PM
Hi Aslefo,

The search enginesuse the robots.txt file to determine what not to index, and to detrmine how other pages should be indexed.

This link should be help, it is a page from one of the resources listed at 123webmaster.com

http://www.rietta.com/robogen/intro.shtml

Jason
06-02-2001, 01:22 PM
Also very helpful:

http://www.w3.org/Search/9605-Indexing-Workshop/Papers/Frumkin@Excite.html

http://www.robotstxt.org/wc/robots.html

aslefo
06-03-2001, 08:39 AM
Thanks, Jason!