The Web Robots Pages can teach you more about web robots - the Good Ones

The Web Robots Pages can teach you more about web robots - the Good Ones. "Bad" Robots ignore the Robots Page on your site ... or else they go straight to the folders that it SAYS they are not allowed in.


The Web Robots Pages

 Web Robots (also known as Web Wanderers, Crawlers, or Spiders), are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses.

On this site you can learn more about web robots.


The Web Robot Pages is an information resource dedicated to web robots. Initially hosted at WebCrawler in 1995, it moved to this dedicated site hosted by independent in 2000. It underwent a modernisation in 2007.

Other Sites


Many people end up on this site because they have questions about specific search engine robots and search engines. For such questions the best place is the relevant's site's own help pages:

Extensions to the Robots Exclusion Protocol

Recently three major search engines have collaborated to support extensions to the /robotst.txt directives and related mechanisms. See the join announcements on:


