If I want my main website to on search engines, but none of the subdomains to be, should I just put the "disallow all" robots.txt in the directories of the subdomains? If I do, will my main domain still be crawlable?
We have a XAMPP Apache development web server setup with virtual hosts and want to stop serps from crawling all our sites. This is easily done with a robots.txt file. However, we'd rather not include a disallow robots.txt in every vhost and then have to remove it when we went live with the site on another server.
Is there a way with an apache config file to rewrite all requests to robots.txt on all vhosts to a single robots.txt file?
If so, could you give me an example? I think it would be something like this:
RewriteEngine On
RewriteRule .*robots\.txt$ C:\xampp\vhosts\override-robots.txt [L]
Thanks!
I just updated my robots.txt file on a new site; Google Webmaster Tools reports it read my robots.txt 10 minutes before my last update.
Is there any way I can encourage Google to re-read my robots.txt as soon as possible?
UPDATE: Under Site Configuration | Crawler Access | Test robots.txt:
Home Page Access shows:
Googlebot is blocked from http://my.example.com/
FYI: The robots.txt that Google last read looks like this:
User-agent: *
Allow: /<a page>
Allow: /<a folder>
Disallow: /
Have I shot myself in the foot, or will it eventually read: http:///robots.txt (as it did the last time it read it)?
Any ideas on what I need to do?
In order to:
- Increase security of my website
- Reduce bandwidth requirements
- Prevent email address harvesting
I want to create a single robots.txt file and have it served for all sites on my IIS (7 in this case) instance.
I do not want to have to configure anything on any individual site.
How can I do this?