robots.txt questions - Page 1

Asked: 2012-09-01 11:23:33 +0800 CST

How do I use robots.txt to disallow crawling for only my subdomains?

8

If I want my main website to on search engines, but none of the subdomains to be, should I just put the "disallow all" robots.txt in the directories of the subdomains? If I do, will my main domain still be crawlable?

Mike B

Asked: 2010-12-17 11:05:19 +0800 CST

How to create robots.txt file for all domains on Apache server

10

We have a XAMPP Apache development web server setup with virtual hosts and want to stop serps from crawling all our sites. This is easily done with a robots.txt file. However, we'd rather not include a disallow robots.txt in every vhost and then have to remove it when we went live with the site on another server.

Is there a way with an apache config file to rewrite all requests to robots.txt on all vhosts to a single robots.txt file?

If so, could you give me an example? I think it would be something like this:

RewriteEngine On
RewriteRule  .*robots\.txt$         C:\xampp\vhosts\override-robots.txt [L]

Thanks!

qxotk

Asked: 2010-08-19 10:09:21 +0800 CST

How Can I Encourage Google to Read New robots.txt File?

23

I just updated my robots.txt file on a new site; Google Webmaster Tools reports it read my robots.txt 10 minutes before my last update.

Is there any way I can encourage Google to re-read my robots.txt as soon as possible?

UPDATE: Under Site Configuration | Crawler Access | Test robots.txt:

Home Page Access shows:

Googlebot is blocked from http://my.example.com/

FYI: The robots.txt that Google last read looks like this:

User-agent: *
Allow: /<a page>
Allow: /<a folder>
Disallow: /

Have I shot myself in the foot, or will it eventually read: http:///robots.txt (as it did the last time it read it)?

Any ideas on what I need to do?

DaveC

Asked: 2010-08-10 10:06:37 +0800 CST

Which bots and spiders should I block in robots.txt?

14

In order to:

Increase security of my website
Reduce bandwidth requirements
Prevent email address harvesting

Tim Erickson

Asked: 2010-07-29 13:06:43 +0800 CST

How do you create a single robots.txt file for all sites on an IIS instance

6

I want to create a single robots.txt file and have it served for all sites on my IIS (7 in this case) instance.

I do not want to have to configure anything on any individual site.

How can I do this?

How do I use robots.txt to disallow crawling for only my subdomains?

How to create robots.txt file for all domains on Apache server

How Can I Encourage Google to Read New robots.txt File?

Which bots and spiders should I block in robots.txt?

How do you create a single robots.txt file for all sites on an IIS instance

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Questions[robots.txt](server)