Ping a Specific Port

Question

qxotk

Asked: 2010-08-19 10:09:21 +0800 CST2010-08-19 10:09:21 +0800 CST 2010-08-19 10:09:21 +0800 CST

How Can I Encourage Google to Read New robots.txt File?

772

I just updated my robots.txt file on a new site; Google Webmaster Tools reports it read my robots.txt 10 minutes before my last update.

Is there any way I can encourage Google to re-read my robots.txt as soon as possible?

UPDATE: Under Site Configuration | Crawler Access | Test robots.txt:

Home Page Access shows:

Googlebot is blocked from http://my.example.com/

FYI: The robots.txt that Google last read looks like this:

User-agent: *
Allow: /<a page>
Allow: /<a folder>
Disallow: /

Have I shot myself in the foot, or will it eventually read: http:///robots.txt (as it did the last time it read it)?

Any ideas on what I need to do?

5 Answers

Voted

Matt · Answer 1 · 2012-05-03T10:01:08+08:00

Best Answer

Matt

2012-05-03T10:01:08+08:002012-05-03T10:01:08+08:00

In case anyone else runs into this problem there is a way to force google-bot to re-download the robots.txt file.

Go to Health -> Fetch as Google [1] and have it fetch /robots.txt

That will re-download the file and google will also re-parse the file.

[1] in the previous Google UI it was 'Diagnostics -> Fetch as GoogleBot'.

25

Hussam · Answer 2 · 2011-10-26T13:58:38+08:00

Hussam

2011-10-26T13:58:38+08:002011-10-26T13:58:38+08:00

I know this is very old, but... If you uploaded the wrong robots.txt (disallowing all pages), you can try the following:

first correct your robots.txt to allow the correct pages, then
upload a sitemap.xml with your pages

as google tries to read the xml sitemap, it will check it agains robots.txt, forcing google to re-read your robots.txt.

4

potrodoido · Answer 3 · 2020-02-22T12:11:15+08:00

potrodoido

2020-02-22T12:11:15+08:002020-02-22T12:11:15+08:00

After have the same problem I sucessfuly made google reread my robots.txt file by submiting on this url:

https://www.google.com/webmasters/tools/robots-testing-tool

2

qxotk · Answer 4 · 2010-09-09T18:24:36+08:00

qxotk

2010-09-09T18:24:36+08:002010-09-09T18:24:36+08:00

OK. Here is what I did, and within a few hours, Google re-read my robots.txt files.

We have 2 sites for every 1 site we run. Let's call them the canonical site (www.mysite.com) and the bare-domain site (mysite.com).

We have our sites setup so that mysite.com always returns a 301 redirecting to the www.mysite.com.

Once I setup both sites in Google Webmaster tools, told it that the www.mysite.com is the canonical site, it soon after read the robots.txt file on the canonical site.

I don't really know why, but that's what happened.

1

BarsMonster · Answer 5 · 2010-08-19T10:11:48+08:00

BarsMonster

2010-08-19T10:11:48+08:002010-08-19T10:11:48+08:00

Shorten google scan interval for some days.

Also, I've seen there buttom to verify your robots.txt, this might force it to google, but I am not sure.

0

How Can I Encourage Google to Read New robots.txt File?

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?