Ping a Specific Port

Question

kylex

Asked: 2012-03-22 08:11:58 +0800 CST2012-03-22 08:11:58 +0800 CST 2012-03-22 08:11:58 +0800 CST

If denying crawlers access to a directory via robots.txt, will it still index a file in that directory if I direct link?

772

I am denying indexing to a folder called pdf via robots.txt. However, I do direct link to a few files that exist in that directory.

Will search engines such as Google index those files, or ignore them because they reside in the pdf folder?

1 Answers

Voted

Ladadadada · Answer 1 · 2012-03-22T08:24:26+08:00

Best Answer

Ladadadada

2012-03-22T08:24:26+08:002012-03-22T08:24:26+08:00

Short answer: No.

Crawlers are disallowed from indexing anything under the URL prefix you put in robots.txt.

Longer answer: It depends.

The Allow keyword is not part of the standard but some robots will follow it. You can use this to Allow a particular URL and Disallow the entire subtree that contains that URL. Most bots work on a first-match-wins basis. Google and Bing work on a longest-string-wins basis regardless of the order of the Allow and Disallow lines.

1

If denying crawlers access to a directory via robots.txt, will it still index a file in that directory if I direct link?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?