Ping a Specific Port

Question

doug

Asked: 2009-12-15 15:47:17 +0800 CST2009-12-15 15:47:17 +0800 CST 2009-12-15 15:47:17 +0800 CST

Techniques for Filtering Spiders/Bots during Log Files Analysis

772

I'll begin by telling you what we do.

The measures we have implemented catch a lot of spiders, but we have no idea how many we are missing. Currently, we apply a set of measures that are obviously partially overlapping:

monitor requests for our robots.txt file: then of course filter all other requests from same IP address + user agent
compare user agent and IP addresses against published lists: iab.net and user-agents.org publish the two lists that seem to be the most widely used for this purpose
pattern analysis: we certainly don't have pre-set thresholds for these metrics but still find them useful. We look at (i) page views as a function of time (i.e., clicking a lot of links with 200 msec on each page is probative); (ii) the path by which the 'user' traverses out Site, is it systematic and complete or nearly so (like following a back-tracking algorithm); and (iii) precisely-timed visits (e.g., 3 am each day).

Again, I am fairly sure we're getting the low-hanging fruit, but I'm interested in getting the views from the community.

1 Answers

Voted

nik · Answer 1 · 2009-12-15T18:43:06+08:00

Best Answer

nik

2009-12-15T18:43:06+08:002009-12-15T18:43:06+08:00

These Newsletter posts tagged as Web Log Analysis at
the commercial Web log Analyzer from Nihuo site pages could be useful reading.

-3

Techniques for Filtering Spiders/Bots during Log Files Analysis

Ping a Specific Port

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What's the command-line utility in Windows to do a reverse DNS look-up?

How to check if a port is blocked on a Windows machine?

What port should I open to allow remote desktop?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?