Ping a Specific Port

Question

Scott

Asked: 2011-08-05 08:04:23 +0800 CST2011-08-05 08:04:23 +0800 CST 2011-08-05 08:04:23 +0800 CST

how to group return string by timestamp and display how many hits per group

772

I am parsing my httpd access logs, to grab out which of our Google appliance's crawlers are bombarding my web servers, and when. If I type the command:

grep google /path/to/access_log | awk '{print $4, $14}'

I can get a very large (like I said, they're bombarding me) return set, at least 4 per second. I want to be able to group the above listed result set by the time stamp, and return a string inline with the number of hits there were per second. So ideally, I'd like to have something similar to

04/Aug/2011:15:56:16 Crawler1 6
04/Aug/2011:15:56:16 Crawler2 10
04/Aug/2011:15:56:17 Crawler1 8
04/Aug/2011:15:56:18 Crawler1 12

Where the first is the timestamp, second is the 14th field, the Google Crawler's ID, and third being the count. The order of the columns is irrelevant.

2 Answers

Voted

quanta · Answer 1 · 2011-08-05T08:25:34+08:00

Best Answer

quanta

2011-08-05T08:25:34+08:002011-08-05T08:25:34+08:00

awk '/google/ { print substr($4, 2, length($4)-1), $14 }' access_log | sort -rn | uniq -c | awk '{ print $2"\t"$3"\t"$1 }'

1

jfg956 · Answer 2 · 2011-08-06T04:21:38+08:00

jfg956

2011-08-06T04:21:38+08:002011-08-06T04:21:38+08:00

This can be done in a single awk, counting hits in the same second using arrays, but it is hard to test without a sample input. Lets guess:

awk '/google/ {
  ts=$4
  crawler=$14

  if (ts != lts) {
    for(c in count) {
      print lts " " c " " count[c]
      delete count[c]
    }

    lts=ts
  }

  count[crawler]++
}END{
  for(c in count) {
    print lts " " c " " count[c]
  }
}' access_log

0

how to group return string by timestamp and display how many hits per group

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?