Ping a Specific Port

Question

Max

Asked: 2014-06-11 06:51:47 +0800 CST2014-06-11 06:51:47 +0800 CST 2014-06-11 06:51:47 +0800 CST

Get IP network range after reverse DNS?

772

For analytics purposes, I'm looking at large sets of IP addresses in server log files. I'm trying to perform reverse-DNS lookups to understand where traffic is coming from - e.g. what percentage of IPs resolve to corporations, schools, government, international etc.

Despite a bunch of optimizations, individually reverse-DNS'ing every IP address still appears to be fairly expensive though. So -

is there any way to obtain an entire range of IPs from a reverse-DNS?

If yes, this could greatly reduce the number of actual reverse-DNS lookups.

Example (numbers slightly obfuscated):

Log file contains a request from an IP 128.151.162.17
Reverse DNS resolves to 11.142.152.128.in-addr.arpa 21599 IN PTR alamo.ceas.rochester.edu
(So this is a visitor from Rochester University, rochester.edu)
Now, would it be safe to assume that all at least all IPs from 128.151.162.* will also resolve to rochester.edu?
What about 128.151.*.*? Is there a way to get the exact IP range?

3 Answers

Voted

Shane Madden · Answer 1 · 2014-06-11T07:15:31+08:00

Is there any way to obtain an entire range of IPs from a reverse-DNS?

Not really, no; in extremely rare cases you might be able to do a DNS zone transfer query to get all the records in the zone (the whole /24, generally), but there's a very low chance that the name server you're querying will respond to this request. Expect one query per address for reverse DNS (sorry!).

Now, would it be safe to assume that all at least all IPs from 128.151.162.* will also resolve to rochester.edu?

Generally speaking, probably, as a university they're likely to own the whole /24. However, that's not a good rule to apply as a general case; a smaller school might not have a whole /24, or might not have it in reverse DNS.

The reverse DNS itself is going to be pretty hit-or-miss - in many cases it'll be just generated names under the ISP's hostnames or no records at all. For better data, we're going to make things even more expensive - you should also look at data from whois.

For example, here's the info from that Rochester IP - it shows the size of the allocation (the whole /16 range, so in this case that applies to 128.151.*.*) and the organization it's allocated to.

The whois info should provide a great source of truth for the info you want, and has the upside of being able to see what range that applies to. The downside is that for smaller allocations, a range will often just show as belonging to the ISP instead of the end customer. Combining both whois and reverse DNS should provide the best information (and be ridiculously slow).

mc0e · Answer 2 · 2014-06-11T07:55:29+08:00

mc0e

2014-06-11T07:55:29+08:002014-06-11T07:55:29+08:00

You can generally get info about netblocks from whois (eg whois 128.151.162.17 refers to CIDR: 128.151.0.0/16), but you'll probably find that there's some variation in the format of the responses you get depending on which registry is involved, and also that whois servers are likely to cap the number of requests you can make. Also note that netblocks are typically nested with smaller ones inside larger ones, and so you may get info about multiple netblocks for one IP.

A DNS request packet can contain multiple requests, which may speed things up if you need to resolve a lot of requests, but the main techniques you need are to paralellise requests, and to cache responses.

2

TomOnTime · Answer 3 · 2014-06-14T06:37:23+08:00

General advice about this kind of algorithm:

Generally you'll find the data is nearly infinitely cacheable. The data changes so rarely that you might as well do it in batch and save the data to a on-disk cache that all your code uses. The TTL on the data might be 1 hour, but when I was on the internet mapping project we found that as far as domains changing, the data was stable for more than a year.

If you are doing a lot of DNS queries, rate-limit how many you send to any particular DNS server. Otherwise it is rude at best, and a DoS attack at worst.

If you can generate all the DNS queries you plan on doing ahead of time, and they'll fit in RAM or on disk, generate the list, randomize it, then do the lookups in random order.
A lazy way to do it, that doesn't require enumerating every query in advance, is to spread out the queries among all the CIDR blocks. That is, if you are doing 500 CIDR blocks, do the .1 address in all of them, then the .2 address in all of them, then the .3 address, etc. The result is that you put less stress on the individual DNS servers. (this works particularly well if you are doing thousands of lookups on millions of CIDR blocks)

If you are doing the lookups "on demand", just use some kind of write-through cache and you should be fine.

Get IP network range after reverse DNS?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?