Ping a Specific Port

Question

tylerl

Asked: 2011-11-26 14:31:24 +0800 CST2011-11-26 14:31:24 +0800 CST 2011-11-26 14:31:24 +0800 CST

Use IPtables or null route for blacklisting about 1 million IP addresses?

772

I've come across a situation where a client needs to blacklist a set of just under 1 million individual IP addresses (no subnets), and network performance is a concern. While I would conjecture that IPTables rules would have less of a performance impact than routes, that's just conjecture.

Does anyone have any solid evidence or other justification for favoring either IPTables or null routing as solution for blacklisting long lists of IP addresses? In this case everything is automated, so ease-of-use isn't really a concern.

EDIT 26-Nov-11

After some testing and development, it appears that none of these options are workable. It appears that both route lookups and iptables do linear searches through the ruleset, and take simply too long to process this many rules. On modern hardware, putting 1M items in an iptables blacklist slows the server down to about 2 dozen packets per second. So IPTables and null routes are out.

ipset, as recommended by Jimmy Hedman, would be great, except that it doesn't allow you to track more than 65536 addresses in a set, so I can't even try to use it unless someone has any ideas.

Apparently the only solution for blocking this many IPs is doing an indexed lookup in the application layer. Is that not so?

More Information:

The usage case in this instance is blocking a "known offenders" list of IP addresses from accessing static content on a web server. FWIW, doing blocking through Apache's Deny from is equally slow (if not more so) as it also does a linear scan.

FYI: Final working solution was to use apache's mod_rewrite in conjunction with a berkeley DB map to do lookups against the blacklist. The indexed nature of berkeley DBs allowed the list to scale with O(log N) performance.

6 Answers

Voted

pQd · Answer 1 · 2011-11-27T14:38:21+08:00

Best Answer

pQd

2011-11-27T14:38:21+08:002011-11-27T14:38:21+08:00

try using iptables and building multi-level tree to decrease number of lookups.

iptables -N rules_0_0_0_0_2
iptables -N rules_64_0_0_0_2
iptables -N rules_128_0_0_0_2
iptables -N rules_192_0_0_0_2
iptables -N rules_0_0_0_0_4
iptables -N rules_16_0_0_0_4

iptables -A INPUT -p tcp --dport 80 -s 0.0.0.0/2 -j rules_0_0_0_0_2
iptables -A INPUT -p tcp --dport 80 -s 64.0.0.0/2 -j rules_64_0_0_0_2
iptables -A INPUT -p tcp --dport 80 -s 128.0.0.0/4 -j rules_128_0_0_0_2
iptables -A INPUT -p tcp --dport 80 -s 192.0.0.0/4 -j rules_192_0_0_0_2

iptables -A rules_0_0_0_0_2 -s 0.0.0.0/4 -j rules_0_0_0_0_4
iptables -A rules_0_0_0_0_2 -s 16.0.0.0/4 -j rules_16_0_0_0_4

and so on - adding nesting levels; obviously you'll need an automatic way of building the rules and you should have chains just for the networks where you have one or more offenders - in this way you can reduce number of lookups that have to be done quite significantly and i think it might actually work.

15

cstamas · Answer 2 · 2011-11-27T01:10:49+08:00

cstamas

2011-11-27T01:10:49+08:002011-11-27T01:10:49+08:00

This is exactly what ipset is for.

From its website http://ipset.netfilter.org/:

If you want to

store multiple IP addresses or port numbers and match against the collection by iptables at one swoop;
dynamically update iptables rules against IP addresses or ports without performance penalty;
express complex IP address and ports based rulesets with one single iptables rule and benefit from the speed of IP sets

then ipset may be the proper tool for you.

It is written by a netfilter core team member Jozsef Kadlecsik (who also wrote the REJECT target) so this is the best choice I can think of.

It is even included in the recent kernels.

11

Per von Zweigbergk · Answer 3 · 2011-11-27T14:42:56+08:00

Per von Zweigbergk

2011-11-27T14:42:56+08:002011-11-27T14:42:56+08:00

I have not tested this myself, but when I heard your problem description I instantly thought "pf" (as from OpenBSD).

pf has the concept of address tables which may just be what you're looking for.

According to some very cursory research I did, it would seem that this has the potential to scale better than ipset. According to the PF FAQ's chapter on Runtime Options, out of the box without tuning, pf supports a total of 1,000 tables, with a total of 200,000 entries across all tables by default. (100,000 if the system has <100MB physical memory). This leads me to believe that it's at least worth considering trying to test this to see if it works on any kind of useful level.

Of course, I'm assuming you're running your servers on Linux, so you'd have to have a seperate firewall box running some OS with pf (like OpenBSD or FreeBSD). You might also be able to improve throughput by doing away with any kind of stateful packet filtering at all.

5

Joel K · Answer 4 · 2011-11-27T15:53:31+08:00

Joel K

2011-11-27T15:53:31+08:002011-11-27T15:53:31+08:00

Have you investigated using a FIB_TRIE instead of FIB_HASH.

FIB_TRIE should scale much better for your prefix counts. (/32s null routes are still prefixes, just very specific)

You might need to compile your own kernel to use it, but it help.

FIB_TRIE Notes

2

plitter · Answer 5 · 2015-11-15T11:53:09+08:00

plitter

2015-11-15T11:53:09+08:002015-11-15T11:53:09+08:00

For posterity: according to ipset docs the default size of a set is 65536, this can be changed by options.

maxelem This parameter is valid for the create command of all hash type sets. It does define the maximal number of elements which can be stored in the set, default 65536. Example: ipset create test hash:ip maxelem 2048.

I put this here since I can't comment yet.

2

tylerl · Answer 6 · 2013-09-19T20:40:29+08:00

Some helpful notes for anyone who stumbles across this problem in the future:

First of all, don't analyze any traffic that you don't need to. If you're blocking TCP traffic, for example, only filter the SYN packets, that way you only hit the filter once per connection. You can use -m state if you want, but connection tracking has its own overhead that you may want to avoid if performance is an issue.

Second, putting a million rules into iptables takes a long time: several minutes. If you need to track that many entities, you'd probably best keep it out of netfliter. The sheer size of the ruleset does make a difference.

Third, the goal is to avoid linear scans; but unfortunately, both iptables and iproute2 are inherently linear. You can divide your rules out binary-tree-style into a large number of chains which limits the number of rules you have to consult, but even still, iptables is not well suited for this size of problem. It will work, but it's a waste of valuable resources.

Fourth, and most importantly, pushing your workload to userspace is not a bad idea. This allows you to write your own tight code or use an off-the-shelf solution that is tuned to your problem set. My own solution to this problem, as mentioned, was to use BDB lookups triggered through apache's mod_rewrite system. This had the added benefit of triggering only one lookup per session, and only after a valid request had been submitted. In this case, performance was extremely fast and the cost of the blocklist was almost negligible.

You can do similar userspace manipulation with iptables by using the -j QUEUE target in conjunction with libnetfilter_queue. This tool is powerful, simple, and poorly documented. I would recommend reading up as much as possible from as many sources as you can find, as there's a lot of interesting material scattered across the web that is not part of any official documentation.

Use IPtables or null route for blacklisting about 1 million IP addresses?

EDIT 26-Nov-11

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?