Recently there are some clients issuing CONNECT requests or treating my server as a proxy server, which really annoys me. The current method I am using is to check access.log regularly to filter out such requests and hacking attempts (trying to access /PMA/, etc) The question is, how can I automate this process, such that the server blocks these IP addresses when such requests are detected?
Given your concern is triggered by use of the CONNECT method, i.e. this is how you are determining "bad behavior," why don't you just block requests using the CONNECT method? For example, using the
<LimitExcept>
directive:will block POST, PUT, DELETE, CONNECT, etc. while allowing only the specified methods GET, HEAD and OPTIONS.
You can use tools such as Fail2Ban, Mod-Security, etc. to detect and deal with bad behavior, but be forewarned these tools: (1) will require you to detect/determine what is "bad" behavior, and (2) will require (sometimes substantial) processing power on your server. As to the first issue, you run some risk of inadvertently blocking traffic that you actually want and, in any case, by chasing IP addresses you are playing a game of "whack-a-mole." As to the second, you need to consider how much server resources will be required for the "solution" as opposed to how little server resources are required for a simple "hit" on your webpage, if you have taken steps to ensure that the hit is harmless.
On the other hand, your server should not act as an open proxy, even though bad actors will always try to use it as a proxy, and many other things. Server hardening is the key here. For this, you might check out the Center for Internet Security (CIS) Security Benchmarks program, which will let you download free guidance for hardening many web servers and operating systems.
Try Fail2ban and configure de Apache filter http://www.fail2ban.org/wiki/index.php/HOWTO_apache_proxy_filter
In addition to Fail2Ban, which I always deploy and have not found to be that resource intensive at all, you can also consider RepSheet:
http://getrepsheet.com/index.html
A bit like Fail2Ban you can use it to say "This type of traffic is probably unwanted, and this type of traffic behaviour is what I expect, and want to prioritize." Rather than outright banning, or just throttling, you can then use the headers it adds to build logic at the application level, like add challenges like Captcha, or send to secondary servers.