I was recently asked 'What causes a line like this in our access.log?'
59.56.109.181 - - [22/Feb/2010:16:03:35 -0800] "GET http://www.google.com/ HTTP/1.1" 200 295 "-" "Mozilla/5.0 (compatible; MSIE 5.01; Win2000)"
My immediate answer is that's someone exploring something a little devious.
But:
- how? Speculation... a short perl or python script could easily connect and ask for a URL with an invalid host.
- Vulnerabilities? What is someone looking for when they do this, what have they learned, and should we patch it?
- Do I need a tin-foil hat to keep them from reading my mind?
- And for me the real question: Shouldn't that be a 404 response, not a 200!?
This is on a standard LAMP server (Ubuntu).
Maybe you want to read http://wiki.apache.org/httpd/ProxyAbuse
specially this point: "My server is properly configured not to proxy, so why is Apache returning a 200 (Success) status code?", it asks your question "Shouldn't that be a 404 response, not a 200!?"
If apache conf is ok, its just sending root page. It's the reason because you get a status code is 200.
I think this would happen if someone tried to use the server as a proxy. That would make the http://... URL "normal" (as opposed to just the path portion that you would expect from a regular server request.)
As for the 200 status code, that... err.. well, my server does that too. It seems to ignore the http://hostname portion and returns the result from the local server using the remaining path. You'll probably have to dig through the RFCs to figure out why that makes sense; I don't know the answer offhand.
Assuming you are not using your server as a proxy, these likely are common attempts of proxy abuse regularly seen on internet facing web servers.
The requests that received a status code of 200 probably returned your index page. You can check this using
telnet
orcurl
.Suppose that:
your sever name is
site.example.org
;third parties are trying to connect to
news.example.net
andsearch.example.com
;your
/index.html
file contains:Using curl, you can reconstruct the requests you received like so:
Using telnet, you can reconstruct the requests you received like so:
If you receive your
index.html
as a result, that means your server is not configured as a proxy and you should not worry about these requests.If you actually receive the contents of
news.example.com
ornews.example.net
your web server is configured as a proxy. You can deactivate this by commenting anyproxy on;
lines on your Nginx configs or by disablingmod_proxy
on your Apache configs.Some interesting references about this: