My question is about using Nginx as a proxy behind another proxy. (Somewhat confusing.)
I want to set up Nginx so it acts as a caching proxy server to an npm mirror. Here is the link: http://eng.yammer.com/a-private-npm-cache/
On my local machine, which is not restricted by a firewall, the following configuration works fine:
proxy_cache_path /var/cache/npm/data levels=1:2 keys_zone=npm:20m max_size=1000m
inactive=365d;
proxy_temp_path /var/cache/npm/tmp;
server {
listen 80;
server_name classen.abc.lan;
location / {
proxy_pass http://registry.npmjs.org/;
proxy_cache npm;
proxy_cache_valid 200 302 365d;
proxy_cache_valid 404 1m;
sub_filter 'registry.npmjs.org' 'classen.abc.lan';
sub_filter_once off;
sub_filter_types application/json;
}
}
Now I want to apply it to a server that is behind an additional firewall. In the logs, I can confirm that it accesses the correct upstream IP, but the request fails because of the internal firewall.
We have one internal proxy, which I can use to bypass the firewall, for example:
$ curl http://registry.npmjs.org
curl: (7) couldn't connect to host
$ http_proxy=http://proxy.abc.lan:1234/ curl http://registry.npmjs.org
... succeeds ...
This trick does not work with Nginx, as it ignores the http_proxy
environment variable. After reading the documentation, I still could not figure out how to modify the configuration, so that it can use the proxy internally.
Is it possible to combine both solutions? It is important that the caching still works, otherwise, you can just use the external mirror registry.npmjs.org directly.
Maybe, Nginx should use the internal proxy (proxy.abc.lan) as proxy_pass
, but then how does the internal proxy know that the request should be sent to the external npm mirror (http://registry.npmjs.org)?
Update to Lukas answer
I tried Lukas solution:
rewrite ^(.*)$ "http://registry.npmjs.org$1" break;
proxy_pass http://proxy.abc.lan:1234;
The logs show that the URL is rewritten but it results in a redirect (triggered by curl classen.abc.lan/test-url
):
2014/03/24 11:31:16 [notice] 13827#0: *2 rewritten redirect: "http://registry.npmjs.org/test-url", client: 172.18.40.33, server: classen.abc.lan, request: "GET /test-url HTTP/1.1", host: "classen.abc.lan"
The result of the curl call is not the expected JSON string from http://registry.npmjs.org but a html page generated by Nginx:
$ curl classen.abc.lan/test-url
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.4.7</center>
</body>
</html>
The issue with Lukas's solution is HttpRewriteModule , which automatically turns everything with http(s) at the front into a 302.
If you instead do the rewrite in two stages - the second one 'break' - it should work. e.g.
I suspect there's a nicer way to do this, but it appears to work.
I think it may be simpler than either of the examples above. They are using rewrite to rewrite the url, I think you can use proxy_pass but pass the url to the proxy setting the host header param to the location you want to go to. e.g.
RFC 2616, Section 5.1.2 states
So what you are supposed to do is pass the request to the proxy with those modified directives:
According to the nginx docs, using
rewrite ... break;
will force nginx to use the rewritten URI (now an absolute URI as the protocol requires) instead of trying to build it from theproxy_pass
directive.