I have a nginx cache proxy that gets content from an apache origin server.
I make requests from curl
, wget
and Chrome
to verify the cache response. Problem is that, for same URL, I always get a MISS
first time in each separate client.
I would expect after I make one request from any clients, the other clients would get a HIT
, but I get MISS
.
I only get a HIT
when repeating the request in same exact client.
It feel like key would be related to user agent, but it is not:
proxy_cache_key $scheme://$host$request_uri;
To rule out different HTTP version and user agent, I sepcified them in the requests (wget uses http1.1 by default), they both show as GET
in the logs, so not HEAD
wget --server-response --user-agent "foo" 'https://www.example.com/x.php?124'
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Tue, 03 Mar 2020 19:53:53 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.16
X-Accel-Expires: 3600
Vary: Accept-Encoding
X-Cache: MISS <<<<<<<<<<<<<<<<<<<<<<<<<< there
# repeating the request again with WGET will get a HIT
wget --server-response --user-agent "foo" 'https://www.example.com/x.php?124'
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Tue, 03 Mar 2020 19:55:21 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.16
X-Accel-Expires: 3600
Vary: Accept-Encoding
X-Cache: HIT <<<<<<<<<<<<<<<<<<<<<<<<<<< there
# after request should be cached, a CURL request to same URL gets MISS again
curl -L -i --http1.1 --user-agent "foo" 'https://www.example.com/x.php?124'
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Tue, 03 Mar 2020 19:56:37 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.16
X-Accel-Expires: 3600
Vary: Accept-Encoding
X-Cache: MISS <<<<<<<<<<<<<<<<<<<<<<<<<<< there
My config
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
include /etc/nginx/conf.d/*.conf;
# lower value might show error: "upstream sent too big header"
proxy_buffer_size 128k;
proxy_buffers 8 256k;
proxy_busy_buffers_size 256k;
# fixes error request entity too large when uploading files
client_max_body_size 256M;
# main cache for images and some of the html pages
proxy_cache_path /nginx_cache levels=1:2 keys_zone=nginx_cache:512m max_size=50g
inactive=90d use_temp_path=off;
# deliver a cached copy in case of error at source server
proxy_cache_background_update on;
proxy_cache_use_stale updating error timeout http_500 http_502 http_503 http_504;
proxy_cache_key $scheme://$host$request_uri;
# set http version between nginx and origin servr, you can check the version in origin server log
proxy_http_version 1.1;
# enable gzip after we forced plain-text between cache and origin with Accept "" in some vhosts
gzip_types text/plain text/css text/xml text/javascript application/javascript application/x-javascript application/xml image/jpeg image/png image/webp image/gif image/x-icon image/svg;
gzip on;
# security headers, iframe block, etc
add_header X-Frame-Options sameorigin;
add_header X-Content-Type-Options nosniff;
add_header Strict-Transport-Security max-age=2678400;
# default server(s) that don't match any specified hosts
server {
server_name _;
listen 80 default_server;
listen 443 ssl http2 default_server;
root /var/www/html;
}
# include all our custom vhosts
include /etc/nginx/adr_vhosts/*.conf;
} # end of http
My vhost config
server {
listen 443 ssl http2;
server_name www.example.com;
root /usr/share/nginx/html;
location / {
# using the alt port to bypass the other nginx cache at source server (and X-Real-IP overwrite)
proxy_pass http://xx.xx.xx.xx:81;
proxy_cache nginx_cache;
# ask directly for the right host (including www), to avoid mismatches, additional redirects
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# sub_filter only works on plain text, disable gzip communication with origin server
proxy_set_header Accept-Encoding "";
# was it a hit or a miss
add_header X-Cache $upstream_cache_status;
# keep the x-accel header for debugging purposes
proxy_pass_header "X-Accel-Expires";
}
}
I disabled gzip compression between cache server and origin server with proxy_set_header Accept-Encoding "";
in order to use sub_filter
in some location.
Then I re-activated gzip by gzip_types
and gzip on
.
Inside /nginx_cache
there is a cache file saved for every client, these two are for nginx and wget, animation switches between the two files to see they are almost identical, except for the binary (or gzip?!) data above:
Edit: I get a HIT
with all clients if I specify Accept-Encoding: gzip
in the request ! I will look into that ...
Edit 2: wget sends request header Accept-Encoding: identity
, curl by default doesn't send any at all, while Chrome sends Accept-Encoding: gzip, deflate, br
, cache properly gets a hit if I force these with any value as long as they are the same. Is that a missconfiguration at my end or is it normal behavior ? It acts like accept-encoding is part of the cache_key.
I am answering my own question in order to clarify the long details in the question and partially posting the solution...
I found that different clients (Curl vs Wget vs Chrome) each get a
MISS
cache reply, one after another for the exact same url because of thevary: Accept-Encoding
in the response headers (e.g create a different cache variation for each Accept-Encoding)The
Vary: Accept-Encoding
seems to come from my origin server and I confirmed that the cache always returns aHIT
if I add:Just I am not sure if this is safe to do, I will open another question for that.