I'm developing a caching system for an ecommerce platform that will use a reverse proxy for caching. I plan to handle invalidation by using proper HTTP/1.1 headers. That is, I will set an ETag on first generation of the content and cache that ETag value in the application. The Cache-Control header will specify "must-revalidate" so the proxy should set If-None-Match header on subsequent requests with the ETag. The application will lookup the cached ETag value and if it matches it will send a 304 response, otherwise it will generate a full 200 response.
I hoped to use nginx but I can't tell for sure that it supports ETags (docs indicate it doesn't but maybe they are out of date?). Varnish is another option but I'm not positive here either..
Which reverse proxy servers out there have full support for ETags? I'd like it to actually cache multiple versions so I can do things like split testing without having to disable the cache. That is, HTTP/1.1 specifies that a client can send If-None-Match with multiple ETag values and the server should respond with which ETag matched (if any). If the reverse proxy kept multiple copies rather than just the last-seen value and let the server specify on each request which to use, that would be ideal.
I just checked in Varnish source code and even though it support
If-Modified-Since
andIf-None-Match
headers, it does not supportmust-revalidate
inCache-Control
. The only supported attributes inCache-Control
aremax-age
ands-max-age
.References:
bin/varnishd/cache/cache_rfc2616.c
in RFC2616_Do_Cond()bin/varnishd/cache/cache_rfc2616.c
in RFC2616_Ttl()include/tbl/http_headers.h
nginx requires third party modules to support ETag. And there are two of them.
You can look at Apache TrafficServer, which seems to have what you need.
Correct me if I'm wrong, and I know this is an old post - but I'd like to comment for new passers-by. I believe a Reverse Proxy cache doesn't help as much as you'd like when using ETags.
Validation caching mechanisms use the origin server to validate if the ETag (or last-modified date) in the request is still valid (matches or doesn't match the resources etag, depending on which header is used, or has/has not been modified since date given in request).
This means a reverse proxy cache such as Varnish will still pass that request through to the origin server. It may respond with the request rather than have the server handle it, but you didn't save the round trip to the origin server.
Browsers can cache responses and handle a 304 response in any case, so the user's private cache may be better suited to handle this than using a reverse proxy (YMMV, especially at scale, and depending on your use case of course. I don't want to make assumptions about your apps).
From the spec 13.3:
and then note 13.3.4:
So, Varnish can return a response for you, but you still have a round-trip to the server. If you can use a app-cache such as APC or memcache, then that still might be worth it to you. Validation caching is generally better for bandwidth savings over server-resource savings, however.
Validation caching might best be left to the client (browser or api code).
Using the Expiration model for caching is where a reverse-proxy cache really shines. This lets you skip hitting the origin server altogether. Using
Expires
,Cache-Control
,Date
, etc, is the best (again, IMO) mechanism for a reverse proxy cache as the cache can return the response, assuming its not stale, without ever hitting the origin server.To date I believe there are still no proxies that fully support this HTTP spec. So, about a year ago I decided to write my own using Node.js and MongoDb.
https://github.com/colinmollenhour/node-caching-proxy