Ping a Specific Port

Question

scones

Asked: 2020-06-26 04:43:42 +0800 CST2020-06-26 04:43:42 +0800 CST 2020-06-26 04:43:42 +0800 CST

varnish delivers randomly empty pages

772

We ran into a problem with our setup where Varnish started to deliver empty pages. We terminate ssl before the varnish and use a apache2.4+php-fpm over fcgi setup behind it.

At first glance the pages seemed to be only in the legacy app, where we are doomed to use a php5.6, thus might get white pages of doom. But these errors happened randomly. Also the php7.2 apps were affected as well.

The next guesses were about recent changes in apache on our end (type of mutex). That turned out to be wrong as well.

Turning of caching solved the problem, but is not a solution.

All the internet searches hinted towards a correlation between Content-Encoding: chunked and a wrong http version, but checking that, we use http/1.1 before and after the varnish. Also we had this problem randomly (~0.6% of pages).

(i am writing this as a question i will self answer for the next poor soul to battle this oddity. And somehow this was never questioned before..)

2 Answers

Voted

scones · Answer 1 · 2020-06-26T04:43:42+08:00

scones

2020-06-26T04:43:42+08:002020-06-26T04:43:42+08:00

The solution came to me during testing:

I tested on several machines using curl and less. But sometimes i used -I, thus triggering a HEAD request. It turned out, that a HEAD request is treated like a GET request in terms of caching (same cache key), but the backend just responds without body for head requests. So you end up with cache-objects without body, that trigger a HIT on GET requests as well.

I simply added this line to vcl_hash and the problem went away:

  hash_data(req.method); // cache HEAD requests seperately

hope this helps someone else to skip 2 weeks debugging.

2

Thijs Feryn · Answer 2 · 2020-07-01T07:28:59+08:00

How Varnish processes `HEAD` requests

Varnish's default behavior is to accept a HEAD request, and turn it into a GET request.

The response is stored in cache, but the payload is stripped off before being returned to the client.

This is done by design, for efficiency reasons, and does not violate the RFCs. The output should be identical.

Varnish does not add a cache variation based on the request method. Because HEAD is converted into GET, it doesn't need that variation.

How to circumvent this default behavior

In your specific case, you've done optimizations at the origin level to process HEAD requests. Apparently, receiving pure and unadulterated HEAD requests at the origin matters.

To achieve this, and to circumvent default behavior, there are 2 common ways:

You can explicitly set the request method to HEAD in vcl_backend_fetch:

sub vcl_recv {
    set req.http.method = req.method;
}

sub vcl_hash {
    hash_data(req.http.method);
}

sub vcl_backend_fetch {
    if(bereq.http.method == "HEAD") {
        set bereq.method = "GET";
    }
    unset bereq.http.method;
}

You can bypass the cache:

sub vcl_recv {
    if(req.method == "HEAD") {
        return(pass);
    }
}

The former solution is probably better than the latter in your case.

Something is going on in your VCL

Although you claim there's no VCL logic in place that keeps HEAD requests intact, I suspect something is going on in your VCL.

I'd love to see your full VCL file, and give it a try myself. Feel free to redact any sensitive data.

I do agree with the solution to your problem.

Although I have a hard time believing that your Varnish installation keeps HEAD requests intact without any special VCL, the solution you offer is a valid one.

Creating a cache variation per request method is a good way to tackle your issue.

Regardless whether or not you have special VCL in place for HEAD requests, this is the behavior you need to satisfy your origin server. I'm not going to argue that.

The only reason I'm pursuing this topic, is because I'm interested in what caused this issue. You already provided the solution, I just want to make sure we both understand why things happened the way they did.

Looking forward to that VCL file, and thanks for your input.

varnish delivers randomly empty pages

How Varnish processes `HEAD` requests

How to circumvent this default behavior

Something is going on in your VCL

I do agree with the solution to your problem.

Can you pass user/pass for HTTP Basic Authentication in URL parameters?