My varnish setup looks like this (obviously I've simplified):
director default round-robin {
{ .backend = me! }
}
director peers random {
{ .backend = peer1 }
{ .backend = peer2 }
{ .backend = peer3 }
}
And the vcl I'm wondering about:
if (req.restarts == 0) {
set req.backend = default;
} else {
set req.backend = peers;
}
What does varnish do when me! is sick (or really if all backends in the director are sick, only in my case it's 1/1)? Does it go to vcl_error immediately, and trigger a restart?
I want to know how it will handle max restarts. Say in this example, I only want to try twice before giving up. I always want try to get the page locally first, and then if that fails, try one of my peers. But, if I already know ahead of time that my local is sick, I still would like to be able to try 2 of my peers. Is there a way to set that up?
I've done my own testing, and it seems like it is an error when the director is unhealthy. Note I never did test this with more than one server in the first pool.
When I set my target page.php to return a header 500 status automatically (only on the me! server) and watched the varnishlog, I saw the request to me! with X-Restarts = 0, and it returned a 500. Followed up with a request to one of the peers with X-Restarts = 1 that successfully gets page.php with a 200 status.
When I set my probe test on me! to show me as unhealthy, and made the same request for page.php, the first (and only) entry in the log was the request to one of the peers with X-Restarts = 1.
So it does behave as I would guess ... but what it really needs is a counter on the number of times it has actually tried to pass to a backend. It's a pretty big difference to know if the failure is from an actual attempt to retrieve the page vs no attempt being made.