Ping a Specific Port

Question

user3758232

Asked: 2020-08-08 09:44:54 +0800 CST2020-08-08 09:44:54 +0800 CST 2020-08-08 09:44:54 +0800 CST

Scope and utility of Varnish bans

772

I need to invalidate the cache of a set of URLs related to a resource that was either deleted/unpublished, or updated. E.g if my resource UUID is 1234abcd I want to invalidate all its cached derivatives under /resource/1234abcd/*.

I understand that there is no way in Varnish to do this with purge, but it can be done with bans. However I have a hard time understanding how bans exactly work.

If e.g. I update resource 1234abcd and ban all its derivatives, I assume that the next client request to /resource/1234abcd/derv1 will be a new backend fetch, and that new resource will be cached. Will I have two version of the same derivative, one banned and one not (until the old one expires and eventually its ban expires too)? If my resources have a long expiration date I may accumulate a lot of bans for cache resources that I would much rather have cleared right away.

On a software design level, What is the utility of leaving inaccessible resources around instead of implementing a regex-based purge, which seems more straightforward to manage?

Also, I implemented a ban in my development env, and I see that a ban only takes effect a minute or so after being requested. Does this have to do with the ban lurker or some other timing setting?

Thanks.

1 Answers

Voted

Thijs Feryn · Answer 1 · 2020-08-10T23:53:00+08:00

Bans & the ban list

Bans in Varnish are done based on the so-called ban list. Items on the ban list match specific criteria that ideally match properties of a cached object.

The ban lurker, a separate thread that monitors the ban list is responsible for removing the matching objects.

Bans & regex url matches

In most cases, you'll want to match a URL pattern that needs to be invalidated.

An easy way would be to issue the following ban:

req.http.host == example.com && req.url ~ /resource/1234abcd/.*

The problem with this example is the request scope: the ban lurker only has access to the object and its properties. Request information is not part of that, because an object only contains response information.

In this case, the lurker won't match the item on the ban list, and the item will remain in cache until the next user hits a matching URL. This is not efficient.

Lurker-friendly bans

A trick we use to bypass these scoping limitations, is by adding host & url information to the response.

Here's how to do this:

sub vcl_backend_response {
  set beresp.http.url = bereq.url;
  set beresp.http.host = bereq.http.host;
}

sub vcl_deliver {
  unset resp.http.url;
  unset resp.http.host;
}

You could then run the following ban:

obj.http.host == example.com && obj.http.url ~ /resource/1234abcd/.*

The lurker would be able to match these properties, and would remote the matching objects from cache, without the need for a request to happen.

When does the lurker remove the objects from cache?

The varnishd binary has a couple of runtime settings that influence how bans are handled by the ban lurker:

ban_lurker_age: the ban lurker will ignore bans until they are this old
ban_lurker_sleep: how long the ban lurker sleeps after examining a batch of ban list items
ban_lurker_batch: the number of bans the lurker processes before going back to sleep

How to issue bans

There are 3 ways you can issue a ban:

Via an HTTP call that is defined in VCL
Via the varnishadm binary locally
Via a remote CLI call over TCP/IP

Here's a varnishadm example:

varnishadm ban obj.http.host == example.com '&&' obj.http.url '~' '\\.png$'

For more information about a remote CLI call, please have a look at: http://varnish-cache.org/docs/6.0/reference/varnish-cli.html#varnish-command-line-interface

Here's an HTTP example:

acl purge {
    "localhost";
    "192.168.55.0"/24;
}

sub vcl_recv {
    if (req.method == "PURGE") {
        if (!client.ip ~ purge) {
            return(synth(405,"Not allowed."));
        }
        if(req.http.x-purge-regex) {
            ban("obj.http.host == " + req.http.host +" && obj.http.url ~ " + req.http.x-purge-regex);
            return(synth(200, "Purged."));
        }
        return (purge);
    }
}

In this HTTP example, we combine purging and banning.

A regular purge issued by curl would look like this:

curl -XPURGE http://example.com/resource/1234abcd/abc

A more flexible regex purge using bans, would be issued like this:

curl -XPURGE -H"x-purge-regex: /resource/1234abcd/.*" http://example.com

Issues with bans

Banning is not without issues.

The entire concept of bans revolves around matching patterns in a list with objects stored in cache.

The more items on the list, the more CPU cycles are required to have them all processed
The more objects in the cache, the more CPU cycles are required to have them all processed

So big ban lists, and lots of objects, could cause a lot of CPU overhead

Tags over URLs

Another use case for bans is tag-based invalidation.

Sometimes its quite hard to translate an entity change into matching URLs. Sometimes you don't even know which URLs are affected by an entity change.

In that case it makes more sense, to tag content, and to invalidate objects that match one of these tags.

In your application, you would issue response headers like this:

X-Cache-Tags: type:resource, id:1234abcd, category:product

If you then want to remove all resources that are part of the product category, you could simply issue the following ban:

ban obj.http.x-cache-tags ~ category:product

A better solution for tag-based invalidation

If you're planning to use bans for tag-based invalidation, you'll run into the same CPU issues if your ban list grows to quickly, and you have too many objects in cache that need to be validated.

A better solution is the use of xkey, which is a Varnish module that comes with the Varnish modules collection. See https://github.com/varnish/varnish-modules/blob/master/src/vmod_xkey.vcc for more info.

You have to compile this module, but the API is more flexible, and the performance is a lot better.

Scope and utility of Varnish bans

Bans & the ban list

Bans & regex url matches

Lurker-friendly bans

When does the lurker remove the objects from cache?

How to issue bans

Issues with bans

Tags over URLs

A better solution for tag-based invalidation

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?