When varnish cache is empty and I got X client requests for the same asset - Varnish gathers the clients and issues 1 backend fetch.
Do you know if there's possibility to control how many requests are held until fetch (or time window size)? - I would like to have backend fetch more frequently so the clients would wait shorter time.
When I test with return(pass) - I got nice flow, no long times for the client, but I got no caching - and because caching is set by backend - I would like to stay with "return(lookup)" which enables antidogpile effect - good but sometimes bad (because of holding reqs...)
EDIT: Kind of solution in my comments :)
As far as I know, the request to backend is fired immediately. If more requests for the same resource arrive before the first request is satisfied, those are served from the backend request in flight.
If you feel your cache misses are slow, then it is most probably because your backend is slow, or something else is misconfigured.