I'm trying to optimise a page, and I'm seeing some strange behaviour. Each time I click on a link to the page, all resources are fetched from the server, responding with 200s. However, when I refresh the page (specifically, F5 in Firefox), all resources return a 304 and - of course - the page loads much faster as a result.
The main page returns a 200 in both cases. In the refresh case, If-Modified-Since
headers are sent with the requests to the resources. However, in the 'clicking a link' case, they are not. What's the reason for that, and can I control it?
In order to determine the why this behaviour is happening exactly, try to use the FF plugin LiveHTTPHeaders or Firebug to see what your servers response bodies are - more specifically, how they permit caching by the browser. Without that information, it'll be hard for me to say precisely why you're getting this behaviour. But yes, there are few ways you can control it.
If you know the resources will not change, you can explicitly tell the browser to cache the objects for a fixed amount of time. An excellent hack, is to say cache for years - then just change the URL slightly to force a refresh (e.g.
http://.../images/test.jpg?1
and replace 1 with 2, 3, 4, etc.). Some frameworks do this automatically, by appending the last modified timestamp.1) .htaccess E-Tags (from http://stuntsnippets.com/etags-htaccess/)
Full documentation: http://httpd.apache.org/docs/2.0/mod/core.html
2) .htaccess Cache-Control (from http://www.askapache.com/htaccess/apache-speed-cache-control.html)
Full documentation: http://httpd.apache.org/docs/2.0/mod/mod_expires.html
Why this is happening
The behaviour is probably down to browser caching, but the confusion is due to how the responses are displayed. This is a big assumption on which this whole answer hinges, so forgive me if this is incorrect.
I find Chrome browser (hit f12, 'Network' tab) is better at illustrating this than Firefox.
What is likely happening is that when you 'follow a link' (which should be the same at directly entering the url) you are seeing 200 responses which are a mix of real requests and browser cached responses. This is normal behaviour. Chrome illustrates 'from cache' responses by explicitly stating 'from cache' against each resource in the network tab. I believe FF illustrates this as greyed responses in the timeline.
In both browsers when items are retrieved from cache they still display the status response. This is from the last server response, but this is still a cached response.
When you hit F5, you are forcing a request to be sent, not to disregard the cache entirely, but check again at the server if it has changed. Your request headers contain the the
If-Modified-Since
because it is still available from cache. A 304 is returned, and your browser uses the cached version confident that the server version has not changed.Following a link = Use cached data.
F5 = Send requests again.
Ctrl+F5 = Send requests again but also disregard the local cache.
Browser caching is likely disabled for the main page (perhaps all .html content by default) by way of the response headers returned for it. Because of this, it cannot send a
If-Modified-Since
in the refreshed request, because it doesn't exist locally in cache, and so there is no date to compare content with. Because there is noIf-Modified-Since
sent in the request, the server must respond with another 200 with the full page content.Because the resource items are available from cache and so there is a date to send - the date when the items were added to the browser's cache.
Because the browser doesn't send If-Modified-Since headers when retrieving from local cache. The 200 "response" is a cached one. You don't need to control that as such.
None of this explains why it loads faster when you hit F5. Are you positive this is correct?
On a separate note, be aware of browser 'heuristic caching'. This is the behaviour the browser adopts when caching isn't explicity defined by the reponse headers, and is essentially a 'best guess' behaviour. Naturally it differs for each browser.
That sounds like a browser setting to me an, flicking through the options I can see in Firefox on my (Windows 7) netbook, I can't see anything that would allow you to control it.
So I think that's a no, I'm afraid.
It would have been helpful if you'd included examples of the request and response headers for both scenarios.
The explicit refresh causes the browser to include prama: no-cache in the request - meaning that all intermediate caches (including proxies) must forward the request to the origin server. When you click on a link, the content may come from your browser cache, an intermediate proxy, a reverse proxy or the origin server.
But you'll get an even faster response if you served up the content with instructions allowing te browser to cache it - i.e. an expires or cache-control: max-age directive). Conditional requests only really speed up access when dealing with very large files (PDFs, video etc). Note that most browsers will report a 200 status against such operations when fetched from the local cache - but this will be much faster than refereing back to the origin.
Really? Are you sure?
This might happen if you are not including any max-age / expires caching instruction - it's such a silly thing to do that I've never tested out how it behaves.