Is it possible to set HSTS headers on an Amazon CloudFront distribution from a S3 origin?
We just migrated to Amazon AWS. We currently have an EC2 instance that's working well. It's running Nginx in front and Apache in the back-end. That's running well also. All sites are launched properly and includes the Cache-Control header for files that are served from the EC2.
The problem is with ALL static files we placed in Amazon S3 that's being accessed through CloudFront CDN. We can access the files fine (and no issue with CORS), but apparently CloudFront doesn't serve files with Cache-Control header. We want to leverage on browser caching.
The way I see it, the EC2 instance doesn't play a role here as the static files are being served directly by S3+CloudFront, the request does not go to the Web Server in EC2.
I'm at a complete lost.
Question: 1) How do I set the Cache-Control in this case? 2) Is it possible to set the Cache-Control? From S3 or CloudFront?
Note: I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.
Thanks!
Background
I'm hosting a static site on S3, with CloudFront over the top. The issue I have is with my HTML files.
According to CloudFront's FAQ:
Amazon CloudFront uses these cache control headers to determine how frequently it needs to check the origin for an updated version of that file
What I've done so far
With this in mind I've set the HTML files in my S3 Bucket to add in the following headers:
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Expires: Fri, 01 Jan 1990 00:00:00 GMT
On my first call to my samplefile.htm
, I see the following response headers (I've excluded obvious headers (e.g. Content-Type
) in order to keep to the point:
Cache-Control:no-cache, no-store, max-age=0, must-revalidate
Date:Sat, 10 Dec 2011 14:16:51 GMT
ETag:"a5890ace30a3e84d9118196c161aeec2"
Expires:Fri, 01 Jan 1990 00:00:00 GMT
Last-Modified:Sat, 10 Dec 2011 14:16:43 GMT
Server:AmazonS3
X-Cache:Miss from cloudfront
As you can see, my Cache-Control
header is in there. The problem is, if I update this file and refresh I get the cached content (rather than the latest file), and I can see that CloudFront is serving its cached version by looking at the response headers:
X-Cache:Hit from cloudfront
Summary/question
With the above in mind, how can I achieve automatic retrieval of the latest HTML when using CloudFront?
As per its FAQ I should be able to do this with Cache-Control headers, but I can't seem to get this working.
Following the answers below
In the end I decided to change my www CNAME to point to my S3 bucket directly. Then added a new CNAME called "static", which points to CloudFront.
This means that HTML is direct from S3, which then has all its CSS/JS/IMG references pointing to static.mydomain.com
I am new to CDNs and experimenting with CloudFront. I have set everything up and all appears to be working fine. I can create a static image on a page and access it though my CloudFront distribution. I am using a custom origin (i.e. not an s3 bucket).
I'm worried that I might be worse off from a performance point of view though. I have a test page that is loading up the same 20 or so images with and without the CDN. Looking at the net panel in Firebug, the first time I load this page the images that are loaded directly from the origin server come in much faster. On subsequent page loads the benefits of the CDN become obvious -- after 3-5 refreshes the CDN is doing better than the origin server.
So I can see that on a popular page on our site that is being hit all the time, this will be a benefit. And I should expect a benefit because I'm in Seattle (around the corner from Amazon) and my server is in CA.
The thing is that if I leave the page for a few minutes and then reload, things are back to square one, with CloudFront being worse than the origin server. Is this expected? Do things drop out of the CDN "cache" so quickly?
Is it possible that something in my setup is hurting performance? Or is the reality that the CDN will only be a net positive for content that is currently being accessed every few seconds on average?
(cross posted from the AWS forum because I've been spoiled forever by SO's turnaround times)
UPDATE:
There are two good answers below that are worth looking at if you have questions about CloudFront performance. I recently found one explanation for my specific problem wasn't mentioned though. I had left TTL at 5 minutes as an oversight. Since I'm also using a custom origin there is an additional round trip to the authoritative nameserver to resolve that to the actual Amazon CloudFront domain. Now that the TTL setting is back to 12 hours it seems that the long loads happen more seldom.
Amazon has recently added the ability to use any server as an origin server for Cloudfront, removing the original S3 only restriction.
My question is- how do I set this up? The AWS web-gui seems to only support the S3 buckets (still), and the ec2 command line tools don't appear to have anything for registering a Cloudfront distro.
Any thoughts much appreciated!
Thanks, Chris.