Recently we had a problem with our CDN service: one of edge locations cached incomplete version of javascript file and our customers from United Arab Emirates (and maybe some other nearby countries) were not able to use our service. This issue is resolved now, but we've realized that monitoring of our CDN network might be a good idea.
We need to monitor:
- Resources are available.
- They have correct content-type, content-encoding (gzip), expires and some other headers
- They have same size on all geo locations.
- Resources often change names (we are changing filename suffix when updating it with a new version), so we need to have ability to change monitored resource urls quickly and in best case automatically during deploys.
Since our CDN provider have a lot of servers in different locations, we need to track this from many different locations as well. I've seen a lot of services that can track website from many locations, but all of them seems to be focused on measuring availability and speed, while for this issue we are more focused on consistency.
So, I believe we have following options:
- Find a hosted monitoring service which can do required. Does anyone know any?
- Write a monitoring script using some list/network of proxy servers. Is there are any reliable list/service for this?
- Write a script using tor exit node specification to track our resources from different locations.
- Use some other CDN that have this functionality out of the box and guarantee that all copies of out resources on all locations are consistent. But we'd still like to have some 3rd-party monitoring.
What is better approach? Can you suggest something else to solve our problem?
Thanks.
Sorry, but that's almost certainly impossible.
The best way would be to get a list of all edge servers' IP addresses from the CDN provider, and then script something to test the copies on the edge locations. But, I'm 99% sure you will not find a CDN provider who will give you detailed information on all edge locations -- the CDN's are in a fiercely competitive business.
With the exception of the above known list of IPs, with a CDN you can't rely on DYI monitoring, or HTTP monitoring done by Gomez, Pingdom, or any of the other HTTP monitoring services. CDNs have edge locations in many cities, and use some kind of 'smart' traffic routing:
For this reason, your monitoring client would get directed to the 'nearest' CDN POP, and would not be able to monitor the other CDN POPs. You would have to run monitoring agents in many, many locations around before you could be certain you covered all CDN POPs.
All good CDNs have extensive monitoring of their own network, so there really should be no reason to duplicate this yourself. Instead I would suggest to consider:
Switching to a better CDN provider?
Forcing the CDN POPs to evict their cached versions more frequently. Good CDNs can manage the cache TTL on their edges independently from what the HTTP Expires & Cache-Control headers say. That way, if there is an incomplete file, it will get flushed sooner.
1) Check out the monitoring locations of pingdom.com and websitepulse.com
2) Not reliable
3) Not reliable
4) never heard off
Did you talk to your CDN provider to check for possible tools to suit your monitoring needs? A CDN might provide you with an interface to specifically choose a mirror site or region for your check requests. I would take this as the number one route, because everything else is rather a hack than a real solution.
There are proxy directories, but these are unreliable and change often, as 3molo stated. You might consider running your own proxies over cloud services (e.g. AWS or Azure) as monitoring agents, but it really depends on the geolocation algorithms used by your CDN if this will reflect "real-world" detection and get your requests served by the "right" servers.
Relying on an unreliable peer-to-peer infrastructure like tor for monitoring sounds like a bad idea and may only be worth a try if you are out of other options.