My question is about Virtual Machines and delivering their content over the servers connection to the internet.
I have an Ec2 windows instance, and its network connection appears to be 100mbps
If I was to be delivering content from that EC2 instance, is THAT my potential bottleneck?
How does s3 differ, I am guess their is no real potential outbound bottleneck with s3?
Note : I know s3 and their CDN would be better for static content, however I need to explore this situation for now. Our HTML pages need to access a server side page via AJAX, and because there is no bombproof work around for this at the moment our content and our server needs to be on exactly the same domain, so it rules out using S3.
Bandwidth needed : I am not sure, we could have up to 100 users downloading videos at any time, probably no more. Videos can be up to 5mb each, but they would view up to 20.
I can't speak for Windows instances, but I will presume that their base characteristics are fairly similar to Linux instances.
Your estimate for bandwidth usage is 100 simultaneous video downloads (I am not sure if you mean downloading the file or streaming the video - I will assume the latter). If we take a stream rate of 512kbps, you need about 51Mbit/s or 6.5MB/s.
EC2 instances differ in their I/O performance (which includes bandwidth). There are 3 levels of I/O performance: low, moderate, and high. Keep in mind, though, that disk I/O (i.e. from EBS volumes) also is bandwidth dependent. You can only really consider bandwidth within the EC2 network (as it will be completely variable over the Internet).
Some typical numbers to quantify 'low', 'medium', and 'high' (different sources quote different numbers for theoretical values, so they might not be completely accurate).
High: Theoretical: 1Gbps = 125MB/s; Realistic (source): 750Mbps = 95MB/s
Moderate: Theoretical: 250Mbps; Realistic (source, p57): 80Mbps = 10MB/s
Low: Theoretical: 100Mbps; Realistic (from my own tests): 10-15Mbps = 1-2MB/s
(There is actually a 'very high' level as well (10Gbps theoretical) but that applies only to cluster compute instances only).
A further point of mention is the degree of variation. On smaller instances, there is more variability in performance as the physical components are shared between more virtual machines. Regardless, you can expect around +/-20% variation in your performance (sources: 1, 2, 3). In your case (as per the assumptions/calculations at the top), you may need peak bandwidth of 13MB/s (double 6.5MBps, since disk I/O is also network limited). If you are transferring lower bandwidth content, you should be able to use an instance with 'moderate' I/O performance (see the instance types page), if your calculations result in a higher bandwidth requirement, you will need an instance with 'high' I/O performance. Simply streaming the data should not be CPU or memory bound, but sustaining 100 simultaneous connections will probably require at least a medium sized instance - and if bandwidth is a concern, based on the above, a large instance would be a safer bet).
I would recommend benchmarking the servers you launch to see if they meet your (calculated) needs. Launch two instances (of the same type) and run
iperf
on each using the instances' private IP addresses - you will need to open port 5001 in your security group if you run it with the default settings). Additionally, most tests outside of the EC2 network show results of between 80-130Mbps (large instances) - although such numbers are not necessarily meaningful.A CDN would be better suited to your needs, if your setup permits it. S3 appear to have a limit around 50MB/s for bandwidth (at least from a single instance) as per this article, but that is higher than what you should require (S3 does not support streaming). Cloudfront would be better suited to your task (as it is designed as a CDN) and supports 1000Mbps=125MB/s by default (source) with higher bandwidth available on request and can stream content as well)
The numbers seem to change over time and as the number of different instance types proliferate. But an number of people post benchmarks. I've had some luck by googling
[instance category] ec2 network benchmark
.For example, I wanted to know the bandwidth of an
m4.xlarge
instance, so I searchedec2 m4 network benchmark
. I found this test result from the Washington Post engineering blog: