I'm working on a Web service which is hosted on EC2 and needs to have a varying number of instances running, depending on load. We have the basic service up and running, but one of the things we're struggling with is the time it takes to provision and launch a Windows instance (we are using some third prty tools that only run on Windows). I've seen this take anywhere from 10 minutes up to a pretty staggering 45 minutes.
Does anyone have any tips on how to speed up the launch of an EC2 instance ? Since the AMIs for Windows servers are large compared to Linux AMIs, for example, I'm wondering if one thing might be to make sure that the S3 bucket containing the AMI is located in the same zone where the instance is launched, which would presumably make provisioning the new instance faster.
The Amazon windows instances reboot on start because the default configuration of the "EC2 Config" windows service is to rename your host to the internal DNS name of the instance. Renaming hosts requires a reboot on windows. If you do not need to use the internal DNS name of your instance then you might benefit by disabling the SetComputerName feature. Windows instances also have the advantage of not having to initialize the startup drives where you might have already bundled your configuration again saving some more time in instance start up. All of this is possible via the EC2 Windows Configuration Service.
Windows Configuration Service: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/appendix-windows-config.html
My Windows small instances normally take 15-18 minutes to boot (larger ones are faster). Depending on your requirements, you might be able to bundle all your software within the AMI and be able to have everything boot up and running within that period. I understand the reservations for not bundling everything into an AMI but it might be worth the improvement in start up time to have production AMIs with everything bundled in them. Keep the build scripts separate if you want in your build environments.
Additionally, now that Amazon had released EBS root volumes as opposed to instance-store root volumes. Windows small images running on an EBS volume boots up in almost 5 minutes versus the almost 20 minutes it took before. Also you don't need to terminate - you can stop/start them - depending on your setup this potentially shaves a few more minutes in some start up scripts.
Essentially customizing your Windows EC2 Config service, your AMI and potentially using an EBS boot volume should reduce start up times to almost 5 minutes. You can avoid the sysprep that run on an ec2 instance startup depending on your app, especially for development purposes. An non-sysprepped m1.large image that avoids a hostname change on startup can start up in about 2 minutes which is not bad at all.
At this time, as far as I understand it, that's the best you can do with Windows on Amazon EC2 but that really is not too bad. If you are able to forecast close to 10 minutes in to the future based on average usage patterns, you should be able to spin up extra instances and handle the additional load.
I installed 3 instances last night of a vanilla Windows 2003 server. The first two took about 45 minutes, the 3rd, about an hour later, took a full 2 hours before it was ready!
Those had nothing in them at all, without any S3 usage. I doubt there is a way to speed up that fundamental step apart from waiting for Amazon to make deployment speed improvements over time. So, I would conclude that a certain delay is to be expected and Kurt's advice is good, which is to keep 1 or 2 in reserve already prepared.
Another thing you could do is to create a new instance of your AMI type a few times and time it. Then try a few times with your S3 storage and see how much time that adds to it. I assume that the availability zone should match between the image and S3, although I don't know how much time difference that will make.
Once you've determined the max time to provision, keep ahead of the load/usage by that many minutes.
Have a minimal system, keep as much as possible in EBS might help? Or perhaps take an Apache style approach and run one or two in reserve?
We've run into this exact problem, but in a very serious way -- our new startup extends Amazon EC2 into a Virtual Lab environment (multi-user, policies, sharing etc.) and so we needed to speed up the start time of Windows machines. Our biggest decision was to support only EBS based volumes in our application, because they're the only ones that can start up in 5-10 minutes. In our testing we found instance-store startup times to vary widely and sometimes take excessive amounts of time, which made them useless for us.
Simon @ LabSlice Virtual Lab Management on EC2