In AWS for example, when I spin up a new EC2 instance, it loads up a new VM, then populate the VM with a container image. This is the reason why spinning up new EC2 instances take 60-90 seconds to start.
Out of curiosity, what are the disadvantages to having AWS run the host machine as-is, and when a user wants to "spin up an EC2 instance", AWS just spins up a container with restricted permissions, and allows the user only access to that container?
The upside would be that the compute instance would spin up very quickly. I'm still learning about cloud technologies, so I was just wondering what the downsides are.
Perhaps it is harder to allocate CPU resources without using VMs? And as a result, users would fight over each other to take the available CPU? Or perhaps there's some security concern? Would love to learn about this.
Containers typically run only a single application and are immutable, i.e. and changes are not preserved across restarts. Containers also don't have their own kernel.
VMs on the other hand run the whole Operating System, including the kernel, init scripts, system daemons, etc. And the storage is typically preserved across restarts.
VMs and Containers serve different purpose - google something like "VMs vs Containers", there's plenty on the internet.
If you want to run Container as a service in AWS without having to worry about the underlaying VMs look at AWS Fargate - that does exactly what you want.
Hope that helps :)
Your question is, to some extent, looking at things backwards: EC2 isn't a general-purpose hosting solution that happens to use VMs; it is a service for hosting VMs. As such, there's a few ways to interpret your question.
Why wasn't EC2 designed to use containers?
The answer to this can be deduced from the timeline: EC2 was launched in beta in 2006, and full production in 2008; Docker wasn't publicly released until 2013, and Kubernetes was 2015.
Container technology was being developed at the time EC2 launched - BSD already had "jails", and Linux had some forms of namespace isolation - but it wasn't the mature ecosystem we're familiar with today. Virtual Private Servers, on the other hand, were a well-established concept - VMWare explicitly marketed ESX for hosting services in 2002, the Xen hypervisor followed in 2003, and Linode was launched that same year. EC2's innovation was a system for launching virtual servers on demand using this established technology.
Why hasn't EC2 moved from VMs to containers?
Although containers can be thought of in some ways as "a light-weight VM", this is not a full description, and the two are not inter-changeable. A VM is designed to give the user the illusion that they are accessing a physical server, with full control of the entire system; resources such as networking are presented as virtual hardware with which the user can directly interact if they wish. Containers present a more limited abstraction, and the application is generally much more closely bound to the configuration of the container itself, such as only forwarding specific network ports.
Amazon has added many services over the years, but are very conservative about retiring old ones which customers rely on. So, they do offer many services based around containers rather than VMs, such as ECS (Elastic Container Service, launched 2014), Fargate (launched 2017), and EKS (Elastic Kubernetes Service, launched 2018); but they are unlikely to retire EC2 if users are still using it.
Why haven't users moved to container services?
Given that container-based cloud hosting is available, why do people still opt to use VM-based services like EC2?
I think there are several reasons; a few that come to mind:
So, although containers continue to grow in popularity, they have not yet completely replaced virtual servers, and probably never will. As such, EC2, and similar VM-based cloud hosting services, are here to stay.
Security is definitely a reason. Containers share the same kernel between them and the host. So they are not considered 100% isolated.
Yet cloud providers do provide containers also. AWS does it too. I suppose containers are cheaper than VMs, but I haven't checked.
In essence what you ask is a more general topic, VMs vs. containers; regardless of platform, the same pros and cons apply.
There's several different approaches to containers, and the current accepted answer only seems to account for the OCI-style (docker-like) containers. There's many other types of containers, such as LXC and BSD jails, which have different approaches.
LXC for example can easily contain several applications, and is mutable by default. It also has init scripts and system daemons (systemd etc).
The allocation for CPU, RAM and disk space resources can be done as easily with containers.
Provisioning containers is not an instant task (but can be faster than "60-90 seconds") as you still have to get an image, extract it and start it up.
Security is a major source of concern on all of the container solutions I mentioned as they all share a kernel. While there's many security measures in place, there's still occasionally vulnerabilities that are found. If you had a shared server with your friends and you all had containers in them, you'd probably be mostly safe, but at the scale of large providers such as Amazon (where there's tons of businesses using their services), it can be significant security concern.
If you check the AWS Fargate website for example, it states that many resources for their containers aren't shared, and from that aspect it is much closer to a VM than a traditional self-hosted container:
One final concern I'd like to note is compatibility. As your access to the kernel (and also potentially your syscalls) is limited, you can't do certain tasks like loading dkms modules or doing sysctl configs. Not all applications will run in this, but those tend to be the exception rather than the norm.
There's many valid use cases for containers (both OCI-like and LXC-like), and it's definitely not a "one solution fits all" thing. Not having to run a whole kernel and do other types of virtualization (graphics, audio, network etc) does result in a lot less overhead, but there's also considerations that must be made about the cons of using containers, some of which I've mentioned in my answer.