Why Containers Won't Beat VMs
Are LXC-style containers the alternative to using cloud-based virtual machines? There's been some hype suggesting that "the cloud" is moving away from virtual machines and toward containers. In a way, this is yet another misunderstanding of what the cloud is. These are not two solutions to the same problem. But even in the cases of overlap, claiming immediate and decisive victory for containerization over virtualization is premature at best.
Here are four reasons why the argument for containers is overblown. <!--break-->
Disk space and startup time are not expensive
I'll start with a negative claim. What I often hear cited as the big wins for containers are startup time and storage utilization. These may be conveniences, but neither is actually all that significant over the long term.
Storage remains cheap (and ephemeral storage is "costless", since compute time is really the billable metric). The difference between several hundred meg of space and a couple of gigs is inconsequential.
In most (but not all) cases, the amount of time required to spin up a VM (1-2 minutes) does not become a significant detriment for usage. Are we really down to arguing whether developers are more willing to wait one minute for a full OS rather than opt for a container that only takes 30 seconds to provision?
While there are specific cases where these might matter, for normal use cases, neither the amount of storage nor the duration of startup time is a decisive differentiator.
Do you ever plan to upgrade your OS?
One aspect of software architecture that I often try to emphasize is long-term maintainability. Over the long term, you will need to update not just your application, but its dependencies.
Ideally, when you are working on a server, you want the application's needs to drive what the server runs. With virtualization, this is fairly easy.
But containerization can thrown a wrench into things. In this environment, you use one OS (kernel plus libraries plus base container) and run many (container-bound) applications in isolated instances that are coupled non-trivially with the host OS.
What happens when you need to upgrade the OS (esp. the kernel, but also other parts)?
In a virtualized environment, there is a 1-to-1 relationship between your apps and the OS. You can be selective about when you update. If you have 30 VMs, you upgrade servers as needed. When one app requires a kernel upgrade, this does not trigger a mad rush to update all 30 applications.
In containerization, when the OS updates, potentially everything has to update -- all of the containers must go at the same time. When you have 30 containers on one system, you get to do 31 migrations at the same time.
(While it's easy to focus on just the kernel -- the most visibly shared piece runtime piece -- there are other pieces of container models whose workings impact the containers themselves.)
Virtual machines are not completely impenetrable, from a security standpoint. But their isolation model makes them more resistant to attacks than containers. A wiki page on the Ubuntu site highlights how the potential for kernel exploits (among other things) remains a persistent threat for containerized solutions:
"[C]ontainers will always (by design) share the same kernel as the host. Therefore, any vulnerabilities in the kernel interface, unless the container is forbidden the use of that interface (i.e. using seccomp2) can be exploited by the container to harm the host."
In my own testing, I have found other cases where information leakage (usually on the network) about the container host or other containers allows some subtle firewall-avoiding network based attacks against the host or other containers. These can be guarded against, but it's a vector of attack that most users are not used to.
I'm not claiming that all containers are currently insecure or vulnerable, but as the quote above indicates, when you give up isolation, you're closing the distance that an attack must traverse.
Stability, scalability, and availability
A strongpoint of cloud virtualization is that virtual machines can be distributed across physical resources. I may, for example, have some of my frontline servers running on one physical location (presumably on many different bare-metal servers), and have others run in another location. Local hardware failures, and even larger datacenter outages don't have a huge impact on the app.
Containerization offers no such benefit. In fact, it has a very specific (very local) single point of failure: the host. And since all apps are basically interacting with the same kernel, it's possible that if one app can trigger a kernel panic, it can take the whole thing down.
So it doesn't matter if you run your database in one container and your app server in another container: you're sharing the same resource pool, and both will fail at the same time if the host crashes.
The standard response is: "You should spread your hosts across multiple virtual machines. Then you can run some containers on one host, and other containers on another host." This is very true. You can also do this without the containers.
Containers are good for many things. Heroku, Stackato, CloudFoundry and other PaaS solutions make good use of them. But they're not even designed to be a replacement for virtualization. Containers serve a different purpose at a different layer. But overstating their position as an alternative to VMs is playing for hype, not fact.
The proclamation of the death of virtualization has been somewhat overstated.