I had a wonderful conversation with some team members around Windows Containers generally, and they had some very cool analogies that I don’t think have been publicized enough. There’s some good technical detail, too, which I think is worth understanding as we move into this brave new world of containerization.
First, let’s just quickly review our oldest kind of “container,” the virtual machine. I’m going to generalize a bit here, so that what I’m writing is true for any kind of VM. Essentially, a virtual machine is a very rigidly scoped process. The host computer, via software known as a hypervisor, emulates all of the services that a real, physical computer would provide. The hypervisor pretends to be a network card, a BIOS, a CPU, RAM, and so on. Upon that virtual hardware runs a normal off-the-shelf operating system, which in turn runs whatever software you want. Now, that description is true of an old-time virtualization situation. In reality, modern hypervisors take a variety of approaches to help improve the performance of that situation. For example, CPUs are rarely emulated, per se; instead, the hypervisor manages thread scheduling on the physical CPUs, and more or less exposes them directly to the VM. Kinda; I’m simplifying a bit. Hyper-V also uses synthetic hardware versus emulated hardware for many devices, which again reduces overhead and improves performance.
Now let’s move on to containers, which – for the purposes of a simple explanation – can be a Linux container or a Windows Server container. Notice that I’m not using the word “Docker,” here, because Docker is a container management solution, really, and can manage both Linux containers and Windows containers. A container is not a virtual machine, in any way, shape, or form. There’s no emulated or synthesized hardware. Instead, a container is just a normal application running as a normal process. In the operating system’s process list, this application is essentially marked “this is a container.” That special “marker” causes some bits of the operating system to behave a little differently. For example, when the application asks the OS to write to a file on disk, or (on Windows) to a registry key, the OS “intercepts” that call and instead writes the data to an area that belongs just to that application. Any read requests first check that private area, so the application gets the data it expects. Read requests that can’t be fulfilled by the private area are directed back to the “main” file system (or registry, or whatever), so the application “believes” it is running all by itself on a full computer. The practical upshot of this is that the application can’t change anything in the “common” OS, although the application doesn’t realize that. Deleting the container removes everything the application has done. Honestly, this technique – in the form of read/write filters – has been around forever. Virtuozzo has been doing this in hosted environments for years, Windows Embedded had similar filtering functionality, and even Microsoft App-V works on largely similar principles. The difference with containers is that the filtering happens at the kernel level of the OS, so it’s much more efficient and managed.
But it isn’t foolproof. Containers do not represent an impenetrable barrier between processes, meaning it’s possible for one application to access another’s data, potentially hog processing resources, etc. So in cases where you’re dealing with super-sensitive data (for example), containers might not be acceptable.
Thus, Microsoft’s “Hyper-V Container,” which sits somewhere between a full VM and a zero-VM container. Basically a Hyper-V Container is a virtual machine, just like the Hyper-V VMs you know and love. The difference is that, when you ask Windows to spin up one of these containers, it inserts a trimmed-down version of the Windows OS and kernel. Many of the API calls within the container are handled instead by the host OS. The result is a “lighter” VM that imposes a bit less overhead, especially given Hyper-Vs use of synthesized hardware. But data remains within the VM, imposing the rigid boundary around the VM that we’re used to. One VM cannot access the contents of another (except via well-defined channels like file sharing or other port-based communications), and so you get a more managed security barrier. You also get faster spin-up time – not as fast as a “normal” container, but faster than a “normal” VM.
It’s worth understanding all of these different execution models. None of them are always right or wrong; they’re all tools in an increasingly varied tool belt, allowing us to right-size our execution environment to the task at hand.