really tenacious guy

Virtualization on Linux - Containers

Nearly two years ago I wrote down my preferences for the virtualization technologies. Right now there is a shiny new server standing nearby and it looks like I need to review my list.

User Mode Linux (UML)

It turned that the speed of the processes running inside UML is awful. You may not notice that during regular usage, but when you need to do some I/O, UML behaves well worse than fully virtualized KVM. I decided not to use UML for my server.

Kernel Virtual Machine (KVM)

Using it all the time now to test various releases of Ubuntu and run foreign OSes. Everything I need works and it works pretty fast. If you happen to notice that your mouse started skipping in guest Linux, see bug LP:553081 for explanation. It works great under libvirt and virt-manager is really helpful to manage the instances.

Linux Containers (LXC)

This is something really new for me. Imagine a chroot() which cannot be escaped (which is not the case with chroot(), see this link), which resources can be restricted and which has its own networking namespace. This is pretty much how it looks to me.

You can control resource allocation for the container but I haven’t done anything in this direction yet. The advantage is tremendous - you don’t need to emulate anything, the processes are running at native speed, everything just wonderful…

… Except that LXC does not support udev at the moment. Without udev you will need to tweak your containers so that they do not depend on udev. See the awesome post by bodhi.zazen for the step-by-step instruction.

Please note that you might want to create Post-Invoke rules for dpkg to "patch" the installation after mountall upgrade to prevent container from creating tmpfs-based /dev.

Bugs

Even if you follow the instruction precisely, you may get the following message:

lxc-start: Device or resource busy - could not unmount old rootfs

This was already discussed in [lxc-users] mailing list. It turns out that 0.6.5 does not work properly if you have /var on a separate partition. But fear not, I have filed the bug LP:566827 and packaged the patched version in my ppa:rye/ppa.

LXC needs more work to support container reboot properly, after all you do not “reboot” it, you just terminate all processes including container’s init and then start everything again.

You should also be advised that with 0.6.5 the container can remount the partition it is on in read-only mode.

I am also highly recommending to include the following lines in your container configuration:

# no insmod/rmmod 
lxc.cap.drop = sys_module
# no time adjusting
lxc.cap.drop = sys_time
See this mailing list message for explanation of capabilities(7) that the container can be denied.

It may be tempting to configure your lxc containers to be running with libvirt, however I was not very comfortable with that. It turns out that you can not configure device access properly with libvirt and your container will basically be able to do whatever it wants to the host devices. I will be glad to migrate to libvirt once I am able to attach to physical console and deny capabilities using libvirt configs. For now I am staying with simple lxc-start-way of management

LXC can be seen as a replacement for OpenVZ patches. LXC support is already included in recent kernels so that you might want to take a look at it if you are running a server farm with OpenVZ.

I am using LXC to run my storage server (remember, its filesystem is just a subtree of a regular one) so I am following lxc-users threads closely in order not to miss the announcement that we may run Ubuntu unmodified in lxc :).

Comments