Linux Container in Fatdog

Introduction

As noted in this and this post, I have been experimenting with Linux Containers (LXC or lxc for short) lately.

What is LXC? In short, it is a collection of technologies that allows processes to be compartmentalised. A process in one compartment (or "container", in LXC-speak) cannot interact with processes in other compartments. It's the equivalent (in terms of concept) of FreeBSD jail in Linux.

What is it good for? It's good to secure 'server' processes, as when these processes are broken, an attacker can only affect things inside the container (presumably, each server process is allocated one container to limit this kind of damage). This has been the basis for the so-called Operating System Virtualisation.

In this respect, LXC is late to the game. FreeBSD has its jail for years. Solaris as 'Solaris Container', and so is many other operating systems (except Windows, probably ). In Linux alone, it has been preceeded by Linux VServer and Virtuozzo/OpenVZ by years. They also have more features that LXC currently doesn't have. LXC however, has the benefit of being available from the vanilla kernel, not patches needed - just compile-time configuration.

More information about LXC: here (LXC manpage), here (IBM developerworks), here (Ubuntu community wiki) and many others you can find using your favorite search engine.


LXC support in Fatdog
In next version of Fatdog, the kernel will support LXC.

Many years in the making, the final component of LXC ("user namespaces") was merged with the kernel in Linux 3.8. Unfortunately, this component requires extensive changes to other kernel components that when many of them were not ready for this change; thus making 3.8 kernel less than ideal for running LXC. The situation is better in Linux 3.9 - only the XFS filesystem still doesn't work with user namespaces. I was hoping that 3.10 would fix this but as of today, at 3.10-rc4, I see don't see it happening, so if we ever need to ship Fatdog with one of these kernel, we will have to choose either to enable XFS filesystem support, or to enable user namespaces. I hope we don't have to choose by the time we release. LXC can still run without user namespaces feature, although it will be less secure.

In addition to the kernel, one needs to install LXC userspace utilities from the Package Manager (this model is similar to User Mode Linux support in Fatdog - the scripts are there, but you need to install the userspace utilities and the UML kernel first).

Now, when creating the Fatdog convenience interface to LXC, I have two choices:
a) follow the sandbox model (zero configuration)
b) follow the UML model (persistent configuration)

I have decided to follow the sandbox model (zero configuration). The scripts are called sandbox-lxc (and rw-sandbox-lxc for the one with persistent storage). The reasons I go with this model is:
a) it is slightly easier to use
b) if one needs to run an LXC container with persistent configuration, the LXC userspace tools themselves already supports this model (see lxc-create and format of lxc-template config-file, so there is no need to duplicate this excellent functionality.
c) lastly, with the sandbox model you can easily copy/transfer/backup the container root filesystem from the "host" (e.g. a tool like sb2dir.sh will work; if you need to copy files to the sandbox just copy it over to the sandbox's fakeroot).

How to use these scripts? They are identical with sandbox scripts, so instead of sandbox.sh you run sandbox-lxc.sh. Instead of rw-sandbox.sh you run rw-sandbox-lxc.sh.

When run, in addition to asking about the layers you want to use, it will also ask whether you want to be able to launch a desktop. If you said yes, the host's /tmp will be mapped inside the container - which means the processes inside the sandbox can change stuff in the host system and connect to processes in the host - but it is a requirement if you want to run Xnest from inside the container.

These scripts by default will run without user namespaces. You can tell them to run with user namespaces by putting IDMAP=yes to the environment before running them, like this: IDMAP=yes sandbox-lxc.sh. Support for kernel user namespaces is auto-detected, if the kernel doesn't support it IDMAP will have no effect.

When run without user namespaces, these capabilities are dropped: sys_module (load kernel module), sys_time (change system time), and syslog (change kernel logging parameters).

When run with user namespaces, all the UID are shifted by 10000 (ie, the container's root is actually a user with UID 10000 in the system).

In either case, the container cannot create block device nodes (only character devices) and will have its own hostname and network interface which you need to configure manually if you want to connect to the network from within the container.

These scripts, the LXC-enabled kernel will be in the next release of Fatdog (the LXC-userspace tools are already in Package Manager but they are useless without LXC-enabled kernel). Beware that they are still considered experimental, as is LXC itself.



Posted on 8 Jun 2013, 3:04 - Categories: Fatdog64 Linux
Edit - Delete


No comments posted yet.

Add Comment

Title
Author
 
Content
Show Smilies
Security Code 1492205
Mascot of Fatdog64
Password (to protect your identity)