The linux containers, lxc, aims to use these new functionalities to provide a userspace container object which provides full resource isolation and resource control for an applications or a system.
The first objective of this project is to make the life easier for the kernel developers involved in the containers project and especially to continue working on the Checkpoint/Restart new features. The lxc is small enough to easily manage a container with simple command lines and complete enough to be used for other purposes.
The following list gives the kernel features to be enabled in the kernel to have the full features container:
* General setup * Control Group support -> Namespace cgroup subsystem -> Freezer cgroup subsystem -> Cpuset support -> Simple CPU accounting cgroup subsystem -> Resource counters -> Memory resource controllers for Control Groups * Group CPU scheduler -> Basis for grouping tasks (Control Groups) * Namespaces support -> UTS namespace -> IPC namespace -> User namespace -> Pid namespace -> Network namespace * Device Drivers * Character devices -> Support multiple instances of devpts * Network device support -> MAC-VLAN support -> Virtual ethernet pair device * Networking * Networking options -> 802.1d Ethernet Bridging * Security options -> File POSIX Capabilities
The kernel version >= 2.6.32 shipped with the distros, will work with lxc, this one will have less functionalities but enough to be interesting. The helper script lxc-checkconfig will give you information about your kernel configuration.
The application / system will be launched inside a container specified by a configuration that is either initially created or passed as parameter of the starting commands.
How to run an application in a container ?
Before running an application, you should know what are the resources you want to isolate. The default configuration is to isolate the pids, the sysv ipc and the mount points. If you want to run a simple shell inside a container, a basic configuration is needed, especially if you want to share the rootfs. If you want to run an application like sshd, you should provide a new network stack and a new hostname. If you want to avoid conflicts with some files eg. /var/run/httpd.pid, you should remount /var/run with an empty directory. If you want to avoid the conflicts in all the cases, you can specify a rootfs for the container. The rootfs can be a directory tree, previously bind mounted with the initial rootfs, so you can still use your distro but with your own /etc and /home
Here is an example of directory tree for sshd:
[root@lxc sshd]$ tree -d rootfs rootfs |-- bin |-- dev | |-- pts | `-- shm | `-- network |-- etc | `-- ssh |-- lib |-- proc |-- root |-- sbin |-- sys |-- usr `-- var |-- empty | `-- sshd |-- lib | `-- empty | `-- sshd `-- run `-- sshd
and the mount points file associated with it:
[root@lxc sshd]$ cat fstab /lib /home/root/sshd/rootfs/lib none ro,bind 0 0 /bin /home/root/sshd/rootfs/bin none ro,bind 0 0 /usr /home/root/sshd/rootfs/usr none ro,bind 0 0 /sbin /home/root/sshd/rootfs/sbin none ro,bind 0 0
How to run a system in a container ?
Running a system inside a container is paradoxically easier than running an application. Why ? Because you don't have to care about the resources to be isolated, everything need to be isolated, the other resources are specified as being isolated but without configuration because the container will set them up. eg. the ipv4 address will be setup by the system container init scripts. Here is an example of the mount points file:
[root@lxc debian]$ cat fstab /dev /home/root/debian/rootfs/dev none bind 0 0 /dev/pts /home/root/debian/rootfs/dev/pts none bind 0 0
More information can be added to the container to facilitate the configuration. For example, make accessible from the container the resolv.conf file belonging to the host.
/etc/resolv.conf /home/root/debian/rootfs/etc/resolv.conf none bind 0 0
In case of failure when the container is initialized, it will pass through the aborting state.
--------- | STOPPED |<--------------- --------- | | | start | | | V | ---------- | | STARTING |--error- | ---------- | | | | | V V | --------- ---------- | | RUNNING | | ABORTING | | --------- ---------- | | | | no process | | | | | V | | ---------- | | | STOPPING |<------- | ---------- | | | ---------------------
lxc-create -n foo lxc-destroy -n foo
Running an application inside a container is not exactly the same thing as running a system. For this reason, there are two different commands to run an application into a container:
lxc-execute -n foo [-f config] /bin/bash lxc-start -n foo [-f config] [/bin/bash]
lxc-execute command will run the specified command into the container via an intermediate process, lxc-init. This lxc-init after launching the specified command, will wait for its end and all other reparented processes. (to support daemons in the container). In other words, in the container, lxc-init has the pid 1 and the first process of the application has the pid 2.
lxc-start command will run directly the specified command into the container. The pid of the first process is 1. If no command is specified lxc-start will run the command defined in lxc.init_cmd or if not set, /sbin/init .
To summarize, lxc-execute is for running an application and lxc-start is better suited for running a system.
If the application is no longer responding, is inaccessible or is not able to finish by itself, a wild lxc-stop command will kill all the processes in the container without pity.
lxc-stop -n foo
lxc-console -n foo -t 3
lxc-freeze -n foo
will put all the processes in an uninteruptible state and
lxc-unfreeze -n foo
will resume them.
lxc-ls lxc-info -n foo
lxc-ls lists the containers of the system.
lxc-info gives information for a specific container.
Here is an example on how the combination of these commands allows one to list all the containers and retrieve their state.
for i in $(lxc-ls -1); do lxc-info -n $i done
lxc-monitor command will monitor one or several containers. The parameter of this command accept a regular expression for example:
lxc-monitor -n "foo|bar"
will monitor the states of containers named 'foo' and 'bar', and:
lxc-monitor -n ".*"
will monitor all the containers.
For a container 'foo' starting, doing some work and exiting, the output will be in the form:
'foo' changed state to [STARTING] 'foo' changed state to [RUNNING] 'foo' changed state to [STOPPING] 'foo' changed state to [STOPPED]
lxc-wait command will wait for a specific state change and exit. This is useful for scripting to synchronize the launch of a container or the end. The parameter is an ORed combination of different states. The following example shows how to wait for a container if he went to the background.
# launch lxc-wait in background lxc-wait -n foo -s STOPPED & LXC_WAIT_PID=$! # this command goes in background lxc-execute -n foo mydaemon & # block until the lxc-wait exits # and lxc-wait exits when the container # is STOPPED wait $LXC_WAIT_PID echo "'foo' is finished"
lxc-cgroup command is used to set or get a control group subsystem which is associated with a container. The subsystem name is handled by the user, the command won't do any syntax checking on the subsystem name, if the subsystem name does not exists, the command will fail.
lxc-cgroup -n foo cpuset.cpus
will display the content of this subsystem.
lxc-cgroup -n foo cpu.shares 512