LXC howto for pets
2025-07-27
Pet used Debian-based systems while was writing this memo, and it has the strong opinion
that apt's option --no-install-recommends is extremely important.
Without it you easily get lots of crap installed both on the host system and in containers.
Best to turn it on by default by creating /etc/apt/apt.conf.d/01 norecommends with the following content:
APT::Install-Recommends "0";
APT::Install-Suggests "0";Links that helped pet
- https://wiki.debian.org/LXC
- https://stgraber.org/2013/12/20/lxc-1-0-blog-post-series/
- https://blog.benoitblanchon.fr/lxc-unprivileged-container/
- https://github.com/lucapiccio/LXC_to_Unprivileged/blob/main/convert.sh
"Worse-Better" knobs on the host system
kernel.dmesg_restrict = 1in/etc/sysctl.confThis makes dmesg output inaccessible from unprivileged containers.kernel.unprivileged_bpf_disabled = 1Pet does not exactly why, but thinks it's worth applying.
Installing LXC
Pet uses minimalistic approach:
apt install lxc lxcfs lxc-templatesDebian versions prior trixie and Devuan prior excalibur also require cgroupfs-mount.
Pet prefers to create containers manually with debootstrap.
Here's what is needed for that:
apt install debootstrap distro-info debian-keyring debian-archive-keyringFor networking that uses /etc/network/interfaces, the obsolete bridge utils might be required:
apt install bridge-utilsOther packages pet seen in recommendations:
- libvirt0: might be needed to run alien containers (i.e. arm on x86),
but
qemu-user-staticplusbinfmt-supportare also required. Pet will revise this. - libpam-cgfs: systemd crap won't work without this
- uidmap: pet has no idea why they recommend it
Networking
Pet prefers to configure networking with its own paws and does not use lxc-net.
On systems with systemd this can be turned off with:
systemctl stop lxc-net
systemctl disable lxc-netOn systems with sysvinit:
/etc/init.d/lxc-net stop
update-rc.d lxc-net removePet uses two approaches for networking: bridged on its own systems and NATed on she-master's systems.
With bridged approach all containers have direct access to the network (layer 2 in OSI model as pet could remember). But the host system should be prepared for that. Namely, its primary network adapter should be bridge with physical ethernet interface as a part of it.
NATed approach does not require such major changes on the host system,
so pet can use she-master's system and she does not notice anything.
Only a couple of changes are required: one in /etc/nftables.conf that turns NAT on:
table ip nat {
chain postrouting {
type nat hook postrouting priority 100; policy accept;
oif eth0 masquerade random,persistent
}
}and another is in /etc/sysctl.conf that enables routing:
net.ipv4.ip_forward=1No reboot is necessary, just
sysctl -p
nft -f /etc/nftables.confWith bridged approach pet configures networking in /etc/network/interfaces:
iface eth0 inet manual
auto br0
iface br0 inet static
bridge_ports eth0
address 192.168.0.2
netmask 255.255.255.0
gateway 192.168.0.1Note that pet uses br0 instead of lxcbr0 which is configured in/etc/lxc/default.conf.
As long as pet creates containers manually that does not matter and no changes are required.
For sophisticated configurations see how to share container namespace with the host system: https://serverfault.com/questions/765789/where-are-network-namespaces-in-lxc-lxd
Subordinate uid/gid maps
To run unprivileged containers, UID and GID maps should be configured on the host system.
Pet simply adds as many as necessary to both /etc/subuid and /etc/subgid:
root:100000:65536
root:200000:65536
root:300000:65536
root:400000:65536
root:500000:65536
...Creating LXC container
debootstrap
Pet's way:
mkdir -p /var/lib/lxc/mycontainer/rootfs
debootstrap --variant=minbase \
--include=dialog,libc-l10n,locales,nano \
--exclude=vim-common,vim-tiny \
excalibur \
/var/lib/lxc/mycontainer/rootfs \
http://deb.devuan.org/mergedPet prefers nano because pet is too stupid and each time when it accidentally steps into vim
it has to reboot the system or ask AI how to exit.
If the base system is not Devuan, debootstrap most likely fails. Need to get Devuan's version:
git clone --depth 1 https://git.devuan.org/devuan/debootstrap.git
export DEBOOTSTRAP_DIR=`realpath debootstrap`And need to get keys:
mkdir -p ~/.gnupg
gpg --no-default-keyring --keyserver keyring.devuan.org --keyring ~/devuan-keyring.gpg --recv-keys B3982868D104092CB3982868D104092C is for excalibur as of time of writing. Check this page for the right one https://www.devuan.org/os/keyring
Add --keyring ~/devuan-keyring.gpg to the above debootstrap command.
debootstrap for different architecture
The host system needs the following packages installed:
qemu-user-static
binfmt-supportRun first stage for arm64, for example:
mkdir /root/apt-cache-arm64
debootstrap --variant=minbase \
--foreign --arch=arm64 \
--cache-dir=/root/apt-cache-arm64 \
--include=dialog,libc-l10n,locales,nano \
--exclude=vim-common,vim-tiny \
excalibur \
/var/lib/lxc/mycontainer/rootfs \
http://deb.devuan.org/mergedSometimes gpg check fails for some reason, and pet has to use --no-check-gpg option.
Chroot and run second stage. Debootstrap may want itself from /root/debootstrap, need to make a copy.
chroot /var/lib/lxc/mycontainer
cp -a debootstrap root/
/debootstrap/debootstrap --second-stagepre-chroot tweaks
Set hostname just in case:
echo mycontainer >/var/lib/lxc/mycontainer/rootfs/etc/hostnameBy default hostname is taken from the host system by debootstrap and this is confusing.
This file is not used by minimal setup because host name is set by LXC, see lxc.uts.name below.
Now copy your favorite .bashrc
cp ~/.bashrc /var/lib/lxc/mycontainer/rootfs/root/chroot to the container
chroot /var/lib/lxc/mycontainer/rootfsconfigure locales
dpkg-reconfigure localesapt tweaks
echo 'APT::Install-Recommends "0";' >/etc/apt/apt.conf.d/01-norecommends
echo 'APT::Install-Suggests "0";' >>/etc/apt/apt.conf.d/01-norecommends
echo 'DSELECT::Clean "always";' >/etc/apt/apt.conf.d/90-autoclean
echo 'APT::Keep-Downloaded-Packages "false";' >>/etc/apt/apt.conf.d/90-autoclean
echo 'Binary::apt::APT::keep-downloaded-packages "false";' >>/etc/apt/apt.conf.d/90-autocleaninstalling packages
Pet's preferred set of packages for the minimal system:
apt install \
apt-utils \
bash-completion \
bsdextrautils \
ca-certificates \
file \
findutils \
iputils-ping \
iputils-tracepath \
iproute2 \
less \
lsb-release \
lsof \
netbase \
netcat-openbsd \
procps \
psutils \
psmisc \
runit \
runit-init \
tree \
tzdata \
xz-utilsIf signatures are broken and apt install fails saying
Splitting up /var/lib/apt/lists/deb.devuan.org_merged_dists_excalibur_InRelease into data and signature failed,
run the following commands:
apt clean
rm -r /var/lib/apt/lists/*
apt updateand try again.
configuring init system
Runit is pet's choice for containers. It's not perfect, Debian package is buggy, the codebase is spooky, but other init systems are not better.
Pet uses runit in the native boot mode:
touch /etc/runit/native.boot.run
touch /etc/runit/no.emulate.sysv
mkdir /etc/runit/boot-run
mkdir /etc/runit/shutdown-run
rm -rf /etc/sv/getty* /etc/service/getty*Minimal initialization needs two scripts only. First, /etc/runit/boot-run/10-sysctl.sh:
/sbin/sysctl --systemSecond, /etc/runit/boot-run/20-mountall.sh:
# Based on /etc/init.d/mountall.sh and /etc/init.d/mountdevsubfs.sh
do_mount_all()
{
. /lib/init/vars.sh
. /lib/init/tmpfs.sh
. /lib/init/mount-functions.sh
TTYGRP=5
TTYMODE=620
[ -f /etc/default/devpts ] && . /etc/default/devpts
MNTMODE=mount_noupdate
mount -a # mount everything from /etc/fstab
mount_run $MNTMODE
mount_lock $MNTMODE
mount_shm $MNTMODE
if [ ! -d /dev/pts ] ; then
mkdir --mode=755 /dev/pts
[ -x /sbin/restorecon ] && /sbin/restorecon /dev/pts
fi
domount "$MNTMODE" devpts "" /dev/pts devpts "-onoexec,nosuid,gid=$TTYGRP,mode=$TTYMODE"
}
do_mount_allcreate container configuration
Now it's okay to exit chrooted environment and create container configuration file /var/lib/lxc/mycontainer/config:
lxc.apparmor.profile = unconfined
lxc.include = /usr/share/lxc/config/devuan.common.conf
lxc.include = /usr/share/lxc/config/devuan.userns.conf
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536
lxc.rootfs.path = dir:/var/lib/lxc/mycontainer/rootfs
lxc.rootfs.options = idmap=container,nodiratime,relatime
lxc.uts.name = mycontainer
lxc.net.0.type = veth
lxc.net.0.name = eth0
lxc.net.0.link = br0
lxc.net.0.flags = up
lxc.net.0.ipv4.address = 192.168.0.3/24
lxc.net.0.ipv4.gateway = 192.168.0.1
lxc.start.auto = 0The above configuration is for bridged approach. Here's how it would look for NATed approach:
lxc.hook.version = 1
lxc.net.0.type = veth
lxc.net.0.veth.mode = router
lxc.net.0.ipv4.address = 192.168.10.2/24
lxc.net.0.ipv4.gateway = 192.168.10.1
lxc.net.0.flags = up
lxc.net.0.script.up = /bin/sh -c "ip address add 192.168.10.1/24 dev $LXC_NET_PEER"Block devices and file systems
To use a block device in an unprivileged container, change group of the block device
to container's GIG, e.g. 100000.
The owner may remain root.
To automate this, create /etc/udev/rules.d/90-sda-permissions.rules
with the following line (assuming the device is sda):
KERNEL=="sda", ACTION=="add", GROUP="100000"Next, allow using block device in the container.
Add the following lines to config file:
lxc.cgroup.devices.allow = b 8:0 rwm
lxc.mount.entry = /dev/sda dev/sda none bind,create=fileAn open question is how to make this by UUID?
I.e. sda and 8:0 may change, but UUID is stable.
So, block device can be read and written, but filesystems cannot be mounted from unprivileged container.
VPN
Pet did not try to run servers so far. All notes are for clients only.
Wireguard
As a client it works without any tweaks in unprivileged containers.
OpenVPN
It works in unprivileged containers with the following tweaks in config file:
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.mount.entry = /dev/net dev/net none bind,create=dir 0 0
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file