Hacker News new | past | comments | ask | show | jobs | submit login
LXC and LXD: a different container story (lwn.net)
206 points by zekrioca on Sept 23, 2022 | hide | past | favorite | 99 comments



I ran LXD for about a year in my home lab. Unfortunately, the easiest way to get it running is by installing Snap; which I didn't do. Instead I ran it on Alpine Edge (one of the few distros that actually has it in their package manager). LXD kept breaking after system updates and I got tired of troubleshooting. I suppose that's just part of the perils of running bleeding edge.

When LXD was running, I found that it feels like something between a VM and OCI container. In the whole pets vs cattle analogy, LXD containers definitely feel more like pets.

On the host, concepts are similar to OCI containers (networking, macvlan, port mapping, etc). I like its approach to building up containers with several commands instead of one long command. You can add and remove ports, volumes, and even GPUs to running containers. It doesn't have compose files, I just used bash scripts to create reproducible containers.

Inside the LXD container is where things get different. You're containerizing a whole OS, not an application. So you end up administering your container like a VM. You jump inside it, update its packages, install whatever application and dependencies you need. It doesn't work like a glorified package manager (which is how most homelabbers use Docker). As a long time VM user I actually prefer installing things myself, but I can see this turning away many people in the selfhosting community.

I liked LXD and would've kept using it if it weren't so intimately linked to Snap. Last month it failed to start my containers after a system update...again. So I moved that box over to Rocky with Podman containers running as SystemD units. I do kinda miss having fast, throwaway OSes on that server for experiments (sometimes its nice to have pets).


> Unfortunately, the easiest way to get it running is by installing Snap;

The Debian package for LXD just entered the archive about 2 weeks ago:

https://tracker.debian.org/news/1361535/accepted-lxd-500-1-s...

This means the next Debian release will make for stable LXD hosts without needing snap.


This is most excellent news!!!


Finally!


> Unfortunately, the easiest way to get it running is by installing Snap; which I didn't do. Instead I ran it on Alpine Edge (one of the few distros that actually has it in their package manager).

Enabling LXD is a one-liner on NixOS:

    virtualisation.lxd.enable = true;
LXC and LXD are a somewhat common approach for NixOS users who want to quickly try something out in the environment of a traditional distro. :)


I use LXD like this on NixOS. It's great for Steam, which works best on Arch Linux.


Someone packages it for opensuse. It has been stable for me on tumbleweed for a couple of years.


Thanks for the info. Never used OpenSuse, but next time my curiosity swings back around to LXD I'll check it out!


In the off chance you remember this conversation, the package does not require attr (even though lxd does), so you need to zypper in attr to get lxd running if another dependency didn't pull it.


I've been using it in my homelab for a while on Arch with ZFS. Never had any issues with upgrades.

However, I once wanted to rename my zpool. Boy was I up for a world of hurt. In the end, I just reinitialized everything (the VMs themselves were easy to get back up).


I really love LXD/LXC conceptually but I always found it broke in mysterious ways is there a suggested base distro for imbeciles like me? I end up retreating back to Podman as well.


According to other comments OpenSuse and Arch have it stable now and the next release of Debian will have it.


Hi, author of the article, pleased to see it here! I'd really like to hear more from folks about how they're using LXC and/or LXD, and what they think their greatest strengths are compared to Docker or Kubernetes.


We use LXC on physically distributed Proxmox nodes. It let's us easily launch new VMs and migrate around when we need to. Nothing we have is very complicated but it works well. We use SaltStack to tear up special debug containers with a bunch of integrated tools. To update Saltstack just downloads the new container and creates a new container in Proxmox.

I guess you could technically replace what we built with pure Docker but the Proxmox UI is a godsend for our on the ground support engineers who aren't the most technically savvy types.

We started going down the Kubernetes route initially but it quickly turned into an absolute nightmare for us and we gave up quickly and we're all pretty strong with Kubernetes. I guess though when all you need is a few stateful services running the "dynamic scaling" of kubernetes is kind of pointless.


Even though I am sure I am perfectly capable of figuring out the command line incantations to make these containers work, I still really appreciate the convenience of the Proxmox environment where it’s something I don’t have to think about.


I use it in a vanilla KVM host to have lightweight VMs for Remote Desktop environments (https://taoofmac.com/space/blog/2022/04/02/2130, https://taoofmac.com/space/blog/2022/04/12/2330). That way I can have full GNOME and XFCE desktops with different sets of developer tools installed, sharing the same projects tree on a bind mount, and to which I can remote via Xrdp (with audio, device sharing, the works). Like the two KVM Windows VMs running alongside, each LXC can have its own IP, outbound VPN connection and even Docker running as well, which is great.

A single i7 with 32GB of RAM can have at least 3 people working on it without hassle, like an old time-sharing system. Myself, I stuck a Raspberry Pi to a monitor (https://taoofmac.com/space/blog/2022/08/14/2030) and use it to access both GNOME and Windows desktops remotely.

All of this works solely with Ubuntu system packages, so I have no third party software to install (yes, ok, snap is a pain, but the host only runs KVM and LXD, nothing else, and all distros have xorgxrdp these days).

It’s like a lightweight Proxmox.

In the past, I used LXC to host several of my services on a single beefy Linode box, and I would go that way again today - I already have a similar setup inside Parallels on my M1 Mac, with one LXC per project.


I use Kubernetes within an LXD container with Btrfs backing storage. This isn't anything special. But it has two advantages. The first is that you could try out multi-node K8s clusters on a single system. The second is that the containers can be deleted and rebuilt easily without affecting the host system. Both are very useful when you're learning multi-node K8s. I plan to expand my homelab to true multi-node setup. However, I will probably retain K8s inside LXD. It is useful to run certain applications that demand full control of the system - like PiHole or MailInABox. There are probably better ways of hosting them. However, LXD gives me a lot of flexibility to experiment and make mistakes.


> The first is that you could try out multi-node K8s clusters on a single system.

Why can't you do this without BTRFS/LXD and just with Docker/OCI?


Yes, that's certainly possible - especially with tools like kind. However, the reason I prefer LXD over Docker/OCI is that the former behaves more like a full OS. LXD containers are system containers which are like VMs except for the lack of guest kernel. So it allows me to experiment with some K8s deployment configurations that's needed on a proper cluster.


> LXD containers are system containers which are like VMs except for the lack of guest kernel.

How is this different than a Docker/OCI container/cgroup thingie?


Do you have instructions (e.g., tutorial or documentation) on what needs to be done to have this setup? There are many configurations that would need to make it work properly. Thanks!


I'm using LXC just to have a painless way to have a second SSH server (for SFTP) running on the computer that has different usernames and passwords.

Literally all that is installed is Alpine Linux and the SSH server. There's a symbolic link to get to the shared files.


Similar use case for me. But instead of SSH/SFTP I was running Caddy with the file_server directive!

I've since switched to podman because my installation on Alpine Edge kept breaking on updates and I don't want to use Snap. Someone mentioned OpenSuse had it in their package manager and its been stable. Maybe I'll check that out in the future.


I use lxd for running core services at home, such as samba/sftp/webdav or home automation. I find it easier to attach a vpn to the container, which combined with a nat traversing sdn/mesh gives me easy, private, encrypted access to data or services. Also easy to pass a usb device (say zwave controller) to a container. I run it over a btrfs array and have a workflow to snapshot containers and send to a backup host. Using unique namespaces for each unprivileged container, I am reasonably confident that a security oops in one container should be isolated.

I also use lxd on vps hosts, often nesting docker to give more granular control over networking.


LXC is how Chromebooks let you install Debian or Ubuntu or Arch in a container, with graphical applications running in the container integrated into the ChromeOS UI.


I basically use LXC for light-weight VMs (a name server here, a simple web server there, pi-hole on my home router). Where I care about security separation (my containers are privileged) I use real VMs via KVM.

Though I'm an old fashioned pets-not-cattle type. Docker and friends have the disadvantage of being invented (or becoming practical/popular) in/after my mid 30s!


i introduced proxmox and saltstack to a customer who was running all sorts of applications together on a few linux servers. splitting up the services over multiple containers made maintenance easier because upgrading software in one container would not affect apps running in other containers.

on my own servers i use plain LXC without proxmox, but created and managed through saltstack for similar reasons. and also to allow me to use different linux distributions for different use cases, and to seperate various development environments


The best futuristic feature by far I've heard of is live-migration of workloads between hosts (read: your compute and memory gets transferred to another VM and resumes without having to restart). This would be amazing for, e.g. auto-scaling workloads when they reach their memory limit. Or to run them on a Spot instance and then seamlessly migrate to another, when the current one is on the verge of preemption.

It's too bad no one appears to be working on that in OCI/Kubernetes world.


Checkpoint/restore is supported by runc and there were demos of live migrating Docker containers several years ago. A better question is why Kubernetes isn't working on integrating it (though I suspect there's a KEP for it somewhere).


Could you share a link please? I've only seen similar demos of a Doom game migration on LXD (I believe).

EDIT: very interesting, thanks TIL! Here's at least someone from RH playing with it in the context of K8s: https://youtu.be/DDJxqV98b4U?t=1932

EDIT2: and apparently, at least its minimal implementation has been merged: https://github.com/kubernetes/kubernetes/pull/104907


Adrian is one of the CRIU folks and has been working on integrating CRIU and containers for a while.


I have a proxmox system that I host lxcs on to serve as development environments for various web apps or network services that have different Linux based tech stacks. I also run some services inside lxcs on my Linux based router. Local network name services, an NFS volume, etc. The strength is the simplicity. I found it all easy to set up and very low maintenance.


I've been considering building a system to spin up containers to run repeatable statistical analyses. LXC was the direction I decided to go.


I fucking love LXC! It allowed me to “virtualize” several physical servers on a single newer server, keeping all of the original setup (except some networking changes). Combined with ipvlan, I could even work around the MAC address restriction of my server provider, with both IPv4 and IPv6.

I also use it to host a Jitsi instance. Jitsi is rather picky about the underlying distribution.

All this without the overhead of an actual virtual machine of course.

The only pain I have with LXC is its behavior when errors occur. It’s not easy to tell what’s wrong, the error messages are worthless. Sometimes the debug log can help, sometimes not so much.


You are using hetzner right?

I'm interested to see how you bypassed the Mac address problem, any blogs, guides ?


Yes, it’s Hetzner. The magic ingredient is ipvlan, a special interface type loosely related to macvlan. An ipvlan interface is tied to a physical interface. Whether traffic goes to the virtual interface is decided based on the IP address alone. The ipvlan virtual interface can be moved to a different network namespace and will still work.

Note that Hetzner added support for multiple MAC addresses in the meantime. So at least for IPv4, you don’t need this.

There’s good info in the Linux kernel docs: https://www.kernel.org/doc/html/latest/networking/ipvlan.htm...

I have a Debian host. In my interfaces file, I have the following code for the Jitsi ipvlan link:

    auto ipvl-meet
    iface ipvl-meet inet manual
       pre-up ip link add link eth0 name ipvl-meet type ipvlan mode l2
       post-down ip link delete ipvl-meet
Then, in my LXC guest config, I have this:

    lxc.net.0.type = phys
    lxc.net.0.link = ipvl-meet
    lxc.net.0.ipv4.address = 192.0.2.46/26
    lxc.net.0.ipv4.gateway = 192.0.2.1
    lxc.net.0.ipv6.address = 2001:0DB8::4/128
    lxc.net.0.ipv6.gateway = fe80::1
A word of advice though: LXC likes to eat network interfaces when it fails to start a container. If you experiment, keep in mind that you may have to recreate the ipvlan interface after errors.


> Note that Hetzner added support for multiple MAC addresses in the meantime. So at least for IPv4, you don’t need this.

Does this apply to the IP supplied with the server? Because I can't find any Mac address option for it.


No, the primary IP address cannot have its MAC address changed.


I'm not sure what the problem is, but if it is stopping you bridging the physical interface effectively, could you not define a virtual bridge with local only addresses and have the host NAT to/from that and the physical interface? Adds a little extra latency to be sure, more so if you have two or more boxes on the same network with containers that would otherwise talk directly to each other on a private vlan, but if your physical box is being a firewall/router for the tasks in the containers anyway it is already practically doing the job.


They also provide a virtual bridge for dedicated servers.

Me I could not get it to work reliably so I added another bridge on my servers which link via wireguard. It works like a lan for both the physical servers and VMs.


What mac address restriction is there? One sets them up on their robot interface so things will actually route. You get the assigned mac and use that in your virt ip or bridge settings.

Are we talking about something else?


The Robot interface thing is actually relatively new (some years). Before they introduced that, all your IP addresses (and maybe subnets, too?) were delivered to your physical MAC address, period. Wanted to use a VM? Maybe use proxy_arp or some 1:1 NAT. Bridging? Not possible.


I checked my emails. My initial server with them was provisioned at the end of 2009 and it's accompanied by my Robot details. I did have IP aliasing on this bare metal server which also needed the assigned mac from Robot.


I have been running a handful of Debian systems in containers since OpenVZ was state-of-the art. As the hardware expired, I moved to linux-vserver, then LXC, then LXD ... at which point it all shifted from straightforward text configuration files to SQLite mediated by CLI programs. When I could not figure out the incantation required for some gnarly configuration rotations ("computer says no"), I was forced to shut everything down, alter the SQLite files directly, and spin it all back up again. On the next hardware refresh, I switched back to LXC.

This sort of mediation makes sense and works great for ephemeral containers in Kubernetes. For full virtual servers, LXD is the worst of both worlds.


I miss linux-vserver. Easily the most simple containerization I ever worked with.


I run everything in LXD containers on my home server(s), with Debian as the host (manually compiled, not the snap). Most of the setup is automated with Ansible that I hacked to read Python scripts instead of YAML.

LXD is a treat to work with, and I feel its container model is perfect for home/small business servers that need to run lots of third-party software, much of which would work poorly in a "proper" stateless container.


I use LXD + ansible for my personal projects. IMO the main benefit I see is it requires very low cognitive overhead. It's insanely easy to set up a container, I only need to remember like 3 commands ever. These programs will never be clustered either so theres no point in having a scheduler.


What would those 3 commands be? As a developer who has not used LXD, I’m curious for insight into the developer experience.


lxc launch and lxc start/stop

First one you supply the image you want (like ubuntu:20.04) and the rest are self explanatory.


Lxd is fantastic but it became unusable for me once it started being distributed only as a snap.

I still use lxd on an Ubuntu 18.04 host that still has the deb instal but looking forward I’m trying to get things migrated to systemd-nspawn.


These seem like cool tech and I know things like proxmox can run them pretty easily.

Still Dockerfiles and all the magic that goes with them was just so much easier to pick up and convince others on the team to try.


My last company had hundreds possibly thousands of LXC containers, and we orchestrated everything via saltstack (which is similar to ansible or puppet if you aren't familiar).

The justification was that we needed our SaaS to also work on-prem for financial companies and government entities, and thus we could not count on kubernetes or any specific cloud vendor to be available, so we rolled our own orchestration built on ubuntu hosts and (for some stupid reasons) centos LXC containers running on top of those.

LXC is a good system and all, but what we were doing with it was a total nightmare, we were basically trying to reinvent kubernetes using traditional configuration management and LXC with two flavors of linux and a million dependencies.


> we needed our SaaS to also work on-prem for financial companies and government entities, and thus we could not count on kubernetes or any specific cloud vendor

....? kubernetes is not a cloud vendor, it runs on any Linux distribution, and it's FOSS.... ?!


I mean that bringing kubernetes to an on-premises customer, just to run our software, was (probably correctly) deemed to be too much complexity. In the end it was complex enough as it was. 5 years ago kubernetes on-prem was not particularly easy, it probably still isn't, and doing it for a customer who just wants to run our software is a lot of support burden.


K3os makes on prem quite nice these days


And that's why the sentence contains an or - because they are not a cloud provider but you cant guarantee a local corp is gonna run kube.


A previous employer had the same issue, and solved it by writing the application to kubernetes and deploying minikube to the clients that didn't have their own kubernetes cluster.

To be fair, the situation ended up being: bunch of clients running minikube, 1 major client with their own kubernetes cluster, and a lot of other clients using the multi tenant, cloud hosted, offering. Made it a bit of a pain to support that one client with a cluster we could not control, but it was one of our biggest clients, so, them the breaks.


I can't speak to supporting these, but there are a few vendors that offer k8s in a box for this exact use case. Replicated is the first that comes to my mind. I've used this one as a customer. It worked fine but felt it necessary to do a bit of poking under the hood to help it understand things like our AZs in a private cloud:

https://www.replicated.com/kubernetes/


ya I'm sure there are some good options especially nowadays. But now think about getting signoff to use that software at VISA, US Bank, or the US military, for extremely sensitive info. Sometimes its just the policies that limit what tools you can use.


I hear you. I was at one of those banks and everything was miserable. For reference, Mulesoft uses (or use to) use Replicated for one of their products that is popular with banks.


systemd-nspawn is worth a look - I found it more intuitive, well designed and security conscious (within reason for containers) than the other container options. It must get some serious use behind closed doors somewhere, because the quality outstrips the amount of publicity it gets


Do either of them support the concept of “layers” like Docker does?

I think that feature combined with overlayfs2 is quite useful, despite my many criticisms of Docker.

It is sort of a middle ground between Nix like granularity which requires rewriting upstream, and big LXC blobs created with shell scripts.

Although I also think we need some kind of middle ground between docker and nix :)


> Do either of them support the concept of “layers” like Docker does?

Never used LXD, but LXC does not have layers per-se.

I usually run LXC containers on a btrfs filesystem, which easily supports snapshots and sending containers to other container hosts via "btrfs send | ssh otherhost btrfs receive".

If you are treating your servers like pets (and not cattle) LXC is a very convenient means to consolidate servers onto fewer hardware systems.


> Never used LXD, but LXC does not have layers per-se.

It's kind of a similar situation in LXD, where that sort of functionality being available depends on the storage backend/driver in use, for example LVM thin provisioning, various options via ZFS, or btrfs like you mentioned.


Hm yeah, in some sense I like the things to be more orthogonal, i.e. container images vs. the file system

On the other hand I think storage and networking should be more unified, and the differential compression and caching you get from layers is important

I suspect that doing the layers at the logical level (files) , like overlayfs, rather than the physical level (blocks) is also better for containers, but I'd be interested to read any comparisons


It depends on the workload. Copy-up is more expensive than snapshots. If it's ext4+overlayfs, then copy-up is really expensive (full file copy), whereas if it's btrfs or XFS then it's reflink copies. Much more efficient. But btrfs snapshots in effect are instant reflink copies. But then overlayfs has the benefit of shared pages.

Also, at least Btrfs snapshots are more logical, like files; than physical, like blocks. Whereas LVM snapshots are separate (logical) block devices, Btrfs snapshots are file b-trees on the same file system.

Maybe the overlay2 graph driver (the driver used in Docker or Podman to interface with the kernel) is getting more usage and thus maturing more quickly, compared to the btrfs graph driver? I think upstream Docker/Moby are defaulting to the btrfs graph driver if /var/lib/docker is on Btrfs. Meanwhile on Fedora where Btrfs is the default for desktops, and Cloud edition, Podman defaults to overlay2 graph driver no matter the underlying file system.


From memory, the early versions of Docker were basically just combining LXC with an image implementation using overlayfs. They then switched to spinning up their own container processes directly.

Storage with LXC is just a block device, so you could roll your own using overlayfs. Probably a fair amount of work though. And in the other direction, I have used btrfs instead of overlayfs with Docker.


Both overlays and snapshots work fine.


I tried to invent docker and kubernetes with LXC back when docker was super alpha and kubernetes was barely a fetus. I failed, obviously, but learned a lot. It's kinda the only reason I managed to pick up kubernetes later on, honestly. Sometimes, not knowing what you're doing is an asset.


I use both lxc and docker and have uses cases for both, I think it really comes down to how stateful something is or how lazy I feel about writing a Dockerfile. I had a really hard time with learning lxd and really only got into using lxc without the daemon.

One trend with docker that I personally don't like is that a lot of projects prefer docker-compose over regular Dockerfiles (though some of them support both), and this leads to a lot of bloat with how many containers it takes to run one application and duplication where each app will need its own database, redis, webserver, etc. containers.

That's not a problem when you are at the scale to need that kind of orchestration and have the resources to run that many containers, but personally I would rather have one database host that all my containers can talk to and one reverse proxy/webserver to make them accessible to make better use of resources and get better density on one host.

One downside with lxc is that there's been some fragmentation with the image creation side of it, I had been used to using lxc-templates for years, which are just shell scripts for bootstrapping and configuring your container and pretty common between distros. I found a bug with creating containers for the latest version of Alpine that I fixed, and only then did I find out that lxc-templates are deprecated and essentially unmaintained, and that now distrobuilder is the preferred way to build lxc containers, at least per Canonical. That seems to be another Canonical-ism, since the distrobuilder docs say that snap is the only way to install it unless you build from source, so that's a huge no from me, and I also didn't feel like learning the yaml config for it.

I was actually considering moving my docker daemon into an unprivileged lxc container like the article mentions, but haven't gotten around to it.


> " [..] which are just shell scripts for bootstrapping and configuring your container and pretty common between distros. "

I find it a big advantage that this builds the containers basically from scratch and you only have to trust the distro and not any other parties. I'm always feeling uneasy with images which are hard to inspect and whose provenance is opaque and potentially involved multiple parties.

That, combined with running them unprivileged, should make a fairly secure system. Unfortunately I had great difficulty to create an unprivileged Debian LXC container with this method. If I remember correctly you have to create the privileged container first and then use a tool to fix-up uid and gid. If anyone knows an easier way to do it, I would be grateful to know.

EDIT: I think I used the following to create the container:

    lxc-create -n <name> -t debian -- -r stretch
It uses deboostrap for the build.


I don't recall having to do any uid/gid fixup last time I made an unprivileged container. I did have to prepare the unprivileged host account, of course, by setting its subordinate uids/gids (/etc/sub?id) and virtual network interface limit (/etc/lxc/lxc-usernet).

To create the container, I did this:

lxc-create -t download -n <name> -- -d debian -r bullseye -a amd64

Note that this runs the 'download' template, which (IIRC) is better suited to unprivileged containers than the 'debian' template is. The 'download' template will list its available distros if you do this:

lxc-create -t download -n <name> -- --list

Note that some versions of the 'download' template may fail with a keyserver error because sks-keyservers.net died somewhat recently. Workaround: DOWNLOAD_KEYSERVER=hkp://keyserver.ubuntu.com lxc-create ...

https://github.com/lxc/lxc/issues/3894


I use Docker in unprivileged LXC* for the mere benefit of separation of concerns [1]. I don't use VMs for this because I am limited memory wise, and LXC allow me to share all host resources. I use docker because I don't want to mess with the way authors expect their dependencies to be set up - which is easy to circumvent by settling on a docker-compose.yml. Nix [2] appears to be the best future successor for my setup, but it is not used as widely yet.

* Afaik, this only works on Proxmox, not LXC in bare Debian, because Proxmox uses a modified Debian Kernel with some things taken from Ubuntu.

    [1]: https://du.nkel.dev/blog/2021-03-25_proxmox_docker/
    [2]: https://nixos.org/


lxc/lxd was the reason I went with ubuntu, but snap is why I migrated my containers to docker (and will soon move to debian)


My shop went from VMware to OpenShift around the time I transitioned from systems to whatever it is I do now. In my private life I've used kvm since it was first rolled out, but over the last few years have redeployed all my home services to docker. I did try lxc for a while, and think it could replace kvm for lab work. On the other hand, lxd never seemed worth the effort: especially with its snapd dependency. Podman seems like a good alternative to docker, but I'm not sure migrating to it would be worth the effort as long as docker sticks around.


I use LXD these days and I love it. It's easy enough to deploy without knowing too much internal details. Its CLI is also intuitive and well designed. It's true that LXD doesn't support installation outside snap. However, many distros like Alpine, Arch and Gentoo support its installation on bare metal from their software repos.


I use an LXC to run Docker and a bunch of containers on my Proxmox VE machine :). LXC is interesting, but you can't beat the ecosystem of Docker images available (and maintained, updated, etc.)

Have a few apps running in plain LXC containers (like AdguardHome) but maintenance is non-free (unlike a docker-compose stack with Watchtower keeping everything nice and fresh).


What do you mean by Watchguard? I did a search but didn't find anything. Did you mean Watchtower maybe?

https://github.com/containrrr/watchtower


Yes, watchtower, sorry.


I love using LXC, Finding documentation and such online regarding things however is an absolute pain in the ass in trying to figure out if the LXC commands they are talking about older depreciated syntax's, LXD which uses 'lxc' for their command line, Or the modern version of LXC which you are trying to make use of....


How do you all manage the security side for LXC containers? As I understand it, any local user allowed to run lxc containers can effectively spin up a root level access container on the host, example: run a privileged container and mount / inside the container? Is there anyway to mitigate this?


Isn't that the same as docker?


Only root can start privileged containers.


Maybe I misunderstood the requirements, I thought that in order to run containers a user has to belong to the lxc/lxd group. Once they belong to this group they can spawn a privileged container, which will indeed be run under root, but this effectively makes the user root too. Is there any other way for a non-root user to be limited to only unprivileged containers or have I misunderstood the requirements?


Back in 2016, I thought it'd be nice if LXC/D had something like a Dockerfile. It would just be a small layer or tool to do it, proved with a little python script: https://github.com/jonatron/lxf


LXD containers are built up using commands. I have one bash script for every container.


Does LXC and LXD support the functional equivalent of Dockerfile or docker-compose.yml? I appreciate how Docker has become widely accepted for defining reproducible environments but have always found it to be rather heavy on resources.


Not really. But LXD containers are built by layering commands. I use a bash script with the commands to create reproducible containers.

That said, LXD is good for containerizing OS's, not single applications. So when you say "defining reproducible environments" you can get reproducible containers, but not so much environments. In that sense, it behaves more like a VM.


You can use cloudinit with lxd to do something quite similar to a Dockerfile. I don't know anything like compose besides just bash scripts.


And, newly, a licensing hassle.


If you are interested in this kind of lightweight containerization, the best tool to consider would be FreeBSD jails. They are available in FreeBSD since version 4.0 released on March 14, 2000 and at this point are a very mature solution to this problem. You can even attach dedicated network cards to jails and prevent them from being visible on the host. Moreover their integration with the base os and system utilities is seamless and very mature.

Linux folks love saying: use the right tool for the right job when it comes to telling people why Kubernetes is a strong enough reason for any company to switch to linux, however when it comes to things such as lightweight containers (or ZFS-on-root for the matter) for some reason the same reasoning isn't applied.

LXC and LXD look like a patchwork compared to FreeBSD jails. If you are on the desktop and want to run graphical apps, maybe it can make some kind of sense if you are already using Linux for everything else. But if you want to have a lightweight container setup on a headless server, FreeBSD is by far the best option.

In FreeBSD land, no one boasts of running 5 virtualized environments on a single machine because this is just a common thing for us and part of every workfow. You want to work on some github repository? You create a new jail in 1 minute using a tool such as cbsd.

A server with 32gb of RAM can easily run 1000s of jails, even a small VPS with 1-2gb of RAM can run a dozen jails or more without problems. It is intriguing that so many people dismiss the power of that.

And it is intriguing that people easily accept: "there is no equivalent to dockerfiles but it's fine because things can be scripted easily with basic shell commands" when it comes to LXC/LXD but people dismiss this very argument when FreeBSD folks try to explain why there is no need for Kubernetes on FreeBSD.

Anyway, if you run a startup not operating a resource intensive products (like ML requiring spinning up massive amounts of cores and RAM in an irregular basis) FreeBSD and jails can really save massive amounts of time and allow you to focus on the right thing. just run everything from a single Root VPS (usually hosting companies already handle things such as RAID replication and hardware redundancy transparently) and before you reach scalability problems, you'll already be making millions of dollars in revenue.

Even after, running thousands of jails on a single hosy essentially means doing what AWS and other cloud providers are doing. This doesn't prevent you from builing a small high-availability layer on top to ensure that critical apps run from at least two different physical hosts. This isn't very difficult to build with things like Consul, HA proxy, built-in system utilities and a small custom web interface.


I used LXC to virtualize Mastodon instances for a while. It works as intended. The illusion was that there were separate kernels all on the same machine.


I use LXD for most of my dev and testing environments, as well as the container for most network services.

Even my mail server runs in LXC, and it is a direct descendant of a Red Hat 5 server from ~20 years ago!

I've often seen LXD vs Docker described as LXD being better suited to running distros and distro-like environments, with Docker being better suited to single-application environments.

So that's what I do. For many of dev and testing environments, there's an LXD container, which I use over SSH/rsync daily. Compiles, long-running tests, bandwidth heavy network actions and so on are done in those containers, on a few fast servers in data centres. It's much faster than my laptop.

One motivation for running Linux in LXD containers instead of on bare metal was to allows me to decouple changes to host OS version and networking from the containers in which I do most my work. Previously I used bare metal, i.e. ran Linux on servers, but found it annoying that I couldn't update the host OS and especially the kernel or disto major version, without shutting down everything in Screen sessions and long-lived networking tests that are not so easy to shut down and restart (without a rewrite anyway).

Being containers, they can use host filesystems and devices almost directly. So they are great for things like I/O performance tests, with confidence that it's basically testing host performance and behaviour.

Unfortunately, after getting deep into both those things, I found LXD wasn't quite as run-a-distro friendly as it first appeared, and it wasn't as reliable at replicating host I/O performance either.

Doing file I/O with "shift=true" host-filesystem mounts in LXD (the most sensible mode) turns out to be have very much slower O_DIRECT performance than it should, so that screwed up my database storage tests unexpctedly until I realised. Now I run low-level storage performance tests outside LXD, because I don't trust it.

As for distro-like environments, I eventually found stateful snapshot+restore or migation of dev and test environments is permanently broken. No LXD container I've used over many years has ever successfully been able to live-snapshot/migrate, without hitting an error which prevents it. This isn't some obscure bug, either. It's never worked, and my browsing of forums and issue trackers leads to the view that it's not actually expected to work for almost any distro-like environment, despite being one of the headline features.

As a result, one of the main factors motivating using LXD instead of bare metal for me turned out not to work. It's always possible to shut down a container, but that loses so much state, long-lived Screen sessions, running processes and so on that it's about as disruptive as updating bare metal. I.e. no real advantage.

I could switch back to VMs, which are excellent for snapshotting and migration, as I used to use (with libvirt+kvm) but they have a different problem for much of my work: Host filesystem/blockdev sharing is relatively slow. Not only could I not use them to measure performance against various kernels on real hardware (it would be measuring the VM as much as anything), I also run many data-intensive jobs, and the closer I can get to those host storage devices, the better.

The last reason it's felt buggy and rough is when removing a host-container filesystem mount from a container, it has often deleted the mount point on the host as well, disrupting other processes using it. Operations on the container are not supposed to change the host itself. You get used to working around this, but it's annoying.

Ah well, nothing's perfect. It's still a very useful tool, with some rough edges. I'd look into fixing the I/O performance and host-unmounting bug if I thought the snapshot/migration feature would be made to work someday, but because that looks unlikely, plus issues that came with Snap, I'm not as motivated and will live with hackish workarounds.

For things like my mail, web, SpamAssassin and other services, LXD has been great, and the issues I encountered aren't really a problem. This is the sort of thing which Docker is pretty good for as well. However, due to history I've tended to keep servers going for a long time (with my mail server winning the crown as it's an image that's been gradually modified and upgraded for about 20 years, across many different ways of running Linux in container-like environments, starting with chroot). LXC/LXD is better than Docker for this use case, though either can be made to work.


No layer caching mechanisms


LXC is particular useful on shared web hosting.


Docker's easier and does all the same stuff. It's integrated with everything else, has an interface everyone is familiar with, uses OCIs, is extensible, is supported on basically every platform, has a simpler config file, tons of community support, and now runs rootless.

Does anyone have a use case where they couldn't use Docker? I'm sure they exist but the list must be tiny. Listed in the article is:

  - run systemd in a container
    - why?? does your system not have systemd or equivalent?
    - bad design; how will you monitor what's running and what's not, or upgrade an individual service in this container? restart everything?
    - apparently systemd can run in docker?? https://medium.com/swlh/docker-and-systemd-381dfd7e4628
  
  - run lxc/docker in a container
    - docker-in-docker


It doesn’t do the same thing. LXC is a godsend when you need sandboxing other users or workloads that spill over into the host (even Docker stacks need (Host) resources, like mount points and storage).

I run different Docker Compose stacks inside different LXC containers to avoid collisions between some containers that are in both stacks.


I teach an introduction to computer security course where I show my students how to write some basic rootkits for Linux. I need them to be able do that on the computer room machines (where they aren't root not sudoers), or on their own computer, but in both case I prefer if it doesn't cause a kernel panic if their code is wrong at some point.

As far as I know you cannot load kernel modules in a container. You need an actual VM for that.


VMs are perfect for that teaching scenario.

That said, you're not supposed to be able to load kernel modules from a container.

Which is exactly the sort of idea an enterprising rootkit author might think to take advantage of ;-)

You might consider asking your students if they can figure out how they might go about getting a kernel module installed from in a container despite it not being allowed - and what advantages does that bring to the rootkit author. If necessary, you can test it using an LXD container in a VM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: