Is it me or is Ubuntu kind of like the Sony of server software? It always seems like they are developing concurrent solutions to fit into their model of doing things.
Where Sony pushes their own special formats like Memory Stick, Ubuntu pushes Upstart, Juju and now LXD. I think in the end this isn't entirely helpful to the ecosystem as a whole when you have Ubuntu attempting to push their special formats of things while not bringing all that much more to the table. The question of how systemd compares to upstart might be something to consider with my upstart comparison, but essentially all I see from Ubuntu is pushing their brand of tooling and most of it is very Ubuntu specific.
On the other hand RedHat usually releases software that can (and usually does) make it to other distributions and ecosystems. I'm not entirely sure what LXD is going to bring to the table beyond what Docker or similar utilities offer, and like many Ubuntu projects it currently seems to be VERY light on documentation. Actually, where do I even access the documentation for this project? Oversights like this are what killed Juju or MAAS for me and yet Ubuntu pushes those projects like crazy at every conference I've seen them at (Gophercon for instance).
Upstart seems like an odd example to give, given that it was the first sysvinit alternative that actually made it into major distributions. RHEL 6 (thus CentOS 6), ChromeOS and some of the earlier Fedora versions make use of it. This was after an attempt to port launchd failed, because of licensing issues. systemd came several years later.
Red Hat is a different business model. Their venturing into the cloud is more recent. Historically, they've been more into the support business, and this necessitated having a lot of people fix bugs in the Linux ecosystem. That and their acquisition of Cygnus Solutions means they're the de facto gatekeepers of the Linux kernel and much of userspace.
Canonical is a more Apple-like company. They care about being internally consistent and formulating their own brand, interacting with the outside only where necessary.
Well I'm aware it made it into CentOS but that was kind of short lived and was never updated to newer releases of the codebase (thus making it kind of a init system floating around in limbo with compatibility issues.. many of which I've personally encountered). I understand what you're saying about RedHat vs Canonical, but tooling that is as critical as a containerization or init system don't work very well (IMHO) in a vacuum. You have to have it more widely available, useable and used by the community or not only can you not build community, but you end up creating a fractured ecosystem that makes it hard to tool for among other things.
Interestingly, upstart's first release was the same month that launchd's license problem went away. If Apple had been a little more responsive to the community, upstart may have never taken off.
Upstart is a bad example, for reasons that other have pointed out. It appears that LXD is being developed along with LXC, so I can't really see it as a "concurrent solution"; Juju is much the same, in that I don't think it's concurrent with any other project that does that same thing.
So I don't think the comparison with Sony holds any water.
It sounds to me that this is hardware assisted docker containers, which would be a good thing.
Intel processors have VT-x that provides hardware to help speed up virtualization, isolate memory, etc. AMD has something similar. You can break out of a docker container and get to the host OS and other containers. With a hardware assisted hypervisor, it is possible to hide container memory from other containers at a level lower than the "host" os.
If I understand docker and VT-x correctly, hardware assisted virtualization can be used run N instances of a container while only having 1 instance in memory. VT-x can rewrite memory reads/writes transparently and deny writes to certain locations of memory.
Docker containers share the kernel with the host and depend on it for isolation. This would add the hardware assisted isolation of containers without the overhead of another kernel per container, plus the other benefits of docker.
So what is the attack surface of LXD?
Can the host kernel be compromised from inside an LXD container, i.e. does it provide the same isolation you get when running processes as different users on same kernel, or something more?
For example if there is a local kernel privilege escalation / DoS / etc. bug that can be triggered by a non-privileged user (or a root-inside-container user) will those exploits still run inside LXD?
A properly written vmm/hypervisor should have no attack surface. This won't be that since there's a rest api on LXD, which is one attack surface.
DoS is still a problem but containers should provide mitigation for that. You can make the vmm prevent DoS, but it's better to keep the vmm small and light.
As for local kernel privilege escalation, yes, it would still run, but it might not matter. In theory, the VMM can isolate all virtual machine resources such that rooting a VM only gives you that VM. I can't figure out how they extend that protection to containers yet since VT-x was made for full virtual machines and containers share a kernel.
Well -- 'no attack surface' might be simplifying things a bit too much, as you do need a way to interact with the hypervisor or the privileged host to actually get your data written to disk and your network packets out on the wire.
Each such interaction can contain bugs, some of which might be exploitable.
Even with Xen or KVM you do have an attack surface:
* guests can send network packets to the host, which interacts with the networking code on the host. If exploitable you get to execute code/DoS the host. Hopefully not because then so could any other remote machine.
* guests can execute instructions which get emulated / need extra privilege checks done in the hypervisor. See recent vulnerability regarding MSR registers in Xen.
* guests execute hypercalls which obviously interacts with the hypervisor. Bugs here, if exploitable, can be nasty.
* guests need to read/write their data to disk. Are we sure they can't read the data of a (possible already deleted) other VM?
* guests read/write from memory ... was the memory of previously deleted/crashed/migrated guests properly scrubbed? Can any of the hypercalls/etc. be used to read another guest's memory, or access uninitialized memory containing pieces from old guests?
...
Of course the attack surface of a hypervisor is smaller than that of a full kernel (where you also have a lot of syscalls, disk formats, etc.), but that doesn't mean hypervisors are suddenly bulletproof.
The question is where does LXD stand from a security pov. between these simplified categories (no order implied):
- running multiple different processes as same user
- running processes in different LXC containers as root-in-container on same host
- running processes in different LXC containers as non-root on same host
- running multiple processes as different users
- running processes in different LXC containers as non-root on same host
- running root processes in different KVM VMs on same host
- running non-root processes in different KVM VMs on same host
- running root processes in different Xen/domU VMs on same host
- running non-root processes in different Xen/domU VMs on same host
- ...
Or in other words if you get an account/container/VM on a shared machine from a hosting provider using technology X, how does that compare to getting an LXD container from a hosting provider?
(provided that other unknown users can run LXD containers on the same machine as yours).
"No attack surface" was definitely simplifying too much, especially for LXD. I think I was trying to say that it is not impossible to mitigate those attacks.
In the pure sense, a hypervisor doesn't need to do anything except create a virtual machine. It doesn't need a way to interact with a user or even the vm once it is created. I have written a bare metal, type 1 hypervisor that did nothing but key log. The guest never made a hypercall and wasn't aware that it was a guest at all. Side note, I'm not an expert. Hypervisor research is just for fun.
We know there is an attack surface on LXD immediately because of the REST API and its interaction with containers. Any resource mediation also exposes an attack surface. Resource mediation is difficult, but not impossible. The attack surface really depends on implementation.
With my limited knowledge of the linux kernel, I can imagine a kernel running in its own vm, a vm for every container, and every container sharing read-only access to the single kernel. Each container could also be isolated via the same memory protection. I don't know enough to say that's possible. I think you're more knowledgeable than I am about lxc and the kernel in general. Any thoughts on this?
I'm not worried about memory protection, there is HW support for that and it can be done.
I'm slightly more worried about making sure that separate containers can't access each other's disks (via symlinks/hardlinks or overflowing some FS structures).
And I'm worried about the privileged kernel/hypervisor parsing/interpreting data from the unprivileged container.
In that sense the situation is not much different from a server: if you can exploit a bug in the server you can run/perform actions with the server's privileges.
Same situation with the kernel.
I'd wait until there are some more design/architecture docs about what LXD is exactly to say more though.
Upstart predates systemd by 4 years and had made it into a good number of other distros before systemd took over. Apart from upstart, I see your point.
There's a bit more to it than that, though. systemd as an init system was originally conceived as a way to improve Upstart; the reason that systemd decided to go its own way rather than contributing to Upstart is because of Canonical's CLA.
That I think is a key point to this discussion -- in order to contribute to an open-source project run by Canonical, they insist upon you giving them more rights to the code you give them than they're willing to give you. Many people are understandably put off by that.
The fact that systemd comes after Upstart is I think less germane to the point that parent was trying to make about Sony than the fact that Canonical insists on being in control of their projects, and puts up rather high barriers to anyone who wants to contribute patches. I am sure that I will hear responses about various patches systemd maintainers have refused or said they wouldn't be receptive to, but that's a difference of kind (not just degree) from insisting upon assigning Canonical the ability to re-license the code outside of the GPL.
These aren't mutually exclusive thoughts; for GPL-licensed content, the ability to relicense the code is pretty much the only benefit that ownership confers over the GPL. It seems strange to assert that having this ability adds value for Canonical in an acquisition, if the acquirer isn't interested in using it.
Now, I have no problem with anyone who wants to sign the CLA and believes that Canonical is acting in good faith. But Canonical is asking for additional value from contributed code than what the GPL provides, and isn't compensating people for this value. Some people have a problem with that, and it makes it harder for Canonical-hosted projects to get community involvement or to be adopted by other distros, where maintainers have to choose between signing a CLA so patches get accepted upstream or continuing to maintain their patches themselves.
First of all, thanks for the comment. Down-votes don't add too much to the conversation.
I agree with you, but I remember reading that argument backing the CLA (defense in court, to increase the value of the company in case they want to sell it and to be able to close the code); I can't find a link though.
I get your points and see it like this, Red Hat is like Ford. Solid, reliable, sometimes quite innovative. Yet the desktop Linux market slipped right by them for like ... years. One has to wonder what could cause such slippage. Ubuntu is more like Samsung to me than Sony. The innovation is off the chart, not everything hits, a lot doesn't but what it does do is cause others to strive and that is leadership.
I don't think that's really a fair comment. Red Hat has always focused on the enterprise - home users didn't slip past them, they were never the focus.
The original success of Ubuntu was really incremental improvements - they took Debian as a stable base and improved on the installer and on the defaults, fixing a big bunch of the small obstacles that prevented non-technical people from using the system.
Say what you will about Sony these days - and there's much to be said - but there is no way Samsung could create something as brilliant as the PS4. Samsung are successful, but I can't think of anything they innovate and lead in, as far as their consumer side is concerned.
For me, it seems that Canonical is ever rewriting Ubuntu (the system tools and applications) from scratch.
Last time I saw, they were rewriting it from Python to Vala to fit mobile devices. Before that, they rewrote it from Perl to Python to fit modern development ecosystem.
IMHO, Ubuntu is going round in circles.
They are not pushing their enterprise solutions on Latin America anymore. Don't know what they are doing.
I am, like many here, totally confused. Is this OS-based virtualization, HW-based virtualization, para-virtualization, or something completely different? On the one hand, there are clear indicators that this is OS-based virtualization ("there is a catch; however, LXD is only for Linux on Linux"). That's fine; that would essentially boil down to bringing the complete containment model of FreeBSD jails and illumos zones to Linux -- which I'm sure would be welcome by Linux folks dealing with the structural problems of the current (piece-meal) Linux containment model.
But then there's the line about "working with silicon companies to ensure hardware-assisted security and isolation for these containers" -- WTF?! If using OS-based virtualization, why would you need hardware assistance for "security and isolation"?! And if that "hardware assistance" is being used for something so basic as proper containment, what happens if you don't have that assistance? Is LXD then vulnerable to privilege escalation? And who are the "silicon companies" we're talking about (like, is this Intel or is this not Intel?), what is the ISA, when does it tape out, how is it being validated, etc. etc. etc.
It's very frustrating for an announcement to be so putatively technical and yet provide so few answers; is there a deeper technical explanation of LXD somewhere?
> WTF?! If using OS-based virtualization, why would you need hardware assistance for "security and isolation"?!
I would guess that Canonical is talking about getting companies to contribute Linux kernel patches for cgroup interfaces to various northbridge-managed hardware virtualization tech (e.g. IOMMU tech like Intel's VT-d.)
VT-d in particular gives you the ability to expose one piece of hardware (that knows how to partition itself in some way) as multiple devices on the PCIe bus. With cgroup support, a container could be assigned one of the split devices, and act within the container as if it were the whole device. This is what regular hypervisors do, but they require a full set of virtualized devices (a virtual CPU, a virtual memory, etc.) while this approach allows you to virtualize only the resources your containers actually want to contend over.
So you could have, say, one virtual ethernet card per container (letting you run a container as a promiscuous-mode packet filter for its own VPC subnet, while still not being able to snoop on other VPCs' traffic) or one virtual GPU per container (allowing you to containerize OpenCL apps), while still having your containers acting like regular processes otherwise.
In order to get one virtual slice of device per container I just need the device's driver to support that or I need to have a layer on top of the driver that partitions the device. As the different cgroups have the same kernel and thus the same set of drivers I see no advantage in splitting physical devices. What am I missing?
The mailing list post[1] doesnt mention any of the hypervisor stuff, except to say make it feel more like a hypervisor. So I think maybe this has been exaggerated...
> If using OS-based virtualization, why would you need hardware assistance for "security and isolation"?!
Today most OS based virtualization is using "hardware assistance". Those are for often for memory and IO device managment (even passthrough). Not sure if this is _the_ assistance they mention but just an example of how it could work.
No, actually OS virtualization doesn't generally use any hardware assistance. And my questions don't stem from ignorance; I have extensive experience with the implementation of both OS virtualization[1] and HW virtualization[2] -- which is why I find the LXD specifics so peculiar. (All the more so that they imply that the support is forthcoming, not current -- and that they are talking to "silicon companies" not microprocessor vendors.)
I think the best guess is what derefr posited, above: that they are using HW network virt as a way of avoiding building in proper network stack virtualization like that found in Crossbow.[3] Then again, given the degree to which LXD appears to be aspirational rather than actual, we might be overthinking it: perhaps the conversations with "silicon companies" are like LXD itself -- a daydream about what might be rather than a concrete reality.
> Yes. We’re working with silicon companies to ensure hardware-assisted security and isolation for these containers, just like virtual machines today. We’re working to ensure that the kernel security cross-section for individual containers can be tightened up for each specific workload.
Sorry, but WTF? Is it a hypervisor or not? From a security perspective, one kernel per container or LXC? If the latter, as the rest of the announcement seems to imply, what is the "work with silicon companies" about? Either compromising Linux allows you to get access to other containers on the machine, or it doesn't. It can't be both.
What? My first thought was "cool, Ubuntu backs LXC". My second was "waitaminute, Ubuntu actually wants to compete with Docker?". Docker, as you all know, is a farily well established LXC management solution.
They then go on to state that LXC is a "real hypervisor" with live migrations and such. What? Did they take an established Linux household name, with wikipedia article and everything, and name their new semi-related project identically?
And if it's a para-virtualized solution they're pushing, are they really competing against Xen? I'm not sure it's wiser than competing with Docker.
Anyone from Canonical here and can explain what's going on?
And yet oddly for something so young, it's seeing more usage than OpenStack. Of course that may have more to do with the problems within OpenStack, but Docker has it's own attractions too.
OpenStack is just mindbogglingly overcomplicated. They appear try to get you to buy the kitchen sink and a space station when what you want is a sofa.
This could be a marketing problem, but it's the impression OpenStack gives me whenever I look at any of it.
Even "just" individual components like Swift makes me want to bang my head against a wall just from looking at an architecture diagram.
Of course, for large deployment you may end up needing all that complexity. The difference is that with OpenStack you need to figure out what you can disable. With the Docker eco-system, you get to figure out what you need to add as you build. The latter approach is much more friendly.
They have waaaaaaay more marketing than OpenStack. They've probably written and sponsored more PR fluff than code for Docker. And it's easier to deploy.
I agree that Docker is easier to deploy than Openstack :) However...
It always astounds me how some people massively over-estimate the size and influence of Docker's marketing... Why yes, of course! The way we got Google, Microsoft, Amazon and IBM to integrate it in their products is by ghost-writing PR fluff. That's also how we got 600 people to contribute 9,000+ pull requests over 18 months [1] [2] [3]. Not bad for marketing monkeys!
Seriously - after seeing so many hackers work so hard to improve the project every day, the "it's all marketing fluff" crap always gets to me. It's just plain disrespectful. How much legitimate engineering work do you need to see before you start respecting other people's work?
Yes, and good engineering is also the art of simplicity; we should also respect engineering work by the amount of useless code they didn't write.
Docker resonates well with people because it focuses on aspects of virtualization that people care: development and deployment. It alleviates the need of using complicated configuration management tools by providing layered images, and encourages fine grained containers by supporting first class volume sharing.
The fact that it integrates well with other virtualization stacks is a proof that for a good part it's orthogonal to them.
My logical opinion says that Docker is a useful tool which, though flawed in many ways, provides real value to a large number of users. I've even recommended Docker be used for new projects in my company, on the basis that it fits in well with what we're trying to do.
My emotional opinion is that Docker trades on trends in startup-world, systems engineering and the open source movement for the sole purpose of eventually generating revenue. This capitalistic perversion of what were before two idealistic and noble things (open source, engineering) is, quite honestly, abhorrent to me.
So to answer your question: while I might eventually respect its engineering accomplishments, I despise it on principle. I hope one day it turns into a simple useful tool that people can decide to use or not use without being cajoled by developer evangelists.
> My emotional opinion is that Docker trades on trends in startup-world, systems engineering and the open source movement for the sole purpose of eventually generating revenue. This capitalistic perversion of what were before two idealistic and noble things (open source, engineering) is, quite honestly, abhorrent to me.
Except that the last thing anyone with a monetary stake in your business will do is tell you to open source your main product. The Docker project has fought damn hard, and continues to, to make sure that a carefully curated line between business and the open source project exists (see for instance the creation of the Docker Governance Advisory Board).
Docker's initial marketing consisted of posts on the blog of a fairly unpopular PAAS vendor, and some meetups in SV. I think they made some T-shirts at one point fairly early on too.
Compare that to the combined marketing budgets of HP, Dell, Rackspace, Redhat etc. I've probably had more spent on me by OpenStack marketing (taking flights & lunches etc into account) than the marketing budget of Docker prior to their recent funding round.
If you take "marketing" to mean random 3rd parties writing how they use Docker to solve actual problems, then yeah - I see a lot more of that than I do for OpenStack.
Well, the project for the kernel features is the same people, but I don't think it is really named "lxc", its just patches for Linux. In the kernel they are just called "namespaces".
Lxd will Not be competiting with Docker. It's meant to complement it. As for the rest of your questions, I'm digging around to see what else I can find.
I'd like to read up on the technical docs. At first, it looks like a Docker rival. One of the concerns I have about LXC is that it isn't using hardware paravirtualization. It seems like that is what Ubuntu is trying to do with LXD (that is, LXD provides a hypervisor backend for LXC), but I am not sure. If this, then that's a much more compelling argument for me to use LXC over something like VirtualBox or ESX, or whatever.
Then again, maybe I don't understand how LXC work at all.
''"Published on 4 Nov 2014
Dustin Kirkland, Product Manager at Canonical introduces LXD (lex-dee), a new hypervisor that delivers capabilities to LXC containers that cloud users demand in scale out infrastructure. LXD is a persistent system daemon developed to enable the secure management and live migration of LXC (lex-cee) containers via an easy to use command line interface and REST API."''
The concept is relatively simple, it's a daemon exporting an authenticated REST API both locally over a unix socket and over the network using https. There are then two clients for this daemon, one is an openstack plugin, the other a standalone command line tool. ''
The main features and I'm sure I'll be forgetting some are:
- Secure by default (unprivileged containers, apparmor, seccomp, ...)
- Image based workflow (no more locally built rootfs)
- Support for online snapshotting, including running state (with CRIU)
- Support for live migration
- A simpler command line experience
This work will be done in Go, using the great go-lxc binding from S.Çağlar.
- Code to this project will be contributed under an Apache2 license, no CLA is required but we will require contributors to Sign-off on their commits as always (DCO).
- Discussions about lxd will happen on lxc-devel and lxc-users.
- Contributions to github.com/lxc/lxd will happen through github pull requests only and reviews will happen on github too.
This is kept separate from the main tree because at least initially, I believe it best to have a separate release schedule for both of those and because it tends to be easier for Go-only projects to live in theirown branch.
...
In order to be a good hypervisor, we also need to make containers feel like they are their own system and so we'll be spending quite a bit of time figuring out how to improve the situation. Some of the work presented at Linux Plumbers is going to contribute to that, like cgmanagerfs to provide a reasonable view of /proc and a fake cgroupfs, Seth's unprivileged FUSE mounts and all the cool things mentioned in Serge's earlier post about
Now as for the next steps. We will be creating the repository on github over the next few hours with Serge and I as the initial maintainers. Once the project is properly started and active, we will promote some of the most active contributors to commiters.
The first few commits in there will be text versions of the
specifications we came up with until now. This should also serve as a good todo list for people who want to get involved.
Over the next few days/weeks, the existing code which was used for the demo at the OpenStack summit in Paris will be submitted through pull requests, reviewed and merge.
Thank you; that's very helpful. And the tl;dr is that this whole announcement is describing something that doesn't really exist yet -- it's open source vaporware.
Can you try to get the juno repo up? It's pretty bad that the add-apt-repository command at http://www.ubuntu.com/cloud/tools/lxd don't even work on Ubuntu 14.04...
To me the most interesting things here are that the kernel(and CPU's) will be getting hardware assisted process isolation as well as advancements being made around CRIU to support live migrating processes(and potentially entire cgroup trees?).
This is good for everyone. Docker doesn't even use liblxc anymore by default, it uses libcontainer. Wonder why Ubuntu isn't getting behind libcontainer.. In any case, the stuff being pushed to upstream projects, like the kernel, will flow back down to docker and everybody can enjoy new awesomeness.
> Docker doesn't even use liblxc anymore by default, it uses libcontainer. Wonder why Ubuntu isn't getting behind libcontainer.
Because not everyone wants what docker offers. Some people prefer and want the more VM-esque behaviour provided by LXC.
To me LXC is the real deal, while docker offers limited convenience at the cost of flexibility and platform lock-in. And I have zero interest in that.
I hope Ubuntu continues to offer good LXC-support, and then docker (or whatever the other hip thing of the month is) can do whatever docker does, because it's external to whatever distro people is running.
> Because not everyone wants what docker offers...
Yes, I can understand this, but more specifically? That's hardly in the way of reasons.
> To me LXC is the real deal, while docker offers limited convenience at the cost of flexibility and platform lock-in.
I'm not really sure what makes liblxc the "real deal" and "libcontainer" not? Would you care to expand on this though? The true flexibility you are alluding to is, I believe, provided by the kernel itself? Are you deriding libraries that abstract interfacing with these features? Where is the platform lock-in coming from? Docker has been making inroads into many non-linux platforms, even Windows recently.
> and then docker (or whatever the other hip thing of the month is)
Are you suggesting Docker is a "flavour of the month"? That's... A unique perspective. In any case, as a counter point, I'd like to offer up that RedHat itself has partnered with docker via OpenShift. If one were looking for linux flavour of the month, I RedHat would be the LAST place they would look.
Containers are far, far more efficient. The superficial benchmarks that suggest that the difference is small are misleading.
At a data center where I worked a while back I saw thousands of VZ containers on boxes that could only manage maybe sixty KVMs. If the issues around security and flexibility can be fixed, there is opportunity for orders of magnitude improvements in density and power utilization.
Sounds like the right mixture of security and usage (cmdline tools like in Docker) and LXC.
Does anybody have any recommendations on how to manage LXC before LXD? LXD can't be the first approach to make usage of LXC and manage connections easy. Played with LXC, liked the concept and hated the iptables NAT thing... not really user friendly. So kudos to LXD!
I've got a Debian system that I've got a couple of Debian VMs running on under qemu-kvm. I don't want Docker-style containers for specific apps; I need "full system image" virtualization - I wonder if this would be helpful for my use case?
Running the same kernel/version of OS is fine across my host system and the guests.
I'd be extremely interested to see if this brings greater security to the "container" model. Something resembling BSD jails is sorely lacking in the Linux world.
Where Sony pushes their own special formats like Memory Stick, Ubuntu pushes Upstart, Juju and now LXD. I think in the end this isn't entirely helpful to the ecosystem as a whole when you have Ubuntu attempting to push their special formats of things while not bringing all that much more to the table. The question of how systemd compares to upstart might be something to consider with my upstart comparison, but essentially all I see from Ubuntu is pushing their brand of tooling and most of it is very Ubuntu specific.
On the other hand RedHat usually releases software that can (and usually does) make it to other distributions and ecosystems. I'm not entirely sure what LXD is going to bring to the table beyond what Docker or similar utilities offer, and like many Ubuntu projects it currently seems to be VERY light on documentation. Actually, where do I even access the documentation for this project? Oversights like this are what killed Juju or MAAS for me and yet Ubuntu pushes those projects like crazy at every conference I've seen them at (Gophercon for instance).