Hacker News new | past | comments | ask | show | jobs | submit login
Docker - Way better than a VM (github.com/dotcloud)
256 points by jonny_eh on May 19, 2013 | hide | past | favorite | 84 comments



For those that have thought about running something similar in production, I have a few questions:

1. Shared kernel - in this model all of the containers have a shared kernel, so any activity or even tuning (ie io scheduler) would impact all of the containers right?

2. Patching - when you need to apply a kernel security patch, all containers would need to agree on a change window unless you were using something like KSplice?

3. User space / kernel space dependencies - if we imagine even 5 years down the road, will containers using for example Red Hat Enterprise Linux 5 apps be containerized but broken? Ie the hosting provider will likely want to stay ahead of the curve and upgrade their kernel, but the app teams may not be as progressive, so when these upgrades occur the apps would break


Those are problems, but they're problems with all container/jail based virtualization platforms like OpenVZ, FreeBSD jails, Solaris containers.

The first question is a valid point. But you also get benefits from not having the overhead of N different kernels running. This is easiest to see when looking at VPS providers - a 512MB OpenVZ vps means you have 512MB of memory for your application to use. Yeah, kernel overhead isn't that much, especially if you're running a few high-resource instances, but it can help if you have lots of low memory instances. There's lots of discussion online about OpenVZ vs. xen/kvm vps hosts if you're curious.

As for patching - OpenVZ at least makes it very easy to do live migration of instances between servers (barring some weirdness if you have NFS mounts in the guest), although it appears that lxc (and therefore docker) can't do that [1]. In any event, it shouldn't hard to shutdown the guest, migrate, and restart the guest - especially if you're using shared storage of some sort.

As for your third point - backward compatibility with RHEL/Centos is generally quite good (since that's kind of the point of RHEL). At work we're currently on Centos 5, and our migration strategy to Centos 6 is probably going to be to install Centos 6 OpenVZ hosts, then move the guests and worry about upgrading the guests later. Forward-compatibility is an issue, but I don't think there's an easy solution to that.

[1] http://en.wikipedia.org/wiki/Operating_system-level_virtuali...


KSM makes the overhead pretty low memory wise


Is (3) a real problem? After Linus' strongly-worded[1] email, I assumed kernel backward compatibility was fairly robust. Isn't it?

[1]: https://lkml.org/lkml/2012/12/23/75


It doesn't appear to be a problem. We built something relatively similar to this where I work, but with generic Linux utilities like chroot. We have 11+ year old Linux environments running on RHEL6 just fine.

The other is not so easy, very recent Linux environments don't always run on RHEL6.



Not a dup of those - their URL and text content is different.

However those links are useful as the coversation is on topic, even if the source material is different.


Right now, finding VPS providers which work well with Docker is a bit like walking through a mine field.

Currently it recommends kernel version 3.8 or greater. This means that if you prefer Ubuntu then you need 13.04 or the ability to upgrade the kernel.

It also currently requires usage of AUFS which means that you need the AUFS kernel module installed. So, if you have a supported kernel then you still might need the ability to modify the kernel. They are working on supporting alternate implementations such as BTRFS though.

EC2 is a great option right now and it's what I'm using.

I agree with another comment mentioning this is the future. However, I wonder how long it will be before something like "Erlang On Xen" becomes more widespread, cutting out the OS completely.

ETA: I love watching this project, it has really taken off and the maintainers have been making fast progress. It seems that as soon as I run into a show stopping problem, it's fixed the next day. It's a bit inspiring and makes me look at the progress I have made on my own projects. ;)


I have been able to get Docker up and running on Amazon EC2, Rackspace Cloud [2], and Linode. You can't get it working on Digital Ocean just yet [3], Due to their inability to run custom kernels, and their lack of Ubuntu 13.04 support. But I'm sure it will be available in the near future.

If there are any hosting providers that you would like to see directions on how to get Docker up and running, please submit an issue, and we will do our best to add it to the docs. I'm working my way down a list, picking the most popular ones first. If I have a lot of requests for a particular host, I'll do that one next.

[1] http://docs.docker.io/en/latest/installation/amazon.html [2] http://docs.docker.io/en/latest/installation/rackspace.html [3] https://github.com/dotcloud/docker/issues/424


There are plenty of VPS providers that offer KVM in such a way that you can upgrade your own kernel. DigitalOcean seems to be an exception.


you can use any VPS provider who support ArchLinux or Gentoo Image. Both Distros will have linux kernel version > 3.8


What show-stopping problems have you been encountering? Also, which VPS providers have you had luck with so far?


The only provider I have been using so far is EC2. I tried Joyent (you can recompile your own kernel but but it's more of a PITA than I'm willing to deal with) and DigitalOcean (no 3.8 kernel for Ubuntu.)

The show stopping problems were early in development. The major one is that I ran into the kernel issues early on and then they added the recommendation to use only 3.8 or higher. So, that's not a fix but it addressed my problems. I was also having problems running Docker in stand-alone mode per their own docs, they have since removed this and daemon mode works great. I don't remember what the others were.


This is fantastic, I've been wondering how DotCloud made everything work so well. Can't wait to try this for deploying a Django application I work on — hopefully it will solve my dependency headaches.

One nitpick — not a big fan of the recommended installation method (curl get.docker.io | sh -x). Is it really that hard to ask people to download and run the script themselves?


Drop the sh, and inspect the script before you run it. It just checks a few dependencies and system items, and then retrieves the docker binary, and puts it in /usr/local/bin. All of this can be done by hand by any Linux sysadmin .. And of course, it wouldn't take much to make a .deb for it either.

However, I would like to discuss the docker design a little more in detail, on the basis of its ease of use. First of all, I too do not like to have random stuff piped into my shell, so I went looking for the Docker sources. It was darned easy to build from sources, and quick too. At the end, I had a single binary.

And the cool thing about this binary is that its both the server and the client in the same package! So - the sysadmin of your Linux machine can (and should, manually, for now) build from sources, install in a local/ or ~/bin, and add the daemon to start up as needed.

Then, anyone else on the machine - not needing su rights - can run docker images, and so on.

This isolation, simplicity of install, and .. frankly .. rocking feature set .. is a beautiful thing. Can it be that golang is the secret sauce here? I say: yes.


Docker author here. Yes, the ability to produce a single binary that "just works" is one of the reasons we chose Golang. The operational simplicity is hard to beat, and you don't have to convince python hackers to install gems, java hackers to mess with virtualenv, and so on.

For the record, another reason we chose Golang is because it was not python. Docker is a rewrite of a 2 year old python codebase which has seen a lot of action in production. We wanted the operational learnings but not the spaghetthi code. So we made it impossible to cut corners by changing languages. No copy-pasting!


Well I for one was quite surprised that a) docker compiled so rapidly on my system, and b) its a very sublime binary. I guess I'm learning another reason why golang ought to get more of my free-time attention, so thats enough HN for me, I'm off to spend the afternoon reading your code.. Cheers to you and what I'm about to learn from docker! :)


Yes it's written in go... though Docker could have been written to the same effect in something else (Haskell or whatever).

If there were a 'secret sauce' I'd say it's the kernel features it takes advantage of (cgroup, lxc, kernel namespaces, aufs, etc). :)


You're right about that! What I'm enjoying, perhaps a little meta-, is how easily that swell soup of jewels is ground up into a fine sonic-screwdriver of a tool. Seems to me right now, I've got yet another reason to enjoy the golang experience ..


> And of course, it wouldn't take much to make a .deb for it either.

It seems to be on its way : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=706060


If you are using Ubuntu 12.04 or 13.04 the recommended installation method is to use the Docker PPA.

More information available here: http://docs.docker.io/en/latest/installation/ubuntulinux.htm...


Seems like it may solve one set of issues, but will create an entirely new class of potential problems that I'm not sure are any better. A lot of the teething issues seem to be getting some focus though and maybe this will get better over time.

What dependency issues with Django apps have you been having a hard time solving? You should be able to solve everything with a decent configuration management system like puppet/chef/saltstack/ansible.


> Is it really that hard to ask people to download and run the script themselves?

Why? Isn't that exactly what "curl get.docker.io | sh -x" does? It's the cli equivalent to clicking on a download link and then executing it. I think what you mean is "wouldn't it be better for the user to first read the script's code before executing it or run it in a VM?"


No, with the "curl get.docker.io | sh -x" you have absolutely no reproducability or accountability, as there is way to find out which code was actually run during installation. This is the dream situation for attackers that managed to trojan the install script or MITM the connection (note that the curl call does not even use https) to target specific users.


The -x flag printa each line as your shell executes it. Not perfect, but a reasonable tradeoff of control and ease of use.


You could throw a "tee" into the pipe if you want to save the script as well as run it.


just as secure as the containers! ;-)


Can you back up that claim?


well yes.. LXC uses containers.. its as secure as what it uses. Heh.


Not yet... LXC is not even production ready yet in terms of security. Will take a year or two but it's getting there.


I guess it depends on what you mean by production ready. dotCloud, Heroku and others already use LXC in production, and they have been for a while.

You just need to know what the limitations are, and make sure you build your system around LXC so that you are protected.


stackato as well.

production ready doesn't mean much tho. you can use anything you like for prod. it doesn't make it better or worse.

a ton of things that are considered "production ready" today are crippled with bugs, design flaws, etc.

The major issue of linux namespaces (or containers or "lxc" if you will) is that they're generally used as a security feature and haven't not been designed primarily as a security feature. (it wouldn't have entered the kernel if it had been designed as such anyways) vm's provide a better level of isolation so far, even thus they're not perfect either.

and for what it's worth, freebsd for example (among some others) provide a similar namespacing that is much better security wise. also openvz, vservers are doing similar things. Oh and rsbac's jail too. (it might be the "strongest" of the list)


Pretty sure that was the point of what they said :)


Any recommended guides to getting a Django/RoR development environment set up with Docker? How does networking work with Docker? Currently I'm using Vagrant, the guest VM is sharing /Vagrant to the host (how does this work with Docker?), and the host has made Vagrant projects available over my LAN for my Macbook/Windows machines with IDEs. I also get the feeling Docker is meant to be complementary with VMs and not exclusive.

Also does it make sense using Chef/Puppet with Docker?


Docker is awesome.

It lets you use a Linode or AWS instance as a bunch of NATed containers. This makes it way easier to install just one thing in one container and not mess up the other ones. This is where configuration management is going.

Plug: CloudVolumes is awesome for Win apps.


My main ask is whether this provides the "hard" isolation types (io, mostly, and gpu). It sounds really cool.

Now that I'm looking at it, I don't think this provides any kind of isolation. That's not what it's for. It's for distributing packaged programs.


> Docker relies on a different sandboxing method known as containerization. ... Most modern operating system kernels now support the primitives necessary for containerization, including Linux with openvz, vserver and more recently lxc, Solaris with zones and FreeBSD with Jails.

I'd guess it's (at most) as secure as the underlying OS containerization support.


I was asking about how the os controls access and allocation of shared resources.


Since docker uses cgroups and cgroups can limit CPU, Memory, I/O, i guess this blog post by the docker guys gives a good overview: http://blog.dotcloud.com/kernel-secrets-from-the-paas-garage...



Way better than a VM, but only covers a subset of VM use cases.


It also covers some of the use cases for packages. So, it's half way between a package and a vm.


Exactly. That is actually a bigger problem to solve, IMO, so suggesting that it's mostly a VM alternative is doing it a disservice.


One of my biggest challenges has been finding the right way to describe Docker, because it borrows properties from multiple categories: virtualization tools, package managers, build tools, process managers, and even operating systems :)

People have told me they use Docker as a "vmware alternative", a "make alternative", a "rpm alternative", a "vagrant alternative" and even a "tomcat alternative". But people also use Docker together with all of these tools!

In that way Docker reminds me of Redis: depending on what you want to do, you could use it as a replacement for memcache, rabbitmq, couchdb, mysql or even zeromq. But you could also use it together with all of these tools. Over time we're getting more comfortable with the fact that Redis might just be a tool of its own - useful in its own unique way, in combination with whatever other tool is best for the job at hand.

... but none of that matters if nobody uses it, and to get people to use software you need catchy titles like "better than VMs" :)


Isn't this .. old news?

Why keep posting this every two month or so? Would be better if there are improvements, major site using it to roll out etc.


We (the docker authors) had nothing to do with this particular posting :)

That said there have been a lot of improvements - see the changelog here https://github.com/dotcloud/docker/blob/master/CHANGELOG.md


For more casual HN visitors, such as myself (I only skim the front page a couple of times a day), it might be the first time they've heard of Docker (which looks pretty cool).


It looks really nice. It would be great if you write more in specific on https://github.com/dotcloud/docker. So, what "size" can we expect, what kind of elements are part of the container. What "interpreters" currently exist for the portable format you describe. But also meta-level information such as the size of the team, why has been chosen for Go, and in what situations NOT to use docker. I've been following docker.io for a while for the purpose of cloud services for robotics, so I think I'm a nice example of a person who would liked to be convinced to use your solution. :-)


Depending on containers for security is absolute insanity. Kernel vulnerabilities are common. Own the kernel, break out of the docker containers, and the whole concept is rendered useless.

VMs are a much safer bet (though not perfect either).


Containers are nice for putting some distance between the compromised app and the OS. Especially useful when combined with a system for mandatory access controls (SELinux, Smack, etc). If you can attempt to limit the exposure of a compromise... you should!

Having said that... If someone has enough determination, they will manage to compromise your system regardless of how 'secure' it's has been made. :)

I generally like to use containers in addition to a virtual machine. I do find it a bit shocking when I see a company offering up containers as an alternative to a VM though. I suppose it's a compromise some companies are willing to make for the additional performance.


> Containers are nice for putting some distance between the compromised app and the OS.

Some distance. But now a days when you can own the kernel, that distance shrinks to zero.

I just saw these a few seconds ago: http://grsecurity.net/~spender/logs2.txt https://twitter.com/grsecurity/status/335963659337601024


The real value of containers is not security (although lxc does include robust security mechanisms), it's the streamlining of application deployment.

If you don't trust lxc to sandbox untrusted code, don't! Just deploy 1 production container per VM, or even per physical machine. But maybe you don't mind running multiple containers on your local dev vm or on your staging machine - I know I don't.

What containers give you is a constant payload which can be moved across very different hardware setups, from the most shared to the most isolated.


> it's the streamlining of application deployment.

I always liked the idea of nicely integrating with the environment and utilizing the features of package managers, the file system, users, and all the rest of the niceties we have at our disposal. "I don't know how to organize a bunch of things together!" seems like a silly reason to containerize every component into a separate root fs.

But on the other hand, I can imagine this work flow does have some merit, and some folks save a lot of time and energy and potential headaches just popping things in containers.


> "I don't know how to organize a bunch of things together!" seems like a silly reason to containerize every component into a separate root fs.

One good reason to separate every component is that it facilitates moving them to separate machines down the road, or scaling them separately.

Another good reason is that it reduces the out-of-band requirements of the components. "all the niceties" you have at your disposal may very well be specific to your distro, or your site setup. By contrast, docker containers only require docker, which is a single binary file. A developer needs to know his component will run anywhere, not just on your particular setup.


> Some distance. But now a days when you can own the kernel, that distance shrinks to zero.

I am certainly not disagreeing :). I do wonder how well the exploit would have worked if the system were also locked down with SELinux/Smack. Not that MACs are bulletproof, but again... more distance.

In the end though, the only secure system is one that's not powered. :)


> I do wonder how well the exploit would have worked

"the"?

> locked down with SELinux/Smack

If you're executing in kernel mode, then you can just disable these. It might be more interesting to point to the efficacy of various exploits on a Grsec/PaX kernel, however.

> In the end though, the only secure system is one that's not powered.

That's silly. Sure, airgap your sensitive stuff. But there's a reasonable level of security you can achieve that's far beyond what docker provides, while still retaining a flexible and reasonable computing environment. For example, Xen's xl tool makes making quick and cheap VM containers very simple.


> "the"?

Now you are just nitpicking :) "the" meaning, you gave me an example.

> If you're executing in kernel mode,.....Grsec/PaX kernel, however.

SELinux has a lot of overlap with grsec providing many of the same benefits while being more widely available. The flow of an exploit: 1) entry into a system (app compromise, shell access), 2) inserting/uploading of executable data, 3) execution of said data granting further access (Ex, exploiting a kernel bug, adding a backdoor, manipulating the host system in some way).

The goals of Grsec/SELinux, and marking data memory as non execute (NX bit, PaX, Exec Shield) are aimed at preventing #2 and #3. The idea is the prevention of access escalation in the first place.

On PaX, the kernel supports utilizing the NX bit on x86-64 and has for quite a while now. Not using a system supporting the NX bit or at least PaX/Exec Shield is pretty stupid.

> But there's a reasonable level of security you can achieve that's far beyond what docker provides

I had already agreed with you on VMs... no reason to argue this point. :) Since you mention VMs again however, I will also note that VMs are not entirely isolated from the host system. A Xen (as your example) exploit as an example: http://lists.xen.org/archives/html/xen-announce/2012-06/msg0...


> you gave me an example.

Ahh sorry. Well to continue along that example, evidently it breaks out of lots of things -- https://grsecurity.net/~spender/logs.txt

> On PaX, the kernel supports utilizing the NX bit on x86-64 and has for quite a while now. Not using a system supporting the NX bit or at least PaX/Exec Shield is pretty stupid.

Not going to try to parse this, but you appear to be very mistaken. Wikipedia PaX.

> VMs are not entirely isolated from the host system

Correct, hence the parenthetical in the OP. That sysret bug was a great one.


> Not going to try to parse this, but you appear to be very mistaken. Wikipedia PaX.

On PaX https://en.wikipedia.org/wiki/PaX#Executable_space_protectio...

"The major feature of PaX is the executable space protection it offers. These protections take advantage of the NX bit on certain processors to prevent the execution of arbitrary code. This staves off attacks involving code injection or shellcode. On IA-32 CPUs where there is no NX bit, PaX can emulate the functionality of one in various ways."

And then on NX bit support on Linux: https://en.wikipedia.org/wiki/NX_bit#Linux

"The Linux kernel currently supports the NX bit on x86-64 CPUs and on x86 processors that implement it, such as the current 64-bit CPUs of AMD, Intel, Transmeta and VIA.

The support for this feature in the 64-bit mode on x86-64 CPUs was added in 2004 by Andi Kleen, and later the same year, Ingo Molnar added support for it in 32-bit mode on 64-bit CPUs. These features have been in the stable Linux kernel since release 2.6.8 in August 2004."

PaX also provides a few other features but the big defining one has been the NX bit support. Not sure why you seem to think I am mistaken in what I said.


Also, In a private PaaS'ish environment, docker's container abstraction is very useful.


A more relevant comparison is with Juju and Heroku buildbacks support by Stackato.


Juju person here. Docker would be a nice enhancement for Juju. Currently we use LXC in our older version (.7 in python) for a local provider to deploy containers on your local laptop. The idea is being thrown around to instead use Docker containers in _every_ provider like EC2 and OpenStack. This would enable you to just move stuff around transparently, which is awesome.

The Juju team is keeping a close eye on Docker and it'll likely (but I can't promise) be in 13.10.


This looks very interesting.

I would love to be able to setup a workflow on my Windows machine that allows me to be able to do Rails dev on the Windows machine, in as close to a native way as possible - but with using much of the same workflow I use on my MBP.

If I setup a container on my Windows machine, I would have still have to SSH into some virtual environment to be able to run my rails app, right?

Ideally, I would love to be able to just go to my localhost in my browser and see my app - will this help me be able to do that, rather than going to a browser within a VM or some 'contained environment'?


Docker looks like it is *nix only. It CAN be run on windows and Mac...inside a VM, which kind of eliminates all the advantages they talk about.

An option is Vagrant (Which Docker uses on the above OSes) + chef / puppet.

It uses VMs but works well enough for me and both configuration engines have widespread support. http://docs-v1.vagrantup.com/v1/docs/getting-started/


Not *nix only, Linux only. Solaris and FreeBSD have totally different container mechanisms, which honestly are probably better understood and vetted at this point than the LXC approach.


Running docker inside doesn't eliminate all the advantages. For example, you can test a full stack of components (frontend, database, memcache, background workers...) on a single VM, instead of deploying 1 vm per component, which gets really heavy really fast.

Another advantage is that docker on a vm is still docker: the container running on your local VM is 100% identical, bute-for-byte, to the container you will run on an octo-core box with half a tera of ram.


I hate the small delays of a VM environment though.

Rails already has some minor loading issues, adding a VM to that can be very frustrating - I imagine.


You want Vagrant[1]. It provides an easy way to control local headless VMs. It will mount local directories and port forward to your localhost.

[1]: http://www.vagrantup.com/


I have been looking at Vagrant for a while - but it seems like just another VM packaging system to me.

Am I mistaken, or will my Rails environment not sit in a VM in a Vagrant instance?


It will sit in a VM.

Vagrant creates a VirtualBox (default) VM, uses chef / puppet (preferably) to set the environment up then you link the host's rails directory to a suitable location on the VM.

So yes, the environment is a VM, you can use "vagrant ssh" to access that VM easily enough but for coding you use your usual code directory on the host with your usual editor.


I'm about to install and answer my own question, but what is the difference between this and, say, supermin + lxc? Or just supermin.


Is there any way to migrate running containers to another physical machine if you need to do maintenance on the underlining system?


Is this similar to RedHat's mock environment?

The advantage is that you're not limiting yourself to a RedHat distro. But conceptually is it similar?


If only something like this existed for Windows.


VMWare ThinApp

Enigma VirtualBox

Molebox

Thinstall Virtualization Suite

There are more...


anything opensource?


I really like the way the dependency format is basically a monad, but they don't call it a monad so devs won't get scared :)


How is possibile to limit the resources (ram, storage, cpu) and get the resources usage stats?

Tnx


You can limit memory and cpu usage with "docker run -m" and "docker run -c", respectively.

Docker relies on cgroups for resource limuts and accounting. So anything you can do with cgroups, you can do with docker.


Will this replace vagrant someday?


"Way better than a VM"

Why?


Crazily enough if you read the linked page you'll find out!


Probably a very snarky reference to, if you're uninterested in the four big features of docker, which are implied to be the only reasons for virtualization, and aren't even accurate anyway, then its not better than VMs, it might even be worse.


The README lays out their reasoning under the second heading, Better Than VMs[0].

[0]: https://github.com/dotcloud/docker/#better-than-vms




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: