The more I use Docker the more I realize that GCE is going to smoke AWS and Azure in the future with regards to deploying a containerized infrastructure.
This is mostly because turn-key kubernetes on GCE is light years ahead of what ECS provides. It almost makes amazon's offering seem like a joke.
This might level the playing field a bit by having Docker itself provide the infrastructure management software instead of it being tied to a particular service.
So then the question for me is do I want to deal with Google or Amazon? I pick Amazon every time because I'd never want to depend on Google, who hates customer service and sucks at it.
The most annoying thing about ECS is how it looks like docker-compose (v1) but is just different enough to be incompatible. At my last job, we were evaluating containerization, and we nearly gave up after seeing how awful ECS was to make work.
Having been running a large scale production system on ECS for well over a year now, I'd like to say that ECS is hardly a bad choice; it's incredibly stable, integrates with proven AWS technologies (CloudFormation, ELB, etc) and it just works without having to manage clustering. All it needs is a layer of usability on top, which you can get from things like Empire and Convox.
While native Docker clustering is cool in theory, I think the vast amount of complexity around networking makes it unsuitable for production deployments at this time. If stability is more important than bells and whistles, ECS is great choice. Docker's production story, outside of simply running containers, has a lot to be desired.
I'm assuming you're referring to instabilities in Docker networking. I can't point to anything specific, but that's because we've seen it as an unnecessary layer of complexity, that's trying to solve a problem that most companies don't/won't have.
DNS + a load balancer is proven and stable. We have 65,536 available ports on linux, and doing port mapping to prevent conflicts across containers is trivial (Empire handles this for us). Client side service discovery is usually unnecessary for most Docker use cases (stateless services), where a load balancer is a better fit (why push load balancing to clients? Would you expect your browser to have to load balance when visiting google.com?).
I'm sure there are cases where networking overlays are a good fit, I just haven't found one yet.
You don't have to use overlay. Docker networking is pluggable and any networking configuration supported by Linux can easily ne expised to Docker via network plugins. The most common configurations (bridge, nat, host networking, no networking, overlay) are built-in.
second on beanstalk/docker - beanstalk does simplify quite a bit about running a fleet of docker images (and non-docker stuff too), uses standard aws tools (cf, elb, rds, ecs, vpc), has zero downtime deployment built in, automated server upgrades, app monitoring, environment cloning etc...
one nick against beanstalk is that it could be faster doing all those things, but it's tolerable considering alternatives.
I've been using Elastic Beanstalk for some time now and while I find it quite useful for people who just want to get Docker to production without investing to much time into Orchestration, it looks to me like there is a bit of price to pay for that 'simplicity':
- Elastic Beanstalk does not provide a native Docker experience. You're forced to use AWS own way of defining services and dependencies (Dockerrurn.aws.json, which comes in two versions). This means you cannot longer use docker-compose, and not all of the options from `docker run` are available (logging driver, privileged containers, though I think the ability to run privileged containers was recently added).
- From my personal experience deployment times can be unacceptable long ranging from 5 - 20 mins. They get longer as you add more instances to the environment or if the instances are not powerful enough.
I think Elastic Beanstalk makes a good choice for starters or for not so mission-critical applications.
As a result of the points explained above I'm now looking into more Docker-centric solutions such Rancher/Kubernete or Docker's UCP.
As you said, SNS is mostly covered with PubSub but SQS and any sort of RDS for something other than MySQL are absent (notably Postgres). I like GCE a lot and we use it at $work but there's no excuse at this point to not have feature parity.
ECS wasn't too bad. The approach they took made sense when it first came out but Amazon haven't evolved the idea since then. Personally I think that GCE's edge won't come from their container support since AWS and Azure both support k8s but will come from their value-add analytics and data services like BigQuery, Dataflow and so on. If Azure and AWS can't compete on that front then GCE will start to cannabilise the customers who are only still using the respective cloud providers only for infrastructure. GCE still has a long way to go before it becomes a serious contender.
I'm overall pretty optimistic about AWS, but if they do end up losing to another platform I would expect it to be due to this general issue of not iterating on their offerings enough. I can't really recall them ever deprecating a service, e.g. forking something to a v2 and killing off v1after a certain amount of time. So with everything they release they only give themselves one chance to get the core of it right.
Right, but that's kind of the problem. If I were signing up for the first time and needed a simple db, I might see the SimpleDB service in the dashboard and try it out, not knowing its been unofficially deprecated.
Similar from what I can tell with CloudSearch being superceded by Elasticsearch Service. Although their version of elasticsearch is ~2 years old and hasn't been updated since they launched the service, so this may be a case where neither are getting any attention.
The worst part is, AWS support is so tight-lipped in the forums and over web support that there's no way to tell what is still fully supported and what's being silently put out to pasture. You have to look for hints based on what products they're doing webinars about, looking at the momentum of what products they're making tweaks to in the weekly update emails, etc.
What do you mean it has a long way to go? What exactly is missing in GCE compared to AWS? You just stated something you would be an edge in the future, but no cons it seemed like.
Interested to know because we're considering cloud providers.
GCE is not stable(a kind of forever beta) and misses a lot of the AWS features. It's basically few years behind AWS. I started to hate it but unfortunately I already invested too much in the platform to leave it now.
Pro: it's cheaper than AWS.
For example there is no AWS Certificate Manager. Someday a similar feature may land on GCP but as usually it will be too little too late. IAM was released only recently. Before IAM the auth & auth on GCP was really a joke. GAE management is still not API manageable. You need to use the sdk which has private APIs to deploy new apps or manage versions. What kind of cloud service is that if it doesn't have an API? GAE email service required you(until recently) to create a paid google apps account for each email address you want to use as sender (WTF??). All this was done manually (e.g. in the browser). Recently the service was totally deprecated(i.e. you are limited at 100 emails per day). Compare that with Amazon SES which was a stable and robust product from day one and still works flawlessly. This is really how GCP relates to AWS. At first glance it looks similar but then you find out that's a stripped down version with various restrictions and stability issues.
Last time I've checked the SDK was a mess. Two issues come into my mind:
- I couldn't ssh into an instance because the username for a particular distro was different than in the docs
- I couldn't deploy a docker based GAE Managed VM(now named flexible VM<see stability>). I understand it's beta labelled but when not even the "example" app is not deployable it says a lot about the product stability.
I could make a long list with such issues... If the price is the most important factor and you need basic cloud features and you are fine with a beta label GCP may work for you, otherwise you should look elsewhere.
It wasn't mentioned because we only launched Docker for AWS and Docker for Azure. We only had the bandwidth to ship 2 versions at the level of quality we wanted, so we picked the most frequently requested providers. I think GCE is a great candidate for the next batch of releases. It all depends on demand. Would you use it?
Docker does follow the Unix philosophy: it's just that it's a unix-like collection of tools rather than a single tool.
Under the hood Docker is made of many small components which you can use separately if you want: containerd, runC, libnetwork, SwarmKit, HyperKit, VPNKit, DataKit, Compose, Machine, Notary, etc.
The 'docker' cli gives you a nicely integrated interface to all these components but you can use them separately too.
I don't think splitting a project up into libraries is what the philosophy describes. If I filter out the 1600+ issue in the Docker project right now, I can see many related to Swarm. Your users now inherit all of the bugs and necessary features of swarm even if they're only interested in using Docker on a single host. The Unix philosophy would also not commend the amount of duplicate effort that the Docker team is pouring into swarm when the problem of clustering have already been tackled by other companies and teams. One unix tool does not develop half of another unix tool, it just uses that other tools and relies upon the experience and expertise of the community. No, If Docker grows a sub tool to solve every problem, it no longer qualifies as meeting the requirements of the philosophy, and it certainly doesn't reap the benefits of compose-ability at the user level. I vote for separate tools and freedom of choice any day over one giant, precomposed tool. I think it's also discouraging that we will now see swarm slowly grow all the features of its competitors, when that engineering effort could have gone elsewhere. To summarize: Don't reinvent, prefer choice to lock in, composition is at the user level, not the tool level (libraries don't count). Plugins were a great step forward, built in swarm was a step backward.
It's the idea popularized by git. Use one namespace and you have related commands that lives under one roof. Compare that with postgressql which installs a bunch of generic names on the top-level (e.g. createddb) which is annoying.
The thing I would be concerned about is bloat. I enjoy and leverage dockers' fast boot up and lightweight containers. I hope they stay focused on this.
Docker's trying to lock in on enterprise customers. This has nothing to do with programming, and everything to do with encouraging lock in. I've lost a lot of faith in Docker. I thought they really cared about bringing containers to developers and fostering an ecosystem, but here we see them trying to oust out sysops competitors. Why compete in the container space when Docker wants to be the king of everything? Why put your trust in Docker as a developer when it's the enterprise they're after? There's still so many improvements to be made to the developer experience, still so many problems to be solved, but Docker is only interested in getting paid by the corporations that could care less. So no, the docker engine shouldn't be defining services and clustering. They touted plugins last year as the answer to extensibility, but of course now we're locked in to their idea of clustering. What their idea of a service is. If I were the Kubernetes team, I would be warning others about Dockers voraciousness. No one is safe as long as there is money to be had.
You can either have a bunch of different tools that do their own little thing and do them well and need a bunch of perl scripts and sysadmins on call to actually do anything useful, or you can have one large tool that does one meaningful thing and, by virtue of being one tool, does it well.
The UNIX philosophy is fine, it's just slowly escaping from the people who misunderstood it. (Oddly enough UNIX kernels have always been fine, partly because of the bikeshed principle, and partly because experimentally, doing a bunch of tiny things well doesn't actually get you a working OS.)
That I can agree with, but I still feel like there is a reasonable line to be drawn between application containerization and coordination and load-balancing of networked services.
My understanding is it follows the 'plumbing and porcelain' pattern; behind the scenes there are lots of small commands that do one thing and one thing well (the plumbing) and one command that acts as a wrapper over all these commands to keep things simple (the porcelain).
Slightly off topic. What are people doing about user data persistence on the cloud/Docker? Specifically, we are porting a desktop application to the cloud via application streaming technology with Docker, but we would like the user’s data and preferences to not go "poof" when the cloud instance disappears. Ideally, we would like some automagic way to attach, say, the user's dropbox account or the equivalent to the cloud instance. Is anyone working on that problem?
You can use something like Glusterfs and distribute your files or it can hook in to your clod provider and create persistent storage automatically on something like EBS.
When I tried it way back when, he trouble I had with Gluster is that it didn't readily handle nodes randomly leaving and new ones joining. It was more hardware-centric in that if a node left, you were expected to bring that specific node back online. Is that still the case?
I don't have direct experience with any of these, but the Docker approach here is that Docker exposes a Volumes Plugin API[1] which allows third-parties to implement portable volume plugins that achieve the ability for a volume to persist across hosts and across containers.
A bunch of plugins have been implemented[2], but I haven't personally heard any real success stories of people using them, which doesn't necessarily mean such stories don't exist.
On a separate note, many other teams run stateful services that can handle the complete loss of a node's data. For example, it seems popular to run Elasticsearch in Docker, though again I'm still learning about this, pattern, too.
I think this is where the idea of co-locating compute and storage really shines (eg: Joyent/Manta, scaleableinformatics.com) - moving all state to nfs sounds like a great way to widen the performance gap between RAM and permanent storage... Not to mention that if you really want to spread your data and your application, you would require nfs over vpn (unless encrypted nfsv4 actually works now?).
Can you really run a transaction-oriented DB over NFS "in the cloud" with meaningful performance and guaranteed writes to disk in case of network/power failure?
They need to put a note on their page that docker for windows is not compatible with docker for windows containers. I've been playing around with docker for windows containers for a few days, then saw this and thought, "cool, an update." I installed, and discovered that while the client is compatible between the two, the daemons they run cannot see the containers created in the other daemon. MS and Docker need to sort this stuff out, because right now, windows containers are nicer for the few images that have been released, but docker for windows allows for the full docker ecosystem.
As usual with AWS (or anything I guess) there's more than one way to accomplish a particular goal. How much of Amazon's Elastic Container Service is replicated by Docker for AWS? I'm currently using ECS + Docker but this looks potentially simpler.
Our goal for this project was to keep it as simple as possible. We tried to only use standard AWS features (CloudFormation, EC2, Autoscaling groups, VPC, ELB, etc) and we are using Docker 1.12 out of the box, with no changes.
All of the scheduling is handled in the docker engine itself, so we didn't need to add anything outside of that.
This version is pretty much an MVP, and we hope that the beta testers will help us test it out, and guide the future direction of the project.
Seems like there's not much in the way, at the moment, about attaching EBS volumes, or using your own custom AMI. For example, I want to mount an NFS share on the host to connect some container data volumes to directories inside. I've asked about this on the new forum https://forums.docker.com/c/docker-for-aws
What if your app was designed to work against AWS's data offerings, from RDS, S3, etc? Unless you're self-hosting your DB, concerns about persistence are less of an issue. Also worth considering if you are self-hosting is the replication/redundancy in the platform, you should be able to preserve-restart instances.
Me, I'd rather use what's available than self-host more often than not, but depends on the use-case.
Mesos is proven at massive scale (Apple use it for Siri, Twitter use it for their data warehouse). You can run a cluster of 10k nodes if you want. I believe they have run simulations at higher scale than that.
Docker Swarm is not yet proven at such scale. They have run perf tests at 1k nodes, which is high enough for most use cases, but not for truly warehouse-scale computing. 1k nodes is also where Kubernetes tops out at (though expect that to be higher in Kubernetes 1.3, which should be released in the coming weeks).
On the other hand, if you're trying to run Docker container workloads, it's going to be easier in a Docker-first cluster orchestrator like Docker Swarm or Kubernetes; Mesos has Docker support but it's not as tightly integrated, and doesn't implement the full Docker management API.
I don't think it does make a difference, these are all just resource/cluster managers. Swarm and Kubernetes are more focused on containers, where Mesos is more robusta and can support running many other things on top of it.
IBM's Bluemix has supported Docker Containers for a while now, but hasn't gotten much limelight... probably as a result of the size of their community and Bluemix's sketchy [but improving] stability. Does anyone have any experience using their container service? And would this be a big improvement?
I've been using the container service for a few months. I have encountered a case where my running containers completely disappeared and was told by support that this was because of a "migration". They claimed to have notified me beforehand on several occasions, but I have no evidence of this. Hopefully this was a one time occurrence and will lead to greater stability.
One other general problem I've had is the time it takes to launch a container. I would expect containers to be an improvement over VMs in terms of how quickly you can launch, but that doesn't seem to be my experience so far.
I would wait a while longer before I would consider their container services for production workloads.
Not a fan of the new Windows version as it requires you to enable Hyper-V, which stops any other virtualization (Virtualbox, VMware, etc.) from functioning. The only workaround I've found is rebooting to enable/disable it on demand.
What are you talking about? I had to enable Hyper-V to get Virtualbox or any other VM solution working. But yes you're right that you can't use them at the same time. The new native docker doesn't use third-party VM (boot2docker) and interfaces with the host hypervisor directly. So I had to stop all running virtualbox VMs to get docker to start.
They never emailed me from the last "private beta" before announcing this. And now deploying this on AWS or Azure requires me to sign up for yet another "we'll get back to you..." private beta.
Love the tech, hate all this marketing runaround.
EDIT to add: either your beta is ready or it ain't. I understand a gentle initial seed to verify it's not an utter catastrophe but no need to play nanny with my bits across 20+ beta releases. I'm a grown up. Make a disclaimer and let me assess the risk. Your corporate logo looks like a 1970s Carvel ice cream cake--I know what I'm getting myself into.
> Love the tech, hate all this marketing runaround.
Just wanted to offer an alternate point of view or maybe some context on this. There was over 70,000 people that signed up for the Docker for Mac & Windows beta. So, using an invite system was more of a self preservation measure, and not waiting to expose tons of people to bugs, as the beta was refined (through roughly twenty iterations). If a bad update rolls out across all these people, that would not be good, as you can imagine. So, much more self preservation measure, and wanting to catching things quickly in smaller batches, than wanting to market to you.
Check your spam folder, some of the email from the last private beta went in people's spam folder, so they never saw them. Either way it doesn't matter anymore since Docker for Mac and Windows is now public to use without the need of a code.
I'm not sure there's much overlap is between people who can get Docker Machine limping on VirtualBox with storage and networking flowing properly on Windows 10 fast ring and those who don't know how to check their spam folders, but I appreciate the tip.
No messages (and I applied via two accounts for two different platforms). No message telling me "go get the bits!" today. And nothing in spam folders.
But yet another blog post asking me to sign up for yet another private beta... ugh.
EDIT to add: As polite feedback--two separate requests, two separate Docker accounts, one Gmail and one corporate email address, and I did not receive any invite during the private beta nor notification today when it went public.
You would be surprised how many people didn't see it in their spam folder. Hence the reason why I mentioned it. :)
Not sure what you mean about no message telling me to go get the bits today. There was no email saying that docker for mac/windows was made public, it was announced during the keynote at DockerCon (this morning), and a blog post went out.
Before we made it public, everyone who had signed up for the beta had an invite sent to them, and there was no one left in the queue.
You should try it. The toolbar is just there to see the status of the daemon and adjust vm settings. You still use the cli to use docker just like on linux.
Yea, the GUI is just there to manage updates and to let you know it's up and running, you use the actual docker CLI commands for everything just like on Linux. The hypervisor is completely abstracted away. For example, here's the output of me running `docker info`:
Kernel Version: 4.4.13-moby
Operating System: Alpine Linux v3.4
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.954 GiB
Can it fall back to Virtualbox/vmware/parallels/etc if Hypervisor.framework isn't available? Hypervisor.framework doesn't run on somewhat dated x86 machines.