I started with ECS (because I wanted to avoid the complexity of K8s) and regret it. I feel I wasted a lot of time there.
In ECS, service updates would take 15 min or more (vs basically instant in K8s).
ECS has weird limits on how many containers you can run on one instance [0]. And in the network mode where you can run more containers on a host, then the DNS is a mess (you need to lookup SRV records to find out the port).
Using ECS with CDK/Cloudformation is very painful. They don't support everything (specially regarding Blue/Green deployments), and sometimes they can't apply changes you do to a service. When initially setting up everything, I had to recreate the whole cluster from scratch several times. You can argue that's because I didn't know enough, but if that ever happened to me on prod I'd be screwed.
I haven't used EKS (I switched to Azure), so maybe EKS has their own complex painful points. I'm trying to keep my K8s as vanilla as possible to avoid the cloud lock-in.
Interesting that you say you worry about re-creating the cluster from scratch because I've experienced exactly the opposite. Our EKS cluster required so many operations outside CloudFormation to configure access control, add-ons, metrics server, ENABLE_PREFIX_DELEGATION, ENABLE_POD_ENI... It would be a huge risk to rebuild the EKS cluster. And applications hosted there are not independent because of these factors. It makes me very anxious working on the EKS cluster. Yes you can pay an extra $70/month to have a dev cluster, but it will never be equal to prod.
On the other hand, I was able to spin up an entire ECS cluster in a few minutes time with no manual operations and entirely within CloudFormation. ECS costs nothing extra, so creating multiple clusters is very reasonable, though separate clusters would impact packing efficiency. The applications can be fully independent.
> ECS has weird limits on how many containers you can run on one instance
Interesting. With ECS it says for c5.large the task limit is 2 with without trunking, 10 with.
In ECS I had to recreate the cluster from scratch because some of the changes I wanted to do, CDK/CF wouldn't do.
My approach on Azure has been to rely as little as possible in their Infra-as-code, and do everything I can to setup the cluster using K8s native stuff. So, add-ons, RBAC, metrics, all I'd try to handle with Helm. That way if I ever need to change K8s provider, it "should" be easy.
In ECS, service updates would take 15 min or more (vs basically instant in K8s).
ECS has weird limits on how many containers you can run on one instance [0]. And in the network mode where you can run more containers on a host, then the DNS is a mess (you need to lookup SRV records to find out the port).
Using ECS with CDK/Cloudformation is very painful. They don't support everything (specially regarding Blue/Green deployments), and sometimes they can't apply changes you do to a service. When initially setting up everything, I had to recreate the whole cluster from scratch several times. You can argue that's because I didn't know enough, but if that ever happened to me on prod I'd be screwed.
I haven't used EKS (I switched to Azure), so maybe EKS has their own complex painful points. I'm trying to keep my K8s as vanilla as possible to avoid the cloud lock-in.
[0] https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesgu...