Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hmm - have you considered a tool like Terraform or Deployment manager for creating the clusters? In general, it's best practice to capture that configuration as code.


Managing clusters in terraform is not enough to "treat clusters like cattle". Changing a cluster from a zonal cluster to a regional cluster in the terraform configuration will, upon a terraform apply, first destroy the cluster then re-create the cluster. All workloads will be lost.

I'm sure there are tools out there to help with cluster migrations, but it is far from trivial.


I think it’s a bit optimistic to assume that your customers will just change their deployment model because you introduced a fee.

You provide a web interface, so it’s reasonable to assume people will use it.


Cloud providers assume everyone is just like them, or like Netflix: Load balancers in clusters balancing groups of clusters. Clusters of clusters. Many regions. Many availability zones. Anything can be lost, because an individual data centre is just 5% of the total, right?

Meanwhile most of my large government customers have a couple of tiny VMs for every website. That's it. That's already massive overkill because they see 10% max load, so they're wasting money on the rest of the resources that are there only for redundancy. Taking things to the next level would be almost absurd, but turning things off unnecessarily is still an outage.

This is why I don't believe any of the Cloud providers are ready for enterprise customers. None of you get it.


I think you're wrong -- containers aren't ready for legacy enterprise, VMs are a better choice of abstraction for an initial move to cloud.

Get your data centers all running VMWare, then VMDK import to AWS AMIs, then wrap them all in autoscaling groups, figure out where the SPOFs have moved to, and only then start moving to containers.

In the mean time, all new development happens on serverless.

Don't let anything new connect to a legacy database directly, only via API at worst, or preferably via events.


Have you considered the Google App Engine on GCP in standard mode? That seems like a good fit based on your explanation but I could be wrong.


I had a similar conversation with a government customer, saying that they should pool their web applications into a single shared Azure Web App Service Plan, because then instead of a bunch of small "basic" plans they could get a "premium" plan and save money.

They rejected it because it's "too complex to do internal chargebacks" in a shared cluster model.

This is what I mean: The cloud is for orgs with one main application, like Netflix. It's not ready for enterprises where the biggest concern is internal bureaucracy.


Why would one want lots of little GKE clusters, anyway? Google itself doesn't part up its clusters this way, AFAIK. I don't want a cluster of underutilized instances per application tier per project; I want a private Borg to run my instances on—a way to achieve the economies-of-scale of pod packing, with merely OS-policy-level isolation between my workloads, because they're all my workloads anyway.

(Or, really, I'd rather just run my workloads directly on some scale-free multitenant k8s cluster that resembled Borg itself—giving me something resembling a PaaS, but with my own custom resource controllers running in it. Y'know, the k8s equivalent to BigTable.)


We run lots of small clusters in our projects and identical infrastructure/projects for each of our environments.

Multiple clusters lets us easily firewall off communication to compute instances running in our account based on the allocated IP ranges for our various clusters (all our traffic is default-deny and has to be whitelisted). Multiple clusters lets us have a separate cluster for untrusted workloads that have no secrets/privileges/service accounts with access to gcloud.

Starting in June our monthly bill is going to go up by thousands. All regional clusters.


Namespaces handle most of these issues. A NetworkPolicy can prevent pods within a namespace from initiating or receiving connections from other namespaces, forcing all traffic through an egress gateway (which can have a well-known IP address, but you probably want mTLS which the ingress gateway on the other side can validate; Istio automates this and I believe comes set up for free in GKE.) Namespaces also isolate pods from the control plane; just run the pod with a service account that is missing the permissions that worry you, or prevent communication with the API server.

GKE has the ability to run pods with gVisor, which prevents the pod from communicating with the host kernel, even maliciously. (I think they call these sandboxes.)

The only reason to use multiple clusters is if you want CPU isolation without the drawbacks of cgroups limits (i.e., awful 99%-ile latency when an app is being throttled), or you suspect bugs in the Linux kernel, gVisor, or the CNI. (Remember that you're in the cloud, and someone can easily have a hypervisor 0-day, and then you have no isolation from untrusted workloads.)

Cluster-scoped (non-namespaced) resources are also a problem, though not too prevalent.

Overall, the biggest problem I see with using multiple clusters is that you end up wasting a lot of resources because you can't pack pods as efficiently.


Aware of all of this, but we have a need to run things relatively identically in GKE/EKS/AKS and gVisor can't be run in EKS, for example.

We're okay with the waste as long as our software & deployment practices can treat any hosted Kubernetes service as essentially the same.



For those that didn't click through, I believe the parent is demonstrating that it is a best practice to have many clusters for a variety of reasons such as: "Create one cluster per project to reduce the risk of project-level configurations"


For robust configuration yes. However one can certainly collapse/shrink if having multiple clusters is going to be a burden cost-wise and operation-wise. This best practices was modeled based on the most robust architecture.


This is it exactly.

Thank you.


Namespace are not always well suited to hermetically isolate workloads.


It's probably not worth $75/month to prevent developer A's pod from interfering with developer B's pod due to an exploit in gVisor, the linux kernel, the hypervisor, or the CPU microcode. Those exploits do exist (remember Spectre and Meltdown), but probably aren't relevant to 99% of workloads.

Ultimately, all isolation has its limits. Traditional VMs suffer from hypervisor exploits. Dedicated machines suffer from network-level exploits (network card firmware bugs, ARP floods, malicious BGP "misconfigurations"), etc. You can spend an infinite amount of money while still not bringing the risk to zero, so you have to deploy your resources wisely.

Engineering is about balancing cost and benefit. It's not worth paying a team of CPU engineers to develop a new CPU for you because you're worried about Apache interfering with MySQL; the benefit is near-zero and the cost is astronomical. Similarly, it doesn't make sense to run the two applications in two separate Kubernetes clusters. It's going to cost you thousands of dollars a month in wasted CPUs sitting around, control plane costs, and management, while only protecting you against the very rare case of someone compromising Apache because they found a bug in MySQL that lets them escape the sandbox.

Meanwhile, people are sitting around writing IP whitelists for separate virtual machines because they haven't bothered to read the documentation for Istio or Linkerd which they get for free and actually adds security, observability, and protection against misconfiguration.

Everyone on Hacker News is that 1% with an uncommon workload and an unlimited budget, but 99% of people are going to have a more enjoyable experience by just sharing a pool of machines and enforcing policy at the Kubernetes level.


It doesn't have to be malicious. File Descriptors aren't part of the isolation offered by cgroups, a misconfigured pod can exhaust FDs on the entire underlying Node and severely impact all other pods running on that node. Network isn't isolated either. You can saturate the network on a node by downloading large amount of data from maybe GCS/S3 and impact all pods on the node.

I agree with most things you’ve said around gVisor providing sufficient security, but it's not just about security, noisy neighbors are a big issue in large clusters.


IOPS and disk bandwidth aren't currently well protected either.


RLIMIT_NOFILE seems to limit FDs, or am i missing something?


CRDs can’t be safely namespaced atm, aiui.


We use Skaffold and it’s great. I’m talking about very minor unforeseen stuff that causes outages, not that we do it manually.


This is an interesting exchange if only for the thread developing instead of a single reply from a rep; it’s nice to see that level of engagement.

More importantly, this dialogue speaks volumes to Google’s stubbornness. Seth’s/Google’s position is: do it the Google way, sorry-not-sorry to all those that don’t fit into our model.

Like we haven’t heard of infrastructure as code? That can’t paper over basics like being unable to change your K8s cluster control plane. This is precisely the attitude that lands GCP as a distant #3 behind AWS and Azure.


Google stubbornly resists the idea that their platforms have actual users who depend on things not being broken for them constantly. It's cultural.

AWS has the complete opposite model.


Even still it’s not like it’s non-trivial to just bring up and drop clusters. Just setting up peering with cloud sql or https certs with GKE ingress can be fraught with timing issues that can torpedo the whole provisioning process.


How is this any helpful? Are they supposed to implement everything in terraform or similar, is that your suggestion? Why don't you completely remove the editable UI then, if whoever is using it is doing it wrong. What a typical arrogant and out of touch with customers Google response.


It could be totally unrelated but having an option such as equivalent TF along with REST and CLI options could dramatically speed up the configuration process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: