Hacker News new | past | comments | ask | show | jobs | submit login

Yes! I think this is a really under-reported issue. It's basically caused by kubernetes doing things without confirming everyone responded to prior status updates. It affects every ingress controller, and it also affects services of type "Load Balancer" and there isn't a real fix. Even if you add a timeout in the pre stop hook, that still might not handle it 100% of the time. IMO it is a design flaw in Kubernetes.



Not defending the situation of a preStop hook at least in the case of API's k8s can handle it 100%, its just messy.

We have a preStop hook of 62s. 60s timeouts are set in our apps, 61s is set on the ALBs (ensuring the ALB is never the cause of the hangup), and 62s on the preStop to make sure nothing has come into the container in the last 62s.

Then we set a terminationGracePeriodSeconds of 60 just to make sure it doesn't pop off too fast. This gives us 120s where nothing happens and anything in flight can get to where its going.


> Then we set a terminationGracePeriodSeconds of 60 just to make sure it doesn't pop off too fast.

I think the grace period includes the prestop duration doesn't it?


The grace period does include the time spent to execute the preStop hook yes.


Yep, same configuration here other than we use 60/65/70 for (admittedly) completely unscientific reasons.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: