Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Perhaps not all tech companies care to push past human error in each postmortem (or even have a proper formal process at all), but some are known to do just that. Etsy and Google are among the well-documented cases.

> This idea of digging deeper into the circumstance and environment that an engineer found themselves in is called looking for the “Second Story”. In Post-Mortem meetings, we want to find Second Stories to help understand what went wrong.

— Blameless PostMortems and a Just Culture,

https://codeascraft.com/2012/05/22/blameless-postmortems/

> Blameless postmortems are a tenet of SRE culture. … Blameless culture originated in the healthcare and avionics industries where mistakes can be fatal. These industries nurture an environment where every "mistake" is seen as an opportunity to strengthen the system. When postmortems shift from allocating blame to investigating the systematic reasons why an individual or team had incomplete or incorrect information, effective prevention plans can be put in place. You can’t "fix" people, but you can fix systems and processes to better support people making the right choices when designing and maintaining complex systems.

— Site Reliability Engineering — Postmortem Culture: Learning from Failure

https://landing.google.com/sre/book/chapters/postmortem-cult...



Amazon is pretty strong on this front too, the most recent public example is the S3 outage and postmortem[1].

>Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.... We have modified this tool to remove capacity more slowly and added safeguards to prevent capacity from being removed when it will take any subsystem below its minimum required capacity level. This will prevent an incorrect input from triggering a similar event in the future.

[1] https://aws.amazon.com/message/41926/


This was address very well by a video posted on the Piper Alpha story yesterday.

https://news.ycombinator.com/item?id=14732404

https://www.youtube.com/watch?v=S9h8MKG88_U




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: