"I think this is another good example of how we as an industry are still unable to adequately assess risk properly."
It is likely that what you mean by "properly" is impossible. At large enough scales, what you end up with is a Gaussian distribution of errors in accordance with the Central Limit Theorem... except that there's a Black Swan spike in the low-probability, high-consequence events, and you basically can't spend enough money to ever get rid of them. Ever. Even if you try, you just end up piling equipment and people and procedures which will, themselves, create the black swan when they fail.
I think you're trying to imply that if only they'd understood better, this could absolutely have been prevented. No. Some specific action would probably have been able to avert this but you simply don't have a 100% chance of calling those actions in advance, no matter how good you are.
The state space of these systems is incomprehensibly enormous and there is no feasible way in which you can get all the failures out of it, neither in theory nor in practice.
Living in terror of the absolute certainty of eventual failure is left as an exercise for the reader.
I didn't imply at all that all failures can be prevented. I'm saying that most peoples' assessment of risk are usually wrong. And the occurrence of a single point failure that can take down an entire system that is deemed low-risk seems to happen an awful lot.
It not only occurs in the technology industry, but also even in things like financial risk analysis. For example, people could mitigate the risk of a bond defaulting by buying a credit default swap. However, most people failed to assess the risk of their counter-party going belly up, like AIG or Lehman. This failure in risk assessment is in large part why the financial crisis was so widespread.
Another striking example is the failure of Fukushima I and II. Basically what happend was that they had a power failure (of the external line)! They thought that was somehow too unlikely to account for, which I really have trouble with understanding. Isn't it obvious that multiple systems can fail because of some unaccounted for event external event? One that affects both, on-site and external power? And in the Japanese case, the trigger was not even something you'd need much fantasy for, an earthquake. Japan is sitting directly on one of the biggest geo faults on earth, and they don't account for simultaneous power outage!
So, yes, many if not most people are very poor risk assessors.
Then again, this might be rather a capacity (or the lack thereof) of the organization within the risk is assessed. This is what the software engineering quip "Most problems are people problems" means, I think. In some environments it is hard to bring up the unlikely, catastrophic scenarions without being seen as overly pessimistic and somehow not enough subscribed to the success of the undertaking as a whole. So you assess risk success overly optimistic in order to further your career, not to assess risk accurately.
Pedantic footnote: the Fukushima Daichi plant failure wasn't just a power failure: they had multiple backup diesel generators and batteries, and the diesels kicked in after the earthquake hit and the reactors tripped and they lost the grid connection. The problem was that all the diesel generators and fuel were at ground level and the sea wall wasn't high enough to keep the tsunami from flooding them a few minutes later.
If they'd had a couple of gennies on the rooftops, they ... well, they wouldn't have been fine but they'd have had a fighting chance to keep the scrammed reactors from melting down. Or if they'd had a higher sea wall (like Onagawa) they'd have been fine.
So: not one, not two, but four power systems failed in order to result in the meltdowns -- and one of them would have worked if they had been located slightly differently.
(Otherwise, your point about people being poor risk assessors is spot-on. And worse: even if some people are acutely conscious of risk, once decision-making responsibility devolves to a committee, the risk-aware folks may be overruled by those who Just Don't See The Problem.)
Yes, I know the full scenario was a bit more involved.
My main beef with their failure handling is actually this: you need to be able to face a situation where _all_ you smart emergency systems fail. In the case of an NPP this can mean an almost global environmental crisis, and need relocate millions of people, making hundreds of square kilometers uninhabitable, etc. In that case you don't really want to rely on five generators on some roof. Which may or may not work on that day.
And this is not something I make up here now. I remember discussing nuclear safety in high school, and the bottom line was: NPPs are ok, since they become uncritical when _everything_ fails, because the moderator rods slide down into the reactor vessel.
But after Fukushima I read, that actually the situation there, with that specific model, is different, unfortunately. Tough luck. Because that model still needs some cooling because the fully moderated reactor still produces 1% it's total energy, an that is enough to bring the reactor into an 'undefined' state, iirc. And it is easy to imagine what that means for a station that has just been struck by an earthquake anyway.
My whole point is: your comment makes it appear as if the security layers actually were plenty, and I would (respectfully, of course) disagree with that. I think it was poor.
That point is important: if NPPs aren't build safely, what is then built safely? My guess is: nothing.
So what to do? Design for failure. (Politically, technically, economically, can be applied everywhere.)
So what to do? Design for failure. (Politically, technically, economically, can be applied everywhere.)
Yup.
On a similar note, the French response to Fukushima Daichi is rather interesting (France relies on nuclear generation for over 80% of its electricity):
"The ASN has also come up with an elegant technical solution to get around the (universal) dilemma of how to protect a plant from external threats, such as natural disasters. The report recommends that all reactors, irrespective of their perceived vulnerability, should add a 'hard core' layer of safety systems, with control rooms, generators and pumps housed in bunkers able to withstand physical threats far beyond those that the plants themselves are designed to resist."
(And a mobile emergency force who can move in and stabilize a reactor after an unforseen catastrophic disaster that kills everyone on-site and destroys most of the safety systems.)
In other words, they now expect unpredictable Bad Things to happen and are trying to build a flexible framework for dealing with it, rather than simply relying on procedures for addressing the known problems.
": you need to be able to face a situation where _all_ you smart emergency systems fail. "
Totally agree with you here - Until recently, no nuclear power plan was designed such that it could survive failure of all their emergency systems. Hopefully, with the negative repercussions of Fukushima on the industry, engineers are rethinking their approach to Nuclear Power.
It is likely that what you mean by "properly" is impossible. At large enough scales, what you end up with is a Gaussian distribution of errors in accordance with the Central Limit Theorem... except that there's a Black Swan spike in the low-probability, high-consequence events, and you basically can't spend enough money to ever get rid of them. Ever. Even if you try, you just end up piling equipment and people and procedures which will, themselves, create the black swan when they fail.
I think you're trying to imply that if only they'd understood better, this could absolutely have been prevented. No. Some specific action would probably have been able to avert this but you simply don't have a 100% chance of calling those actions in advance, no matter how good you are.
The state space of these systems is incomprehensibly enormous and there is no feasible way in which you can get all the failures out of it, neither in theory nor in practice.
Living in terror of the absolute certainty of eventual failure is left as an exercise for the reader.