A Little Background
To fully understand what happened at Three Mile Island in March, 1979, we must go back to 1977. On September 24, 1977, the new Davis Besse nuclear plant near Toledo was going through its initial power escalation. Davis Besse used a pressurized water reactor supplied by Babcock & Wilcox, nearly identical to the Three Mile Island 2 reactor.
Figure 1. Davis Besse/TMI2 Schematic. Inverted loop in pressurizer line not shown.
In a pressurized water reactor, there are two main loops, Figure 1.
1) The primary loop (shown in green) which circulates water through the reactor core where it is heated up. This water is maintained at a very high pressure, 150 atmospheres, so it does not boil, which would deprive the reactor core of cooling.
2) The heated primary loop water is pumped through one or more (in this case, two) boilers, called steam generators, which transfer the heat to a secondary loop. This heat turns the incoming secondary loop feedwater (shown in blue) into steam (shown in red). This steam is then expanded through a turbine to produce electricity. The cooled primary loop water is pumped back to the reactor core.
The primary loop is equipped with a pressurizer, which is a large vertical surge tank. The tank is half filled with primary loop water; but it has a heater, which maintains a steam bubble above the water. This bubble serves as a cushion, absorbing changes in water level in a manner that won't overstress the plumbing. The usual reason for a change in pressurizer water level, is a change in water temperature, since water expands/contracts as it heats/cools.
Here's a critically important detail, not shown in Figure 1. The line from the primary loop to the pressurizer dipped down and then up, like a sink drain. So any steam produced by boiling in the reactor pressure vessel (RPV) (which was never supposed to happen) could not get to the pressurizer. Instead the expanding steam would push water into the pressurizer.
The water level in the pressurizer performed another important function. There was no direct measurement of the water level in the core; but as long as the water level in the pressurizer was above a certain level, you could be sure that the core was completely full of water. At least that's what the manuals said and the training said and the simulator said. The operators were also told that, as the primary loop pressure increases, the water level in the pressurizer will go up, as the extra pressure squeezes down the pressurizer steam bubble. Pressure and water level move in the same direction. The simulator they trained on was programmed so that's what always happened.\cite{kemeny-1979}[p 50]
One final point. In the 1970's, reactor operators were drilled over and over. Do NOT let the pressurizer become completely filled with water. This is called ``going solid". If you let the pressurizer go solid while on the simulator, the simulator simply stopped.\cite{rogovin-1980}[p 104] Game over. You lose. And you can forget about that Reactor Operator license.
The Davis-Besse Potential Meltdown
Back to the Davis-Besse start up. At 9:34 PM, a valve only used during start up failed closed, depriving one of the two steam generators (SG2) of secondary loop water. The feedwater level in SG2 drops to a very low level. The control system senses something is wrong, and starts up a back up feedwater pump. But that pump fails to produce the necessary pressure to inject water into SG2. SG2 is no longer properly cooling the primary loop water.
The primary loop starts to heat up. Primary loop water expands, increasing the loop pressure and pushing the pressurizer water level up. The safety valve on top of the pressurizer, called a PORV, opens to relieve the extra pressure.
About 2 minutes into the event, the reactor operators don't like what they are seeing and scram the reactor. The reactor is now producing much less heat, the primary loop temperature drops, and the primary loop pressure drops. So far so good.
But the PORV fails to close as it should when the pressure went down. The operators have no direct indication of this. They assume it is closed. Primary loop pressure drops quickly, since water and steam are blowing out through the open PORV. The control system correctly says this is not good. If the pressure drops too far, the primary loop will start of boil, and I won't be able to properly cool the core. About 3 minutes into the event, she starts up the emergency high pressure injection (HPI) pumps, to push water into the primary loop.
This works but the pressurizer water level rises quickly to nearly the top. The pressurizer is going solid. At 6 minutes in, the reactor operators stop the HPI pumps, as they were trained to do.
But something weird is happening. The pressurizer water level is indicating full and the primary loop pressure is way low. This is not supposed to happen. On the simulator, it can't happen. Worse, the pressure keeps dropping, but the water level does not. Then at 9 minutes in, the pressure suddenly stops dropping. Why? They've done nothing.
At about 10 minutes in, the shift supervisor, Mike Derivan, has an eureka moment. The pressure has stopped dropping because the water in the core has started to boil, and all those bubbles are pushing water into the pressurizer. Despite what he has been told over and over again, it is possible to have high pressurizer water level and low pressure. It is possible to have high pressurizer water level and boiling in the core. But why is this impossibility happening? The crew has no idea what to do.
23 minutes in Derivan gets word that the pressure in the containment is increasing. The only way that can happen is steam is blowing into the containment, and the only way that can happen is he has a major leak. In a flash, Derivan knows where's that leak is coming from. ``Close the PORV block valve", he shouts, which stops the leakage.\cite{gray-1982}[p 27]
Finally they are back in control of the situation and can bring the plant to a safe shutdown. But their training is so strong, that they still don't turn on the HPI pumps, which would have made the process quicker and safer. It turns out the water level in the core never got low enough to drastically overheat the fuel. No harm. No foul.
But clearly their instructions for how to handle such a situation, known at the Emergency Operating Procedures, need changing. They should not have shut down the HPI pumps. Everybody needs to know that the manuals are wrong; the training is wrong; and the simulator is lying. Time to get the word out. But that did not happen.
Don't Rock the Boat
Toledo Electric, the plant's owner, was not interested in publicizing a near meltdown. After all the party line is that meltdowns and a large release are ``virtually inconceivable".\cite{walker-2004}[p 241] They settled for a minor-looking change in the Emergency Procedures which said something like ``Make sure you don't have a pressurizer leak like a stuck open PORV before stopping the high pressure injection pumps."\cite{derivan-2014}
B&W Response
One concerned engineer at B&W, Joe Kelly distributed a polite memo, dated November 1, 1977, saying perhaps we should tell our customers about this. Kelly was in Engineering. The responsibility for operator training and writing the procedures was in Nuclear Services, headed by Don Hallman. Hallman blew him off. A Hallman underling, Frank Walters, sent Kelly a sarcastic memo saying the operators did the right thing, after all everything came out OK.\cite{derivan-2014}[Ch 3] The memo displays a willful lack of understanding of what happened. There's a bit of a turf war going on.
In February, 1978, Engineering tried again. Bert Dunn, head of emergency core cooling, sent out a much stronger memo. Davis Besse was starting up at the time and operating at only 9% full power. Dunn pointed out, if the reactor had been at full power ``it is quite possible, indeed probable, that core uncovery and possible fuel damage would have occurred."\cite{kemeny-1979}[p 29] Dunn's memo included wording for the changes in the instructions that were needed, and ended with ``I believe this is a very serious matter and deserves our prompt attention and correction". A week later Dunn and his group followed up with another memo with a set of revised emergency procedures which they wanted to send out to all B&W plants.\cite{rogovin-1980}{p 94}
This time Hallman kicked the issue up to Bruce Karrasch, Manager, Plant Integration. Karrasch later testified that he thought the Hallman memo raised ``rather routine issues" and delegated some one in his unit to ``follow up and take any appropriate action". There the matter died. This sequence blows my mind. Forget about a basic human concern for safety. Karrasch, Hallman, and Walters were putting B&W in an untenable liability position.
NRC Response
Davis Besse was the responsibility of the NRC Chicago office. James Creswell, an inspector in this office, was truly concerned. He persistently pestered his bosses and NRC Washington about the issue, but got no response.\cite{rogovin-1980}[p 94] The official explanation from the ever polite Kemeny Report was failure to communicate between the inspection side of NRC and the licensing side. But Creswell would not have been surprised. The first rule of bureaucratic advancement is ``Do not let shit flow uphill." Nobody was prepared to rock the boat. In a bureaucracy, you do not make yourself popular by saying there's something wrong with the system; and there's no penalty for remaining silent.
NRC had other evidence that their understanding of a leak in the pressurizer steam space was completely wrong. In 1977, a TVA engineer, Carlyle Michelson, on his own initiative, had done a study of what would happen in a B&W reactor if there was a small break in the top of the pressurizer.1 He found the level would go up and the pressure would go down, which he pointed out could mislead operators to turn off the emergency pumps. His prescient report was sent to the NRC, which wrote a memo on it and filed it.\cite{rogovin-1980}[p 95]
Creswell persisted. He finally went around everybody. Without telling his bosses, he flew to Washington on his day off, paying for the ticket himself, and took his case directly to two of the NRC commissioners. These commissioners considered ``his complaint serious enough to merit further consideration".\cite{kemeny-1979}[p 55] This was Saturday, March 22, 1979.
The Three Mile Island Repeat
Four days later, the Three Mile Island, Unit 2, turbine went off line unexpectedly. During the reactor shut down, the same PORV failed open; but there was no direct indication of this. The water level in the pressurizer went off-scale high. Following their training, the operators shut down the HPI pumps, just as Derivan's crew had. The difference was that TMI2 was running at nearly full power and had built up a complete inventory of fission products. By the time, they recognized their mistake, the core had melted down. TMI2 was a multi-billion dollar write off.
Some say Creswell was too late; but, I'm quite confident that, if TMI had not happened, Creswell's warning would have had the same fate as Michelson's; and Creswell's career at NRC would be over, as Creswell himself recognized.
Bureaucracies like the NRC are artificial constructs, that develop their own reward systems. These incentives need not have anything to do with societal welfare. And when this disconnect is revealed and the system screws up, there is no accountability; there are no penalties, as we will see in the next post in this series.
As early as 1971, a Belgian engineer, H. Dopchie, figured out the that a leak in the steam space of the pressurizer would result in both a water level rise and low pressure. He sent a letter to the AEC, asking if they had ``investigated the consequences of the event". The AEC sent a polite reply thanking him for his concern. In August, 1974, the Dopchie failure happened at a Westinghouse reactor at Beznau, Switzerland: loss of feedwater, primary loop temperature and pressure up. PORV opens but fails to close, pressure drops precipitously while the pressurizer water level remains way high. Fortunately, 3 minutes in the operators figured out the PORV had failed open, and were able to get things under control. Westinghouse's internal investigation found that protection systems had ``performed properly". Since this was a European reactor, Westinghouse decided there was no need to inform the NRC of the event.
If anyone is curious why the Beznau Operators were able to determine their PORV had failed open in only about 3 minutes (footnote 1), note the Beznau PORV had a mechanical position indication in the control room. So all they had to do was look at it. That info is in the Rogovin Report. (Specific reference can be cited if needed).
In my experience as an Automation Engineer. PORV valves tend to have a massive hysteresis, in other words once they open at their setpoint pressure, they do not close until a much lower pressure is achieved. Personally I've seen a steam PORV open at 420kPa and not close again until around 160kPa.
And given that in that era it was not common for valves of any type to have position feedback sensors showing actual valve state in the control room - it was almost inevitable that at some point this scenario would happen.
Frankly this design flaw should have been picked up in a decent HAZOP process; that it wasn't and then the NRC proceeded to try and pretend otherwise - should have ended some careers.