Hey there, human — the robots need you! Vote for IEEE’s Robots Guide in the Webby Awards.

Close bar

Space Station Incident Demands Independent Investigation

A space expert warns NASA's safety culture may be eroding again

7 min read
​Large square solar panels stick out from cylindrical white and brown modules in space

Russia's "Nauka" Multipurpose Laboratory Module is pictured shortly after docking to the Zvezda service module's Earth-facing port on the International Space Station, with the Brazilian coast 263 miles below. In the foreground is the Soyuz MS-18 crew ship docked to the Rassvet module on 29 July 2021.

NASA

This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.

In an International Space Station major milestone more than fifteen years in the making, a long-delayed Russian science laboratory named Nauka automatically docked to the station on 29 July, prompting sighs of relief in the Mission Control Centers in Houston and Moscow. But within a few hours, it became shockingly obvious the celebrations were premature, and the ISS was coming closer to disaster than at anytime in its nearly 25 years in orbit.

While the proximate cause of the incident is still being unravelled, there are worrisome signs that NASA may be repeating some of the lapses that lead to the loss of the Challenger and Columbia space shuttles and their crews. And because political pressures seem to be driving much of the problem, only an independent investigation with serious political heft can reverse any erosion in safety culture.

Let's step back and look at what we know happened: In a cyber-logical process still not entirely clear, while passing northwest to southeast over Indonesia, the Nauka module's autopilot apparently decided it was supposed to fly away from the station. Although actually attached, and with the latches on the station side closed, the module began trying to line itself up in preparation to fire its main engines using an attitude adjustment thruster. As the thruster fired, the entire station was slowly dragged askew as well.

Since the ISS was well beyond the coverage of Russian ground stations, and since the world-wide Soviet-era fleet of tracking ships and world-circling network of "Luch" relay comsats had long since been scrapped, and replacements were slow in coming, nobody even knew Nauka was firing its thruster, until a slight but growing shift in the ISS's orientation was finally detected by NASA.

Russia's "Nauka" Multipurpose Laboratory Module approaches the International Space Station for docking.Nauka approaches the space station, preparing to dock on 29 July 2021. NASA

Within minutes, the Flight Director in Houston declared a "spacecraft emergency"—the first in the station's lifetime—and his team tried to figure out what could be done to avoid the ISS spinning up so fast that structural damage could result. The football-field-sized array of pressurized modules, support girders, solar arrays, radiator panels, robotic arms, and other mechanisms was designed to operate in a weightless environment. But it was also built to handle stresses both from directional thrusting (used to boost the altitude periodically) and rotational torques (usually to maintain a horizon-level orientation, or to turn to a specific different orientation to facilitate arrival or departure of visiting vehicles). The juncture latches that held the ISS's module together had been sized to accommodate these forces with a comfortable safety margin, but a maneuver of this scale had never been expected.

Meanwhile, the station's automated attitude control system had also noted the deviation and began firing other thrusters to countermand it. These too were on the Russian half of the station. The only US orientation-control system is a set of spinning flywheels that gently turn the structure without the need for thruster propellant, but which would have been unable to cope with the unrelenting push of Nauka's thruster. Later mass-media scenarios depicted teams of specialists manually directing on-board systems into action, but the exact actions taken in response still remain unclear—and probably were mostly if not entirely automatic. The drama continued as the station crossed the Pacific, then South America and the mid-Atlantic, finally entering Russian radio contact over central Europe an hour after the crisis had begun. By then the thrusting had stopped, probably when the guilty thruster exhausted its fuel supply. The sane half of the Russian segment then restored the desired station orientation.

Initial private attempts to use telemetry data to visually represent the station's tumble that were posted online looked bizarre, with enormous rapid gyrations in different directions. Mercifully, the truth of the situation is that the ISS went through a simple long-axis spin of one and a half full turns, and then a half turn back to the starting alignment. The jumps and zig-zags were computational artifacts of the representational schemes used by NASA, which relate to the concept of "gimbal lock" in gyroscopes.

How close the station had come to disaster is an open question, and the flight director humorously alluded to it in a later tweet that he'd never been so happy as when he saw on external TV cameras that the solar arrays and radiators were still standing straight in place. And any excessive bending stress along docking interfaces between the Russian and American segments would have demanded quick leak checks. But even if the rotation was "simple," the undeniably dramatic event has both short term and long-term significance for the future of the space station. And it has antecedents dating back to the very birth of the ISS in 1997.

How close the ISS had come to disaster is still an open question.

At this point, unfortunately, is when the human misjudgments began to surface. To calm things down, official NASA spokesmen provided very preliminary underestimates in how big and how fast the station's spin had been. These were presented without any caveat that the numbers were unverified—and the real figures turned out to be much worse. The Russian side, for its part, dismissed the attitude deviation as a routine bump in a normal process of automatic docking and proclaimed there would be no formal incident investigation, especially any that would involve their American partners. Indeed, both sides seemed to agree that the sooner the incident was forgotten, the better. As of now, the US side is deep into analysis of induced stresses on critical ISS structures, with the most important ones, such as the solar arrays, first. Another standard procedure after this kind of event is to assess potential indicators of stress-induced damage, especially in terms of air leaks, and where best to monitor cabin pressure and other parameters to detect any such leaks.

The bureaucratic instinct to minimize the described potential severity of the event needs cold-blooded assessment. Sadly, from past experience, this mindset of complacency and hoping for the best is the result of natural human mental drift that comes when there are long periods of apparent normalcy. Even if there is a slowly emerging problem, as long as everything looks okay in the day to day, the tendency is ignore warning signals as minor perturbations. The safety of the system is assumed rather than verified—and consequently managers are led into missing clues, or making careless choices, that lead to disaster. So these recent indications of this mental attitude about the station's attitude are worrisome. The NASA team has experienced that same slow cultural rot of assuming safety several times over the past decades, with hideous consequences. Team members in the year leading up to the 1986 Challenger disaster (and I was deep within the Mission Control operations then) had noticed and begun voicing concerns over growing carelessness and even humorous reactions to occasional "stupid mistakes," without effect. Then, after imprudent management decisions, seven people died.

The same drift was noticed in the late 1990s, especially in the joint US/Russian operations on Mir and on early ISS flights. It led to the forced departure of a number of top NASA officials, who had objected to the trend that was being imposed by the White House's post-Cold War diplomatic goals, implemented by NASA Administrator Dan Goldin. Safety took a decidedly secondary priority to international diplomatic value. Legendary Mission Control leader Gene Kranz described the decisions that were made in the mid-1990s over his own objections, objections that led to his sudden departure from NASA. "Russia was subsequently assigned partnership responsibilities for critical in-line tasks with minimal concern for the political and technical difficulties as well as the cost and schedule risks," he wrote in 1999. "This was the first time in the history of US manned space flight that NASA assigned critical path, in-line tasks with little or no backup." By 2001-'02, the results were as Kranz and his colleagues had warned. "Today's problems with the space station are the product of a program driven by an overriding political objective and developed by an ad hoc committee, which bypassed NASA's proven management and engineering teams," he concluded.

To reverse the apparent new cultural drift, NASA headquarters or some even higher office is going to have to intervene.

By then the warped NASA management culture that soon enabled the Columbia disaster in 2003 was fully in place. Some of the wording in current management proclamations regarding the Nauka docking have an eerie ring of familiarity. "Space cooperation continues to be a hallmark of U.S.-Russian relations and I have no doubt that our joint work reinforces the ties that have bound our collaborative efforts over the many years" wrote NASA Director Bill Nelson to Dmitry Rogozin, head of the Russian space agency, on July 31. There was no mention of the ISS's first declared spacecraft emergency, nor any dissatisfaction with Russian contribution to it.

To reverse the apparent new cultural drift, and thus potentially forestall the same kind of dismal results as before, NASA headquarters or some even higher office is going to have to intervene. The causes of the Nauka-induced "space sumo match" of massive cross-pushing bodies need to be determined and verified. And somebody needs to expose the decision process that allowed NASA to approve the ISS docking of a powerful thruster-equipped module without the on-site real-time capability to quickly disarm that system in an emergency. Because the apparent sloppiness of NASA's safety oversight on visiting vehicles looks to be directly associated with maintaining good relations with Moscow, the driving factor seems to be White House diplomatic goals—and that's the level where a corrective impetus must originate. With a long-time U.S. Senate colleague, Nelson, recently named head of NASA, President Biden is well connected to issue such guidance for a thorough investigation by an independent commission, followed by implementation of needed reforms. The buck stops with him.

As far as Nauka's role in this process of safety-culture repair, it turns out that quite by bizarre coincidence, a similar pattern was played out by the very first Russian launch that inaugurated the ISS program, the 'Zarya' module [called the 'FGB'] in late 1997. Nauka turns out to be the repeatedly rebuilt and upgraded backup module for that very launch, and the parallels are remarkable. The day the FGB was launched, on 20 November 1998, the mission faced disaster when it refused to accept ground commands to raise its original atmosphere-skimming parking orbit. As it crossed over Russian ground sites, controllers in Moscow sent commands, and the spacecraft didn't answer. Meanwhile, NASA guests at a nearby facility were celebrating with Russian colleagues as nobody told them of the crisis. Finally, on the last available in-range pass, controllers tried a new command format that the onboard computer did recognize and acknowledge. The mission—and the entire ISS project—was saved, and the American side never knew. Only years later did the story appear in Russian newspapers.

Still, for all its messy difficulties and frustrating disappointments, the U.S./Russian partnership turned out to be a remarkably robust "mutual co-dependence" arrangement, when managed with "tough love." Neither side really had practical alternatives if it wanted a permanent human presence in space, and they still don't—so both teams were devoted to making it work. And it could still work—if NASA keeps faith with its traditional safety culture and with the lives of those astronauts who died in the past because NASA had failed them.

Postscript: As this story was going to press, a NASA spokesperson responded to queries about the incident saying:

As shared by NASA's Kathy Lueders and Joel Montalbano in the media telecon following the event, Roscosmos regularly updated NASA and the rest of the international partners on MLM's progress during the approach to station. We continue to have confidence in our partnership with Roscosmos to operate the International Space Station. When the unexpected thruster firings occurred, flight control teams were able to enact contingency procedures and return the station to normal operations within an hour. We would point you to Roscosmos for any specifics on Russian systems/performance/procedures.
The Conversation (8)
Thomas Vasiloff
Thomas Vasiloff08 Sep, 2021
INDV

Oberg, in his usual thorough and analytical manner, has sounded the alarm on another round of potential complacency on the part of those entrusted with the lives of our astronauts- on all sides. We MUST conduct a complete investigation into the incident to avoid any repetition of the Challenger/Columbia disasters. Before we even think about a return to the Moon and eventually Mars, the attitude and work ethic to achieve total success must be all pervasive.

Brian Bixby
Brian Bixby17 Aug, 2021
M

And why do the Russians have the **only** ability to communicate with the module, if they only can connect to it when it's overhead? Why were they unable to be routed through a NASA antenna somewhere else on Earth, or even on the ISS? I realize that it's more complex than downloading a file from MSDN, but if that capability doesn't exist today it damn well better before they try something like this again.

Which brings up the question: Can the automatically docking SpaceX modules be overridden?

John Mcbain
John Mcbain12 Aug, 2021
LS

As a long-time product safety engineer I know how difficult it can be to convince others (R&D engineers, marketeers, executive management) that safety and hazard mitigation are essential. Sometimes the phrase "A dead customer is not a repeat customer" works. However, a common mindset seems to be "So far, so good" (as the man said while falling off the Empire State Building).

Do we really need a disaster before we take action and implement appropriate safety processes - and even more important ensure all personnel really understand and internalize those processes? Perhaps we can point to all the disasters that did NOT occur to keep that safety culture alive, although proving a negative is problematic. An excellent and timely article, but now (not after the disaster) is the time for action!