DOS Glitch Nearly Killed Mars Rover
A software glitch that paralyzed the Mars "Spirit" rover earlier this year was caused by an unanticipated characteristic of a DOS file system, a NASA scientist said Monday.
The flaw, since fixed, was only discovered after days of agonizingly slow tests complicated by the limited "windows" of communication allowed by the rotation of Mars, said Robert Denise, a member of the Flight Software Development Team at NASA's Jet Propulsion Laboratory.
On Jan. 21, the Spirit rover stopped communicating with the teams on Earth, beginning a cycle where the rover would reboot itself, over and over. After days of tests, the team finally discovered on Jan. 26 that the issue was tied to what was originally reported as corruption inside the rover's onboard flash memory.
In a presentation at the Hot Chips conference here, Denise said that the real issue was an embedded DOS file system whose directory structure kept growing and growing. When the rover's embedded operating system then told the flash memory to mirror the data structure in RAM, the unexpectedly large file caused a fatal error and an almost continuous reboot cycle, he said.
Aside from the flash memory error, the recent voyages of Spirit and Opportunity have gone far better than expected. The mission was originally funded to last 90 sols, the equivalent of 90 Mars days, and come to an end last April. (One sol equals 24.65 hours.) Since both rovers have managed to stay "alive" far longer than anticipated, Denise said, the current funding will run out on Sept. 13, the beginning of the "solar conjunction," when Mars disappears behind the Sun and out of radio range. The lifespan of both rovers is really not known, he said.