Had a similar problem to this many years ago. Happened every 24 days approximately and lost one user setting. Had a logic analyser connected to it for days trying to reproduce the issue in some way. Went to go for a piss and get a coffee one afternoon and came back and there it was triggered!
What happened? Well it turns out there was a timer that no one used that overflowed and caused an interrupt which wasn’t handled any more, the interrupt handler fell through, caused a halt and the WDT fired fire rebooting it and some idiot hadn’t stored that one setting in the NVRAM.
So then we had more problems. 5000 things with EPROMs in that were rebooting every 24 days which were spread all over the planet. Many questions to ask over how the hell it ended up like that.
I hope people are asking these sorts of questions at Boeing.
Edit: also the source code we had did not match what was on the devices. Turned out the engineer who provided the hex file hadn’t copied that code to the file server and had left a year before hand. We didn’t find that until the WDT fired and piqued our interest and could reproduce it on the dev board because the software was different (should have checked that past the label on the ROM which was wrong!)
What happened? Well it turns out there was a timer that no one used that overflowed and caused an interrupt which wasn’t handled any more, the interrupt handler fell through, caused a halt and the WDT fired fire rebooting it and some idiot hadn’t stored that one setting in the NVRAM.
So then we had more problems. 5000 things with EPROMs in that were rebooting every 24 days which were spread all over the planet. Many questions to ask over how the hell it ended up like that.
I hope people are asking these sorts of questions at Boeing.
Edit: also the source code we had did not match what was on the devices. Turned out the engineer who provided the hex file hadn’t copied that code to the file server and had left a year before hand. We didn’t find that until the WDT fired and piqued our interest and could reproduce it on the dev board because the software was different (should have checked that past the label on the ROM which was wrong!)