I wrote assembler code for the 9900 when I worked at TI. I freaking loved that architecture: There was only one real register and it was a pointer to memory where the actual registers for the current thread were stored. This meant that a context switch only required the single master register to change. It was an absolutely glorious idea for writing multithreaded code.
And also slow as frozen dog crap because every register access was a RAM access. Sigh.
And also slow as frozen dog crap because every register access was a RAM access. Sigh.