Sorry I should have responded to this comment, but I wrote a separate response i...

Sorry I should have responded to this comment, but I wrote a separate response in the parent thread. I didn't feel the pdf / paper was really trying to mimick spiking biological networks in all but the loosest sense (there is a sequence of activations and layers of "neurons"). I think the major contribution is just using the dot product on output transpose output, the rest is just diffusion / attention on inputs. Its conceptually a combination of "input attention" and "output attention" using a kind of stepped recursive model.