"What if the plasticity of the connections was under the control of the network itself, as it seems to be in biological brains through the influence of neuromodulators?"
Anyone who wishes to explore this idea would do well to go back to the basics of neural nets and read Warren McCulloch's seminal papers on neural nets, from the 40s:
(After having read those two papers, one can then try to make sense of Heinz von Förster's masterpiece, http://www.univie.ac.at/constructivism/archive/fulltexts/127..., Objects: Tokens for (Eigen-)Behaviors, which also bears some relevance to this matter. However, most people find it incomprehensible.)
Try anything citing von Förster's paper that was written by:
• Louis H. Kauffman if you're mostly interested in the pure math aspect of it. He wrote numerous papers on the topic of von Förster's paper, you can find one them here: https://arxiv.org/abs/1109.1892
• JM Stern and CAB Pereira if you're mostly interested in the application to statistics and fundamental questions of the epistemology of statistics. I wrote a thread on their works on my Twitter account at some point: https://twitter.com/no_identd/status/877883663014400000
Also, this paper by Heinz von Förster (first order author) and Karl H. Miiller (second order editor), where Miller basically took a lot of old papers by von Förster and rearranged them to give a more coherent view for certain types of readers:
Don't bother. Reading through the third link reveals that it's more alike thoughts during an opiate induced dream dressed in flowery language than actual science. Disappointing, really.
Very cool. It's interesting how powerful the recurrent network becomes with the addition of the learned hebbian term. For context, even without the Hebbian term, recurrent networks can learn to learn to do quite interesting things (Hochreiter et al. 2001).
Shameless plug -- our lab recently ported LSTMs to spiking networks without a significant loss in performance, and showed that learning to learn works quite well even with spiking networks (Bellec et al. 2018).
So it seems like this method of learning to learn could provide a extremely biologically realistic and fundamental paradigm for fast learning. The addition of the Hebbian term neatly fits in with this paradigm too.
It’d be interesting to compare this approach against a simpler baseline: setting a different (10 – 100 times higher?) learning rate for a fraction (10% ?) of neurons in an LSTM.
Interesting... I mean knockout already does this (sort of), by setting the learning rate of 10% artificially low. I'm not sure if inverting the common thing is useful, but it might be fun to try.
Is the plasticity update guaranteed to reach equilibrium assuming the network is run on iid data (as in do the H_{ij} values reach a fixed point)?
Edit: Seems like it should be reached eventually as the equilibrium point is H_ij = y_i * y_j and they keep doing a weighted average of the former with the latter (this is not a proof ofc as y_i * y_j keeps changing with each sample).
So the "plastic component" of a connection strength is a thing which decays away exponentially, but is replenished whenever the two endpoints do the same thing.
I have heard that neuroscientists have an adage "fire together wire together". Is that all that ML people mean by "plasticity".
But that's only because our edge cases and the machine vision edge cases don't match, since the underlying concepts are different.
And frankly, we probably don't want them to match. The main goal should be to make the cars drive better than humans on average, even if they're not perfect. And we know that there will always be edge cases.
Agree they should not match, however I think it is also important that people trust the tech, which in turn requires it to not make any mistakes we would consider obvious. People are really bad at estimating risk, a problem which I think will be easier to overcome this way.
A system which makes exactly the same mistakes as a competent human but never gets tired, never uses the phone, has 360 degree vision with no blind spots… even that would be a huge improvement over the status quo. On the other hand, a system which crashes because it missed something Joe Average calls obvious when they see the black box pictures on the news… that system will never be trusted enough to replace human drivers, not even when it has a tenth of the fatality rate.
Anyone who wishes to explore this idea would do well to go back to the basics of neural nets and read Warren McCulloch's seminal papers on neural nets, from the 40s:
http://www.cse.chalmers.se/~coquand/AUTOMATA/mcp.pdf A Logical Calculus of the ideas immanent in nervous activity
http://vordenker.de/ggphilosophy/mcculloch_heterarchy.pdf A heterarchy of values determined by the topology of neural nets
(After having read those two papers, one can then try to make sense of Heinz von Förster's masterpiece, http://www.univie.ac.at/constructivism/archive/fulltexts/127..., Objects: Tokens for (Eigen-)Behaviors, which also bears some relevance to this matter. However, most people find it incomprehensible.)