> Is there a need for a sort of middleman like a one arm bandit kind of thing th...

fancy_pantser · on April 5, 2021

I wonder if you thought of it as a type of optimal stopping problem locally on each node and explore-exploit (multi-armed bandit) globally? For example, if each node knows when to halt when it hits a [probably local] minima, the results can be shared at that point and the best-performing models can be cross-pollinated or whatever the mechanism is at that point. Since both copying the models and continuing without gaining ground are both wastes of time, you want to dial in that local halting point precisely. An overseeing scheduler would record epoch-level results and make the decisions, of course.

klmadfejno · on April 6, 2021

Haha sorry I meant multi arm bandit, which I'd presume you're familiar with.

Although I guess a single arm bandit would be something akin the secretary problem.