> Is there a need for a sort of middleman like a one arm bandit kind of thing that makes a decision to spawn and despawn child nodes to explore the space more?
What's the one-armed bandit? (Besides a slot machine.)
My knowledge of this field is rusty, but I actually wrote my MSc thesis on novel ways to get Genetic Algorithms to more efficiently explore the space without getting stuck, so it sounds up my alley.
I wonder if you thought of it as a type of optimal stopping problem locally on each node and explore-exploit (multi-armed bandit) globally? For example, if each node knows when to halt when it hits a [probably local] minima, the results can be shared at that point and the best-performing models can be cross-pollinated or whatever the mechanism is at that point. Since both copying the models and continuing without gaining ground are both wastes of time, you want to dial in that local halting point precisely. An overseeing scheduler would record epoch-level results and make the decisions, of course.
What's the one-armed bandit? (Besides a slot machine.)
My knowledge of this field is rusty, but I actually wrote my MSc thesis on novel ways to get Genetic Algorithms to more efficiently explore the space without getting stuck, so it sounds up my alley.