In general this is of course an active area of research, but yes, you can do som...

yorwba · 2025-06-12T08:20:18 1749716418

In addition to increasing the number of layers, you can also grow the weight matrices and initialize by tiling them with the smaller model's weights https://neurips.cc/media/neurips-2023/Slides/83968_5GxuY2z.p...

MrLeap · 2025-06-12T03:49:07 1749700147

Thank you for taking the time to provide me all this reading.

ijk · 2025-06-12T08:34:52 1749717292

This might be obvious, but just to state it explicitly for everyone: you can freeze the weights of the existing layers if you want to train the new layers but want to leave the existing layers untouched.