StackOverflow's making their own competing LLM for all this stuff.
IMO, one of the biggest problems with the way people use LLMs right now, is that they're being treated as a single oracle: to know Java, it must be trained on examples of Java.
It would be much better if their natural language comprehension abilities were kept separated from their knowledge (and there are development efforts in this direction), so in this example it would be trained to be able to be able to read a Java tutorial rather than by actually reading a Java tutorial, so when the overall system is asked to write something in Java, the language model within the system decides to do this by opening https://learnxinyminutes.com and combining the user query with the webpage.
I think this will help make the models more compact, which is a benefit all by itself, but it would also mean that knowledge can be updated much more easily.
Someone would have to actually do this in order to see if those benefits are worth the extra cost of having to load a potentially huge a tutorial into the context window, and likewise the extent to which a more compact training set makes the language comprehension worse.
IMO, one of the biggest problems with the way people use LLMs right now, is that they're being treated as a single oracle: to know Java, it must be trained on examples of Java.
It would be much better if their natural language comprehension abilities were kept separated from their knowledge (and there are development efforts in this direction), so in this example it would be trained to be able to be able to read a Java tutorial rather than by actually reading a Java tutorial, so when the overall system is asked to write something in Java, the language model within the system decides to do this by opening https://learnxinyminutes.com and combining the user query with the webpage.
I think this will help make the models more compact, which is a benefit all by itself, but it would also mean that knowledge can be updated much more easily.
Someone would have to actually do this in order to see if those benefits are worth the extra cost of having to load a potentially huge a tutorial into the context window, and likewise the extent to which a more compact training set makes the language comprehension worse.