How will ChatGPT handle new languages and technologies that don’t have a wealth of old human generated content to train on? Is it smart enough to read the docs and figure it all out?
This is a use-case where Stackoverflow fails as well - oh so often I have a problem where Stackoverflow has a question with answers that were true some years ago for an earlier version, but is not true anymore, so the existing answers are worse than useless (as in, they're actively misleading and wasting time) but re-asking the question gets closed as duplicate.
Being unable to clearly mark the bounds of relevancy for answers (in a structured way that affects search) is a major weakspot of SO.
I assume we'll eventually begin building languages and technologies around what chatbots can use. Like how we rebuilt American cities for cars. "How can Ford's cars work if there are no roads?" Build roads.
Short answer, yes. And if a library doesn't have any docs you can just paste the header file into the chat box and ask ChatGPT to write example code for you.
Since LLMs came out, I’ve mostly been using ChatGPT to write AWS SDK based Python scripts and infrastructure as code.
If it wasn’t trained on a specific newer API or Cloudformation/Terraform/CDK construct, with 4/4o, I just give it the link to the relevant documentation and tell it to use the links to help it create the right code.
I tried giving it a link once to read through a PDF full of cars manufactured back in the 80s to find the cheapest ones. I gave me answers, but I was able to manually find some cheaper ones in the list. So at the end of the day, I couldn’t trust it any more than my own eyes. I think what I was asking was far more basic than writing code. Order the list by price (lowest to highest), and give me the top 5 results.
The PDF had multiple columns, and while ChatGPT seemed to figure that part out, it couldn’t do the logic part. Had it been in an easier Format to deal with, I would have just used to spreadsheet.
ChatGPT struggles mightily with the simple task of ordering the presidents by the year they were born. It got the order wrong, the years wrong and there were duplicates.
I had to explicitly tell it to verify its sources on the web and use Python
Yes, it only works within the context window - ie your session, it isn’t part of its permanent training data.
I purposefully chose something obscure that I knew was a new feature that wouldn’t be in the training data. Even then I had to force it to search the web
One of the ingredients that made GenAI possible was a massive quantity of public, relatively high quality data (the internet).
We crowd sourced that over decades. If we transition to only interacting with AI agents in private as consumers, the corpus will fail to grow and update.