How will ChatGPT handle new languages and technologies that don’t have a wealth ...

PeterisP · 2024-08-27T12:38:20 1724762300

This is a use-case where Stackoverflow fails as well - oh so often I have a problem where Stackoverflow has a question with answers that were true some years ago for an earlier version, but is not true anymore, so the existing answers are worse than useless (as in, they're actively misleading and wasting time) but re-asking the question gets closed as duplicate.

Being unable to clearly mark the bounds of relevancy for answers (in a structured way that affects search) is a major weakspot of SO.

sandspar · 2024-08-27T18:16:06 1724782566

I assume we'll eventually begin building languages and technologies around what chatbots can use. Like how we rebuilt American cities for cars. "How can Ford's cars work if there are no roads?" Build roads.

bjourne · 2024-08-27T14:34:25 1724769265

Short answer, yes. And if a library doesn't have any docs you can just paste the header file into the chat box and ask ChatGPT to write example code for you.

scarface_74 · 2024-08-27T13:56:42 1724767002

Since LLMs came out, I’ve mostly been using ChatGPT to write AWS SDK based Python scripts and infrastructure as code.

If it wasn’t trained on a specific newer API or Cloudformation/Terraform/CDK construct, with 4/4o, I just give it the link to the relevant documentation and tell it to use the links to help it create the right code.

al_borland · 2024-08-27T17:44:03 1724780643

I tried giving it a link once to read through a PDF full of cars manufactured back in the 80s to find the cheapest ones. I gave me answers, but I was able to manually find some cheaper ones in the list. So at the end of the day, I couldn’t trust it any more than my own eyes. I think what I was asking was far more basic than writing code. Order the list by price (lowest to highest), and give me the top 5 results.

The PDF had multiple columns, and while ChatGPT seemed to figure that part out, it couldn’t do the logic part. Had it been in an easier Format to deal with, I would have just used to spreadsheet.

scarface_74 · 2024-08-27T20:12:22 1724789542

ChatGPT struggles mightily with the simple task of ordering the presidents by the year they were born. It got the order wrong, the years wrong and there were duplicates.

I had to explicitly tell it to verify its sources on the web and use Python

https://chatgpt.com/share/27ffea74-b3c0-478e-a6f4-3aca9e3e64...

Then I just changed the initial prompt

https://chatgpt.com/share/5f44924f-4b2c-4d26-92e2-cf3b398cf2...

“ Create a list of presidents with the first column being the year they were born and the second their name. Order the list by the first column.

Use Python and verify the ages on the web”

fendy3002 · 2024-08-27T14:09:07 1724767747

You can give links to gpt for them to digest info?

scarface_74 · 2024-08-27T14:45:53 1724769953

Yes, it only works within the context window - ie your session, it isn’t part of its permanent training data.

I purposefully chose something obscure that I knew was a new feature that wouldn’t be in the training data. Even then I had to force it to search the web

https://chatgpt.com/share/67992b79-f047-441a-809c-f151b2e511...

langcss · 2024-08-28T11:23:01 1724844181

1. Someone asks chat gpt. No answer

2. Checks Google/SO and no answer

3. Figures it out themselves

4. Blogs about that

5. AIs that search pick it up first but eventually it ends up in training.

As well as AI teams paying for content for training.

otaviojr · 2024-08-28T11:44:44 1724845484

And if people, like I'm doing, start to block AI access to their blogs?

Some studies show that the number of sites with AI blockers at their robots.txt has dramatically increased!

Right now, some companies are trying to ignore robots.txt, but after regulations...

seanthemon · 2024-08-29T16:07:38 1724947658

Do you also block all google bots? Who's to say your data isn't scraped by the company but buys it from brokers who can? Robots.txt is theater.

gorjusborg · 2024-08-27T13:12:44 1724764364

Yeah, it wont.

One of the ingredients that made GenAI possible was a massive quantity of public, relatively high quality data (the internet).

We crowd sourced that over decades. If we transition to only interacting with AI agents in private as consumers, the corpus will fail to grow and update.