This exactly scenario is why our company is so afraid to put AI into production without the results being completely clear that it could be wrong. But if there’s even a chance that it could be wrong, why are we offering it to the user? How much due diligence does the user need to do? Does the benefits outweigh the cons?
The AI should be required to cite the source. But that wont work for LLM though as they are just random words put together in a way that is statistically probable.
In this case the problem is caused by citing sources - Google finds a list of links using their existing search tech and then produces a response using RAG based on the search results.
That’s where the problem is - it’s originally citing a Reddit post where someone recommends it as a joke, then Business Insider through citogenisis from the original story.
Pure LLMs (no RAG) don’t make this mistake - Claude will tell you it’s a bad idea and will taste bad.
> But if there’s even a chance that it could be wrong, why are we offering it to the user?
Corporations push code written by fresh junior devs into production every day, breaking stuff that could cost them tens of thousands. Do they care? On paper, very much so, in practice, they dgaf.
That sounds more like Alibaba, where it operates like a B2B type transaction. In my experience, just talk to them, sure it might take more time but you can usually get a good price if you’re buying higher quantities.
USB! I’ve tried and failed couple times in understanding how USB works under the hood, from electrical, to protocol, then classes, and also Power Delivery. This time around things seem to make more sense now. It started out as an ambitious goal to emulate an FTDI USB DMX converter with the ESP32-S2/S3, but realizing that might be too big a goal, so I’m starting small. I want to be able to make a custom device class on the ESP32, and write a driver with libusb.
USB is a frighteningly deep rabbit hole. I thought it was going to be easy when I first dipped my toes into the USB stack but boy was I in for a shock.
Basically embed a layer of transparent text to the PDF that's invisible to humans but visible to whatever PDF parsing software which will only get processed by LLMs
just like my first teachers said I should absolutely not use Wikipedia.
LLMs was popularized less than 2 years ago.
I think it is safe to assume that it will be as trustworthy as you see Wikipedia today, and probably even more as you can embed reasoning techniques into the LLMs to correct misunderstandings.
There's an important difference between wikipedia and the LLMs that are actually useful today.
Wikipedia is open, like completely open.
GPT is not.
Unless we manage to crack the distributed training / incremental improvement barriers, LLMs are a lot more likely to follow the Google path (that is, start awesome and gradually enshittify as capitalist concerns pollute the decision matrix) than they are the Wikipedia path (gradual improvement as more eyes and minds work to improve them).
it also carves I to the question what constituted model openness?
most people agree that just releasing weights are not enough.
but I don't think it will ever be feasible to say that reproducing model training is feasible. especially when factoring in branching and merging of models.
for me this is an open and super interesting question.
Here's what I envision (note: impossible with current state of the art)
A model that can be incrementally trained (this is the bit we're missing) hosted by a nonprofit, belonging to "we the people" (like wikipedia).
The training process could be done a little like wikipedia talk pages are now - datasets are proposed and discussed out in the open and once generally approved, trained into the model.
Because training currently involves backpropagation, this isn't possible. Hinton was working on a structure called "forward-forward" that would have overcome this (if it worked) before he decided humanity couldn't be trusted [1]. It is my hope that someone smarter than me picks up this thread of research - although in the spirit of personal responsibility I've started picking up my old math books to try and get to a point where I grok the implementation enough to experiment myself (I'm not super confident I'm gonna get there but you can't win if you don't play, right?)
It's hard to tell when (if?) we're ever going to have this - if it does happen, it'll be because a lot of people do a lot of really smart unpaid work (after seeing OpenAI do what it did, I don't have a ton of faith that even non-profit orgs have the will or the structure to pull it off. Please prove me wrong.)
Shapeshifter! Those were the days...