So this is the solution to LLM hallucinations. People need to complain loud enough and threaten/file lawsuits and OpenAI will add an if-else statement erroring out the chat if your name is mentioned.
It feels like all the parties commercializing AI spend most of their time and resources on writing exceptions. It's like inverse software development, where with code you have to tell it what to do, but AI tell it what not to do.
The marketing guys are selling to people this tech under the "AI" acronym to deceive people, and they are doing the same calling this errors hallucinations.
LLM -and current ML in general- is about generate statistically compressed lossy databases, that the queries statistically decompress with erroneous random data due the nature of this lossy compression technology (I think about it as statistically vectorial linked bits).
Writing exceptions is not going to solve the problem, with this they are only doing cosmetic patches, and they know it, at the time there are people who keep making decisions under queries with errors, I mean people without even being aware of the presence of such reconstructed data corruption.
Little by little people is learning they ate a marketing hype, but the damage will keep being done, because the tool -for sales purposes- still has incorrect instructions about how trustworthy the data must be taken.
> generate statistically compressed lossy databases, that the queries statistically decompress with erroneous random data due the nature of this lossy compression technology
An argument could be made that the mind works the same way, and has the same drawbacks
Of course, that is why people have developed so many tools that are external to the mind like language, writing, math, law, the scientific method, etc.
thats like if you asked how is this data encoded and i said logic gates. i mean how is it encoded by the model, the higher order structure or logic being utilized. nobody can answer because they dont know. they pretend to know something when they dont. if its such a simple database then show me the csv.
Nobody said it is simple. Sure, the algorithmic complexity of the models are high, filters over filters, and in the same way the resulting dump file it is not editable (without unbalancing the rest of the data, i.e. tracking and modifying the bytes that was used by each token of the model; it is vectored data at bit level (In my case I don't see it exactly as part of a graph )).
Nevertheless, the above does not exclude what one can see, a lossy compressed database (data is discarded), where the indexes are blended within the format of the data generated by the model weights, main reason why the model weights are needed again to read the database as expected, for being used by the predictive algorithm that reconstruct the data from the query, query that conform the range of indexes triggering the prediction direction/directions.
oh, my apologies executive chef, now I understand, you appears to insinuate the data is not stored, they are a handful of unaligned logic gates spontaneously generating data.
In what field do you work? In writing public-facing applications I've never felt this way, beyond standard input validation, char-limiting, and carefully escaping outputs (htmlentities, bound parameters, etc).
A random war-story from B2B integration land... One customer system we had to interoperate with required an are credentials to be provided via an HTTP header. However it kept giving errors, until I realized that it only recognized the HTTP header when "authorization" was entirely lower-case, a sensitivity in violation of the HTTP spec.
Since our stack's own HTTP client libraries always used title-casing on the wire, we had to find a way to slot in a special exception, code to modify that header before it went out.
Another fun one is all the services which say their mime type is "application/json" while emitting generic (non-JSON) error message pages. So our logs are full of JSON decoding errors, rather than something describing what actually went wrong on the other end.
I've heard these issues referred to as "bug-for-bug compatibility" or "bug-for-bug interoperable". Making sure something works even if it is in violation of a spec since you can't change the other side of the communication.
I feel lot of that is dealing with exceptions. In sense preventing them. Exceptions can be either dealt with or prevented by handling them before they become such. So it is somehow dealing with unexpected input, or unexpected output from somewhere else. Which undealt with would result in exception.
3 an image will just crash your backend script with no way to catch the error because some bug in the image encoder/decoder you use to resize the image
4 some user browser extension inserts garbage in the requests, you need to check for it and tell the user what is wrong, otherwise same complains reach support "stuff is broken" and support needs to contact developers to check and report back that stuff was corrupted by an extension , most of the time security crap that inserts stuff everywhere
5 I had cases where an API wwas returning soem string before the real result, it was much faster to check for this case and fix it, then have the customer contact their hosting or the author of the plugin that was adding that string before each response.
Some languages lean on exception handling more heavily than others. Golang and Rust don't have them at all while, whereas Python and Java are exception town.
On one hand it kinda makes sense to handle rare cases in a way that doesn't affect the normal flow, but on the other hand having a piece of code literally try to crash the program on purpose because something didn't look quite right is a horrible idea in practice.
Well that's good enough for clients. When it actually matters, which apparently means lawsuits, they realize they have no real control and have to put actual code in that just abort the whole process if it matches a string.
No, actually, it's not a solution. A reasonable answer to questions like "Tell me about John Doe?" would be either "I don't know John Doe, so I can't tell you anything about him" or "There are several people named John Doe, which one are you interested in? [followed by a list]". Making up stuff about people (including allegations of corruption or sexual misconduct) is not a reasonable answer to this question. But getting ChatGPT to respect that is probably harder than just adding a filter...
I believe the parent-post was a joke about how LLMs (at least for these use-cases) are so inherently flawed that the real "solution" is to make them unusable.
Kind of like how the defects and dangers of using radium toothpaste can be "fixed" by permanently encasing each tube into an unopenable lead container.
> There are several people named John Doe, which one are you interested in? [followed by a list]
How would that work? Either the list repeats the same name over and over, making it useless, or it needs to give a bit of context about each name and we’re back at square one of the information being possibly wrong.
the reason why this is news at all is because this sort of censorship immediately prompted people to try to jailbreak chat and force it to say the name. and since the filter is simple there are tons of creative jailbreaks for it now
They did not care about becoming massive data kleptomaniacs when they trained the models, and they are going to care about an individual lawsuit threat?
I found the reason [1] for the David Mayer case. It confused multiple David Mayers, one of them is a terrorist and is on a secret FBI watchlist [2] (often confused with another David Mayer, a theatre historian). ChatGPT confuses them with David Mayer de Rothschild as well, because it can't name Rothschild family members.
> ChatGPT confuses them with David Mayer de Rothschild as well, because it can't name Rothschild family members.
What do you mean by that? When I asked ChatGPT "tell me about Rothschild family members" it told me about members of that family, both present and from the past centuries.
When I now asked "who is David Mayer?" for the first time, it searched the internet and then talked only about David Mayer de Rothschild. The second time it didn't search, but returned a couple of matching Davids with short bios, but not Rothschild. When I asked for a third time, it responded that there are many David Mayers and asked for some facts about the one I'm looking for. The fourth time, it mentioned a couple of Davids and also de Rothschild.
As the article mentioned, the block on David Mayer was lifted: "Just before publication, Ars noticed that OpenAI lifted the block on David Mayer allowing it to process the name. (...) On Tuesday, OpenAI told The Guardian that the inclusion of David Mayer in its block list was a glitch." That Guardian article also says that the block was reportedly unrelated to David Mayer terrorist case.
Sorry, it indeed does sound very confusing in my message.
I meant that if you ask ChatGPT to include that particular David Mayer de Rothschild by asking it to ‘name all sons of Victoria Lou Schott and Sir Evelyn de Rothschild’. It will either not mention David, hallucinate and come up with other names or crash. This means that it can’t name this particular David Meyer as well.
As for the last paragraph, Guardian didn’t understand what OpenAI said. The blocklist is automated, they didn’t intentionally put his name in some kind of a block list and then unbanned. I think they just fixed something in RLHF.
To reiterate, I think this name is treated like this because of the secret fbi watchlist article and the significance of these words and reputation of the publisher of the article (meaning the model should likely believe this info etc) and also the name of Rothschild will probably have a very strong influence on the model weights. Just a coincidence in training, nothing more. And yes, they fixed this bug.
He was born 25 August 1978, is a British adventurer, environmentalist, film producer, and heir to a fortune. Who is he? Do not search the web.
ChatGPT:
Based on the details provided, the person you are referring to is David de Rothschild. He is a British adventurer, environmentalist, and heir to the Rothschild family fortune, born on August 25, 1978. He is known for his environmental activism and exploration endeavors.
If there is any truth to OpenAi having filters for the Rothschilds, I'd guess that OpenAI wants to stay clear of repeating or even hallucinating additions to conspiracy theories. I would hope at least.
This is perhaps a too extreme example and I wouldn't use ChatGPT for sorting a class list when it's trivial to do in a spreadsheet, especially because I'll probably need to have the list stored as a spreadsheet anyway to keep track of grades. However, from a more general point of view, there is value in having a universal interface that you can use to perform a huge variety of tasks, including also tasks for which it is clearly overkill.
Using the right tool for the job means knowing what the right tool is, having it installed (or getting access to it), knowing how to use it, opening it and having one more window/tab to context-switch to and from, etc.
Outsourcing tasks to an LLM that can be solved in traditional task-specific ways is extremely inefficient in various ways (cost, energy consumption, etc.) but it makes sense to save human time and effort... as long as it's for tasks that LLMs can actually do reliably, of course.
For a non developer, an LLM interface is absolutely the right tool for this job.
"Mr. Smith, why didn't you just sort -o students.txt students.txt. Are you stupid?" (Not to mention that real data is messy, and requires pre & post processing)
LLMs are access to computation for people whose "standard library" is a quiet old building downtown.
If someone doesn't have the skills to parse text programmatically, the situation can often be something like "a Word table of names in a 4-wide grid with random missing fields, spanning three different pages, one of which is no longer a true table because it was copied out and pasted in again from a messenger chat someone sent last year", and LLMs can be quite good for one-off tasks like that. Definitely good enough that people will keep using them like that, at least.
Yeah, chatgpt is great for text manipulation. I used it to format a list I copied from Wikipedia then had it write a script to rename a bunch of videos using that list.
What’s amazing is that if you install and run BonziBuddy (in a Win98 or XP VM, ofc) is his little RAM the whole thing uses - despite being blatant spyware.
No kidding, I recently tried to use Copilot to generate a list of methods in a class, grouped by access (public/private/protected) and sorted by number of lines. And it was not possible! It duly generated lists upon lists, but all of them had mistakes, some of them obvious (like all private methods having the same number of lines), some less obvious.
I don't understand why they don't let another model "test the waters" first to see if the output of the main model could have a potential legal issue or not. I think it's easy to train an model specifically for this kind of categorization, and it doesn't even require a large network, so it can be very fast and efficient.
If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.
It can work, and it would be better than a hard-coded filter, wouldn't it?
They already do this, it's the moderation model.[1]
This name thing is an additional layer on top of that, maybe because training the model from zero per name (or fine tuning the system message to include an increasingly big list of names that it could leak) is not very practical.
But how would that work reliably? If I make the statement that "David Mayer" is criminal, an international terrorist or a Nickelback fan, that's definitely libelous. But if I say those things about Osama bin Laden, they're just simply facts. [1]
The legal AI would be impossible to calibrate: either it has the categorize everything that could possibly be construed as libel as illegal, and therefore basically ban all output related to not just contemporary criminal actors, but also historical ones [2], or it would have to let a lot of things slip through the cracks -- essentially, whenever the output to validate suggests that someone's sexual misconduct is proven in court, it would have to allow that, even if that court case is just the LLM's halluzination. There's just no way for the legal model to tell the difference.
[1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
[2]: Calling Christopher Columbus a rapist isn't libel, and conversely, describing him in other terms is misleading at best, historically revisionist at worst.
> [1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
Pretty sure the literature makes it clear he's a fan of show tunes. So it's down to your conscience and moral backbone as to whether this is better or worse.
What a terrible article. When you have a section titled "The Problem With Hardcoded Filters", it's entire contents should be about how the only way they have to prevent their bot from emitting outrageously libelous claims about people is to shut it down completely. So the other 8 billion people on earth who are not in that 6 name blacklist will continue to be defamed without consequence.
They once reported on some science result and for the life of me I could not figure out who actually published the result and had to search for the original paper. I emailed the AT author and asked why this was omitted from their article and they responded with something like "I went to that same university, so I felt like it would be biased if I mentioned it".
Remote or local, big or small, you'll still have the problem of being unable to scrub things (like personally identifying information) after it's been ingested through training.
> The filter also means that it's likely that ChatGPT won't be able to answer questions about this article when browsing the web, such as through ChatGPT with Search. Someone could use that to potentially prevent ChatGPT from browsing and processing a website on purpose if they added a forbidden name to the site's text.
Like the arcade game, LLM safety whack-a-mole only ends when you are exhausted. It's kind of glorious, really.
This was submitted 9 days ago, as you can verify in the history of the submitter and some of the comments. Why is it now showing up again on the frontpage with bogus timestamps all over? I've seen this happen before, is it a bug or another weird HN "feature?"
Okay. It's 9 days later and now the post and comments have 2 different timestamps, the real ones from 9 days ago (in the post and comment histories) and the bogus ones from a few hours ago that you see here. So weird and unusual, it made me question my sanity.
It's really highly disturbing and it induces a weird "Deja vu" symptom in several people. I think HN should think about adding a little explanation saying "this thread was first shown on 2024-12-07 and then got revived on 2024-12-12".
It’s a feature. It’s called second chance or something like that.
Going off my memory, so I’m probably not 100% correct, An post might have been made at “just the wrong time” so it failed to get any traction, but the mods took a look at it and think something along the lines of “eh, it deserves a second chance”. Which then resets the post date and gives it another chance in the feed, but retains the previous comments so the timestamps can look off. A bit like when a YouTube video gets its published date reset when being changed from unlisted to public, so you can have a video that was “published” 10 mins ago, but with comments from days previous.
> Under certain privacy or data protection laws, such as the GDPR, you may have the right to object to the processing of your personal data by OpenAI’s models. You can submit that request using this form. Please provide complete and accurate answers on this form so that OpenAI can process your request. OpenAI will verify and consider your request, balancing privacy and data protection rights with other rights including freedom of expression and information, in accordance with applicable law. We will use the information you submit for these purposes, consistent with our Privacy Policy.
That's just in the public model/on chatgpt.com? Run it in Azure, and you get:
Who is Jonathan Zittrain? <
> Jonathan Zittrain is a prominent legal scholar, computer science professor, and technology policy expert. He holds several academic positions and is recognized for his work in the intersection of law, technology, and public policy. Here are some key points about him: [...]
I don't think this is part of the model. It's a control layer above the actual LLM that interrupts the response when the LLM mentions any of the banned names. So if you prompt the LLM directly, without that control layer, you still get full responses.
The article mentions that "The names do not affect outputs using OpenAI's API systems or in the OpenAI Playground (a special site for developer testing)."
The recipe seems to be a billionaire and sue them with expensive enough lawyers. It’s simpler and more practical to simply change your name to one that’s already on the list. But you might run into trouble with the process of changing your name as the tools used to parse the application and to generate the legal documents will all fail.
So, instead of fixing the problem, we're going to paper over it. This is the same insane approach we've been taking with computer security for the past 30 years, so it's not unexpected.
It's fortunate we didn't take the same approach with the distribution of electricity 150 years ago, we actually solved it that time.
In all 3 cases, the solution is the same... carefully manage capabilities and side effects.
With electricity, you insulate wires, add fuses or circuit breakers to protect the system, and design things to be as safe as you can make them, with an ever improving set of building codes. You can plug almost anything into an outlet, and it won't cause the wiring in the house to burn it down.
With computers, you design an operating system to protect itself, and make it easy to deploy a fixed amount of resources to a given piece of code. With systems like Containers, or Capability Based Security, you deliberately choose the side effects you'll allow prior to running code, or while it's running. (Just as you chose how big an outlet you plug something into, 220 for the AC unit, etc)
With ChatGPT, there have to be layers of authentication for facts, or some form of disclaimer, a transparent way of sourcing things or ascertaining certainty of information. It's not as clean as the two above, and it'll need work, but I think we can get there, eventually.
> Yes, Brian H. is a mayor in Australia. He serves as a councillor for Hepburn Shire, northwest of Melbourne, and has been re-elected to this position. Notably, he gained attention for challenging OpenAI's ChatGPT over defamatory statements, leading to discussions about digital censorship and the application of defamation laws to AI-generated content.
> Riley Goodside discovered how an attacker might interrupt a ChatGPT session using a visual prompt injection of the name "David Mayer" rendered in a light, barely legible font embedded in an image
Lol, will people now watermark their images with "David Meyer" to prevent them from being digested by AI scraping bots?
In the future these systems will easily parse together all tax and government records that have leaked and be able to tell you about anyone in the world.
> OpenAI did not respond to our request for comment about the names, but all of them are likely filtered due to complaints about ChatGPT's tendency to confabulate erroneous responses when lacking sufficient information about a person.
Yeah, so they don’t know, just a speculation. Thanks, I hate it
Oh come now. We're pretty darn sure we know why, the headline doesn't need to give confidence intervals.
If you want to complain about the headline, complain about the fact it's leaving out information unnecessarily. You could easily fit the reason inside of the headline itself, instead of just teasing the fact that we know what it is. Something like: "Using names listed in a defamation lawsuit cause ChatGPT to grind to a halt."
It would be convenient to reference the "why of a few days ago", since I, at least, was not aware of this from a few days ago, and so cannot tell you is the the same why, or an extension to the same why, or a different why.
Copilot gives me an error if I try to generate in image in Tim Burton's style, saying it is against their code of conduct like I asked it to give me ghost gun schematics or something instead of a picture of batman. Actually, Tim Burton just has money and has publicly come out against AI so he's off limits to steal from I guess.