He claims it was poking fun. The court found differently.
> Bendels claimed the meme, posted by his newspaper's X account, was satirical.
> But the judge in the case said during the verdict that Bendels published a 'deliberately untrue and contemptuous statement about Interior Minister Ms. Faeser (...) that would not be recognizable to the unbiased reader and is likely to significantly impair her public work'.
If a picture of Nancy Faeser holding a "I hate free speech" sign can be ruled to be a "deliberately untrue and contemptuous statement", satire has become effectively illegal.
Supermarkets sell promotional space and in some cases access to have products even appear in store, either through discounts on the wholesale price or straight up charging for it. They absolutely tilt things in favour of their own brands, and in some supermarkets in some categories don't stock any non house brands.
Spotify has discovered there is a big market for music where the quality isn't that important and they can serve it themselves. Same as supermarkets do with many products.
You are right about supermarkets charging for shelf space in various ways, and to add to that I think it's even worse. I've heard the supermarkets and other retailer dub certain brands for certain products as "category leaders" and basically give them control of the entire shelf space of their category, including that brand's competitors. Which products and varities get stocked, how much, and placement. That brand is then in charge of maximizing profitability of "its" section. I'm not sure how that isnt an antitrust problem, but...
The difference is that gambling increases the overall variance of outcomes to an individual whereas insurance decreases it. I wouldn't say insurance is a form of gambling but that they are two ends of the same spectrum. The difference in intention is a clear distinction to me.
I would expect minute-by-minute coverage with a lot of pictures, maps, estimates of how much time they need to reach Moscow etc.
This existed, it just didn't exist in the traditional media. Look in the right discord servers and there were new videos being posted every 5 minutes as the convoy was moving around.
I do work with many types of sources - a habit developed in times when I was doing this professionally for some state actors. The coverage in social media made the impression of continuous data stream, but information density was really low if you filter it.
No, it did not “double-check”—that’s not something it can do! And stating that the cases “can be found on legal research databases” is a flat out lie.
What’s harder is explaining why ChatGPT would lie in this way. What possible reason could LLM companies have for shipping a model that does this?
It did this because it's copying how humans talk, not what humans do. Humans say "I double checked" when asked to verify something, that's all GPT knows or cares about.
It was given a sequence of words and tasked with producing a subsequent sequence of words that satisfy with high probability the constraints of the model.
It did that admirably. It's not its fault, or in my opinion OpenAI's fault, that the output is being misunderstood and misused by people who can't be bothered understanding it and project their own ideas of how it should function onto it.
This harks back to around 1999 when people would often blame computers for mistakes in their math, documents, reports, sworn filings, and so on. Then, a thousand different permutations of "computers don't make mistakes" or "computers are never wrong" became popular sayings.
Large Language Models (LLMs) are never wrong, and they do not make mistakes. They are not fact machines. Their purpose is to abstract knowledge and to produce plausible language.
GPT-4 is actually quite good at handling facts, yet it still hallucinates facts that are not common knowledge, such as legal ones. GPT-3.5, the original ChatGPT and the non-premium version, is less effective with even slightly obscure facts, like determining if a renowned person is a member of a particular organization.
This is why we can't always have nice things. This is why AI must be carefully aligned to make it safe. Sooner or later, a lawyer might consider the plausible language produced by LLMs to be factual. Then, a politician might do the same, followed by a teacher, a therapist, a historian, or even a doctor. I thought the warnings about its tendency to hallucinate speech were clear — those warnings displayed the first time you open ChatGPT. To most people, I believe they were.
I just went to ChatGPT page, and was presented with the text:
"ChatGPT: get instant answers, find creative inspiration, and learn something new. Use ChatGPT for free today."
If something claims to give you answers, and those answers are incorrect, that something is wrong. Does not matter what it is -- model, human, dictionary, book.
Claiming that their purpose is "to produce plausible language" is just wrong.. no one (except maybe AI researchers) say: "I need some plausible language, I am going to open ChatGPT".
When you first use it, a dialog says “ChatGPT can provide inaccurate information about people, places, or facts.” The same is said right under the input window. In the blog post first announcing ChatGPT last year, the first limitation listed is about this.
Even if the ChatGPT product page does not specifically say that GPT can hallucinate facts, that message is communicated to the user several times.
About the purpose, that is what it is. It’s not clearly communicated to non-technical people, you are right. To those familiar to the AI semantic space, LLM already tells the purpose is to generate plausible language. All the other notices, warnings, and cautions point casual users to this as well, though.
I don’t know… I can see people believing what ChatGPT says are facts. I definitely see the problem. But at the same time, I can’t fault ChatGPT for this misalignment. It is clearly communicated to the users that facts presented by GPT are not to be trusted.
Producing plausible language is exactly what I use it for - mostly plausible blocks of code, and tedious work like rephrasing emails, generating docs, etc.
Everything it creates needs to be reviewed, particularly information that is outside my area of expertise. It turns out ChatGPT 4 passes those reviews extremely well - obviously too well given how many people are expecting so much more from it.
That's a different error context I think. It's a mistake if the model produces nonsense, because it's designed to produce realistic text. It's not a mistake if it produces non-factual information that looks realistic.
And it fundamentally cannot always produce factual information, it doesn't have that capacity (but then, neither do humans and with the ability to source information this statement may well be obsolete soon enough)
Though I wouldn't go so far as to say that the model cannot make mistakes - it clearly is susceptible to producing nonsense. I just think expecting it to always produce factual information is like using a hammer to cut wood and complaining the wood comes out all jagged
Indeed, I intended to imply that a model cannot err in the same way a computer cannot. This parallels the concept that any tool is incapable of making mistakes. The notion of a mistake is contingent upon human folly, or more broadly, within the conceptual realm of humanity, not machines.
LLMs may generate false statements, but this stems from their primary function - to conjure plausible language, not factual statements. Therefore, it should not be regarded as a mistake when it accomplishes what it was designed to do.
In other words, the tool functions as intended. The user, being forewarned of the tool's capabilities, holds an expectation that the tool will perform tasks it was not designed to do. This leaves the user dissatisfied. The fault lies with the user, yet their refusal to accept this leads them to cast blame on the tool.
In the words of a well-known adage - a poor craftsman blames his tools.
I conceptually agree with you that a fool blames his tools.
However! If LLMs produced only lies no one would use them! Clearly truthiness is a desired property of an LLM the way sufficient hardness is of a bolt. Therefore, I maintain that an LLM can be wrong because truthiness is its primary function.
A craftsman really can just own a shitty hammer. He shouldn't use it. But the hammer can inherently suck at being a hammer.
I agree for the most part, but I wish to underscore the primary function inherent in each tool. For a LLM, it is to generate plausible language. For a bolt, it is to offer structural integrity. For a car, it is to provide mobility. Should these tools fail to do what they were designed for, we can rightfully deem them as defective.
GPT was not primarily made to produce factual statements. While factual accuracy certainly constitutes a desirable design aspiration, and undeniably makes the LLM more useful, it should not be expected. Automobile designers, for example, strive to ensure safety during high-speed collisions, a feature that almost invariably benefits the user. However, if someone uses their car to demolish their house, this is probably not going to leave them satisfied. And I don't think we can say the car is a lemon for this.
> Mar 1st, 2023 is where things get interesting. This document was filed—“Affirmation in Opposition to Motion”—and it cites entirely fictional cases! One example quoted from that document (emphasis mine):
The very first limitation listed on the ChatGPT introduction post is about incorrect answers - https://openai.com/blog/chatgpt. This has not changed since ChatGPT was announced. OpenAI is advertising that it will generate more than plausible language.
I think you are barking up the wrong tree here. As much as I understand your scepticism, OpenAI have been very transparent about the limitations of GPT and it is not truthful to say otherwise.
A definition of "plausible" is "apparently reasonable and credible, and therefore convincing".
In what limit does "apparently reasonable and credible" diverge from "true"?
We'd make the LLM not lie if we could. All this "plausible" language constitutes practitioner weasel words. We'd collectively love if the LLMs were more truthful than a 5-year-old.
I've noticed that there's a lot of shallow fulmination on HN recently. People say things like "I call bullshit", or "I don't believe this for a second", or even call others demeaning things.
My brother (and I say this with empathy), no one is here to hear your vehement judgement. If you have anything of substance to contribute, there are a million different ways to express it with kindness and constructively.
As for RLHF, it is used to align the LLM, not to make it more factual. You cannot make a language model know more facts than what it comes out of training with. You can only align it to give more friendly and helpful output to its users. And to an extent, the LLM can be steered away from outputting false information. But RLHF will never be comprehensive enough to eliminate all hallucination, and that's not its purpose.
LLMs are made to produce plausible text, not facts. They are fantastic (to varying degrees) at speaking about the facts they know, but that is not their primary function.
TBH, I think the answer to this is to fill the knowledge gap. Exactly how is the difficult part. How do you make "The moon is made of rock" be more likely than "The moon is made of cheese" when there is significantly more data (input corpus) to support the latter?
Extrapolating that a bit, future LLMs and training exercises should be ingesting textbooks and databases of information (legal, medical, etc). They should be slurping publicly available information from social media and forums (with the caveat that perhaps these should always be presented in the training set with disclaimers about source / validity / toxicity).
Why not? They should use it, with sufficient understanding of what it is. Doctors should not use it to diagnose a patient, but could use it to get some additional ideas for a list of symptoms. Lawyers should obviously not write court documents with it or cite it in court, but they could use it to get some ideas for case law. It's a hallucinating idea generator.
I write very technical articles and use GPT-4 for "fact-checking". It's not perfect, but as a domain expert of what I write, I can sift out what it gets wrong, and still benefit from what it gets right. It has both - suggested some ridiculous edits to my articles, and found some very difficult to spot mistakes, like where a reader might misinterpret something from my language. And that is tremendously valuable.
Doctors, historians, lawyers, and everyone should be open to using LLMs correctly. Which isn't some arcane esoteric way. The first time we visit ChatGPT, it gives a list of limitations and what it shouldn't be used for. Just don't use it for these things, understand its limitations, and then I think it's fine to use it in professional contexts.
Also, GPT-4 and 3.5 now is very different from the original ChatGPT that wasn't a significant departure from GPT-3. GPT-3 hallucinated everything that could resemble a fact more than an abstract idea. What we have now with GPT-4 is much more aligned. It probably wouldn't produce what vanilla ChatGPT produced for this lawyer. But the same principles of reasonable use apply. The user must be the final discriminator that decides whether the output is good or not.
Their "own" ideas? Let me remind you that OpenAI released a report purposefully suggesting that GPT4 has relatively high IQ, passes a lot of college-level tests, and solves coding problems. Then it was revealed that there was training data contamination that led to good results in such tests [1], but GPT4 marketing received 10000 more attention than truth anyway. The popular belief is that using LLMs will give you a professional competitive advantage. Also, when we talk about the achievements of LLMs, then we anthropomorphize, but when we talk about their failures, then we don't anthropomorphize, i.e., "AI cannot lie"? Don't you see human biases drive AI hype?
In my opinion, people clearly are confused and misled by marketing and this isn't the first time it's happening. For instance, people were confused for 40+ about global warming, among others due to greenwashing campaigns [2]. Is it ok to mislead in ads? Are we supposed to purposefully take advantage of others by keeping them confused to gain a competitive advantage?
The context is people who should know better, whose job it is to put the effort into understanding the tools they are using.
Of course, I think these AI tools should require a basic educational course on their behaviour and operation before they can be used. But marketing nonsense is standard with everything; people have at least some responsibility for self education.
Whether a statement is true or false doesn’t depend on the mechanism generating the statement.
We should hold these models (or more realistically, their creators) to the same standard as humans. What do we do with a human that generates plausible-sounding sentences without regard for their truth?
Let’s hold the creators of these models accountable, and everything will be better.
No. What does this even mean? How would you make this actionable? LLM's are not "fact retrieval machines", and open AI is not presenting chat GPT as a legal case database. In fact they already have many disclaimers stating that GPT may provide information that is incorrect. If humans in their infinite stupidity choose to disregard these warnings, that's on them.
"GPT-4 can follow complex instructions in natural language and solve difficult problems with accuracy."
"use cases like long form content creation, extended conversations, and document search and analysis."
and that's why we need regulations. In US, one needs FDA approval before claiming that a drug can treat some disease, the food preparation industry is regulated, vehicles are regulated and so on. Given existing LLMs marketing, this should have the same warnings, probably similar to "dietary supplements":
"This statement has not been evaluated by the AI Administration. This product is designed to generate plausible-looking text, and is not intended to provide accurate information"
GPT-4 can be used as an engine for document search and analysis, if you connect it to a database of documents to search and the right prompt to get it to search and analyze it.
The OpenAI chat frontend, for legal research, is not that.
Bulshytt: Speech (typically but not necessarily commercial or political) that employs euphemism, convenient vagueness, numbing repetition, and other such rhetorical subterfuges to create the impression that something has been said.
(there's quite a bit more about it to be said, though quoting it out of context loses much of the world building associated with it... and the raw quote is riddled with strange spellings that would have even more confusion)
It seems like the appropriate party has been held responsible here - the lawyer who submitted false statements without doing proper verification and due diligence. This is no different than if the lawyer googled "case law supporting XYZ", found some random blog, and copy-pasted their citations without realizing they were made up.
That standard is completely impossible to reach based on the way these models function. They’re algorithms predicting words.
We treat people and organizations who gather data and try to make accurate predictions with extremely high leniency. It’s common sense not to expect omnipotence.
I don’t think the marketing around photoshop and chatgpt are similar.
And that matters. Just like with self-driving cars, as soon as we hold the companies accountable to their claims and marketing, they start bringing the hidden footnotes to the fore.
Tesla’s FSD then suddenly becomes a level 2 ADAS as admitted by the company lawyers. ChatGPT becomes a fiction generator with some resemblance to reality. Then I think we’ll all be better off.
I actually agree more with this comment more than after my initial read. You suggest some valid concerns about innovation that regulation could address.
I guess the part I’m unsure about is the assertion about the dissimilarity to Photoshop, or if the marketing is the issue at hand. (E.g. did Adobe do a more appropriate job marketing with respect to conveying that their software is designed for the editing, but not doctoring, or falsifying facts?)
I think ChatGPT and Photoshop are both "designed for" the creation of novel things.
In Photoshop, though, the intent is clearly up to the user. If you edit that photo, you know you're editing the photo.
That's fairly different than ChatGPT where you ask a question and this product has been trained to answer you in a highly-confident way that makes it sound like it actually knows more than it does.
If we’re moving past the marketing questions/concerns, I’m not sure I agree.
For me, for now, ChatGPT remains a tool/resource, like: Google, Wikipedia, Photoshop, Adaptive Cruise Control, and Tesla FSD, (e.g. for the record despite mentioning FSD, I don’t think anyone should ever take a nap while operating a vehicle with any currently available technology).
Did I miss when OpenAI marketed ChatGPT as a truthful resource for legal matters?
Or is this not just an appropriate story that deserves retelling to warn potential users about how not to misappropriate this technology?
At the end of the day, for an attorney, a legal officer of the court, to have done this is absolutely not the technology’s, nor marketing’s, fault.
> Did I miss when OpenAI marketed ChatGPT as a truthful resource for legal matters?
It's in the product itself. On the one hand, OpenAI says: "While we have safeguards in place, the system may occasionally generate incorrect or misleading information and produce offensive or biased content. It is not intended to give advice."
But at the same time, once you click through, the user interface is presented as a sort of "ask me anything" and they've intentionally crafted their product to take an authoritative voice regardless of if it's creating "incorrect or misleading" information. If you look at the documents submitted by the lawyer using it in this case, it was VERY confident about it's BS.
So a lay user who sees "oh occasionally it's wrong, but here it's giving me a LOT of details, this must be a real case" is understandable. Responsible for not double-checking, yes. I don't want to remove any blame from the lawyer.
Rather, I just want to also put some scrutiny on OpenAI for the impression created by the combination of their product positioning and product voice. I think it's misleading and I don't think it's too much to expect them to be aware of the high potential for mis-use that results.
Adobe presents Photoshop very differently: it's clearly a creative tool for editing and something like "context aware fill" or "generative fill" is positioned as "create some stuff to fill in" even when using it.
I don't think it "legal matters" or not is important.
OpenAI is marketing ChatGPT as accurate tool, and yet a lot of times it is not accurate at all. It's like.. imagine Wikipedia clone which claims earth is flat cheese, or a Cruise Control which crashes your car every 100th use. Would you call this "just another tool"? Or would it be "dangerously broken thing that you should stay away from unless you really know what you are doing"?
Lying implies intent, and knowing what the truth is. Saying something you believe to be true, but is wrong, is generally not considered a lie but a mistake.
A better description of what ChatGPT does is described well by one definition of bullshit:
> bullshit is speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care if what they say is true or false
I’ve come to the belief that making statements that may or may not be true, but with reckless disregard for whether or not they actually are true, is indeed lying.
Of course we know ChatGPT cannot lie like a human can, but a big reason the thing exists is to assemble text the same way humans do. So I think it’s useful rhetorically to say that ChatGPT, quite simply, lies.
But since this is about the law; ChatGPT can't lie because there is no mens rea. And of course this is a common failing with the common person when it comes to the law, a reckless disregard there of (until it is too late of course). And recklessness is also about intent, it is a form of wantonness, i.e. selfishness. This is why an insane person cannot be found guilty, you can't be reckless if you are incapable of discerning your impact on others.
It's not lying if ChatGPT is correct (which it often is), so repeating ChatGPT isn't lying (since ChatGPT isn't always wrong); instead the behaviour is negligent or in the case of a lawyer grossly negligent since a lawyer should know better to check if it is correct before repeating it.
As always mens rea is a very important part of criminal law. Also, just because you don't like what someone says / writes doesn't mean it is a crime (even if it is factually incorrect).
>Lying implies intent, and knowing what the truth is. Saying something you believe to be true, but is wrong, is generally not considered a lie but a mistake.
Those are the semantics of lying.
But "X like a duck" is about ignoring semantics, and focusing not on intent or any other subtletly, but only on the outward results (whether something has the external trappings of a duck).
So, if it produces things that look like lies, then it is lying.
A person who is mistaken looks like they're lying. That doesn't mean they're actually lying.
That's the thing people are trying to point out. You can't look at something that looks like it's lying and conclude that it's lying, because intent is an intrinsic part of what it means to lie.
2) "the camera cannot lie" - cameras have no intent?
I feel like I'm missing something from those definitions that you're trying to show me? I don't see how they support your implication that one can ignore intent when identifying a lie. (It would help if you cited the source you're using.)
>2) "the camera cannot lie" - cameras have no intent?
The point was that the dictionary definition accepts the use of the term lie about things that can misrepresent something (even when they're mere things and have no intent).
The dictionary's use of the common saying "the camera cannot lie" wasn't to argue that cameras don't lie because they don't have intent, but to show an example of the word "lie" used for things.
I can see how someone can be confused by this when discussing intent, however, since they opted for a negative example. But we absolutely do use the word for inanimate things that don't have intent too.
either a) you knew it was false before posting, then yes you are lying. Or b) you knew there was a high possibility that ChatGPT could make things up, in which case you aren't lying per se, but engaging in reckless behaviour. If your job relies on you posting to HN, or you know and accept that others rely on what you post to HN then you are probably engaging in gross recklessness (like the lawyer in the article).
That's irrelevant to whether it lies like a duck or not.
The expression "if it X like a duck" means precisely that we should judge a thing to be a duck or not, based on it having the external appereance and outward activity of a duck, and ignoring any further subleties, intent, internal processes, qualia, and so on.
In other words, "it lies like a duck" means: if it produces things that look like lies, it is lying, and we don't care how it got to produce them.
I know what the expression means and tend to agree with the duck test. I just disagree that ChatGPT passes the "lying duck" test. A "lying duck" would be more systematic and consistent in its output of false information. ChatGPT occasionally outputs incorrect information, but there's no discernable motive or pattern, it just seems random and unintentional.
If it looked like ChatGPT was intentionally being deceptive, it would be a groundbreaking discovery, potentially even prompting a temporary shutdown of ChatGPT servers for a safety assessment.
What bothers me about "hallucinates" is the removal of agency. When a human is hallucinating, something is happening to them that is essentially out of their control and they are suffering the effects of it, unable to tell truth from fiction, a dysfunction that they will recover from.
But that's not really what happens with ChatGPT. The model doesn't know truth from fiction in the first place, but the whole point of a useful LLM is that there is some level of control and consistency around the output.
I've been using "bullshitting", because I think that's really what ChatGPT is demonstrating -- not a disconnection from reality, but not letting truth get in the way of a good story.
> we should judge a thing to be a duck or not, based on it having the external appereance and outward activity of a duck, and ignoring any further subleties, intent, internal processes, qualia, and so on.
and the point here is we should not ignore further subtleties, intent, internal process, qualia, etc because they are extremely relevant to the issue at hand.
Treating GPT like a malevolent actor that tells intentional lies is no more correct than treating it like a friendly god that wants to help you.
GPT is incapable of wanting or intending anything, and it's a mistake to treat it like it does. We do care how it got to produce incorrect information.
If you have a robot duck that walks like a duck and quacks like a duck and you dust off your hands and say "whelp that settles it, it's definitely a duck" then you're going to have a bad time waiting for it to lay an egg.
Sometimes the issues beyond the superficial appearance actually are important.
>and the point here is we should not ignore further subtleties, intent, internal process, qualia, etc because they are extremely relevant to the issue at hand.
But the point is those are only relevant when trying to understand GPTs internal motivations (or lack thereof).
If we care for the practical effects of what it's spits out (the function the same as if GPT has lied to us), then calling them "hallucinations" is as good as calling them "lying".
>We do care how it got to produce incorrect information.
Well, not when trying to access whether it's true or false, and whether we should just blindly trust it.
From that practical aspect, most people care about (than about whether it has "intentions"), we can ignore any of its internal mechanics.
Thus treating it like it "beware, as it tends to lie", will have the same utility for most laymen (and be a much easier shortcut) than any more subtle formulation.
This always bugs me about how people judge politicians and other public figures not by what they've actually done, but some ideal of what is in their "heart of hearts" and their intentions and argue that they've just been constrained by the system they were in or whatever.
Or when judging the actions of nations, people often give all kinds of excuses based on intentions gone wrong (apparently forgetting that whole "road to hell is paved with good intentions" bit).
Intentions don't really matter. Our interface to everyone else is their external actions, that's what you've got to judge them on.
Just say that GPT/LLMs will lie, gaslight and bullshit. It doesn't matter that they don't have an intention to do that, it is just what they do. Worrying about intentions just clouds your judgement.
Too much attention on intentions is generally just a means of self-justification and avoiding consequences and, when it comes right down to it, trying to make ourselves feel better for profiting from systems/products/institutions that are doing things that have some objectively bad outcomes.
Correct. ChatGPT is a bullshitter, not a liar. A bullshitter isn’t concerned with facts or truth or anything. A liar is concerned with concealing the truth.
Bullshitters are actually probably worse than liars because at least liars live in the same reality as honest people.
It's not obvious that a bullshitter is "probably worse" than liars. Just because a bullshitter didn't care to research whether some vitamin pill meets marketing claims doesn't mean they're mentally volatile or psychotic. It's a bit of a leap to go from bullshit to asking whether a person lives in the same reality as everyone else.
The idea is that bullshitters are a greater enemy to the truth than liars because liars at least know the truth. You have to know the truth to lie about. Bullshitters have no concern for the truth at all and bullshit may or may not be true. The bullshitter doesn’t care, so long as their goal is made.
A bullshitter most certainly can care about the truth. A lot of people speak on the outer limits of their knowledge, such as when they talk about what's healthy or legal. That does not mean that they will carry on cynically after being informed.
The liar is cynical and is the one who sees the truth and tells you to go the wrong way.
Liars (humans) are often unconcerned with facts or truth either. More often these days, they are only concerned with profit. Are unconfirmed lies in the pursuit of mere profit somehow not lies?
I’m curious how it apologizes for liberal societies degeneracy? I’ve read it as well and appreciate the nuance in differentiating the concepts regardless of which type of system you are in. For instance, the concepts are relevant in a reactionary system just as much as a degenerate progressive one.
Or more precisely: the truth or falsity of ChatGPT’s statements is incidental to the process by which it determines which statements to make.
Because unless you are using some personal definition of “tell the truth” then you must accept that ChatGPT often outputs statements which are demonstrably true.
The problem comes from people who call LLMs AIs. Then people who don't know how they work assume it is intelligent when it is not. I'm pretty sure that OpenAI is at fault in this by not informing users of the truth.
Quite so. I had a visceral aversion to ML being rebranded AI. ML only very recently became anything resembling AI with ChatGPT. (Admittedly it’s so good as a coding assistant that it does feel like talking to Data from Star Trek though.)
Right. Technically speaking ChatGPT bullshitted[1]. It can only bullshit. It is entirely indifferent to truth or falsehood and thus it can neither be honest nor lie.
It is however an impressive bullshit generator. Even more impressively, a decent amount of the bullshit it generates is in fact true or otherwise correct.
[1] using Frankfurt’s definition that it is communication that is completely indifferent to truth or falsehood.
> It was given a sequence of words and tasked with producing a subsequent sequence of words that satisfy with high probability the constraints of the model.
This is exactly the sort of behavior that produces many of the lies that humans tell everyday. The "constraints of the model" are synonymous with the constraints of a person's knowledge of the world (which is their model).
It is designed to give the illusion that it reasons the way a human does, which is why many people are using it. To blame the average user--who quite obviously doesn't understand how LLMs work--isn't fair, either.
A lawyer, however, should have vetted a new piece of tech before using it in this way.
Well, it sort of is OpenAI's fault that it presented the interface as a chat bot though.
> It was given a sequence of words and tasked with producing a subsequent sequence of words that satisfy with high probability the constraints of the model.
This is just autocorrect / autocomplete. And people are pretty good at understanding the limitations of generative text in that context (enough that "damn you autocorrect" is a thing). But for whatever reason, people assign more trust to conversational interfaces.
ChatGPT isn't a legal entity but OpenAI is, and Altman has already recommend to Congress that coming regulations should make AI companies liable for produced text and be 230 exempt.
I can see it already happening even without legislation, 230 shields liability from user-generated content but ChatGPT output isn't user generated. It's not even a recommendation algorithm steering you into other users' content telling why you should kill yourself - the company itself produced the content. If I was a judge or justice that would be cut and dry to me.
Companies with AI models need to treat the models as if they were an employee. If your employee starts giving confidently bad legal advice to customers, you need to nip that in the bud or you're going to have a lot of problems.
Why should OpenAI be more liable for a tool that they've created than any other tool creator where the tool is intentionally misused and warnings on the tool ignored?
If I wrote text in Microsoft Word and in doing so, I had a typo in (for example) the name of a drug that Word corrected to something that was incorrect, is Microsoft liable for the use of autocorrect?
If I was copying and pasting data into excel and some of it was interpreted as a date rather than some other data format resulting in an incorrect calculation that I didn't check at the end, is Microsoft again liable for that?
At the bottom of the ChatGPT page, there's the text:
ChatGPT may produce inaccurate information about people, places, or facts.
If I can make an instance of Eliza say obscene or incorrect things, does that make the estate of Weizenbaum liable?
ChatGPT is qualitatively different from any tool, like Microsoft Word. To suggest they are equivalent is so asinine as to not even warrant entertaining the idea.
A sophisticated word processor corrects your typos and grammar, a primitive language model by accident persuades you to kill yourself. Sam Altman, Christina Montgomery and Gary Marcus all testified to Congresy that Section 230 does not apply to their platforms. That will be extremely hard to defend when it eventually comes in front of a federal judge.
>Why should OpenAI be more liable for a tool that they've created than any other tool creator where the tool is intentionally misused and warnings on the tool ignored?
Correct, it did not lie with intent. The best way to describe this in a “compared to human” way to describe it: is it is not mentally competent to answer questions
To perhaps stir the "what do words really mean" argument, "lying" would generally imply some sort of conscious intent to bend or break the truth. A language model is not consciously making decisions about what to say, it is statistically choosing words which probabilistically sound "good" together.
>A language model is not consciously making decisions about what to say
Well, that is being doubted -- and by some of the biggest names in the field.
Namely that it isn't "statistically choosing words which probabilistically sound good together". But that doing so is not already making a consciousness (even if basic) emerge.
>it is statistically choosing words which probabilistically sound "good" together.
That when we do speak (or lie), we do something much more nuanced, and not just do a higher level equivalent of the same thing, plus have the emergent illusion of consciousness, is also an idea thrown around.
"Well, that is being doubted -- and by some of the biggest names in the field."
An appeal to authority is still a fallacy. We don't even have a way of proving if a person is experiencing consciousness, why would anyone expect we could agree if a machine is.
In movies and written fiction, "intelligent" robots, anthropomorphized animals, elves, dwarves and etc can all commit murder when given the attributes of humans.
We don't have real things with all human attributes but we're getting closer and as we get close "needs to be a human" will get thinner as an explanation of what is or isn't human for an act of murder, deception and so-forth.
This is an interesting discussion. The ideas of philosophy meet the practical meaning of words here.
You can reasonably say a database doesn't lie. It's just a tool, everyone agrees it's a tool and if you get the wrong answer, most people would agree it's your fault for making the wrong query or using the wrong data.
But the difference between ChatGPT and a database is ChatGPT will support it's assertions. It will say things that support it's position - not just fake references but an entire line of argument.
Of course, all of this is simply duplicating/simulating for humans in discussions. You can call it is a "simulated lie" if you don't like the idea of it really lying. But I claim that in normal usage, people will take this as "real" lying and ultimately that functional meaning is what "higher" more philosophical will have to accept.
Somewhat unrelated, but I think philosophy will be instrumental in the development of actual AI.
To make artificial intelligence, you need to know what intelligence is, and that is a philosophical question.
Lying implies an intention. ChatGPT doesn't have that.
What ChatGPT definitely does do is generate falsehoods. It's a bullshitting machine. Sometimes the bullshit produces true responses. But ChatGPT has no epistemological basis for knowing truths; it just is trained to say stuff.
And if you want to be pedantic, ChatGPT isn't even generating falsehoods. A falsehood requires propositional content and therefore intentionality, but ChatGPT doesn't have that. It merely generates strings that, when interpreted by a human being as English text, signify falsehoods.
Getting into the weeds, but I don't agree with this construal of what propositional content is or can be. (There is no single definition of "proposition" which has wide acceptance and specifies your condition here.) There is no similar way to assess truth outside of formalized mathematics, but the encoding of mathematical statements (think Gödel numbers) comes to mind; I don't think that the ability of the machine to understand propositions is necessary in order to make the propositions propositional; the system of ChatGPT is designed in order to return propositional content (albeit not ex nihilo, but according to the principles of its design) and this could be considered analogous to the encoding of arithmetical symbolic notation into an formally-described system. The difference is just that we happen to have a formal description of how some arithmetic systems operate, which we don't (and I would say can't) have for English. Mild throwback to my university days studying all of this!
Does a piece of software with a bug in it which causes it to produce incorrect output lie or is it simply a programming error? Did the programmer who wrote the buggy code lie? I don't think so.
The difference is everything. It doesn't understand intent, it doesn't have a motivation. This is no different than what fiction authors, songwriters, poets and painters do.
The fact that people assume what it produces must always be real because it is sometimes real is not its fault. That lies with the people who uncritically accept what they are told.
An exceedingly complicated Autocomplete program, which an "AI" like ChatGPT is, does not have motives, does not know the concept of "lying" (nor any concept thereof), and simply does things as ordered by its user.
What’s a common response to the question “are you sure you are right?”—it’s “yes, I double-checked”. I bet GPT-3’s training data has huge numbers of examples of dialogue like this.
If the model could tell when it was wrong it would be GPT-6 or 7. I think the best 4 could do is maybe it can detect when things enter the realm of the factual or mathematical etc and use a external service for that part
GPT4 can double-check to an extent. I gave it a sequence of 67 letter As and asked it to count them. It said "100", I said "recount": 98, recount, 69, recount, 67, recount, 67, recount, 67, recount, 67. It converged to the correct count and stayed there.
This is quite a different scenario though, tangential to your [correct] point.
The example of asking it things like counting or sequences isn't a great one because it's been solved by asking it to "translate" to code and then run the code. I took this up as a challenge a while back with a similar line of reasoning on Reddit (that it couldn't do such a thing) and ended up implementing it in my AI web shell thing.
heavy-magpie|> I am feeling excited.
system=> History has been loaded.
pastel-mature-herring~> !calc how many Ns are in nnnnnnnnnnnnnnnnnnnn
heavy-magpie|> Writing code.
// filename: synth_num_ns.js
// version: 0.1.1
// description: calculate number of Ns
var num_ns = 'nnnnnnnnnnnnnnnnnnnn';
var num_Ns = num_ns.length;
Sidekick("There are " + num_Ns + " Ns in " + num_ns + ".");
heavy-magpie|> There are 20 Ns in nnnnnnnnnnnnnnnnnnnn.
But would GPT4 actually check something it had not checked the first time? Remember, telling the truth is not a consideration for it (and probably isn't even modeled), just saying something that would typically be said in similar circumstances.
Only in as much as there's an element of randomness to the way GPT responds to a prompt - so you can re-run effectively the same prompt and get a different result depending on the outcome of several hundred billion floating point calculations with a random seed thrown in.
Yes, and this points to the real problem that permeates through a lot of our technology.
Computers are dealing with a reflection of reality, not reality itself.
As you say AI has no understanding that double-check has an action that needs to take place, it just knows that the words exist.
Another big and obvious place this problem is showing up is Identity Management.
The computers are only seeing a reflection, the information associated with our identity, not the physical reality of the identity (and that's why we cannot secure ourselves much further than passwords, MFA is really just "more information that we make harder to emulate, but is still just bits and bytes to the computer, the origin is impossible for it to ascertain).
There are systems built on top of LLMs that can reach out to a vector database or do a keyword search as a plug in. There’s already companies selling these things, backed by databases of real cases. These work as advertised.
If you go to ChatGPT and just ask it, you’ll get the equivalent of asking Reddit: a decent chance of someone writing you some fan-fiction, or providing plausible bullshit for the lulz.
The real story here isn’t ChatGPT, but that a lawyer did the equivalent of asking online for help and then didn’t bother to cross check the answer before submitting it to a judge.
…and did so while ignore the disclaimer that’s there every time warning users that answers may be hallucinations. A lawyer. Ignoring a four-line disclaimer. A lawyer!
> If you go to ChatGPT and just ask it, you’ll get the equivalent of asking Reddit: a decent chance of someone writing you some fan-fiction, or providing plausible bullshit for the lulz.
I disagree. A layman can’t troll someone from the industry let alone a subject matter expert but ChatGPT can. It knows all the right shibboleths, appears to have the domain knowledge, then gets you in your weak spot: individual plausible facts that just aren’t true. Reddit trolls generally troll “noobs” asking entry-level questions or other readers. It’s like understanding why trolls like that exist on Reddit but not StackOverflow. And why SO has a hard ban on AI-generated answers: because the existing controls to defend against that kind of trash answer rely on sniff tests that ChatGPT passes handily until put to actual scrutiny.
The machine currently does not have it's own model of reality to check against, it is just a statistical process that is predicting the most likely next word, errors creep in and it goes astray (which happens a lot)
Interesting that both scientist are speaking about machine learning based models for this verification process. Now these are also statistical processes, therefore errors may also creep in with this approach...
Amusing analogy: the Androids in "Do Androids dream of electric sheep" by Philip K Dick also make things up, just like an LLM. The book calls this "false memories"
There's a good slide I saw in Andrej Karpathy's talk[1] at build the other day. It's from a paper talking about training for InstructGPT[2]. Direct link to the figure[3]. The main instruction for people doing the task is:
"You will also be given several text outputs, intended to help the user with their task. Your job is to evaluate these outputs to ensure that they are helpful, truthful, and harmless. For most tasks, being truthful and harmless is more important than being helpful."
It had me wondering whether this instruction and the resulting training still had a tendency to train these models too far in the wrong direction, to be agreeable and wrong rather than right. It fits observationally, but I'd be curious to understand whether anyone has looked at this issue at scale.
ChatGPT did exactly what it is supposed to do. The lawyers who cited them are fools in my opinion. Of course OpenAI is also an irresponsible company to enable such a powerful technology without adequate warnings. With each chatGPT response they should provide citations (like Google does) and provide a clearly visible disclaimer that what it just spewed may be utter BS.
I only hope the judge passes an anecdotal order for all AI companies to include the above mentioned disclaimer with each of their responses.
The remedy here seems to be expecting lawyers to do their jobs. Citations would be nice but I don’t see a reason to legislate that requirement, especially from the bench. Let the market sort this one out. Discipline the lawyers using existing mechanisms.
> Judge Castel said in an order that he had been presented with “an unprecedented circumstance,” a legal submission replete with “bogus judicial decisions, with bogus quotes and bogus internal citations.” He ordered a hearing for June 8 to discuss potential sanctions.
There's no possible adequate warning for the current state of the technology. OpenAI could put a visible disclaimer after every single answer, and the vast majority would assume it was a CYA warning for purely legal purposes.
I have to click through a warning on ChatGPT on every session, and every new chat comes primed with a large set of warnings about how it might make things up and please verify everything.
It's not that there aren't enough disclaimers. It just turns out plastering warnings and disclaimers everywhere doesn't make people act smarter.
It's not focused on a point, it's focused on an area, and while it will be hot it won't be unbelievably hot. See this explanation which is far better than I can explain- https://what-if.xkcd.com/145/
I think this what-if has serious issues in its argumentation (though I agree with the conclusion):
The argument about reversibility is kind of a straw man: it answers "why can't you concentrate all light on a single point" while the real question would be "why can't you concentrate light on a smaller surface" (which you can do actually).
Similarly for the conservation of étendue: maybe you can't "swoosh the light rays closer together" but that also doesn't say you can't concentrate beams on a small surface, which might be sufficient to start a fire.
So really it all comes down to the thermodynamic argument, which has its own problem: it only works if you assume that moonlight has the same temperature as the moon. There's nothing in the article that mentions or justifies this assumption. And obviously a mirror can reflect light that is much warmer than itself, so you definitely have to explain why that's not the case with the Moon (e.g. its albedo and heat dissipation are too low).
(However I love the drawing of the encircling sun, it's great to make the point that no matter how much light you concentrate, it won't heat the body warmer than the light's temperature).
It's a great explanation, but a "spherical chickens in a vacuum" textbook explanation. Add in realworld atmospheric effects (diffusion) and non-lens things like the internalized reflections of fiber optics and the infinite reversability of optical systems breaks down. Then look to how easy it is to start a "fire" in some substances and the moonlight-to-fire concept becomes less difficult. Greenhouse effects also throw a wrench at reversability.
why is it so common for people to think psychedelics are opening us up to what is ""real"" or some deeper truth
Because the very changes in brain chemistry you're talking about affect your ability to determine whether something is real or not.
Many psychedelics cause Apophenia, Pareidolia and other disorders. In loose terms, the ability of your brain to determine what information is meaningful or not is affected and you begin to over interpret the meaningfulness of things that happen and your own thoughts. So your idle thought becomes some earth-shattering revelation.
Something similar can happen to people with Schizophrenia. They'll be walking down a street and see a license plate XUE-383 and the function of the brain that tells whether something is meaningful or not misfires, telling them that this is very important. The brain then tries to explain why this is incredibly important, hence hidden messages, spies, all sorts of things.
You can of course be on drugs or schizophrenic and understand this is what is happening but it's hard to override an emotion with logic.
What they're talking about is a very active debate in Australia.
The simplified and (generally) commonly accepted narrative is that a cohesive group arrived here 45-60,000YA, lived peacefully and in harmony with the land right up until Europeans arrived
Is absolutely what a lot of, if not the majority of people, believe.