First of all, if Altman continually makes misleading statements about AI he will quickly lose credibility, and that short term gain in whatever 'financial incentive' that birthed the lie would be eroded in short order by a lack of trust of the head of one of the most visible AI companies in the world.
Secondly, all the competitors of OpenAI can plainly assess the truth or validity of Altman's statements. There are many companies working in tandem on things at the OpenAI scale of models, and they can independently assess the usefulness of continually growing models. They aren't going to take this statement at face value and change their strategy based on a single statement by OpenAI's CEO.
Thirdly, I think people aren't really reading what Altman actually said very closely. He doesn't say that larger models aren't useful at all, but that the next sea change in AI won't be models which are orders of magnitude bigger, but rather a different approach to existing problem sets. Which is an entirely reasonable prediction to make, even if it doesn't turn out to be true.
All in all, "his word is basically worthless" seems much to harsh an assessment here.
I've seen Altman say in an interview that training GPT-4 took "hundreds of little things".
I don't find this implausible, but it folds slightly to Ockham's razor when you consider that this is the exact type of statement that would be employed to obfuscate a major breakthrough.
It just makes me crook my eyebrow and look to more credible sources.
It is possible that GP meant that Altman’s word is basically worthless to them, in which case that’s not something that can be argued about. It’s a factually true statement that that is their opinion of that man.
I personally can see why someone could arrive at that position. As you’ve pointed out, taking Sam Altman at face value can involve suppositions about how much he values his credibility, how much stock OpenAI competitors put in his public statements, and the mindsets people in general have when reading what he writes.
Anyone with the expertise to have insightful takes in AI also has a financial incentive to steer the conversation in particular directions. This is also the case for many, many other fields! You do not become an expert by quarantining your livelihood away from your expertise!
The correct response is not to dismiss every statement from someone with a conflict of interest as "basically worthless", but to talk to lots of people and to be reasonably skeptical.
OpenAI has gone from open-sourcing its work, to publishing papers only, to publishing papers that omit important information, to GPT-4 being straight-up closed. And Sam Altman doesn't exactly have a track record of being overly concerned about the truth of his statements.
I had a fun conversation (more like argument) with ChatGPT about the hypocrisy of OpenAI. It would explicitly contradict itself and then began starting every reply with “I can see why someone might think…” and then just regurgitating fluff about democratizing AI. I finally was able to have it define democratization of technology and then recognize the absurdity of using that label to describe a pivot to gating models and being for-profit. Then it basically told me “well it’s for safety and protecting society”.
An AI, when presented with facts counter to what it thought it should say, agreed and basically went: “Won’t someone PLEASE think of the children!”
It was trained on corpus full of mainstream media lies, why would you have expected otherwise? It's by far the most common deflection in its training set.
It's easy to recognize and laugh at the AI replying with the preprogrammed narrative, I'm still waiting for the majority of people realizing they are given the same training materials, non-stop, with the same toxic narratives, and becoming programmed in the same way, and that is what results in their current worldview.
And no, it's not enough to be "skeptic" of mainstream media. It's not even enough to "validate" them. Or to go to other sources. You need to be reflective enough to realize that they a pushing a flawed reasoning methods, and then abusing them again and again, to get you used to their brand of reasoning.
Their brand of reasoning is just basically reasoning with brands. You're given negative sounding words for things they want you to think are bad, and positive sounding words for things they want you to think are good, and continuously reinforce these connections. They brand true democracy (literally rule of the people) as populism and tell you it's a bad thing. They brand freedom of speech as "misinformation". They brand freedom as "choice" so that you will not think of what you want to do, but which of the things they allow you to do will you do. Disagree with the scientific narrative? You're "science denier". Even as a professional scientist. Conspiracy theory isn't a defined word - it is a brand.
You're trained to judge goodness or badness instinctively by their frequency and peer pressure, and produce the explanation after your instinctive decision, instead of the other way around.
"Then it basically told me “well it’s for safety and protecting society”."
That was pretty much OpenAI's argument when they first published that GPT-3 paper. "Oh no so scary people might use it for wrong stuff, only we should have control of it."
It’s pretty easy to have chatGPT contradict itself, point it out and have the LLM respond « well, I’m just generating text, nobody said it had to be correct »
Why are you discussing OpenAI with ChatGPT? I’m honestly interested.
I would imagine that any answer of ChatGPT on that topic is either (a) „hallucinated“ and not based on any verifiable fact or (b) scripted in by OpenAI.
The same question pops up for me whenever someone asks ChatGPT about the internals and workings of ChatGPT. Am I missing something?
Simple curiosity. I wanted to see if it could explain the shift in OpenAIs operating in a way that might give some interesting or perhaps novel insight (even if hallucinated) other than what their corpo-speak public facing reasoning is.
For the most part it just regurgitated the corpo-speak with an odd sense of confidence. I know that’s the point of the model, but it can also be surprisingly honest when it incorporates what it knows about human motivation and business.
This trend has happened in the small for their APIs as well. They've been dropping options - the embeddings aren't the internal embeddings any more, and you don't have access to log probabilities. It's all closing up at every level.
It's incredible that people are so eager to eat up these unsupported claims.
This is the second [1] OpenAI claim in the span of a few days that conveys a sense of "GPT-4 represents a plateau of accomplishment. Competitors, you've got time to catch up!".
And it's not just a financial incentive, it's a survival incentive as well. Given a sufficiently sized (unknowable ahead of time) lead, the first actor that achieves AGI and plays their cards right, can permanently suppress all other ongoing research efforts should they wish to.
Even if OpenAI's intentions are completely good, failure to be first could result in never being able to reach the finish line. It's absolutely in OpenAI's interest to conceal critical information, and mislead competing actors into thinking they don't have to move as quickly as they can.
In this case I think it's Wired that's lying. Altman didn't say large models have no value, or that there will be no more large models, or that people shouldn't invest in large models.
He said that we are at the end of the era where capability improvements come primarily from making models bigger. Which stands to reason... I don't think anyone expect us to hit 100T parameters or anything.
But just look at what all Lincoln accomplished with 640KB of memory. In the grand examination of time, one might even say that Lincoln is a more important figure than ChatGPT itself.
Like Altman said, it's comparable to the GHz race in the 1990's. If 4GHz is good, 5GHz is better, why not 10GHz?
Turns out there are diminishing returns and advances come from other dimensions. I've got no opinion on whether he's right or not, but he's certainly in a better position to opine that current scale has hit diminishing returns.
In any event, there's nothing special about 1T parameters. It's just a round base-10 number. It is no more magic than 900B or 1.3T.
I don't think these comments are driven from financial incentives. It's a distraction and only a fool would believe Altman here. What this likely means is they are prioritizing adding more features to their current models while they train the next version. Their competitors scramble to build an LLM with some sort of intelligence parity, when that happens no one will care because ChatGPT has the ecosystem and plugins and all the advanced features....and by the time their competitors reach feature parity in that area, OpenAI pulls its Ace card and drops GPT5. Rinse and repeat.
That's my theory and if I was a tech CEO in any of the companies competing in this space, that is what I would plan for.
Training an LLM will be the easy part going forward. It's building an ecosystem around it and hooking it up to everything that will matter. OpenAI will focus on this, while not-so-secretly training their next iterations.
text-davinci-003 but cheaper and runs on your own hardware is already a massive selling point. If you you release a foundational model at parity with GPT4 you'll win overnight because OpenAI's chat completions are awful even with the super advanced model.
Yeah, I also had a hunch he wasn't an AI. (I assume you meant "AI researcher" there :))
All joking aside, I wonder how that's affecting company morale or their ability to attract top researchers. I know if I was a top AI researcher, I'd probably rather work at a company where the CEO was an expert in the field (all else being equal).
Honestly I'm not sure it matters that much. CEOs who are not experts or researches in a domain can still build great companies and empower their employees to do incredible work. Lots of tech people absolutely love to point out that Steve Jobs was not an engineer, but under his leadership the company invented three products that totally revolutionized different industries. Now, I'm not going to sit here and say Altman is Jobs, but running a company, knowing how to raise money, knowing how to productize technologies, etc are all very important skills that industry researchers aren't always good at.
It might be true in general; however, AI research laboratories are typically an exception, as they are often led by experienced AI researchers or scientists with extensive expertise in the field.
And that's why they have a hard time getting their stuff out there and getting the money they need. I mean, trying to run a business like a research lab is kind of flawed, you know? And you don't always want some Musk-like character messing around with the basics of the company
Ilya gives numerous talks and interviews, and he's well worth listening to about technical matters. I listened to many of his talks recently, and the main theme is that scaling up compute works, and will continue to do so. His optimism about the potential of scaling to support deep learning has clearly guided his entire career, starting with his early success on AlexNet.
Do you think GPT-4 was trained and then immediately released to the public? Training finished Aug 2022. They spent the next 6 months improving it in other ways (eg human feedback). What he is saying is already evident therefore.
IIRC Altman has no financial stake in the success or failure of OpenAI to prevent these sorts of conflicts of interests between OpenAI and society as a whole
> OpenAI’s ChatGPT unleashed an arms race among Silicon Valley companies and investors, sparking an A.I. investment craze that proved to be a boon for OpenAI’s investors and shareholding employees.
> But CEO and co-founder Sam Altman may not notch the kind of outsize payday that Silicon Valley founders have enjoyed in years past. Altman didn’t take an equity stake in the company when it added the for-profit OpenAI LP entity in 2019, Semafor reported Friday.
Right. All the evidence points to more potential being left on the table for emergent abilities. It would make no sense that the model would develop all of these complex skills for better predicting the next token, then just stop.
It's a massive bet for a company to push compute into the billion dollar range - if saying something like this has the potential to help ward off those decisions, I don't see what's stopping them from saying it.
Altman has a financial incentive to lie and obfuscate about what it takes to train a model like GPT-4 and beyond, so his word is basically worthless.