A path to O1 open source

mrayycombi · 2025-01-04T00:24:07 1735950247

From the first few paragraphs it doesn't pass the sniff test for me.

"Now AI has made everything more complex!" "AI is embedded in everything we do"...

Sounds like marketing gibberish and obfuscation, combined with self promotion.

That's just my read at first sniff.

xvector · 2025-01-04T01:33:42 1735954422

This is absolutely a worthless fluff paper

gone35 · 2025-01-04T20:05:50 1736021150

I disagree. I found the review useful.

behnamoh · 2025-01-04T02:02:04 1735956124

flagged it. more people should flag this kinda stuff.

pizza · 2025-01-04T02:18:19 1735957099

"Doesn't pass my sniff test" is not the purpose of the flag button. Furthermore, it passes my personal sniff test: hundreds of people upvoting it while the top comment is saying it's worthless. Usually the real alpha is in the comments under such things.

mrayycombi · 2025-01-04T02:43:48 1735958628

I didn't flag it, I flamed it.

Seems like it stunk enough for others to flag it. Lol.

kcorbitt · 2025-01-04T01:17:12 1735953432

Lots of folks working on open-source reasoning models trained with reinforcement learning right now. The best one atm appears to be Alibaba's 32B-parameter QwQ: https://qwenlm.github.io/blog/qwq-32b-preview/

I also recently wrote a blog explaining how reinforcement fine-tuning works, which is likely at least part of the pipeline used to train o1: https://openpipe.ai/blog/openai-rft

HappMacDonald · 2025-01-04T03:33:37 1735961617

I don't know if I would call it "the best one" when it has "How many r in strawberry" as one of its example questions and when tried it arrives at the answer "two".

xoxosc · 2025-01-03T23:14:03 1735946043

Is it me or the first link on the site http://ability.openai is broken? Is openai a tld now?

rrr_oh_man · 2025-01-03T23:47:04 1735948024

That seems, ironically, to be an AI proofreader mistake.

The correct URL is https://cdn.openai.com/o1-system-card-20241205.pdf , at least according to https://arxiv.org/html/2412.14135v1 (which contains typos that OP's submissions doesn’t).

benatkin · 2025-01-04T00:11:44 1735949504

They should correct the name to closedai before they get the TLD, since there's an administrative cost to these TLDs.

All of them are here though. No .openai. https://www.iana.org/domains/root/db

.open is among the worst. :/

> Purchasing a .open domain name isn't available to the general public. This particular extension is owned by American Express and currently isn't for sale or open to registration, limiting its use to only selected entities associated with American Express. It's primarily designed to serve the interests of the corporation and its customers' claims or needs.

https://tld-list.com/tld/open - Who is able to buy a .open domain name?

zb3 · 2025-01-04T00:49:59 1735951799

.open is as open as OpenAI so I guess it's a good fit :)

0xml · 2025-01-05T12:52:27 1736081547

This is a feature of arxiv that automatically converts text looks like a link into "this http url". The submitter missed the space after the "." in "...strong reasoning ability. OpenAI has claimed...".

https://info.arxiv.org/help/prep.html#abstract-required

mdaniel · 2025-01-03T23:41:30 1735947690

whois says no, and it seems there's a close one of https://tld-list.com/tld/open that's owned by (strangely enough) American Express Travel. It could be yet another tpyo of foo.open.ai which would work today, no global TLD required (I mean, they have damn near unlimited money, just buy out whoever owns it now)

rrr_oh_man · 2025-01-03T23:52:55 1735948375

I don’t think anyone who upvoted this has read more than one sentence of this paper.

ricardobeat · 2025-01-04T00:09:13 1735949353

It’s interesting even if not true or correct.

You could also choose to enrich the discussion by elaborating on why you think this is worthless instead.

nickvec · 2025-01-04T01:59:55 1735955995

I have a hard time giving worth to a paper whose first sentence fails to spell intelligence correctly.

pizza · 2025-01-04T03:41:41 1735962101

In a mathematical conversation, someone suggested to Grothendieck that they should consider a particular prime number. “You mean an actual number?” Grothendieck asked. The other person replied, yes, an actual prime number. Grothendieck suggested, “All right, take 57.”

Might I suggest Postel's Law?

melvinmelih · 2025-01-04T00:26:36 1735950396

This paper has been available for a few weeks, and I wrote an article [1] exploring how to apply its inner workings to the design of multi-agent systems. If you can design "reasoning" at the model level, you can also design "reasoning" in larger, more complex systems using the same principles.

https://melvintercan.com/p/lessons-from-reasoning-designing

jeanlucas · 2025-01-03T22:56:22 1735944982

Looks like China will lead the next generation of open source tech.

curious_cat_163 · 2025-01-03T23:05:05 1735945505

And be a lot more efficient (e.g. DeepSeek-V3) [1] about it...

[1] https://arxiv.org/pdf/2412.19437v1

elorant · 2025-01-03T23:51:21 1735948281

How is China training models without access to cutting edge GPUs?

ryao · 2025-01-04T00:33:56 1735950836

They have access to cutting edge GPUs via rentals:

https://www.msn.com/en-us/money/markets/bytedance-plans-to-s...

HarHarVeryFunny · 2025-01-04T15:33:00 1736004780

- using non-cutting edge GPUs (just more of them)

- creating more efficient models such as MoE based DeepSeek

- getting their hands on cutting edge GPUs all the same

I think it was Dylan Patel (from semianalysis) on Dwarkesh that mentioned one scam is for a Chinese source to arrange for a SOTA NVidia cluster to be bought/installed in some non-embargoed country, then dismantled and shipped to China.

pixelesque · 2025-01-04T00:16:42 1735949802

They're being more efficient about it by the looks of things, rather than brute-forcing things...

https://x.com/karpathy/status/1872362712958906460

rajamaka · 2025-01-04T00:37:22 1735951042

The same way drug users access illegal drugs.

kristjansson · 2025-01-04T00:01:20 1735948880

Pretty easily, it turns out?

arthurcolle · 2025-01-04T02:04:34 1735956274

very carefully

enos_feedler · 2025-01-03T23:08:30 1735945710

Yep, including HarmonyOS NEXT.

basilgohar · 2025-01-03T23:35:31 1735947331

It is not open source.

"HarmonyOS NEXT (Chinese: 鸿蒙星河版; pinyin: Hóngméng Xīnghébǎn) is a proprietary distributed operating system and a major iteration of HarmonyOS, developed by Huawei to support only HarmonyOS native apps."

[0] https://en.wikipedia.org/wiki/HarmonyOS_NEXT

skissane · 2025-01-03T23:54:11 1735948451

HarmonyOS NEXT is based on an open source core, OpenHarmony [0], with proprietary additions.

So, not hugely dissimilar from iOS (lots of bits of which are open source, most significantly the core of its XNU kernel) and Android (considering that the proprietary Google Mobile Services is de facto a mandatory component)

[0] https://en.wikipedia.org/wiki/OpenHarmony

lern_too_spel · 2025-01-04T00:56:55 1735952215

The Taco Bell kiosk and your exercise bike don't need the "mandatory" GMS.

mdaniel · 2025-01-03T23:43:58 1735947838

Haven't you heard? Open Source now means whatever businesses want it to mean so long as it gets eyeballs or free labor :-(

enos_feedler · 2025-01-04T00:37:09 1735951029

I am speculating it will become open source. or at least supplant Android's role in the global smartphone ecosystem.

mtkd · 2025-01-04T00:15:41 1735949741

There was a Berman video on it earlier today that summarises it

https://www.youtube.com/watch?v=-haWhgmUheA

Detail starts about 7mins in

fuddle · 2025-01-03T23:00:01 1735945201

It's ironic that they are attempting to open source a model from "OpenAI".

echelon · 2025-01-03T23:48:56 1735948136

There is no moat for OpenAI.

They spent a lot of money to find a lot of shallow gradients. Everyone else can climb those same gradients by putting in a little bit of money. Every single funded vertical and research org is proving this. Players in third place and below are incentivized to release their weights to develop an ecosystem around them. Meta and Tencent get to ensure the technology doesn't evolve beyond them by commoditizing their compliment and releasing stuff like Llama and Hunyuan for free.

Furthermore, OpenAI hasn't stumbled across a defensible moat. There's zero switching cost to move to another product, and they don't control any major panes of glass to stay as a default.

If OpenAI doesn't find a moat soon, they're gonna be cooked. The value of foundation models will plummet.

bugglebeetle · 2025-01-04T00:17:39 1735949859

They won’t because they drove away all their actually talent to Anthropic and elsewhere in pursuit of the dumbest version of SV product dev and are now forced to do benchmark hacking in a paper thin ruse to convince the market they still have the talent to compete. The o1 series models are unusable in practice while Claude and the new MCP protocol work is becoming the basis of a bunch of actually functional applications.

FanaHOVA · 2025-01-03T23:47:17 1735948037

I wish HN would stop devolving into Reddit. This comment is the same boring "joke" that has been repeated 100 times on every platform, and keeps being posted for karma. It adds nothing to the conversation.

veggieroll · 2025-01-03T23:50:09 1735948209

> Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills. [0]

But also, if you want people to stop mocking "Open" AI, then maybe they should stop being such a mockable caricature of themselves.

[0]: https://news.ycombinator.com/newsguidelines.html

FanaHOVA · 2025-01-04T11:24:13 1735989853

If you joined 8 months ago it might be hard to recognize. I've been on HN for more than a decade and the quality of discourse has drastically lowered in quality especially in the last 3-4 years. This is a problem with the broader web, not just HN. Tech / startups is now a mainstream topic that attracts a lot of people who are not really in the weeds and are just able to write surface level comments.

Regarding the name, open is just a word. Apple doesn't sell apples. The company never promised to open source every model, only to make them accessible to the public, so you're arguing semantics that lead to no improvement in the technical conversation.

WiSaGaN · 2025-01-03T23:50:17 1735948217

I think it adds correct incentive to the public discourse so that you don't do bait and switch against public good will.

kristjansson · 2025-01-04T00:00:34 1735948834

100% agree. If you’ve been here for any length of time you’ve seen it, and nothing is added by the repetition.

Perhaps we should just string-sub to IAnepO or some such, so we can engage with the models and company as it is, without dealing with the (empty) semantics of the name.

mikkom · 2025-01-03T23:28:43 1735946923

"Open"ai really should change their name

pizza · 2025-01-04T01:19:38 1735953578

open, perhaps, in the sense of an open set

halayli · 2025-01-04T01:02:25 1735952545

> has claimed that the main techinique behinds o1 is the reinforcement learining.

Typos in the first sentence of the paper doesn't give confidence that I am about to read something worthwhile.

lgessler · 2025-01-04T01:28:34 1735954114

I think this is both a harmful and irrational attitude. Why focus on some trivial mechanical errors and disparage the authors for it instead of the thing that is much more important, i.e., the substance of the work? And in dismissing work for such trivial reasons, you risk ignoring things you might have otherwise found interesting.

In an ideal world would second-language speakers of English proofread assiduously? Of course, yes. But time is finite, and in cases like this, so long as a threshold of comprehensibility is cleared, I always give the benefit of the doubt to the authors and surmise that they spent their limited resources focusing on what's more important. (I'd have a much different opinion if this were marketing copy instead of a research paper, of course.)

dTal · 2025-01-04T14:37:01 1736001421

>in dismissing work for such trivial reasons, you risk ignoring things you might have otherwise found interesting

Not dismissing work for trivially avoidable mistakes risks wasting your precious, limited lifespan investing effort into nonsense. These signals are useful and important. If they couldn't be bothered to proofread, what else couldn't they be bothered to do?

>spent their limited resources focusing on what's more important

Showing that you give a crap is important, and it takes seconds to run through a spell checker.

HarHarVeryFunny · 2025-01-04T02:34:16 1735958056

Well, it not exactly a research paper, more an overview of the problem and suggested techniques, but it'd still be interesting to hear some criticism based on the content rather than the (admittedly odd) omission to run it through a spell checker. I do wonder why it was written in English, apparently targeting a western audience.

Two of the authors are from "Shanghai AI Labs" rather than students, so one might hope it had at least been proofread and passed some sort of muster.

DiscourseFan · 2025-01-04T01:12:56 1735953176

At least you know a person wrote it

newyankee · 2025-01-03T23:01:28 1735945288

I guess now the strategy of OpenAI would be to keep the small edge all the time, integrate it with businesses fast & possibly kickstart new businesses by supporting them and trying to be synonymous with the best in AI (may be with Deepmind). I cannot think of any other moat, unless somehow they have a lot of proprietary and useful data (like in company) that others cannot replicate

visarga · 2025-01-04T00:48:43 1735951723

But that edge will become more and more expensive, while the competition will cover more and more of the task space and make it less profitable for OAI

currymj · 2025-01-04T02:59:52 1735959592

many people are dismissing this paper because it has errors in spelling and grammar.

this is a terrible heuristic for evaluating AI papers. If you use it, you will miss a lot of good work by very strong researchers with below-average English writing skills.

I have not read this paper carefully so claim nothing one way or the other about its quality. It superficially seems like a pleasant and timely survey although a little flag-planty.

Oras · 2025-01-03T23:17:02 1735946222

First line in the abstract

> OpenAI o1 represents a significant milestone in Artificial Inteiligence,

Inteiligence

Safe to say OpenAI has nothing to worry about

d-lisp · 2025-01-04T00:01:32 1735948892

I cannot help with the fact that it sounds like a bad strategy to claim this is a good reason "not to worry" about something. If I were "OpenAI" I would rather read the content than evaluate the form of such articles to know if I should "worry" or not. It seems like the most Inteiligent method.

yencabulator · 2025-01-03T23:21:25 1735946485

It's amazingly bad.

> the main techinique behinds o1 is the reinforcement learining.

bn-l · 2025-01-03T23:37:03 1735947423

I prefer (maybe intentional) spelling mistakes to ai generated drone and verbosity.

HarHarVeryFunny · 2025-01-03T23:18:47 1735946327

Sounds about right.

How long until we see DeepSeek-o1 ?

elashri · 2025-01-03T23:21:07 1735946467

There is DeepSeek-R1 where they have R1-lite preview version available for testing on their chat website.

HarHarVeryFunny · 2025-01-03T23:39:58 1735947598

Have they released anything on how it works other than "test time compute"? I wonder how similar it is to what's being proposed on this roadmap, that sounds close to what I imagine OpenAI are doing. I guess we'll see when they open source it.

punnerud · 2025-01-04T00:05:42 1735949142

Not one word in the article about using embeddings for the reasoning, a bit strange?

mmaunder · 2025-01-04T00:13:11 1735949591

No. O1 doesn’t do RAG.

chefandy · 2025-01-03T23:15:42 1735946142

It seems to me that for the most capable and useful models, openness almost exclusively benefits businesses, or maybe academic organizations with money for serious hardware. I know what I can run on my 4090 at home but the results pale in comparison to the commercial services. I see why people consider these matters important from a theoretical standpoint but from a practical standpoint it doesn’t seem particularly consequential. I self-host a few FOSS server applications that are primarily sold as SaaS subscriptions, and folks are often very critical of those businesses benefitting from the “open source” label because they’re often seemingly deliberately difficult to self-host. This seems to be an order of magnitude less open than that. Is there some use case for people with reasonable hardware that I’m just not aware of?

lukeschlather · 2025-01-03T23:50:06 1735948206

I look at this as being for the reasonable hardware of the future. This is starting to look like actual AGI, and I don't think actual AGI is going to run on a 4090. But an H100 starts to sound like a mass-market product even with the $50k price tag if it actually can run an AGI.

ripped_britches · 2025-01-04T00:30:09 1735950609

You can get cheaper hosting of open weights from a commercial provider than you can closed weights from the same company. So even if you’re not hosting yourself, openness is a major factor for price competitiveness.

jbegley · 2025-01-03T23:11:09 1735945869

lmao at “learining” being misspelled in the first sentence

mdaniel · 2025-01-03T23:14:54 1735946094

In these times, how else does one expect to advurtise that theeir text was not geeenerated by an LLM?

Oarch · 2025-01-03T23:16:55 1735946215

Excessive profanity could be a fun way to prove human authors!

yencabulator · 2025-01-03T23:23:33 1735946613

CUSSTCHA

Curse-Using Social Scoring Turing test to tell Computers and Humans Apart

jcims · 2025-01-03T23:46:14 1735947974

That's hilarious, and sorry, I can't help myself.

Curse-Using Social Scoring Turing Assessment to Tell Computers and Humans Apart

CUSSATCHA

BlueTemplar · 2025-01-04T00:37:49 1735951069

Bless you.

mdaniel · 2025-01-03T23:32:31 1735947151

Oh, fuckin' A, I love this shit!

asddubs · 2025-01-03T23:33:47 1735947227

or just pepper in some erotic pictures of copyrighted characters

int_19h · 2025-01-04T03:39:45 1735961985

LLMs are trained, among other things, on Internet forums. Creative swearing is something they can do surprisingly well.

saikia81 · 2025-01-03T23:12:32 1735945952

and later on again

mig39 · 2025-01-03T23:20:52 1735946452

Is this a joke? Why are there so many spelling mistakes? "Inteiligence" "challanging" “learining”

What is going on?

melvinmelih · 2025-01-03T23:22:08 1735946528

> Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Bo Wang, Shimin Li, Yunhua Zhou, Qipeng Guo, Xuanjing Huang, Xipeng Qiu

I'm guessing English isn't their first language.

mdaniel · 2025-01-03T23:36:21 1735947381

Modern times: a bunch of matrices of floating point numbers can chat with a human

but also: no one knows what the red squiggle marks mean in any textarea anymore because they no longer use them to write anything

greenchair · 2025-01-04T01:55:11 1735955711

and too lazy to run spell check

mig39 · 2025-01-03T23:23:43 1735946623

"Hey, ChatGPT, please correct any typos or spelling mistakes in this abstract."