People have been saying that GPUs randomly happened to be good at ML since at le...

vlovich123 · 2024-06-06T18:44:06 1717699446

The history page for CUDA is pretty accessible [1]. It originated from experiments at Stanford in 2000 with Ian taking on leadership of CUDA development in 2004.

> In pushing for CUDA, Jensen Huang aimed for the Nvidia GPUs to become a general hardware for scientific computing. CUDA was released in 2006. Around 2015, the focus of CUDA changed to neural networks.[8]

Credit to Jensen for pivoting, but I recall hearing about CUDA networks from Google tech talks in 2009 and realizing they would be huge. It wasn't anything unique to realize NNs were a huge innovation but it did take another 5 years for it to mature enough and for it to become clear that GPUs could be useful for training and whatnot. Additionally, it's important to remember that Google had a huge early lead here & worked closely with Nvidia since CUDA was much more mature than OpenCL (due to intentional sabotage or otherwise) and Nvidia's chips satisfied the compute needs of that early development.

So it was more like Google leading Nvidia to the drinking well and Nvidia eventually realizing it was potentially an untapped ocean and investing some resources. Remember, they also put resources behind cryptocurrency when that bubble was inflating. They're good at opportunistically taking advantage of those bubbles. It was also around this time period that Google realized they should start investing in dedicated accelerators with their TPUs because Nvidia could not meet their needs due to lack of focus (+ dedicated accelerators could outperform) leading to the first TPU being used internally by 2015 [2].

Pretending like Jensen is some unique visionary seeing something no one else in the industry didn't is insane. It was a confluence of factors and Jensen was adept at navigating his resources to take advantage of it. You can appreciate Nvidia's excellence here without pretending like Jensen is some kind of AI messiah.

[1] https://en.wikipedia.org/wiki/CUDA

[2] https://en.wikipedia.org/wiki/Tensor_Processing_Unit

bcatanzaro · 2024-06-06T23:54:40 1717718080

I was at NVIDIA at that time working on ML on GPUs. Jensen is indeed a visionary. It’s true as you point out that NVIDIA paid attention to what its customers were doing. It’s also true that Ian Buck published a paper using the GPU for neural networks in 2005 [1], and I published a paper using the GPU for ML in 2008 while I did my first internship at NVIDIA [2]. It’s just not true that NVIDIA’s success in AI is all random chance.

[1] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1575717 [2] https://dl.acm.org/doi/pdf/10.1145/1390156.1390170

talldayo · 2024-06-06T17:59:28 1717696768

Yeah, you really get this sense when you page through their research history on the topic: https://research.nvidia.com/publications?f%5B0%5D=research_a...

CUDA won big because it made a big bet. Were it that OpenCL was ubiquitous and at feature-parity with CUDA, maybe there would be more than one player dealt-in at the table today. But everyone else folded while Nvidia ran the dealer for 10 long years.