Unfortunately there's not really any competition in the GPU space for AI use. AM...

whoopdedo · on Aug 30, 2023

This is where WSL pulls its mask off. Windows will become (even more so) the platform of choice by being a "more compatible" version of Linux than Linux. And AI developers don't seem to have much affinity for open-source. Why would they care about being stuck on Microsoft and Nvidia's proprietary platforms when they're building their own proprietary models?

seanw444 · on Aug 30, 2023

Great point. WSL has always been a wolf in sheeps clothing, in my eyes.

I'm not saying it's not useful. It certainly fills a role people clearly wanted filled. But you'd be naive to think Microsoft isn't taking on an "embrace, extend, extinguish" mentality. Again.

whoopdedo · on Aug 30, 2023

Also, I'm sure AI devs would be all for making it more difficult to run models on the desktop so you'd be forced to use a cloud service. They could even keep using a non-GPL patched Linux on their servers because it's not like they're distributing anything.

kkielhofner · on Aug 30, 2023

One word: Apple.

With Apple's significant work from hardware on up with AI there's a very near future where many AI applications will train models on a Mac[0] directly in XCode[1], bundle it with the iOS/Mac OS application (and browsers), and run inference directly on the device. Apple will also be deploying more and more of their own foundational models shipped in the hardware/OS. Most user AI applications will be as simple as "call this coreml API". With the links provided it could be said we're already there and they've been laying the groundwork for years.

This hits Nvidia and big GPU cloud providers two fold:

1) Training. We already see with the unified memory architecture of Apple Silicon that Apple can provide accelerated training with RAM capable of training/finetuning very large models[2]. 800GB/s bandwidth memory (192 GB!!!) is very close to the RTX 4090 (1TB/s). Also note how many PCIe X16 slots the Mac Pro has... You can be certain that follow up Apple Silicon will eat closer and closer to Nvidia datacenter GPU performance/capability and with Apple's resources potentially even surpass it for all but things like training massive LLM models from scratch a la Google, Meta, OpenAI, etc.

2) Inference. Why pay AWS or whoever to run your inference when users have already spent their money on hardware that does it faster (because network latency) and more privately on the devices they already have? iOS 17 already does some AI tuning (FINALLY) on device with auto-correct - expect to see this show up everywhere in iOS and iOS applications.

I don't make many predictions but this one is obvious. I'm convinced the next generations of Apple Silicon and the Apple software ecosystem, developer experience, etc will be the first very real threat to Nvidia. When this really gets going Intel and AMD will be permanently left at the starting line (where they are now) for all but very bespoke use cases and deployments like Frontier[3].

For people who take issue with MacOS look at the progress of Asahi...

Then, once Google/Android catches up things will get REALLY interesting. This is the future and the current grab for Nvidia GPUs and clouds will eventually look like crypto mining - a few great years to be capitalized on when the getting was good.

[0] - https://coremltools.readme.io/docs

[1] - https://coremltools.readme.io/docs/xcode-model-preview-types

[2] - https://www.apple.com/mac-pro/

[3] - https://www.amd.com/en/products/frontier

blackoil · on Aug 30, 2023

> 800GB/s bandwidth memory (192 GB!!!) is very close to the RTX 4090

40xx is not the competition. DGX H100 has 2TB of memory, H100 NVL has 2x4TB bandwidth and are selling as fast as they can be made at any price they ask.

kkielhofner · on Aug 30, 2023

I quoted the 4090 because this is the Mac Pro on the second gen of Apple Silicon in a desktop/workstation. In a few years Apple has nearly caught up to the flagship desktop GPU from Nvidia - who has been at this former niche for 15 years. 192GB is already enough RAM to finetune very large LLMs and we will be seeing more memory from Apple down the road.

192GB is 4 RTX A6000s or 2 A100/H110s ($20k just in GPUs)... A completely maxed out Mac Pro (with wheels and fancy mouse) is under $13k. Oh yeah and you can actually buy them.

The M3, M4, and beyond will have TFLOPS and RAM jumping by leaps and bounds. CoreML will start chipping at CUDA developer mindshare from academia on up - you can already run LLMs at more than reasonable speed on a $2000 Mac Book you already have laying around. Expect to see Mac Pros showing up in AI labs at universities and course work where every student just buys a Mac. Of course at this level it's mostly higher level frameworks anyway so transferrable knowledge to CUDA powered whatever.

As I noted, the big stuff from Nvidia will own the very high end (especially for training) for the foreseeable future and there's plenty of money to be made there as you noted.

But Apple is going to take big chunks out of Nvidia on the hardware side and a host of startups from the software/SaaS side. For hardware less demand on the low-mid side and less demand on the inference hosting side even for the big players.

For software leveraging foundational Apple models, dev tools, etc will make it so that any app developer can drop data in Xcode to tune/build a model for their app. Apps will be able to further finetune with user data on device.

See the links I provided - this is here (in beta) today.