To me it makes sense to have an interface that can be implemented individually f...

jorvi · 2024-04-01T12:22:54 1711974174

Apple could unlock so much compute if they give customers a sort of “Apple@Home” deal. Allow Apple to run distributed AI workloads on your mostly idle extremely overpowered Word/Excel/VSCode machine, and you get compensation dropped straight into your Apple account’s linked creditcard.

TuringNYC · 2024-04-01T13:45:02 1711979102

BTW, at our day-job, we've been running a "cluster" of M1 Pro Max machines running Ollama and LLMs. Corporate rules prevent remote access onto machines, so we created a quick and dirty pull system where individual developers can start pulling from a central queue, running LLM workloads via the Ollama local service, and contributing things back centrally.

Sounds kludge, but introduce enough constraints and you end up with this as the best solution.

nickpsecurity · 2024-04-01T18:13:50 1711995230

Do you have price-performance numbers you can share on that? Like compared against local or cloud machines with RTX and A100 GPU’s?

TuringNYC · 2024-04-01T18:39:24 1711996764

>> Do you have price-performance numbers you can share on that? Like compared against local or cloud machines with RTX and A100 GPU’s?

Good question, the account is muddy --

1. Electricity is a parent company responsibility, so while that is a factor in OpEx price, it isnt a factor for us. I dont think it even gets submetered. Obviously, one wouldnt want to abuse this, but maxing out Macbooks dont seem close to abuse territory

2. The M1/M2/M3 machines are already purchased, so while that is major CapEx, it is a sunk cost and also an underutilized resource most of the day. We assume no wear and tear from maxing out the cores, not sure if that is a perfect assumption but good enough.

3. Local servers are out of the question at a big company outside of infra groups, it would take years to provision them and I dont think there is even a means to anymore.

The real question is cloud. Cloud with RTX/A100 would be far more expensive, though I'm sure performant. (TPM calculation left to the reader :-) I'd leave those for fine tuning, not for inference workloads. Non-production Inference is particularly bad because you cant easily justify reserved capacity without some constant throughput. If we could mix environments, it might make sense to go all cloud on NVIDIA but having separate environments with separate compliance requirements makes that hard.

Jokes aside, I think a TPM calculation would be worthwhile and perhaps I can do a quick writeup on this and submit to HN.

newswasboring · 2024-04-01T12:59:01 1711976341

If Apple were doing an Apple@Home kind of deal they might actually want to give away some machines for free or super cheap (I realize that doesn't fit their brand) and then get the rights perpetually to run compute on them. Kind of like advertising but it might be doing something actually helpful for someone else.

TuringNYC · 2024-04-01T18:42:45 1711996965

>> If Apple were doing an Apple@Home kind of deal they might actually want to give away some machines for free or super cheap

In such a case, my guess is that the machines being free would be trumped by the increased cost of electricity.

passion__desire · 2024-04-03T15:11:48 1712157108

I think in future, it is possible for homes to have "compute wall" similar to powerwall of tesla. Each home has a wifi router why not a compute wall for their needs.