SMT allows for concurrent execution of both threads (thus independent front-end for fetch, decode especially) and certain core resources are statically partitioned unlike a warp being scheduled on SM.
I'm not a graphics expert but warps seem closer to run-time/dynamic VLIW than SMT.
In actual implementation they are very much like very wide SIMD on a CPU core.
Each HW thread is a different warp as each warp can execute different instructions.
This mapping is so close that translation from GPU to CPU relatively easy and performant.
You are correct. I was just illustrating what kind of processes belong to the umbrella term "packaging" in the context of semiconductor manufacturing. Was not talking about what particular process are missing from the Arizona facility.
But you are right on that it is CoWoS which is the missing ingredient.
Or the 90% are charged absurd markup because clearly they can deliver the hardware for 10% use-case for 1000$ and still make money on top but they would rather charge the data centers 50k for the same product
There are investors lined up around the block to hand over $$$$$ for SFHs on the market while people delay family making because they cannot afford housing or end up on the streets.
So, is the market really absurd?
Just because some billionaires are desperate for growth to grow their hundred billions into trillions outbidding each other does not mean that 90% of humanity cannot make use of ML running locally on cheaper GPUs.
Different markets, similar semantics. Both are artificially supply restricted.
Infact you can argue that something is really wrong in our governance if housing is human right and yet there are people profiteering from how unaffordable it has become.
I am more appalled at how long it has taken for the big tech other than Google to standardized ML workload and not be bound by CUDA.
Or they can't afford to sell the cards at consumer prices. If they take a loss in the consumer segmet, they can recoup by overcharging the datacenter customers.
That's how this scheme works. The card is most likely not profitable at consumer price points. Without this segmentation, consumer cards would trail many years behind the performance of datacenter cards.
You can theorize a million scenarios, but clearly no one here will know what really transpired for Nvidia to hobble their consumer chips. I really don't think consumer is loss leading, GPUs for AI is a fairly recent market while Nvidia has existed churning out consumer GPUs since 90s.
But clearly, lack of competition is one thing that supports whatever rent Nvidia seeks.
Typical EVs are about 10-15% heavier than the comparable ICE. Yes it makes a difference, but only marginally. Normal drivers without a lead foot get 10s of thousands of miles from tires, just like ICE vehicles.
Also, few EVs that are not pickups have 100 kWh batteries; more typically 60-75.
Mazda 6 is mid sized, Mazda 3 is compact, whether it has a tech back or not is irrelevant. You are basically comparing a Honda civic to an Honda accord and saying they are the same size.
I don’t think so. Weight is a rule of thumb for damage to the road. Aggressive acceleration is much more of an impact to wear.
All the people I know with EVs love the feeling of gunning it from a red light on when getting onto freeways. Yeah they don’t peel out, but that’s only because the control system stops them.
There may be some kernel interface to allow userspace to toggle that, but that's not the same as being a userspace-accessible SCR (and I also wouldn't expect it to be passed through to a VM - you'd likely need a hypercall to toggle it, unless the hypervisor emulated that, though admittedly I'm not quite as deep weeds on ARMv8 virtualization as I would prefer at the moment.
Hmm, you’re right - maybe my memory serves incorrectly but yeah it seems it is privileged access but the interface is open to all processes to toggle the bit.
Without that kernel support, all processes in the VM (not just Rosetta-translated ones) are opted-in to TSO:
> Without selective enablement, the system opts all processes into this memory mode [TSO], which degrades performance for native ARM processes that don’t need it.
Before Sequoia, a Linux VM using Rosetta would have TSO enabled all the time.
With Sequoia, TSO is not enabled for Linux VMs, and that kernel patch (posted in the last few weeks) is required for Rosetta to be able to enable TSO for itself. If the kernel patch isn't present, Rosetta has a non-TSO fallback mode.
SMT allows for concurrent execution of both threads (thus independent front-end for fetch, decode especially) and certain core resources are statically partitioned unlike a warp being scheduled on SM.
I'm not a graphics expert but warps seem closer to run-time/dynamic VLIW than SMT.
reply