Indeed - the 'dynamic' comes from 'dynamic logic'. Wikipedia: "It is distinguished from the so-called static logic by exploiting temporary storage of information in stray and gate capacitances." What Dennard realised was that you don't actually need to have a separate capacitor to hold the bit value - the bit value is just held on the stray and gate capacitance of the transistor that switches on when that bit's row and column are selected, causing the stray capacitance to discharge through the output line.
Because of that, the act of reading the bit's value means that the data is destroyed. Therefore one of the jobs of the sense amplifier circuit - which converts the tiny voltage from the bit cell to the external voltage - is to recharge the bit.
But that stray capacitance is so small that it naturally discharges through the high, but not infinite, resistance when the transistor is 'off'. Hence, you have to refresh DRAM, by regularly reading every bit frequently enough that it hasn't discharged before you got to it. Usually you might only need to read every row frequently enough, because there's actually a sense amplifier for each column, reading all the bit values in that row, with the column address strobe just selecting which column bit gets output.
Yes, it totally misses the crucial and non obvious trade off which unlocked the benefits. The rest of the system has to take care of periodically rewriting every memory cell so that the charge doesn't dissipate.
In fact it took a bit of time for the CPUs or memory controllers to do it automatically, i.e. without the programmer having to explicitly code the refresh.
Static RAM ( based on how it is used) never needs to be refreshed at current typical computer power on times (hours or days ) . Current DRAM must be refreshed at very much faster rates to be able to be useful.
Because SRAM is essentially a flipflop gate. It takes at least four transistors to store a single bit in SRAM, some designs use six. And current must continuously flow to keep the transistors in their state, so it's rather power hungry.
One bit of DRAM is just one transistor and one capacitor. Massive density improvements; all the complexity is in the row/column circuitry at the edges of the array. And it only burns power during accesses or refreshes. If you don't need to refresh very often, you can get the power very low. If the array isn't being accessed, the refresh time can be double-digit milliseconds, perhaps triple-digit.
Which of course leads to problems like rowhammer, where rows affected by adjacent accesses don't get additional refreshes like they should (because this has a performance cost -- any cycle spent refreshing is a cycle not spent accessing), and you end up with the RAM reading out different bits than were put in. Which is the most fundamental defect conceivable for a storage device, but the industry is too addicted to performance to tap the brakes and address correctness. Every DDR3/DDR4 chip ever manufactured is defective by design.
A nitpick: if the chip is manufactured in CMOS technology (as it's typically done), then no, current does not have to flow to keep the transistors' state (it's sufficient that a potential difference is maintained), only to change it. There's a tiny leakage current however, which over a few billion transistors adds up.
The key point is that the refreshes do not need to happen very often. Something like once per 20 ms for each row was doable even by an explicit loop that the CPU had to periodically execute.
And this task soon moved to memory controllers, or at least got done by CPUs automatically without need for explicit coding.
I have always had some questions about these low level details.
Back when it needed to be explicit code, what exactly was the code doing? I tried to find some example of what it might look like online but search is so muddy.
DRAM has destructive reads and is arranged in pages. When you read from a page, the entire contents of the page are read into an SRAM buffer inside the memory chip, the bit(s) selected are written out to the pins, and then the entire contents of the SRAM buffer is written back into DRAM.
For old DRAM, usually half the bits in an address selected the page, and the other half selected the word from the page (actually, often a single bit, and this was extended to a full word by accessing multiple chips in parallel). Set your address lines so that the page address is in the low order bits, and any linear read of 2^(log2(DRAM chip size)/2) length is sufficient to refresh all ram. Many early computers made use of this to do the refresh as a side effect; as an example, IIRC the Apple 2 was set up so that the chip updating the screen would also refresh the ram.
The inventor of DRAM, Robert Heath Dennard, just died a few months ago and I was reading his obit and his history.
I think the long and short of it is that DRAM is cheap. DRAM needs one transistor per data bit. Competing technologies needed far more. SRAM needed six transistors per bit for example.
Dennard figured out how to vastly cut down complexity, thus costs.
This won't prevent their major rivals - cloud companies like Google with deep pockets - from competing with them. It only hurts the little startups and SMEs.
CUDA is a decided abstraction with OpenCL I wouldn’t be surprised if eventually they pick a different abstraction to describe the interface they use for writing programs
It is a well known fact that people (except incels & asexuals) will have sex whether you like it or not. It's part of the human experience. Policy must deal with the real world.