More

dragontamer · 2025-09-07T03:54:08 1757217248

RDRAND and RDSEED are both using quantum principles (aka: heat and temperature / truly quantumly random noise at the microscopic level in the CPU's transistors) to generate random numbers.

Well... a seed at least. And then they are expanded using AES encryption IIRC (which "shouldn't" be breakable, and even if it were breakable it'd probably be very difficult to follow). I think RDSEED takes hundreds (or nearly a thousand) cycles to complete, but we're still talking millions-of-bits of entropy per second. More than enough to shuffle a deck even if you're taking a fresh RDSEED every single card.

Every few months, it feels like "someone effed up RNG" becomes an article. But in practice, RDRAND / RDSEED are the primitives you need. And you should be getting that for free with Linux's /dev/urandom on modern platforms.

----------

I think RDSEED / RDRAND cannot be "proven secure" because of all the VMs we are running in practice though. So its something you need to be running on physical hardware to be 100% sure of security. So its still harder than it looks.

But its not "impossible" or anything. Just work to cover all the little issues that could go wrong. After all, these RDRAND/RDSEED instructions were created so that we can send our credit card numbers securely across the internet. They're solid because they _HAVE_ to be solid. And if anyone figures out a problem with these instructions, virtually everyone in the cryptographic community will be notified of it immediately.

---------

EDIT: I should probably add that using the shot-noise found in a pn-junction (be it a diode or npn transistor) is a fun student-level EE project if anyone wants to actually play with the principles here.

You are basically applying an amplifier of some kind (be it 3x inverters, or an OpAmp, or another NPN transistor) to a known quantum-source of noise. Reverse-avalanche noise from a Zener Diode is often chosen but there's many, many sources of true white-noise that you could amplify.

thijsr · 2025-09-07T07:41:47 1757230907

When you can modify the microcode of a CPU, you can modify the behaviour of the RDRAND/RDSEED instructions. For example, using EntrySign [1] on AMD, you can make RDRAND to always return 4 (chosen by a fair dice roll, guaranteed to be random)

[1] https://bughunters.google.com/blog/5424842357473280/zen-and-...

dragontamer · 2025-09-07T21:25:37 1757280337

I don't mean to say that RDSEED is sufficient for security. But a "correctly implemented and properly secured" RDSEED is indeed, quantum random.

IE: While not "all" RDSEED implementations (ie: microcode vulnerabilities, virtual machine emulation, etc. etc.) are correct... it is possible to build a true RNG for cryptographic-level security with "correct" RDSEED implementations.

------

This is an important factoid because a lot of people still think you need geiger counters and/or crazy radio antenna to find sufficient sources of true entropy. Nope!! The easiest source of true quantum entropy is heat, and that's inside of every chip. A good implementation can tap into that heat and provide perfect randomness.

Just yeah: microcode vulnerabilities, VM vulnerabilities, etc. etc. There's a whole line of other stuff you also need to keep secure. But those are "Tractable" problems and within the skills of a typical IT Team / Programming team. The overall correct strategy is that... I guess "pn-junction shot noise" is a sufficient source of randomness. And that exists in every single transistor of your ~billion transistor chips/CPUs. You do need to build out the correct amplifiers to see this noise but that's called RDSEED in practice.

klodolph · 2025-09-07T04:40:53 1757220053

What I’m impressed by is getting noise of a consistent level out of a circuit. That’s a nice second layer of difficulty to the “make some noise” EE project.

blobbers · 2025-09-07T05:52:25 1757224345

I think the noise has to be random... so its inherently inconsistent ;) .. maybe?

dragontamer · 2025-09-07T06:55:05 1757228105

Its easy to think if you can see it in both frequency and time domains.

So the fourier-transform of white noise is still.... white noise. Random is random as you say. But this has implications. That means the "wattage" of noise (ie: Voltage * Current == Watts aka its power) is a somewhat predictable value. If you have 0.5 Watts of noise, it will be 0.5 Watts of noise in the frequency-domain (after a fourier transform, across all frequencies).

The hard part of amplification is keeping it consistent across all specifications. I assume the previous post was talking about keeping white noise (which is "flat" across all frequency domains), truly flat. IE: It means your OpAmps (or whatever other amplifer you use) CANNOT distort the value.

Which is still student level (you cannot be a good EE / Analog engineer if you're carelessly introducing distortions). Any distortion of white-noise is easily seen because your noise profile weakens over frequency (or strengthens over frequency), rather than being consistent.

DHRicoF · 2025-09-08T13:04:59 1757336699

But most common noises are not white. you had to decolor it before.

dragontamer · 2025-09-08T15:32:45 1757345565

Alternatively, you can choose a proven source of white noise.

Such as the reverse-bias shot and/or avalanche noise at the pn junction of a reverse bias'ed Zener Diode. Which is white-noise into the hundreds-of-MHz. Maybe not good enough for RDSEED, but certainly good enough and fast-enough for most hobbyist projects who are experimenting with this for the first time.

kittikitti · 2025-09-07T07:32:40 1757230360

These are methods to generate cryptographically secure pseudo random numbers using a truly random seed.

dragontamer · 2025-09-07T19:12:32 1757272352

RDSEED returns a truly random seed (as long as you aren't getting hacked while running in a VM or something)

RDRAND is allowed to stretch the true random seed a little bit inside of hardware, which is sufficient for most purposes.

dragontamer · 2025-09-03T04:09:00 1756872540

There's many ways CPU utilization fails to work as expected.

I didn't expect an article on this style. I was expecting the normal Linux/Windows utilization but wtf it's all RAM bottlenecked and the CPU is actually quiet and possibly down clocking thing.

CPU Utilization is only how many cores are given threads to run by the OS (be it Windows or Linux). Those threads could be 100% blocked on memcpy but that's still CPU utilization.

-------

Hyperthreads help: if one thread is truly CPU bound (or even more specifically: AVX / Vector unit bound), while a 2nd thread is hyperthreaded together that's memcpy / RAM bound, you'll magically get more performance due to higher utilization of resources. (Load/store units are separate from AVX compute units).

In any case, this is a perennial subject with always new discoveries about how CPU Utilization is far less intuitive than many think. Still kinda fun to learn about new perspectives on this matter in any case.

dragontamer · 2025-09-02T23:05:14 1756854314

My mom was looking up church times in the Philippines. Google AI was wrong pretty much every time.

Why is an LLM unable to read a table of church times across a sampling of ~5 Filipino churches?

Google LLM (Gemini??) was clearly finding the correct page. I just grabbed my mom's phone after another bad mass time and clicked on the hyperlink. The LLM was seemingly unable to parse the table at all.

etherealG · 2025-09-05T05:22:19 1757049739

Because google search and llm teams are different, with different incentives. Search is the cash cow they keep squeezing for more cash at the expense of good quality since at least 2018, as revealed in court documents showing they did that on purpose to keep people searching more to have more ads and more revenue. Google AI embedded in search has the same goals, keep you clicking on ads… my guess would be Gemini doesn’t have any of the bad part of enshitification yet… but it will come. If you think hallucinations are bad now, just you wait until tech companies start tuning them up on purpose to get you to make more prompts so they can inject more ads!

dragontamer · 2025-09-02T09:26:26 1756805186

I don't think software emulation is very important.

Let's look at the lowest end chip in the discussion. Almost certainly the SAM9x60.... it is a $5 ARMv5 MMU chip supporting DDR2/LPDDR/DDR3/LPDDR3/PSRAM, a variety of embedded RAM and 'old desktop RAM' and mobile RAM.

Yes it's 32-bit but at 600MHz and GBits of RAM support. But you can seriously mass produce a computer under $10 with the chip (so long as you support 4-layer PCBs that can breakout the 0.75mm pitch BGA). As in, the reference design with DDR2 RAM is a 4-layer design.

There are a few Rockchips and such that are (rather large) TQFP that are arguably easier. But since DDR RAM is BGA I think it's safe to assume BGA level PCB layout as a point of simplicity.

---------

Everything smaller than this category of 32-bit / ARMv5 chips (be it Microchip SAM9x60, or competing Rockchips or AllWinner) is a microcontroller wholly unsuitable for running Linux as we know it.

If you cannot reach 64MBs of RAM, Linux is simply unusable. Even for embedded purposes. You really should be using like FreeRTOS or something else at that point.

---------

Linux drawing the line at 64MB hardware built within the last 20 years is.... reasonable? Maybe too reasonable. I mean I love the fact that the SAM9x60 is still usable for modern and new designs but somewhere you have to draw the line.

ARMv5 is too old to compile even like Node.js. I'm serious when I say this stuff is old. It's an environment already alien to typical Linux users.

zgs · 2025-09-02T09:52:38 1756806758

Out by a factor of five or more.

A $1 Linux capable ARM: https://www.eevblog.com/forum/microcontrollers/the-$1-linux-...

I'd expect that there were even cheaper processors now since it's eight years later.

dragontamer · 2025-09-02T23:25:02 1756855502

Microchip is always more expensive than the Chinese stuff, but Microchip contributions to Linux are mainline (!!!!), and is often worth the extra few $$$$.

Fully open hardware, with mainline Linux open source drivers. It's hard to beat SAM9x60 in openness, documentation and overall usability. It's specs are weaker but keeping up with mainline Linux is very, very relevant. Especially in this discussion

dragontamer · 2025-09-02T00:22:59 1756772579

> How is this the OS's sole role?

Embedded Design.

A PWM driver (or a hardware timer) will handle the nanosecond-to-nanosecond wait states and counts, but the OS has to still setup the hardware timer to send the right PWM wave down to the system.

Besides, the OS should have some degree of custom fan controls for any modern computer, embedded or not. My PC can control all of my fans for example.

bri3d · 2025-09-02T02:16:56 1756779416

Sure, I think the OP knows this, but another (arguably much more common) way to do fan control is to have a secondary control system (be it a separate management processor, fan IC, management core on the same SoC, whatever) know about temperature curves/thresholds and have that IC handle sensor input to set the PWM.

This is the usual way things are done on x86 with ACPI, for example - unless the OS or some userland fan manager elects to take over via the OSPM fan objects, the fans control is delegated to the BIOS/platform firmware. If I boot an OS with no notion of a fan on a common x86 motherboard, it will still cool reasonably well (usually). Same deal for Macs with SMC - unless the OS tells the SMC explicitly to quit handling the fan, the SMC deals with all the thermals with no intervention.

ggm · 2025-09-02T03:57:35 1756785455

Not wanting to tell on them, my intel SBC super lightweight cigarette-box board has non-PWM risers. you can add a fan, it's always-on. The BIOS doesn't do anything smart it just volts the fan.

I think it's not that unusual for people to delete things they hoped they didn't need, the device targets passive cooling deployments: Turns out a lot of us run them in hot locations.

snvzz · 2025-09-02T05:57:51 1756792671

Most importantly, even if control was delegated to the CPU, it could still take over in the event of temperature exceeding some safety threshold.

dragontamer · 2025-09-02T00:11:45 1756771905

Nit: Trees/Grass are even more of this than Crabs.

The two strategies for plants are to grow super tall to absorb the sun, or super wide (and small) to.... absorb the sun.

Tall needs wood or other 'strong' polymer to support height. Short and wide is perhaps weak from an individual level but far more efficient.

And trees and grass respectively have such genetic diversity that it's clear that none of these damn plants are of the same genetic line.

thethethethe · 2025-09-02T16:47:01 1756831621

Nit: grasses are a distinct genetic lineage, the Poaceae family. There are a few other linages outside of Poaceae that have convergently evolved to look like grasses, sedges and rushes, but they all fall in the same clade, Monocots.

Trees, on the other hand, are a growth habit, exhibited by species in a wide variety of plant families, even grasses (e.g palm trees).

dragontamer · 2025-08-27T04:21:35 1756268495

If you are disposing of the corn anyway, why not turn it into Ethanol and then burn it as car fuel?

The only real issue with Ethanol IMO is that corn Ethanol is preventing progress in advanced synthesis made out of, ex: switchgrass cellulose. There are better sources of ethanol if we invest into them.

chairmansteve · 2025-08-30T01:43:29 1756518209

"If you are disposing of the corn anyway, why not turn it into Ethanol and then burn it as car fuel?"

It costs more money and has a higher carbon footprint than simply using gasoline.

dragontamer · 2025-09-01T10:57:55 1756724275

More money yes.

The carbon footprint thing doesn't past review of the overall literature. There's one outspoken guy who has to bend over backwards and publishes media articles rather than keeping things academic who tries to make the public believe what you say, but I'm not convinced he's arguing in any serious manner.

dragontamer · 2025-08-19T16:08:22 1755619702

To clarify, I believe the issue is C++ unordered map iterators and when / where they are allowed to go invalid.

OpenAddressing means that an address of map[thing] could change on insert. Which means iterators and pointer invalidation concepts can go stale on insert.

C++11 standard for unordered_map guarantees this won't happen. But that forces slower implementations.

And now people rely upon the standard so we can't change it. At best we do fast_unordered_map or unordered_map2 with different guarantees.

tialaramex · 2025-08-19T17:59:04 1755626344

There are numerous factors where std::unordered_map makes API design choices that you would not make today and the invalidation of iterators (or lack thereof) is just one. A particularly frustrating issue is that std::unordered_map promises some things we suspect nobody cares about, but whereas Rust routinely does "crater runs" in which they discover to a good approximation whether their changes break the ecosystem (via hidden ABI dependencies, Hyrum's law etc.) there is little equivalent for C++ and much scare mongering about claimed large codebases hidden from sight which may use absolutely anything.

The most stupid thing about std::unordered_map is that it was standardized in 2011, so it isn't from 1998 like much of the C++ standard library containers, it's newer and yet apparently nothing was learned.

gpderetta · 2025-08-19T18:57:41 1755629861

It was standardized in c++11, but it was standardizing hash_map that was in the original STL pre-C++98 and was available as an extension in most STL derived standard libraries.

For me the remaining reason I still reach for unordered_map is if I need reference stability as most faster hash tables don't provide it (and I don't care enough about performance to build reference stability on top of a better hash map).

tialaramex · 2025-08-19T20:12:24 1755634344

When you want reference stability, do you need references to stay working even where something moved to a different hash table?

That is suppose we have two std::unordered_map<Goose> labelled A and B, and we put a particular Goose in A, later we get a reference to it, and we might, or might not, move it to B, but it definitely still exists. Do you need that? (As I understand it this is in fact what you get in std::unordered_map today)

Or for your purposes is it enough that it works only so long as the goose we're referring to is in A still and if we moved it out of A then it's fine that the reference is invalidated ?

gpderetta · 2025-08-19T21:14:27 1755638067

If I need the lifetime of an object to outlive it's presence in the map then I use boost intrusive or simply a map of (smart) pointers.

For me it is usually when I need a multi indexed map where I use a fast map of pointers for the fast access and unordered_map both to keep the object alive and as a secondary access key.

Really I should use boost intrusive or boost multiindex, but these days I value the implementation simplicity in many cases, even if I have to keep the indices in sync myself

senderista · 2025-08-19T23:07:32 1755644852

Can you use the CPython approach: an open-addressed hashtable mapping keys to indices into a dynamic array holding (pointers to) the actual keys and values? (Reference stability of course precludes implementing deletes by moving the last key/value entry into the vacated slot; instead you must track empty slots for reuse.)

masklinn · 2025-08-20T04:56:34 1755665794

As soon as the hashmap needs to resize the array will be realloc’d, so it won’t be stable unless you add virtual memory tricks or an indirection via some sort of segmented array.

dragontamer · 2025-08-20T16:01:37 1755705697

Of course. That's just map<obj*>. The pointer-to-pointers could move but the pointers are now staying put. But now you have a bit of a lifetime worry.

Map<shared_ptr<obj>> with weak_ptr probably is the best general solution. If the object decided to delete itself the weakptr will be set to null.

dragontamer · 2025-08-19T13:22:26 1755609746

Possible? Yes.

Legally? No. But it's totally possible.

There are micro-towers for cell phone testing you can buy and cell phones think it's a real tower. Law enforcement uses one of them to track criminals IIRC.

But it's very tightly regulated.

dragontamer · 2025-08-14T20:03:32 1755201812

I don't believe it's possible to vectorize the classic heap.

I've seen vectorized and SIMD heap implementations. They are a different data structure entirely. You basically work with 16-sorted items and then sort your working set (16-items) with any node, allowing you to generalize the push down or push up operations of a heap. (Sort both lists. Top16 make a node and stay here. Bottom16 push down and recurse).

This is very very similar to a classic heap but you need a lot of operations so that you have enough work to SIMD.

Sorting is after all, a highly parallelized operation (Bionic Sort, MergePath, etc) and is a good basis for generalizing single threaded data structures into a multi thread or SIMD version.