> I wonder why doesn't it come like this by default. It runs faster too because it's no longer frequently thermal throttled.
Because pumping up the voltage also allows them to increase the base clock frequency without causing instability. Consumers learned to compare the frequencies during the CPU wars so that's what they juice for marketing purposes.
Newer Thinkpads are notorious for this. Many of them can operate fanless 99% of the time if they just undervolted them by 100mV like you did. It's the first thing I do with a new laptop after installing a clean OS.
Cause the CPU can get unstable under certain loads. So Intel plays it safe and ramps up the voltage.
And no.. there is no easy way to measure this. Also every CPU is slightly different.
I ultimately gave up playing this game, spent hours testing and tuning. Each core is slightly different (on amd, I’m sure intel has something similar) and stress test for 1-24h each tweak invoking a reboot. Then some new workload comes and kernel panic/bsod.
Stock settings just work, maybe I lost the silicon lottery but too tired to check anymore.
It's not just unstable, it's a security issue because control over CPU voltage allows someone to use that instability to compromise computations that are supposed to be performed securely (e.g. look up "Plundervolt").
If plundervolt is a viable attack you’ve got much bigger problems (the attacker must already have gained full privileges). Not sure how this is relevant to just using throttled or similar to conditionally undervolt the CPU with fixed levels that can be proven reasonably stable.
At the time, I ran a CPU stress test which put all cores at full load for a couple of hours.
Then I ran RAM tests.
Then I ran NVMe stress tests.
Then I ran GPU stress tests (NVIDIA GPU is underclocked).
Then I ran all of these tests together.
Worked butter smooth. Not a crash to this day.
To be fair, -100mV is on the safe side according to the articles I read at the time. Some folks run at -200mV or less. I don't need that kind of tweak.
chips do increasingly come like that as intel gets better at binning, they don't want to leave money on the table and are creating more SKUs in order to mark up good chips like yours (this is why silicon lottery is no longer in business)
the reason your chip didn't come like that is because intel plays it very safe when it comes to stability and their margins for error are likely broader than yours, the stochastic nature of the failures means that the voltage margin between one crash for every 2 hours of stress test and one crash for every day of stress test can be hundreds of milivolts, but they're definitely working on it because pat doesn't want you getting free real estate
When undervolting with an offset, most of the crashes I had were when the CPU state changed. Opening a browser and playing a video from idle, starting a transcode, stuff like that.
The only way to test is, unfortunately, to use it until it crashes.
It was sad seeing AMDGPU limit undervolting semi-recently. A couple complaints or bits of damage ruined a practice that a lot of people were using just fine to save significsnt wattage, https://www.phoronix.com/news/AMDGPU-Lower-Power-Limit
However, this is not good as it remove under-powering range too far. I was getting only about 7% less performance but 90W(!) less consumption when set to my 115W before. Also I wonder if we as a OS of options and freedom have to stick to such very high reference for min values without ability to override them through some sys ctrls. Commit was done by amd guy and I wonder if because of maybe this post that I made few months ago(business strategy?)
I have certainly crossed a threshold where hardware seems good enough. I do not need the 99th percentile performance if it comes with a non-linear increase in power consumption.
>I have certainly crossed a threshold where hardware seems good enough. I do not need the 99th percentile performance if it comes with a non-linear increase in power consumption.
But this is how AMD, Intel and Nvidia have been pushing performance improvements to consumers every generation. To quote Darth Sidious: "UNLIMITED POWER!"
Not just AMD. I recently bought a nvidia 4060 card and i can't set the power limit lower than something like 75-90 W, at least in Linux.
Of course, it's possible that the lower limit has always been there for nvidia, but I just didn't know because I went without a dedicated video card for like 4 years.
For AMD aren't the drivers (partially) open source? Can't the change be reverted?
Undervolting can make your CPU use less power and produce less heat. It’s popular for people attempting to overclock because of the thermal headroom
it gives- essentially its the same concept as overclocking in that the silicon lottery can mean CPUs can be better than their factory tuning because Intel is hitting the lowest common denominator in the stock config.
I have used this to moderate success on a 8th generation mobile Xeon to drop the temps ~7C under load, and get the system idle from 6w to 4w (going lower would have been really hard as at that point the CPU was responsible for very little of what remained).
Undervolting is not the same as downclocking, it is supplying less voltage which has a strangely profound impact with no performance loss. However your system can be much less stable.
There are multiple tiers of Xeon. The “base” tier (I am very much over-simplifying) is the same silicon as consumer desktop/mobile with some features turned on like ECC. Mobile workstations and the like get Mobile Xeons. It’s less crazy than Desktop Replacements with socketed Desktop CPUs.
It’s a pretty fuzzy barrier anyway. Like the original Skylake i9 desktop chips were basically just Xeons, right? It isn’t like there’s anything magic in a Xeon, it’s just a bunch of cores, why not pop one in a laptop… just don’t hurt your back carrying the power brick.
Why would you not want this, a properly undervolted system loses no perf and gains lifespan and draws less power. Amortized over the now increased lifespan of the processor you could end up with hundreds of dollars of savings, especially in a laptop where a dead CPU or battery often results in a full system replacement. Undervolting is why my 4 year old HP laptop still gets 6 hours of battery life even with heavy mobile use every day.
Undervolting is not underclocking. With undervolting, you keep performance the same, but use less voltage. It requires trial and error to find the "magic value" that you can reach without experiencing system instability.
To add to this, undervolting isn't about idle clocks, but temperature and power control under load. I typically only apply them to "turbo" boost clocks, which are a frequent cause of thermal throttling even with water cooling.
Intel's 13/14th gen CPUs draw significantly (~70-100W) more power on boost, which can stress the VRMs in otherwise correct pairings and lead to current/EDP or power limit throttling. Lowering the voltage at certain frequencies can allow the system to sustain higher clocks without performance degradation.
iirc a lot of Intel chips (12th gen+ esp) can be slightly undervolted to achieve better power efficiency and lower temperatures with negligible impact to clock speeds. particularly useful on laptops, mini PCs, and desktops that spend a lot of time at idle.
Because the CPU is never idle, there’s always some JS jockey’s piece-of-shit Electron app redrawing in a hot loop, some manic search tool re-indexing all the contents of all mounted drives etc
You can underclock most laptop CPUs by 50-100mV without losing any performance. They’re usually tuned so far into diminishing returns out of the factory that it’s free efficiency.
In 2006, I had my first laptop right before the release of Windows Vista. Unfortunately, running Windows Vista makes my laptop hot and the fan kicked in to make a lot noise. I cannot remember the software's name which can reduce the core voltage of my single core Centrino processor. I was successful to undervolt the CPU to the lowest possible voltage. That cools down the CPU by a large margin, the laptop became quiet again (also stable). I was happy. The functionality disappears with the newer generation intel CPU releases, I had not been able to do similar thing ever since.
Back then, Intel had the best semiconductor process, and they had a wide margin on undervolting the CPU.
This reminds of me a funny/annoying thing I had happen to me.
A few years back, I bought a Dell laptop that was under their "workstation" line. Dell Precision 7520. The default config when ordering these was had a power-hungry nVidia GPU on a dedicated card. I customized the laptop upon ordering to remove the GPU since I wasn't going to be doing anything that needed a GPU. (I just wanted lots of ports, a nice keyboard, and a touchpad with three buttons. Thinkpad was not an option at the time for reasons.)
Unfortunately, the 7520 firmware was hardcoded with power requirements for a fully-loaded system. It came with a big 180-watt brick. I knew that the laptop wasn't going to need that much power, so I would occasionally use a 90W Dell power brick that I had laying around. I turned off the boot-time warning about an undersized power brick, reasoning that if the battery started draining while plugged in, then I would give up and switch to the bigger brick. The battery never drained.
What did happen, though, is that sometimes I would notice that certain UI things were really, really slow. I always ran a lightweight Linux desktop on the laptop, so generally the web browser was the only thing that would cause any serious CPU or memory usage. For the longest time, I just put up with the occasional slowness on big heavy drunk-with-UX-power websites.
Eventually, I got to wondering what was wrong with this laptop that made it _feel_ so much slower _sometimes_ than any other Linux system I dealt with on a daily basis. One day, after working on the couch for a bit in the morning (on battery), I went back to my desk and plugged into the AC and noticed that Slack got _very_ slow. And so did a few other things. Undock the laptop, and things were fast again. Huh, I thought. That's weird. Nothing in the kernel logs or system journal but when I happened to look at the CPU, I saw it was reporting as an 800 MHz Intel Core i5-7300HQ! Okay, that's not right!
It turns out that Dell hard-coded the firmware to throttle the CPU when a lower-than-expected wattage power brick was used. Even in configurations where that wasn't necessary or made sense. This is silly. I would have much rather had my battery start discharging if the power became an issue, than to silently throttle the CPU. (It's very common to monitor the battery! Not the CPU frequency!) (And no, the firmware warning message that I disabled didn't say that it would underclock the CPU if a lower-spec power brick was used. Just that it might lead to system instability or discharging the battery, or something to that effect.)
Anyway, there is a command you can run in Linux to force the CPU back to full speed. Once I did that, I never had any problems afterward.
> It turns out that Dell hard-coded the firmware to throttle the CPU when a lower-than-expected wattage power brick was used.
My laptop throttles the CPU to 800MHz when it thinks it detects a power brick that doesn't have the correct sense wire. Or, more commonly, when the power brick is not all the way plugged in (even if it is working fine).
Did I get that right? A laptop that needs a 230W power brick and wasn't really intended to run on 65W, though it does? 800MHz, that was what we had in the 90s, I assume that's the era? Meanwhile, my laptop is idlying at ~9W playing YouTube. Crazy.
How long did the battery last? Less than 30m I assume.
The laptop is designed to accept power over some terrible barrel jack that isn't actually made of metal. The conductive coating can scrape off after just a couple months of use and render it completely inert. That had happened again so I was stuck with USB-C charging, and that particular laptop only supported up to 65W via its single Thunderbolt port. I eventually fixed this for good by just stripping the barrel off and hard soldering the power cable directly to the motherboard.
Anyway, indeed, that laptop was not designed to run off 65W. Mainly because it had a dedicated Nvidia GPU, but also because it had a bunch of other stuff (couple 6W fans, bright 4K panel, etc.).
The battery basically didn't exist from the very day I got the laptop brand new from Best Buy. I'm not sure if it's possible to get good battery life out of a 99Wh battery while pulling 230W. Of course, you could simply increase the battery capacity, but a not-insignificant fraction of laptop buyers want to bring them on planes so that'd be a pretty dumb idea.
Oh no, portable workstations are a whole nother thing. This was 100% marketed and sold as a laptop. But gaming laptops have never been good with battery life.
They even tried to make it thin and light, which makes me cringe every time. Gamers don't need thin and light, wtf.
those can but 10nm CPUs after Plundervolt can't undervolt , it's disabled from UEFI and not only there's no option to reenable it in the UI, the very EFI variable itself is write locked. I have an X1 Gen 4 (Intel 11th gen) and you just can't do it.
On my Lenovo Legion 5 Pro with i7-13700HX I can disable undervolting protection in stock UEFI. The only problem is that Windows virtualization features get disabled when undervolting is allowed, and I kind of need those.
Perhaps it's just your laptop manufacturer. But people were able to undervolt on gen 11 by editing the EFI variables and then turning off virtualization.
I remember some boards, you would have to paint a cpu pin with nail polish to stop it from conducting, or carefully jump one of the several hundred pins on the socket…
That'd be reason to take away the reenable from UEFI but write protecting the EFI variable where changing it already requires to boot into an EFI editor is just being dicks, pardon my French. Just let me enable it , I am aware of the risks.
I think the trouble is that it is not possible to distinguish (through reliable technical means) a responsible overclocker such as yourself versus an unscrupulous actor editing EFI variables in flash. So, because some bad guys/gals found a way to damage the system, all the good guys/gals suffer as a result, because the system cannot tell them apart. Reminds me of airport security...
Aren't you vulnerable to this regardless of whether wether you're using this tool? The vulnerability in question relies on untrusted code being able to lower voltages to very low levels, causing the cpu to malfunction. Using this tool or having it installed isn't a relevant factor. If you have untrusted code running on your PC, it's already game over, and any malicious tool can use the same api this tool uses to control voltages.
Not exactly. The promise of SGX and secure hardware enclaves is that the code that executes there should run with access to protected encrypted memory pages (enforced by the CPU VMM), and the state of the enclave can be remotely attested. Basically, it's designed to run a secure application in an untrusted computing environment as long as you trust the hardware to implement the features correctly.
But that's something that Signal implements on their own backend, not something that runs on consumer devices, so it's not really relevant to a discussion about the risks of undervolting your CPU.
I was directly replying to the parent's question of whether there were any uses of SGX that were not anti-consumer. Signal's use of it, is very much in line with my thinking of what constitutes pro-consumer.
I agree though, we're all getting slightly off topic
SGX is actually deprecated on client devices like PCs, so it is rather difficult to use it in anti-consumer ways now (and as mentioned in a sibling thread, makes this rather irrelevant to the topic of undervolting your own PC).
In my experience (working in the field at Anjuna), SGX and other Confidential Computing are quietly used on the server-side in enterprises a lot. It's a part of defense-in-depth, often to protect critical secrets and cryptographic keys, or the systems that manage them.
> We were able to corrupt the integrity of Intel SGX on Intel Core processors by controling the voltage when executing enclave computations
> If you are not using SGX, no actions are required. If you are using SGX, it suffices to apply the microcode update provided by Intel to mitigate Plundervolt.
It's not nothing, but that seems minor to irrelevant to most people.
In all likelihood this tool does not work for most users, specifically in response to this vulnerability. If you're on the latest microcode, undervolting is no longer possible due to Intel's mitigation: https://www.intel.com/content/www/us/en/security-center/advi...
Which is a pity because my i7 Lenovo laptop is acoustically and thermally some kind of jet turbine in a case, because I was foolish enough to believe a review, and I really wish I could undervolt it so it can make it to lunchtime on a charge.
I was actually wrong about that, it turned out to be possible on my 11th gen Intel CPU but it was definitely not as easy as it should've been.
I used https://github.com/datasone/setup_var.efi to modify the UEFI variables. The README has all the info you'd need. It turns out that both a BIOS and microcode update is required to kill off this feature, and you could just configure the BIOS to not lock it.
Wow, I never considered a power attack from software of an untrusted OS. Ring -1 and SGX and the like lead to some very harsh security environments for modern processors. IMO if you want cryptographic security, you should probably use an external component that you control, but that isn't always possible and is never the cheaper option.
That resulted in a -10C average CPU temperature. Massive!
I don't remember last time I heard the fans. Not even with Docker + Jetbrains IDEs.
I wonder why doesn't it come like this by default. It runs faster too because it's no longer frequently thermal throttled.