You don't really have much of a choice in the high end laptop world. Maybe this will be enough to push manufacturers to put AMD CPUs into high end workstations. I'd kill for a Thinkpad P1 with AMD.
I stopped in 2007 and haven't looked back, and advise friends and family to do the same. This is just more ammo for the "but why" rebuttal speech, and baby, "wanting your cpu to not die" is an awfully juicy bullet.
Watching Intel fuck themselves the last decade has been an absolute delight, but this, I could almost fap to this news.
Moore's Law is Dead shared an interesting video yesterday about these chips. Supposedly, leaks from his sources at Intel say that high voltages being pushed through the ring bus cause degradation. The leaks claim it shares the same power rail as the P and E cores, meaning it's influenced by the voltage requested by the cores.
For context, the ring bus is responsible for communication between cores, peripherals, and the platform. This includes memory accesses, which means that if the ring bus fails and does something incorrectly, it could appear normal but result in errors far down the line.
Going beyond the video specifically, and considering what others have suggested as workarounds, it seems like ring bus degradation might be a decent candidate for the actual root cause of these issues.
Some observations around chips degrading were:
High memory pressure exacerbates the issue.
Chips with more cores deteriorate faster.
Some of the suggestions to work around the issue were:
Lower the memory speed.
Lower the voltage and clock speeds.
Disabling E cores.
All of those can be related to stress being put on the ring bus:
Higher voltage being put through the bus -> higher likelihood of physical damage
More memory pressure -> more usage of the bus, more opportunity for damage to accumulate
More cores -> more memory pressure
Slower memory speeds -> less maximum throughput -> less stress
I'm not claiming anything definitive, but I think my money is on this one.
The scariest part of this whole problem is there is no way for the owners of i13/14 CPU to figure out to what extent the CPU is damaged. It's like holding a ticking bomb without knowing when that will go off!
100%. Whatever Intel does at this point, I don't trust it to be a fix so much as a mitigation or attempt to delay the inevitable until a few years after the warranty period.
If it's possible for people to return their 13th/14th gen processor and trade up for a 12th gen, that would be the safest solution.
I've heard speculation that this is exasperated by a feature where the CPU increases the voltage to boost clocks when running single core workloads at low temperatures. If that's true, having less load or better cooling may be detrimental to the life of the processor.
If the product has issues it should be legally required to either have a warranty extension, recall, or both. Heck they shouldn't be selling more units until it's figured out and patched.
It's absurd to say: "it might have problems but we'll keep selling it as is".
We have safety recalls. There should be product degregation recalls.
Class action lawsuit, but demand entire company to be put under disqualification from operating for some time instead of just wanting money that will amount to you getting like 10€.
They don't need a recall. If your processor ain't broke yet then the patch will (supposedly) prevent it from breaking and if it's ALREADY broke then Intel will (supposedly) replace it via RMA.
So what's the big fuggin' problem here? That Intel won't use the term "recall"?
The "problem" is that the more you understand the engineering, the less you believe Intel when they say they can fix it in microcode. Without writing an entire essay, the TL/DR is that the instability gets worse over time, and the only way that happens is if applied voltages are breaking down dielectric barriers within the chip. This damage is irreparable, 100% of chips in the wild are irreparably damaging themselves over time.
Even if Intel can slow the bleeding with microcode, they can't repair the damage, and every chip that has ever ran under the bad code will have a measurably shorter lifespan. For the average gamer, that sometimes hasn't even been the average warranty period.
+1. Lots of people are also likely to not have any idea about the situation and just think their PC crashes or acts up more. More of these issues can pop up over time.
A recall forces them to notify customers of the issue so the customer can act on it.
They can most likely prevent further breakdown through software. If the meters and controls are functioning correctly, they can undervolt the CPU. But it's not really a fix if that comes with a performance penalty. If it's a bug where the CPU maxes out the voltage when idle so it can do nothing faster, that could be fixed with no performance penalty, but that seems unlikely.
I’ve only recently become aware of the issue and that’s the way it feels.
But in the absence of a definitive test I think folks are concerned that they will be stuck with a CPU that continues to degrade prematurely. That seems like a valid concern.
So what's the big fuggin' problem here? That Intel won't use the term "recall"?
Would you say the same thing about a car?
"We know the door might fall off but it has not fallen off yet so we are good."
The chances of that door hurting someone are low and yet we still replace all of them because it's the right thing to do.
These processors might fail any minute and you have no way of knowing. There's people who depend on these for work and systems that are running essential services. Even worse, they might fail silently and corrupt something in the process or cause unecessary debugging effort.
If I were running those processors in a company I would expect Intel to replace every single one of them at their cost, before they fail or show signs of failing.
Those things are supposed to be reliable, not a liability.