129
Intel's CPUs Are Failing, ft. Wendell of Level1 Techs
(www.youtube.com)
This is a most excellent place for technology news and articles.
The last generation has been a total mess for both Intel and AMD.
AMD had motherboards frying CPUs, crazy stupid post issues due to DDR5 memory training (and my personal build fails to post like 25% of the time due to this exact same stupid shit), and just generally less than a totally reliable experience compared to previous gens.
Intel has much the same set of problems on their 13/14th gen stuff: dead chips, memory training issues, instability.
Wonder if it's just a fluke that both x86 vendors are having a shitty generation at the same time, or if something else is at play.
Because they are pushing their chips even harder. AMD literally pegs them at the maximum temperature these days. It's basically factory overclocking for both companies. Of course it's going to run into issues, voltage + temperature fries chips
Factory overclocking is a marketing term. Overclocking means running a processor above its specified speed, but if it intentionally ships that way from the factory it is by definition operating within specification.
Fair point though factory overclocking has been a thing for years with base and boost speeds on Intel and and cpus. I guess they're just pushing them a little too much.
Sure, but the spec is not in line with what the silicon can take, leading to degradation and stability issues
Memory training:
https://www.crucial.com/support/articles-faq-memory/ddr5-memory-training
Have you tried this?
If you don't let it finish, the system will continue to POST with unstable values.
The UX for this seems to be absolute shit. The system seems to hang, and give no indication of something going on? And in the end, the system may need a reboot to complete the process? It better give some indication when it's complete then, or else.
Absolutely, this is decidedly user hostile design.
It's just the easiest way to do this. Memory training is a very early step in the boot process. Firmware only has the CPU cache available as memory and most hardware in the system isn't initialized yet. Most of this isn't even done by the UEFI firmware itself, but by calling a binary blob provided by the CPU manufacturer, for intel it is called FSP and AMD i believe it is AGESA. I'd have to check, but I believe at the point memory training is running the PCIe bus has not even been brought up and scanned, so video output in this phase would require extensive reengineering of the early boot process from both the CPU manufacturer, firmware vendors and the board manufacturer. PCIe has DMA so making that work without memory might be a challenge. There are three easy to implement solutions though: post codes if your mainboard has a display for them, serial output if the board has a serial port (though this needs another device to read the messages) and the cheapest solution could be a flashing LED on the board labeled memory training in progress.
Flashing LED would be great IMO. And a HUGE improvement.
Not to mention hearing about it through word of mouth... Just 🤦♂️
Holy crap. Never heard of this. Thanks!
My biggest complaint is that there should be a visual indication of this process. Many users are utterly unaware it is going on.
Maybe x86/x64 has reached the end of its development lifecycle, and both companies are at the point where they simply can’t squeeze any more out of it, so every trick they try results in these abnormalities?
I dunno.
Look at the heat sinks on gen 5 SSDs. To me the marginal speed benefit of the platform introduces a lot of problems you have to deal with like heat. I would’ve preferred they just focus on bandwidth to allow users to get the same performance as Gen 4 with half the PCI lanes.
In regards to the memory training: have you double-checked how much Ram your CPU actually supports, at what frequencies? For example even the 7950X3D supports only DDR5-3600 when you put more than 2 bars of ram in, leading to issues with memory training taking long/not posting/instability if you enable any form of overclocking in that scenario. I had that problem before and switching from 4 bars to two fixed everything. Just in case this might be your issue as well.
It's pair of 16gb 6000mt/s sticks that i just run at stock 4800mt, primarily because the BIOS fails to post every 3rd or so time, shits itself, and resets to defaults. I've quit fucking with it because, frankly, it's fast enough and going into the bios requires a 2nd reboot and memory retrain, which will fail 50% of the time, and lead to the bios resetting itself, which leads to needing to reconfigure it which....
When the system is up, it's perfectly stable, and stays fine through sleep states and whatever else until I have to reboot it for whatever reason (updates, mostly).
But honestly, if the memory controller can't handle dual-channel 4800mt/s ram, then it's really really fucked, because that's the bare minimum in terms of support.
I'd also add I have 3 mobile AMD based devices with DDR5, none of which exhibit ANY of this nonsense. Makes me think their desktop platform may well be legitimately defective, given how many people have this issue, and how it doesn't seem to be universal across even their own product stacks.
(And, yes, two of the mobile devices have removable ram, so it's not some soldered vs dimm thing)
To be fair I joined a couple of months after the release but I have 0 of these issues. Maybe time to update your BIOS?