20-year-old Linux workaround is still slowing down AMD systems - Ars Technica

2 years ago 47

Proper slumber —

A small hole for CPUs that didn't decently slumber had decades-long consequences.

- Sep 26, 2022 8:51 p.m. UTC

A second-generation Epyc server spot   from AMD, 1  that whitethorn  person  been moving  2002-era Linux codification  slowing it down.

Enlarge / A second-generation Epyc server spot from AMD, 1 that whitethorn person been moving 2002-era Linux codification slowing it down.

Getty Images

AMD has travel a agelong mode since 2002, but the Linux kernel inactive treats modern Threadrippers similar Athlon-era systems—at slightest successful 1 perchance lag-inducing respect.

AMD technologist Prateek Nayak precocious submitted a spot to Linux's processor idle drivers that would "skip dummy hold for processors based connected the Zen microarchitecture." When ACPI enactment was added to the Linux kernel successful 2002—written by Andy Grover, committed by Linus Torvalds—it included a "dummy hold op." The strategy fundamentally work information with nary intent different than delaying the adjacent acquisition until the CPU could afloat halt with the STPCLK# command. This allowed for immoderate powerfulness redeeming and compatibility during the aboriginal days of ACPI implementation erstwhile immoderate chipsets wouldn't determination to an idle authorities erstwhile 1 would expect it.

But today's Zen-based AMD chips don't request this workaround, and, arsenic Nayak writes, it's hurting them, astatine slightest successful circumstantial workloads connected Linux. Testing with instruction-based sampling (IBS) workloads shows that "a important magnitude of clip is spent successful the dummy op, which incorrectly gets accounted arsenic C-State residency." The CPU, seeing each this low-effort dummy work, tin propulsion into deeper, slower C-State, which past makes the CPU instrumentality longer to "wake up," particularly connected jobs that necessitate tons of switching betwixt engaged and idle states.

Nayak ran tests successful tbench on a dual-socket Zen3 strategy against the baseline Linux kernel, a kernel with the C2 authorities wholly disabled, and a kernel with the dummy hold cognition patched out. His patched mentation saw a 1,390 percent summation successful minimum MB/s throughput and a 51 percent summation successful mean MB/s implicit the baseline kernel, often conscionable a small down having C2 disabled entirely.

Intel systems person avoided AMD's bequest curse, arsenic they usage an MWAIT-based strategy for astatine slightest a decade, per the Phoronix blog. That led to an urgent spot submitted by Dave Hansen of Intel. His solution was to bounds "dummy wait" to Intel systems, wherever it would not impact "remotely modern Intel systems," and adhd comments to the kernel's idle drivers that spell retired what's happening—and promote those speechmaking to "consider moving your strategy to a much modern idle mechanism."

If an urgent spot removing oregon limiting "dummy wait" is submitted this week, it could apt marque the Linux 6.0 kernel, which Torvalds expects to vessel adjacent week.

Read Entire Article