[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Thoughts on current Xen EDAC/MCE situation
On Wed, Jan 24, 2024 at 07:20:56AM -0800, Elliott Mitchell wrote: > On Wed, Jan 24, 2024 at 08:23:15AM +0100, Jan Beulich wrote: > > > > Third, as to Dom0's purposes of having the address: If all it is to use > > it for is to pass it back to Xen, paths in the respective drivers will > > necessarily be entirely different for the Xen vs the native cases. > > I'm less than certain of the best place for Xen to intercept MCE events. > For UE memory events, the simplest approach on Linux might be to wrap the > memory_failure() function. Yet for Linux/x86, > mce_register_decode_chain() also looks like a very good candidate. I did hope to get some response. It really does look like, aside from being x86-only, mce_register_decode_chain() is the ideal hook point. Xen could forward NMIs to Domain 0, then intercept them from the decode chain. For UEs Xen would mark the event handled, then create a new event for whichever domain (if any) was effected. Right now my main concern is several of the Linux MCE/EDAC drivers are growing `if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) return -ENODEV;` calls. This approach is being poisoned and will become quite difficult if this isn't stopped. The justification found for one instance was that it "removed one message", with no useful information. I cannot help suspecting it involved a hypervisor from Redmond, WA and their engineers are encouraged to poison interfaces used by others. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg@xxxxxxx PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |