[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2 2/2] xen/x86: introduce MCE_NONFATAL


  • To: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Stefano Stabellini <stefano.stabellini@xxxxxxx>
  • Date: Tue, 8 Jul 2025 11:32:38 -0700
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N5q7riMQpJuR081NmlAlblhxCQ1R0T6CrGkGHy0+zI4=; b=YbtyrGcU1xXMpozkZPJfUYDkjzq1d6kiYj/HbFgyI+afDyb5xkng8Je9LNF1761o/FpjEeaer+M10Qjk1PvV/zUK5kmLVjtPsEYc8DhK6BlmgNO/iAYvqcXb/usRtegLXLpnAua2joRRWqWiaqJrdJIDCH9Pn2Pr5TmfbLHtt+qQ19eJtpI0Fj9Gb+BbwRIOWi7RjWcs+u2dADRe2AqgwJzOjoTLZD6ssdR6eufnjrMBQwOa5a+4pZO8pbafb/XoR2O0h8IShIQJnZtw6ocahXoY02rE1ZIEsm6z8QUqAgrw1LO0OaIKiMt45LWOriuOUDfEXV8WPKblueC6unDwLw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wCKmnzJtcp2pt2ncjGXC/WiEteEaDAVLJcH0vV78+mVmjTtXNn8+qM/MVac2sgo4bq0/EcZfUgORo4T6Z+nL8z+LEIrZ1qdSBzlAtsFIcWjQA29YP9tszGtvAzGwFdNUp3H8co+G0rHPg/EdVL2zDrKxIOjBr/pGxKooOJYz42e2vpftDSe5JQsdg+JprVgDuO2ZWjaGvQ8f/rJCIp2nahM12GAA7CMYvhesHWJkZ2wA2sGLGAAXm7G8gFhOJioJpzbvk/mIR2NwM3E0W2zSkeT1o9+VE1VvOl3QbUXk0zOUepoCzfvSrbhR468N/HD3Edz3CTWyMnq+Rc4HOv2sYg==
  • Cc: <jbeulich@xxxxxxxx>, <andrew.cooper3@xxxxxxxxxx>, <roger.pau@xxxxxxxxxx>, <stefano.stabellini@xxxxxxx>, <Xenia.Ragiadakou@xxxxxxx>, <alejandro.garciavallejo@xxxxxxx>, <Jason.Andryuk@xxxxxxx>
  • Delivery-date: Tue, 08 Jul 2025 18:32:55 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Today, checking for non-fatal MCE errors on AMD is very invasive: it
involves a periodic timer interrupting the physical CPU execution at
regular intervals. Moreover, when the timer fires, the handler sends an
IPI to all physical CPUs.

Both these actions are disruptive in terms of latency and deterministic
execution times for real-time workloads. They might miss a deadline due
to one of these IPIs. Make it possible to disable non-fatal MCE errors
checking with a new Kconfig option (MCE_NONFATAL).

Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxx>
---
Changes in v2:
- generalize the appraoch and remove the code when MCE_NONFATAL is not
  set
- move the new kconfig option to xen/arch/x86/Kconfig
---
 xen/arch/x86/Kconfig             | 14 ++++++++++++++
 xen/arch/x86/cpu/mcheck/Makefile |  6 +++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 752d5141bb..9ec0fb0bed 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -248,6 +248,20 @@ config X2APIC_MIXED
 
 endchoice
 
+config MCE_NONFATAL
+       bool "Check for non-fatal MCEs" if EXPERT
+       default y
+       help
+         Check for non-fatal MCE errors.
+       
+         When this option is on (default), Xen regularly checks for
+         non-fatal MCEs potentially occurring on all physical CPUs. The
+         checking is done via timers and IPI interrupts, which is
+         acceptable in most configurations, but not for real-time.
+       
+         Turn this option off if you plan on deploying real-time workloads
+         on Xen.
+
 config GUEST
        bool
 
diff --git a/xen/arch/x86/cpu/mcheck/Makefile b/xen/arch/x86/cpu/mcheck/Makefile
index e6cb4dd503..c70b441888 100644
--- a/xen/arch/x86/cpu/mcheck/Makefile
+++ b/xen/arch/x86/cpu/mcheck/Makefile
@@ -1,12 +1,12 @@
-obj-$(CONFIG_AMD) += amd_nonfatal.o
+obj-$(filter $(CONFIG_AMD),$(CONFIG_MCE_NONFATAL)) += amd_nonfatal.o
 obj-$(CONFIG_AMD) += mce_amd.o
 obj-y += mcaction.o
 obj-y += barrier.o
-obj-$(CONFIG_INTEL) += intel-nonfatal.o
+obj-$(filter $(CONFIG_INTEL),$(CONFIG_MCE_NONFATAL)) += intel-nonfatal.o
 obj-y += mctelem.o
 obj-y += mce.o
 obj-y += mce-apei.o
 obj-$(CONFIG_INTEL) += mce_intel.o
-obj-y += non-fatal.o
+obj-$(CONFIG_MCE_NONFATAL) += non-fatal.o
 obj-y += util.o
 obj-y += vmce.o
-- 
2.25.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.