[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0



>>> "Ke, Liping" <liping.ke@xxxxxxxxx> 10.06.09 08:52 >>>
>According to the discussion result, I modified the patch
>1)  use Kconfig to identify when mce_dom.c and mce_XXX.c will be built. 
>mce_dom.c
>     will only be build when it is under XEN MCE environment.
>2) virq bind function will only be called when mce_dom.c is used.
>
>I test the compiling both under native/dom0 with all possible combinations.

I think that's not covering all the issue I pointed out:

>--- a/arch/x86_64/Kconfig      Tue Jun 09 09:58:11 2009 +0800
>+++ b/arch/x86_64/Kconfig      Wed Jun 10 14:12:55 2009 +0800
>@@ -472,7 +472,6 @@
> 
> config X86_MCE
>       bool "Machine check support" if EMBEDDED
>-      depends on !X86_64_XEN
>       default y
>       help
>          Include a machine check error handler to report hardware errors.

Shouldn't this rather change the dependency to XEN_PRIVILEGED_GUEST?

>@@ -483,7 +482,7 @@
> config X86_MCE_INTEL
>       bool "Intel MCE features"
>       depends on X86_MCE && X86_LOCAL_APIC
>-      default y
>+      default n
>       help
>          Additional support for intel specific MCE features such as
>          the thermal monitor.

This hunk should go.

>@@ -491,11 +490,15 @@
> config X86_MCE_AMD
>       bool "AMD MCE features"
>       depends on X86_MCE && X86_LOCAL_APIC
>-      default y
>+      default n
>       help
>          Additional support for AMD specific MCE features such as
>          the DRAM Error Threshold.

And likewise this part.

> 
>+config X86_64_XEN_MCE
>+      def_bool y
>+      depends on X86_64_XEN && (X86_MCE_INTEL || X86_MCE_AMD)
>+

I wonder if this wouldn't better be named X86_XEN_MCE (for consistency
with a potential 32-bit implementation).

> config KEXEC
>       bool "kexec system call (EXPERIMENTAL)"
>       depends on EXPERIMENTAL && !XEN_UNPRIVILEGED_GUEST
>...
>--- a/arch/x86_64/kernel/apic-xen.c    Tue Jun 09 09:58:11 2009 +0800
>+++ b/arch/x86_64/kernel/apic-xen.c    Wed Jun 10 14:12:55 2009 +0800
>@@ -195,3 +195,13 @@
> 
>       return 1;
> }
>+
>+/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */
>+void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char vector,
>+                          unsigned char msg_type, unsigned char mask)
>+{
>+      unsigned long reg = (lvt_off << 4) + K8_APIC_EXT_LVT_BASE;
>+      unsigned int  v   = (mask << 16) | (msg_type << 8) | vector;
>+      apic_write(reg, v);
>+}

This continues to be wrong (yes, I realize arch/x86_64/kernel/apic-xen.c is
full of such broken code, but that doesn't mean more should be added).

>...
>--- a/arch/x86_64/kernel/mce.c Tue Jun 09 09:58:11 2009 +0800
>+++ b/arch/x86_64/kernel/mce.c Wed Jun 10 14:12:55 2009 +0800
>@@ -165,7 +165,7 @@
>  * The actual machine check handler
>  */
> 
>-void do_machine_check(struct pt_regs * regs, long error_code)
>+asmlinkage void do_machine_check(struct pt_regs * regs, long error_code)
> {
>       struct mce m, panicm;
>       int nowayout = (tolerant < 1); 

Why?

>@@ -276,9 +276,16 @@
> 
> /*
>  * Periodic polling timer for "silent" machine check errors.
>- */
>+ * We will disable polling in DOM0 since all CMCI/Polling
>+ * mechanism will be done in XEN for Intel CPUs
>+*/
> 
>+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL)
>+static int check_interval = 0; /* disable polling */
>+#else
> static int check_interval = 5 * 60; /* 5 minutes */
>+#endif
>+
> static void mcheck_timer(void *data);
> static DECLARE_WORK(mcheck_work, mcheck_timer, NULL);
 
Shouldn't that now simply be #ifdef CONFIG_X86_64_XEN_MCE?

>...
>--- a/include/asm-x86_64/mach-xen/asm/hw_irq.h Tue Jun 09 09:58:11 2009 +0800
>+++ b/include/asm-x86_64/mach-xen/asm/hw_irq.h Wed Jun 10 14:12:55 2009 +0800
>@@ -60,6 +60,9 @@
> #define NUM_INVALIDATE_TLB_VECTORS    8
> #endif
> 
>+/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */
>+#define THRESHOLD_APIC_VECTOR   0xf9
>+#define THERMAL_APIC_VECTOR   0xfa
> /*
>  * Local APIC timer IRQ vector is on a different priority level,
>  * to work around the 'lost local interrupt if more than 2 IRQ

These vectors mean nothing under Xen.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.