[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] Fix cpu offline bug: add clflush inside dead loop



Fix cpu offline bug: add clflush inside dead loop

At some platform (like Xen 7400), when hyperthreading, an offlined thread may 
waked spuriously up by its brother, and returning around the loop. 
This patch explicitly clflush the cache line in a light weight way to 
workaround potential issue.
Unlike wbinvd, clflush is not serializing instruction, hence memory fence is 
necessary to make sure all load/store operation visible before flush cache line.

Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx>

diff -r 0cf5e30f1697 xen/arch/x86/acpi/cpu_idle.c
--- a/xen/arch/x86/acpi/cpu_idle.c      Fri Mar 11 03:45:32 2022 +0800
+++ b/xen/arch/x86/acpi/cpu_idle.c      Fri Mar 11 04:58:47 2022 +0800
@@ -554,6 +554,7 @@ static void acpi_dead_idle(void)
 {
     struct acpi_processor_power *power;
     struct acpi_processor_cx *cx;
+    void *mwait_ptr;
 
     if ( (power = processor_powers[smp_processor_id()]) == NULL )
         goto default_halt;
@@ -561,23 +562,33 @@ static void acpi_dead_idle(void)
     if ( (cx = &power->states[power->count-1]) == NULL )
         goto default_halt;
 
-    /*
-     * cache must be flashed as the last ops before cpu going into dead,
-     * otherwise, cpu may dead with dirty data breaking cache coherency,
-     * leading to strange errors.
-     */
-    wbinvd();
-    for ( ; ; )
+    mwait_ptr = (void *)&mwait_wakeup(smp_processor_id());
+
+    if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
     {
-        switch ( cx->entry_method )
+        /*
+         * cache must be flashed as the last ops before cpu going into dead,
+         * otherwise, cpu may dead with dirty data breaking cache coherency,
+         * leading to strange errors.
+         */
+        wbinvd();
+
+        while ( 1 )
         {
-            case ACPI_CSTATE_EM_FFH:
-                /* Not treat interrupt as break event */
-                __monitor((void *)&mwait_wakeup(smp_processor_id()), 0, 0);
-                __mwait(cx->address, 0);
-                break;
-            default:
-                goto default_halt;
+            /*
+             * 1. The CLFLUSH is a workaround for erratum AAI65 for
+             * the Xeon 7400 series.  
+             * 2. The WBINVD is insufficient due to the spurious-wakeup
+             * case where we return around the loop.
+             * 3. Unlike wbinvd, clflush is a light weight but not serializing 
+             * instruction, hence memory fence is necessary to make sure all 
+             * load/store visible before flush cache line.
+             */
+            mb();
+            clflush(mwait_ptr);
+            __monitor(mwait_ptr, 0, 0);
+            mb();
+            __mwait(cx->address, 0);
         }
     }
 

Attachment: cpu_offline_3.patch
Description: cpu_offline_3.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.