[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [xen stable-4.5] x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown



commit de8b55032455a4b591f6a853fae24e474cd3835d
Author:     Dario Faggioli <dario.faggioli@xxxxxxxxxx>
AuthorDate: Tue Jul 21 11:09:39 2015 +0200
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Tue Jul 21 11:09:39 2015 +0200

    x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown
    
    In fact, when a pCPU goes down, we want to clear its
    bit in the correct cpupool's valid mask, rather than
    always in cpupool0's one.
    
    Before this commit, all the pCPUs in the non-default
    pool(s) will be considered immediately valid, during
    system resume, even the one that have not been brought
    up yet. As a result, the (Credit1) scheduler will attempt
    to run its load balancing logic on them, causing the
    following Oops:
    
    # xl cpupool-cpu-remove Pool-0 8-15
    # xl cpupool-create name=\"Pool-1\"
    # xl cpupool-cpu-add Pool-1 8-15
    --> suspend
    --> resume
    (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
    (XEN) CPU:    8
    (XEN) RIP:    e008:[<ffff82d080123078>] csched_schedule+0x4be/0xb97
    (XEN) RFLAGS: 0000000000010087   CONTEXT: hypervisor
    (XEN) rax: 80007d2f7fccb780   rbx: 0000000000000009   rcx: 0000000000000000
    (XEN) rdx: ffff82d08031ed40   rsi: ffff82d080334980   rdi: 0000000000000000
    (XEN) rbp: ffff83010000fe20   rsp: ffff83010000fd40   r8:  0000000000000004
    (XEN) r9:  0000ffff0000ffff   r10: 00ff00ff00ff00ff   r11: 0f0f0f0f0f0f0f0f
    (XEN) r12: ffff8303191ea870   r13: ffff8303226aadf0   r14: 0000000000000009
    (XEN) r15: 0000000000000008   cr0: 000000008005003b   cr4: 00000000000026f0
    (XEN) cr3: 00000000dba9d000   cr2: 0000000000000000
    (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
    (XEN) ... ... ...
    (XEN) Xen call trace:
    (XEN)    [<ffff82d080123078>] csched_schedule+0x4be/0xb97
    (XEN)    [<ffff82d08012c732>] schedule+0x12a/0x63c
    (XEN)    [<ffff82d08012f8c8>] __do_softirq+0x82/0x8d
    (XEN)    [<ffff82d08012f920>] do_softirq+0x13/0x15
    (XEN)    [<ffff82d080164791>] idle_loop+0x5b/0x6b
    (XEN)
    (XEN) ****************************************
    (XEN) Panic on CPU 8:
    (XEN) GENERAL PROTECTION FAULT
    (XEN) [error_code=0000]
    (XEN) ****************************************
    
    The reason why the error is a #GP fault is that, without
    this commit, we try to access the per-cpu area of a not
    yet allocated and initialized pCPU.
    In fact, %rax, which is what is used as pointer, is
    80007d2f7fccb780, and we also have this:
    
    #define INVALID_PERCPU_AREA (0x8000000000000000L - (long)__per_cpu_start)
    
    Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
    Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Acked-by: Juergen Gross <jgross@xxxxxxxx>
    master commit: 8022b05284dea80e24813d03180788ec7277a0bd
    master date: 2015-07-07 14:29:39 +0200
---
 xen/arch/x86/smpboot.c |    1 -
 xen/common/cpupool.c   |    2 ++
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index c54be7e..fe376f0 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -816,7 +816,6 @@ void __cpu_disable(void)
     remove_siblinginfo(cpu);
 
     /* It's now safe to remove this processor from the online map */
-    cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
     cpumask_clear_cpu(cpu, &cpu_online_map);
     fixup_irqs();
 
diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index 2a557f3..045499e 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -529,6 +529,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
             if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) )
             {
                 cpumask_set_cpu(cpu, (*c)->cpu_suspended);
+                cpumask_clear_cpu(cpu, (*c)->cpu_valid);
                 break;
             }
         }
@@ -551,6 +552,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
          * If we are not suspending, we are hot-unplugging cpu, and that is
          * allowed only for CPUs in pool0.
          */
+        cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
         ret = 0;
     }
 
--
generated by git-patchbot for /home/xen/git/xen.git#stable-4.5

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.