[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/4] xen: x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown



On 06/25/2015 02:15 PM, Dario Faggioli wrote:
In fact, if a pCPU belonging to some other pool than
cpupool0 goes down, we want to clear the relevant bit
from its actual pool, rather than always from cpupool0.

Before this commit, all the pCPUs in the non-default
pool(s) will be considered immediately valid, during
system resume, even the one that have not been brought
up yet. As a result, the (Credit1) scheduler will attempt
to run its load balancing logic on them, causing the
following Oops:

# xl cpupool-cpu-remove Pool-0 8-15
# xl cpupool-create name=\"Pool-1\"
# xl cpupool-cpu-add Pool-1 8-15
--> suspend
--> resume
(XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    8
(XEN) RIP:    e008:[<ffff82d080123078>] csched_schedule+0x4be/0xb97
(XEN) RFLAGS: 0000000000010087   CONTEXT: hypervisor
(XEN) rax: 80007d2f7fccb780   rbx: 0000000000000009   rcx: 0000000000000000
(XEN) rdx: ffff82d08031ed40   rsi: ffff82d080334980   rdi: 0000000000000000
(XEN) rbp: ffff83010000fe20   rsp: ffff83010000fd40   r8:  0000000000000004
(XEN) r9:  0000ffff0000ffff   r10: 00ff00ff00ff00ff   r11: 0f0f0f0f0f0f0f0f
(XEN) r12: ffff8303191ea870   r13: ffff8303226aadf0   r14: 0000000000000009
(XEN) r15: 0000000000000008   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 00000000dba9d000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) ... ... ...
(XEN) Xen call trace:
(XEN)    [<ffff82d080123078>] csched_schedule+0x4be/0xb97
(XEN)    [<ffff82d08012c732>] schedule+0x12a/0x63c
(XEN)    [<ffff82d08012f8c8>] __do_softirq+0x82/0x8d
(XEN)    [<ffff82d08012f920>] do_softirq+0x13/0x15
(XEN)    [<ffff82d080164791>] idle_loop+0x5b/0x6b
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 8:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=0000]
(XEN) ****************************************

Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Acked-by: Juergen Gross <jgross@xxxxxxxx>

---
Cc: Juergen Gross <jgross@xxxxxxxx>
Cc: Jan Beulich <JBeulich@xxxxxxxx>
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
  xen/arch/x86/smpboot.c |    1 -
  xen/common/cpupool.c   |    2 ++
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 2289284..a4ec396 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -887,7 +887,6 @@ void __cpu_disable(void)
      remove_siblinginfo(cpu);

      /* It's now safe to remove this processor from the online map */
-    cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
      cpumask_clear_cpu(cpu, &cpu_online_map);
      fixup_irqs();

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index 5471f93..b48ae17 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -530,6 +530,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
              if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) )
              {
                  cpumask_set_cpu(cpu, (*c)->cpu_suspended);
+                cpumask_clear_cpu(cpu, (*c)->cpu_valid);
                  break;
              }
          }
@@ -552,6 +553,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
           * If we are not suspending, we are hot-unplugging cpu, and that is
           * allowed only for CPUs in pool0.
           */
+        cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
          ret = 0;
      }





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.