[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 2/4] xen/sched: remove cpu from pool0 before removing it

On 13.09.19 19:27, Dario Faggioli wrote:
On Mon, 2019-09-09 at 11:33 +0200, Juergen Gross wrote:
Today a cpu which is removed from the system is taken directly from
Pool0 to the offline state. This will conflict with the new idle
scheduler, so remove it from Pool0 first. Additionally accept
a free cpu instead of requiring it to be in Pool0.

For the resume failed case we need to call the scheduler code for
situation after the cpupool handling, so move the scheduler code into
a function and call it from cpupool_cpu_remove_forced() and remove
CPU_RESUME_FAILED case from cpu_schedule_callback().

Note that we are calling now schedule_cpu_switch() in stop_machine
context so we need to switch from spinlock_irq to spinlock_irqsave.

So, I was looking at this patch, and while doing that, also trying it

I've done the following:

# echo 0 > /sys/devices/system/xen_cpu/xen_cpu7/online

And CPU 7 went offline, and was listed among the free CPUs:

(XEN) Online Cpus: 0-6
(XEN) Free Cpus: 7
(XEN) Cpupool 0:
(XEN) Cpus: 0-6
(XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Active queues: 1
(XEN)   default-weight     = 256
(XEN) Runqueue 0:
(XEN)   ncpus              = 7
(XEN)   cpus               = 0-6
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 1
(XEN)   instload           = 1
(XEN)   aveload            = 3992 (~1%)
(XEN)   idlers: 0000006f
(XEN)   tickled: 00000000
(XEN)   fully idle cores: 0000004f

Then, I did:

# echo 1 > /sys/devices/system/xen_cpu/xen_cpu7/online

And again it appear to have worked, i.e., the CPU is back online and in

(XEN) Online Cpus: 0-7
(XEN) Cpupool 0:
(XEN) Cpus: 0-7
(XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Active queues: 1
(XEN)   default-weight     = 256
(XEN) Runqueue 0:
(XEN)   ncpus              = 8
(XEN)   cpus               = 0-7
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 1
(XEN)   instload           = 2
(XEN)   aveload            = 271474 (~103%)
(XEN)   idlers: 000000af
(XEN)   tickled: 00000000
(XEN)   fully idle cores: 0000008f

Then I did:

# echo 0 > /sys/devices/system/xen_cpu/xen_cpu7/online

And, after that:

# xl cpupool-cpu-remove Pool-0 7

And the system hanged.

I don't have a working serial console on that testbox, unfortunately,
so I can't poke at debug keys, etc.

Is this anything that you've seen or that you can reproduce?

I can reproduce it and already have found the bug.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.