[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] xen/sched: fix cpu offlining with core scheduling
On Tue, 2020-03-10 at 09:09 +0100, Juergen Gross wrote: > Offlining a cpu with core scheduling active can result in a hanging > system. Reason is the scheduling resource and unit of the to be > removed > cpus needs to be split in order to remove the cpu from its cpupool > and > move it to the idle scheduler. In case one of the involved cpus > happens > to have received a sched slave event due to a vcpu former having been > running on that cpu being woken up again, it can happen that this cpu > will enter sched_wait_rendezvous_in() while its scheduling resource > is > just about to be split. It might wait for ever for the other sibling > to join, which will never happen due to the resources already being > modified. > > This can easily be avoided by: > - resetting the rendezvous counters of the idle unit which is kept > - checking for a new scheduling resource in > sched_wait_rendezvous_in() > after reacquiring the scheduling lock and resetting the counters in > that case without scheduling another vcpu > - moving schedule resource modifications (in schedule_cpu_rm()) and > retrieving (schedule(), sched_slave() is fine already, others are > not > critical) into locked regions > > Reported-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > Signed-off-by: Juergen Gross <jgross@xxxxxxxx> > Reviewed-by: Dario Faggioli <dfaggioli@xxxxxxxx> Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere) Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |