[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin




> On Apr 11, 2018, at 10:31 PM, Dario Faggioli <raistlin@xxxxxxxx> wrote:
> 
> Il Mer 11 Apr 2018, 22:48 Olaf Hering <olaf@xxxxxxxxx> ha scritto:
> On Wed, Apr 11, Dario Faggioli wrote:
> 
> > It will crash, again, possibly with the same stack trace, but I think
> > it's worth a try.
> 
>     BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc)));
> 
> (XEN) Xen BUG at sched_credit.c:876
> (XEN) ----[ Xen-4.11.20180410T125709.50f8ba84a5-7.bug1087289_411  x86_64  
> debug=y   Not tainted ]----
> (XEN) CPU:    108
> (XEN) RIP:    e008:[<ffff82d080229ab4>] 
> sched_credit.c#csched_vcpu_migrate+0x27/0x54
> (XEN) RFLAGS: 0000000000010006   CONTEXT: hypervisor
> ...
> (XEN) Xen call trace:
> (XEN)    [<ffff82d080229ab4>] sched_credit.c#csched_vcpu_migrate+0x27/0x54
> (XEN)    [<ffff82d080236348>] schedule.c#vcpu_move_locked+0xbb/0xc2
> (XEN)    [<ffff82d08023764c>] schedule.c#vcpu_migrate+0x226/0x25b
> (XEN)    [<ffff82d080239367>] context_saved+0x95/0x9c
> (XEN)    [<ffff82d08027797d>] context_switch+0xe66/0xeb0
> (XEN)    [<ffff82d080236943>] schedule.c#schedule+0x5f4/0x627
> (XEN)    [<ffff82d080239f15>] softirq.c#__do_softirq+0x85/0x90
> (XEN)    [<ffff82d080239f6a>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08031f5db>] vmx_asm_do_vmentry+0x2b/0x30
> 
> So, really *exactly* the same. Ok, thanks.

But this doesn’t make any sense.  If you applied Dario’s ‘fix’ patch, then 
context_saved() should have *just* called vcpu_sleep_nosync() before calling 
vcpu_migrate().  The VPF_migrating flag should still be set, so it should have 
called csched_vcpu_sleep(); and sd->curr should have been changed to be != prev 
way back in schedule(), so csched_vcpu_sleep() should have called runq_remove().

It’s probably worth asking the obvious question: Are you sure the “fix” patch 
is actually applied (in addition to the new “debug” patch)? :-)

If so, then maybe it’s time to open-code vcpu_sleep_nosync() there in 
context_saved(), to try to figure out where our understanding of what *should* 
happen is incorrect.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.