[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin



On Tue, 2018-04-10 at 17:59 +0200, Olaf Hering wrote:
> On Tue, Apr 10, Olaf Hering wrote:
> 
> > (XEN) Xen BUG at sched_credit.c:1694
> 
> And another one with debug=y and this config:
>
Wow...

> memory=4444
> vcpus=36
> cpu="nodes:1,^node:0"
> cpu_soft="nodes:1,^node:0"
>
As said, its cpus= and cpus_soft=, and you probably just need

cpus="node:1"
cpus_soft="node:1"

Or, even just:

cpus="node:1"

as, if soft-affinity is set to be equal to hard, it is just ignored.

> (nodes=1 cycles between 1-3 for each following domU).
> 
> (XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at
> sched_credit.c:269
> (XEN) ----[ Xen-4.11.20180407T144959.e62e140daa-
> 4.bug1087289_411  x86_64  debug=y   Not tainted ]----
> (XEN) CPU:    18
> (XEN) RIP:    e008:[<ffff82d08022b2e8>]
> sched_credit.c#csched_schedule+0x8fe/0xd42
> (XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor (d0v18)
> ...
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08022b2e8>]
> sched_credit.c#csched_schedule+0x8fe/0xd42
> (XEN)    [<ffff82d080236406>] schedule.c#schedule+0x107/0x627
> (XEN)    [<ffff82d080239ec5>] softirq.c#__do_softirq+0x85/0x90
> (XEN)    [<ffff82d080239f1a>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08036e566>]
> x86_64/entry.S#process_softirqs+0x6/0x10
>
Yeah, thanks for trying with debugging on. Unfortunately, stack traces
in these case are not very helpful, as they only tell us that
schedule() is being called by do_softirq()... :-P

Still...

> (XEN) ****************************************
> (XEN) Panic on CPU 18:
> (XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at
> sched_credit.c:269
>
...it is another, different, one, this time when removing (or not
reinserting) the vcpu from the runqueue.

What would be helpful, would be to catch the other side of the race,
i.e., the point when the vcpu is being re-insterted in the runqueue, or
when v->processor of a vcpu in the runqueue is changed.... Let's see if
the debug patch will help with this.

Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.