[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] credit2 / cpupool crash in xen-unstable



On Fri, Jul 15, 2016 at 3:48 PM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
> Hey Dario,
>
> Working on my scheduler benchmarking, and managing occasionally to
> crash Xen running the following command:
>
> xl cpupool-create cpupool.cfg && ./schedbench plan && (xentrace -S 32
> -e 0x21000 -D /tmp/05.schedbench-w16-cpupool-credit2.trace &) &&
> ./schedbench run && killall xentrace && xl cpupool-destroy schedbench
>
> The "schedbench run" command will create 16 VMs, each with one vcpu,
> and run them in the "schedbench" cpupool; it's sometime during this
> that I get the following dump (full boot log attached):
>
> (XEN) Initializing Credit2 scheduler
> (XEN)  WARNING: This is experimental software in development.
> (XEN)  Use at your own risk.
> (XEN)  load_window_shift: 18
> (XEN)  underload_balance_tolerance: 0
> (XEN)  overload_balance_tolerance: -3
> (XEN)  runqueues arrangement: core
> (XEN) Adding cpu 12 to runqueue 0
> (XEN)  First cpu on runqueue, activating
> (XEN) Adding cpu 13 to runqueue 0
> (XEN) Adding cpu 14 to runqueue 1
> (XEN)  First cpu on runqueue, activating
> (XEN) Adding cpu 15 to runqueue 1
> (XEN) csched2_vcpu_insert: Inserting d161v0
> (XEN) csched2_vcpu_insert: Inserting d162v0
> (XEN) csched2_vcpu_insert: Inserting d163v0
> (XEN) csched2_vcpu_insert: Inserting d164v0
> (XEN) csched2_vcpu_insert: Inserting d165v0
> (XEN) csched2_vcpu_insert: Inserting d166v0
> (XEN) csched2_vcpu_insert: Inserting d167v0
> (XEN) ----[ Xen-4.8-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82d0801269a1>] 
> sched_credit2.c#__runq_assign+0x28/0x8f
> (XEN) RFLAGS: 0000000000010086   CONTEXT: hypervisor (d0v10)
> (XEN) rax: 0000000000000000   rbx: ffff83083eb6f8e0   rcx: 0000000000000001
> (XEN) rdx: ffff83083ead03e0   rsi: ffff83083ead03a0   rdi: ffff83083eb6f8e0
> (XEN) rbp: ffff8300bf507cf8   rsp: ffff8300bf507cd8   r8:  ffff83082a3e0000
> (XEN) r9:  0000000000000001   r10: 0000000000000000   r11: 0000000000000001
> (XEN) r12: ffff83083ead03a0   r13: ffff83082a3dc148   r14: ffff83082a3dc148
> (XEN) r15: ffff83083eb6f790   cr0: 0000000080050033   cr4: 00000000000026e0
> (XEN) cr3: 000000080a10d000   cr2: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen code around <ffff82d0801269a1>
> (sched_credit2.c#__runq_assign+0x28/0x8f):
> (XEN)  40 48 89 17 48 89 47 08 <48> 89 38 8b 77 38 48 8b 7f 20 ba 00 00 00 00 
> e8
> (XEN) Xen stack trace from rsp=ffff8300bf507cd8:
> (XEN)    0000000000000002 ffff8300bf507cf8 ffff8300bf2d6000 ffff82d08033bcc0
> (XEN)    ffff8300bf507d08 ffff82d080126a40 ffff8300bf507d58 ffff82d080126ffd
> (XEN)    ffff8300bf507d28 ffff83083d452340 ffff8300bf507d58 ffff8300bf2d6000
> (XEN)    ffff830828510000 0000000000000000 0000000000000001 ffff82d08033cbc8
> (XEN)    ffff8300bf507d78 ffff82d08012c709 ffff8300bf2d6000 ffff830828510000
> (XEN)    ffff8300bf507da8 ffff82d080105edd 0000000000000001 ffff830828510000
> (XEN)    0000000000000020 ffff83083eb6f7c0 ffff8300bf507ef8 ffff82d080103aee
> (XEN)    ffff82d08033bba8 ffff82d08033cbc8 ffff82d08033cbc8 ffff82d08033cbc8
> (XEN)    00007fe113c9e004 0000000f00000000 0000000000000000 ffff83083d449a90
> (XEN)    0000000000000000 0000000000000001 000000000000f003 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffff82d08012c34b ffff83083d4c0de0
> (XEN)    0000000c0000000f 00007fe113a900a7 00007ffd00000001 00007fe1135f280a
> (XEN)    00000000000001bc 00007fe1135a5d3f 00007fe11359ea38 00007ffd892c0648
> (XEN)    000000008f1bd153 00000000023c6f45 00000000011d4310 00000000000000a7
> (XEN)    0000000000000001 00000000000000a7 00000000011d4310 00000000011d4090
> (XEN)    0000000000000000 00007fe113ca09a8 00007fe113c9e004 ffff8300bf2f0000
> (XEN)    ffff880037682a00 0000000000305000 00007ffd892c04a0 0000000000000000
> (XEN)    00007cff40af80c7 ffff82d0802401fd ffffffff8100148a 0000000000000024
> (XEN)    00007ffd892c05f0 00007ffd892c08b8 00007fe11364496a 00007ffd892c08b8
> (XEN)    00007ffd892c04a0 ffff880002dcdac0 0000000000000282 0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0801269a1>] sched_credit2.c#__runq_assign+0x28/0x8f
> (XEN)    [<ffff82d080126a40>] sched_credit2.c#runq_assign+0x38/0x3a
> (XEN)    [<ffff82d080126ffd>] sched_credit2.c#csched2_vcpu_insert+0x8a/0xd9
> (XEN)    [<ffff82d08012c709>] sched_init_vcpu+0x1b8/0x1fe
> (XEN)    [<ffff82d080105edd>] alloc_vcpu+0x1be/0x2b7
> (XEN)    [<ffff82d080103aee>] do_domctl+0x9e7/0x1d66
> (XEN)    [<ffff82d0802401fd>] lstar_enter+0xdd/0x137

It appears to be that csched2_vcpu_insert() is being called with
vc->processor set to '1', which is not in the cpupool in question,
because there's a bug in the
xen/common/domctl.c:default_vcpu0_location() that allows it in some
circumstances to return a vcpu value not inside the mask passed to it.
Looking at a solution.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.