[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 106580: regressions - trouble: blocked/broken/fail/pass



>>> On 10.03.17 at 08:20, <osstest-admin@xxxxxxxxxxxxxx> wrote:
> flight 106580 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/106580/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-armhf-armhf-xl-arndale   3 host-install(3)        broken REGR. vs. 
> 106534
>  test-amd64-amd64-migrupgrade 10 xen-boot/dst_host        fail REGR. vs. 
> 106534

The NMI watchdog has hit the EOI timer waiting to be able to send
an IPI on CPU1:

Mar 10 00:09:32.745677 (XEN) Xen call trace:
Mar 10 00:09:32.745727 (XEN)    [<ffff82d080134083>] _spin_lock+0x2c/0x4f
Mar 10 00:09:32.745779 (XEN)    [<ffff82d080133e34>] on_selected_cpus+0x2c/0xc6
Mar 10 00:09:32.753699 (XEN)    [<ffff82d080177101>] 
irq.c#irq_guest_eoi_timer_fn+0x142/0x165
Mar 10 00:09:32.761711 (XEN)    [<ffff82d080136ddc>] 
timer.c#execute_timer+0x47/0x62
Mar 10 00:09:32.769683 (XEN)    [<ffff82d080136ed2>] 
timer.c#timer_softirq_action+0xdb/0x22c
Mar 10 00:09:32.769744 (XEN)    [<ffff82d0801337e1>] 
softirq.c#__do_softirq+0x7f/0x8a
Mar 10 00:09:32.777697 (XEN)    [<ffff82d080133836>] do_softirq+0x13/0x15
Mar 10 00:09:32.785792 (XEN)    [<ffff82d080255081>] 
entry.o#process_softirqs+0x21/0x30

That lock is being held by CPU2:

Mar 10 00:15:25.133639 (XEN) Xen call trace:
Mar 10 00:15:25.133655 (XEN)    [<ffff82d080102389>] __bitmap_empty+0x54/0x96
Mar 10 00:15:25.141636 (XEN)    [<ffff82d080133eb5>] on_selected_cpus+0xad/0xc6
Mar 10 00:15:25.149635 (XEN)    [<ffff82d0801ca640>] 
powernow.c#powernow_cpufreq_cpu_init+0x20d/0x372
Mar 10 00:15:25.157633 (XEN)    [<ffff82d08014c476>] cpufreq_add_cpu+0x1d6/0x5d3
Mar 10 00:15:25.157654 (XEN)    [<ffff82d0801ca173>] cpufreq_cpu_init+0x17/0x1a
Mar 10 00:15:25.165658 (XEN)    [<ffff82d08014cd8d>] set_px_pminfo+0x2b6/0x2f7
Mar 10 00:15:25.165679 (XEN)    [<ffff82d0801956dd>] do_platform_op+0xe69/0x1959
Mar 10 00:15:25.173667 (XEN)    [<ffff82d080251485>] pv_hypercall+0x1ef/0x42d
Mar 10 00:15:25.181678 (XEN)    [<ffff82d080254ff6>] 
entry.o#test_all_events+0/0x30

Register state tells us that it's CPU5 not responding. The only piece
of information we have about CPU5 is

Mar 10 00:09:32.809709 (XEN) CPU5 @ e008:ffff82d080134083 (0000000000000000)

which is the also in _spin_lock(), but which I'm afraid is too little to
diagnose the issue. I'm therefore wondering whether we wouldn't
better default "async-show-all" to true in debug builds.

What I'm also puzzled by is that the system is still partly alive after
the panic: There's Dom0 output, and it is also reacting to debug
key input. I would have expected a panic to bring down the system
right away...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.