[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL



On Fri, 9 Jun 2017, Jan Beulich wrote:
> >>> On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote:
> >>>> On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote:
> >> On Tue, 6 Jun 2017, Jan Beulich wrote:
> >>> >>> On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote:
> >>> > Looking at the serial logs for that and comparing them with 10009,
> >>> > it's not terribly easy to see what's going on because the kernel
> >>> > versions are different and so produce different messages about xenbr0
> >>> > (and I think may have a different bridge port management algorithm).
> >>> > 
> >>> > But the messages about promiscuous mode seem the same, and of course
> >>> > promiscuous mode is controlled by userspace, rather than by the kernel
> >>> > (so should be the same in both).
> >>> > 
> >>> > However, in the failed test we see extra messages about promis:
> >>> > 
> >>> >   Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left 
> >>> > promiscuous 
> >>> > mode
> >>> >   ...
> >>> >   Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous 
> >>> > mode
> >>> 
> >>> Wouldn't those be another result of the guest shutting down /
> >>> being shut down?
> >>> 
> >>> > Also, the qemu log for the guest in the failure case says this:
> >>> > 
> >>> >   Log-dirty command enable
> >>> >   Log-dirty: no command yet.
> >>> >   reset requested in cpu_handle_ioreq.
> >>> 
> >>> So this would seem to call for instrumentation on the qemu side
> >>> then, as the only path via which this can be initiated is - afaics -
> >>> qemu_system_reset_request(), which doesn't have very many
> >>> callers that could possibly be of interest here. Adding Stefano ...
> >> 
> >> I am pretty sure that those messages come from qemu traditional: "reset
> >> requested in cpu_handle_ioreq" is not printed by qemu-xen.
> > 
> > Oh, indeed - I didn't pay attention to this being a *-qemut-*
> > test. I'm sorry.
> > 
> >> In any case, the request comes from qemu_system_reset_request, which is
> >> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
> >> initiated the reset (or resume)?
> > 
> > Right, this and hw/pckbd.c look to be the only possible
> > sources. Yet then it's still unclear what makes the guest go
> > down.
> 
> So with all of the above in mind I wonder whether we shouldn't
> revert 933f966bcd then - that debugging code is unlikely to help
> with any further analysis of the issue, as reaching that code
> for a dying domain is only a symptom as far as we understand it
> now, not anywhere near the cause.

Makes sense to me

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.