Xen project Mailing List

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Fri, 9 Jun 2017 10:50:37 -0700 (PDT)

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, osstest-admin@xxxxxxxxxxxxxx, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 09 Jun 2017 17:50:49 +0000

Dmarc-filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBCD0239A6

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Fri, 9 Jun 2017, Jan Beulich wrote: > >>> On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote: > >>>> On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote: > >> On Tue, 6 Jun 2017, Jan Beulich wrote: > >>> >>> On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote: > >>> > Looking at the serial logs for that and comparing them with 10009, > >>> > it's not terribly easy to see what's going on because the kernel > >>> > versions are different and so produce different messages about xenbr0 > >>> > (and I think may have a different bridge port management algorithm). > >>> > > >>> > But the messages about promiscuous mode seem the same, and of course > >>> > promiscuous mode is controlled by userspace, rather than by the kernel > >>> > (so should be the same in both). > >>> > > >>> > However, in the failed test we see extra messages about promis: > >>> > > >>> > Jun 5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left > >>> > promiscuous > >>> > mode > >>> > ... > >>> > Jun 5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous > >>> > mode > >>> > >>> Wouldn't those be another result of the guest shutting down / > >>> being shut down? > >>> > >>> > Also, the qemu log for the guest in the failure case says this: > >>> > > >>> > Log-dirty command enable > >>> > Log-dirty: no command yet. > >>> > reset requested in cpu_handle_ioreq. > >>> > >>> So this would seem to call for instrumentation on the qemu side > >>> then, as the only path via which this can be initiated is - afaics - > >>> qemu_system_reset_request(), which doesn't have very many > >>> callers that could possibly be of interest here. Adding Stefano ... > >> > >> I am pretty sure that those messages come from qemu traditional: "reset > >> requested in cpu_handle_ioreq" is not printed by qemu-xen. > > > > Oh, indeed - I didn't pay attention to this being a *-qemut-* > > test. I'm sorry. > > > >> In any case, the request comes from qemu_system_reset_request, which is > >> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS > >> initiated the reset (or resume)? > > > > Right, this and hw/pckbd.c look to be the only possible > > sources. Yet then it's still unclear what makes the guest go > > down. > > So with all of the above in mind I wonder whether we shouldn't > revert 933f966bcd then - that debugging code is unlikely to help > with any further analysis of the issue, as reaching that code > for a dying domain is only a symptom as far as we understand it > now, not anywhere near the cause. Makes sense to me _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.