[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [linux-4.1 test] 63030: regressions - FAIL
On Wed, 2015-10-21 at 18:34 +0100, Wei Liu wrote: > On Wed, Oct 21, 2015 at 05:47:06PM +0100, Ian Campbell wrote: > > On Tue, 2015-10-20 at 16:34 +0100, Ian Jackson wrote: > > > Wei Liu writes ("Re: [Xen-devel] [linux-4.1 test] 63030: regressions > > > - FAIL"): > > > > From mere code inspection and document of lwip 1.3.0 I think mini > > > -os > > > > does send gratuitous ARP. > > > > > > The guest is using the PVHVM drivers at this point, with the backend > > > directly in dom0, so it is the guest's gratuitous arp which is > > > needed, > > > I think. > > > > It would be worth investigating whether mini-os's gratuitous ARP might > > also be occurring and confusing things, e.g. by coming after and > > therefore taking precedence over the one coming from the guest. > > > > Several observations: > > 1. The guest doesn't always send gratuitous arp -- but this might not be > the cause of this failure. Guest works fine when using qemu-trad > only. As in it always sends the arp when using qemu-trad, or that it is fine irrespective of not always sending it? > 2. Guest only sends one gratuitous arp at most. This is as expected, but does the stubdom also send one? > 3. When using stubdom, guest is a lot less responsive. See two > experiments and analysis below. Less responsive in use or only while migrating, or to ssh after migration, or to something else? > Scenario 1: > xl shows "Migration successful." > ...30s... > xenbr0 receives gratuitous arp > ...1s... > ssh date command comes back > > Scenario 2: > xenbr0 receives gratuitous arp > ...1s... > xl shows "Migration successful." > ssh date command comes back > > When stubdom was not present I never saw scenario 1. It would be worth looking at the possibility of a delay between "Migration successful" and the target domain actually running. A 30s delay between the guest restarting and it sending the ARP would be pretty strange IMHO > Note that my machine is relative old (>6 years). It would never pass > the test in osstest because in osstest the timeout is 10s. > > The slowness in osstest seems to be host specific because all failures > in guest migrate test failed on merlot*. It's not only linux-4.1 is > failing, other branches fail the same test step on merlot*, too. This could be a factor in common with the other qmu timeout on merlot which led to 9acfbe14d726. It might be worth prodding AMD over that issue again. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |