|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [linux-4.1 test] 63030: regressions - FAIL
On Wed, 2015-10-21 at 18:34 +0100, Wei Liu wrote:
> On Wed, Oct 21, 2015 at 05:47:06PM +0100, Ian Campbell wrote:
> > On Tue, 2015-10-20 at 16:34 +0100, Ian Jackson wrote:
> > > Wei Liu writes ("Re: [Xen-devel] [linux-4.1 test] 63030: regressions
> > > - FAIL"):
> > > > From mere code inspection and document of lwip 1.3.0 I think mini
> > > -os
> > > > does send gratuitous ARP.
> > >
> > > The guest is using the PVHVM drivers at this point, with the backend
> > > directly in dom0, so it is the guest's gratuitous arp which is
> > > needed,
> > > I think.
> >
> > It would be worth investigating whether mini-os's gratuitous ARP might
> > also be occurring and confusing things, e.g. by coming after and
> > therefore taking precedence over the one coming from the guest.
> >
>
> Several observations:
>
> 1. The guest doesn't always send gratuitous arp -- but this might not be
> the cause of this failure. Guest works fine when using qemu-trad
> only.
As in it always sends the arp when using qemu-trad, or that it is fine
irrespective of not always sending it?
> 2. Guest only sends one gratuitous arp at most.
This is as expected, but does the stubdom also send one?
> 3. When using stubdom, guest is a lot less responsive. See two
> experiments and analysis below.
Less responsive in use or only while migrating, or to ssh after migration,
or to something else?
> Scenario 1:
> xl shows "Migration successful."
> ...30s...
> xenbr0 receives gratuitous arp
> ...1s...
> ssh date command comes back
>
> Scenario 2:
> xenbr0 receives gratuitous arp
> ...1s...
> xl shows "Migration successful."
> ssh date command comes back
>
> When stubdom was not present I never saw scenario 1.
It would be worth looking at the possibility of a delay between "Migration
successful" and the target domain actually running. A 30s delay between the
guest restarting and it sending the ARP would be pretty strange IMHO
> Note that my machine is relative old (>6 years). It would never pass
> the test in osstest because in osstest the timeout is 10s.
>
> The slowness in osstest seems to be host specific because all failures
> in guest migrate test failed on merlot*. It's not only linux-4.1 is
> failing, other branches fail the same test step on merlot*, too.
This could be a factor in common with the other qmu timeout on merlot which
led to 9acfbe14d726.
It might be worth prodding AMD over that issue again.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |