[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [linux-4.1 test] 63030: regressions - FAIL



On Thu, Oct 22, 2015 at 10:50:54AM +0100, Ian Campbell wrote:
> On Wed, 2015-10-21 at 18:34 +0100, Wei Liu wrote:
> > On Wed, Oct 21, 2015 at 05:47:06PM +0100, Ian Campbell wrote:
> > > On Tue, 2015-10-20 at 16:34 +0100, Ian Jackson wrote:
> > > > Wei Liu writes ("Re: [Xen-devel] [linux-4.1 test] 63030: regressions 
> > > > - FAIL"):
> > > > > From mere code inspection and document of lwip 1.3.0 I think mini
> > > > -os
> > > > > does send gratuitous ARP.
> > > > 
> > > > The guest is using the PVHVM drivers at this point, with the backend
> > > > directly in dom0, so it is the guest's gratuitous arp which is
> > > > needed,
> > > > I think.
> > > 
> > > It would be worth investigating whether mini-os's gratuitous ARP might
> > > also be occurring and confusing things, e.g. by coming after and
> > > therefore taking precedence over the one coming from the guest.
> > > 
> > 
> > Several observations:
> > 
> > 1. The guest doesn't always send gratuitous arp -- but this might not be
> >    the cause of this failure. Guest works fine when using qemu-trad
> >    only.
> 
> As in it always sends the arp when using qemu-trad, or that it is fine
> irrespective of not always sending it?
> 

Whether or not stubdom is in use, the guest behaves the same -- it
doesn't always send gratuitous arp.

When using qemu-trad alone, it's always fine when it doesn't send
gratuitous arp because either there is cache in dom0 that already has
guest mac address or the guest responses instantly to dom0 arp request.

So it comes down to the responsiveness of guest is the key.

> > 2. Guest only sends one gratuitous arp at most.
> 
> This is as expected, but does the stubdom also send one?
> 

There is at most one gratuitous arp request per migration, I think it's
from guest, not stubdom. To identify the exact interface the arp packet
comes from requires a bit of gymnastics with tcpdump that I haven't
managed to do yesterday.

> > 3. When using stubdom, guest is a lot less responsive. See two
> >    experiments and analysis below.
> 
> Less responsive in use or only while migrating, or to ssh after migration,
> or to something else?
> 

For every activity after migration for a period of time, including both
arp request / reply and ssh connection.

> > Scenario 1:
> >   xl shows "Migration successful."
> >   ...30s...
> >   xenbr0 receives gratuitous arp
> >   ...1s...
> >   ssh date command comes back
> > 
> > Scenario 2:
> >   xenbr0 receives gratuitous arp
> >   ...1s...
> >   xl shows "Migration successful."
> >   ssh date command comes back
> > 
> > When stubdom was not present I never saw scenario 1.
> 
> It would be worth looking at the possibility of a delay between "Migration
> successful" and the target domain actually running. A 30s delay between the
> guest restarting and it sending the ARP would be pretty strange IMHO
> 

The guest is in a weird state.

xl list shows the stubdom is in "b" state while guest has no state at
all, heh.

Wei.

> > Note that my machine is relative old (>6 years). It would never pass
> > the test in osstest because in osstest the timeout is 10s.
> > 
> > The slowness in osstest seems to be host specific because all failures
> > in guest migrate test failed on merlot*. It's not only linux-4.1 is
> > failing, other branches fail the same test step on merlot*, too.
> 
> This could be a factor in common with the other qmu timeout on merlot which
> led to 9acfbe14d726.
> 
> It might be worth prodding AMD over that issue again.
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.