[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [OSSTEST Nested PATCH 2/6] Add and expose some testsupport APIs



On Fri, 2015-03-20 at 12:59 +0000, Pang, LongtaoX wrote:
> 
> 
> > -----Original Message-----
> > From: Ian Campbell [mailto:ian.campbell@xxxxxxxxxx]
> > Sent: Friday, March 20, 2015 8:20 PM
> > To: Pang, LongtaoX
> > Cc: xen-devel@xxxxxxxxxxxxx; Ian.Jackson@xxxxxxxxxxxxx; wei.liu2@xxxxxxxxxx;
> > Hu, Robert
> > Subject: Re: [OSSTEST Nested PATCH 2/6] Add and expose some testsupport
> > APIs
> > 
> > On Fri, 2015-03-20 at 12:09 +0000, Pang, LongtaoX wrote:
> > > Add xen-devel in mail loop.
> > 
> > Here is what I wrong in reply to the private version without noticing that 
> > it was
> > private.
> > 
> > On Fri, 2015-03-20 at 11:59 +0000, Pang, LongtaoX wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ian Campbell [mailto:ian.campbell@xxxxxxxxxx]
> > > > Sent: Friday, March 20, 2015 12:27 AM
> > > > To: Pang, LongtaoX
> > > > Cc: xen-devel@xxxxxxxxxxxxx; Ian.Jackson@xxxxxxxxxxxxx;
> > > > wei.liu2@xxxxxxxxxx; Hu, Robert
> > > > Subject: Re: [OSSTEST Nested PATCH 2/6] Add and expose some
> > > > testsupport APIs
> > > >
> > > > On Tue, 2015-03-17 at 14:16 -0400, longtao.pang wrote:
> > > > > From: "longtao.pang" <longtaox.pang@xxxxxxxxx>
> > > > >
> > > > > 1. Designate vif model to 'e1000', otherwise, with default device
> > > > > model, the L1 eth0 interface disappear, hence xenbridge cannot work.
> > > > > Maybe this limitation can be removed later after some fix it. For
> > > > > now, we have to accomodate to it.
> > > >
> > > > You have done this unconditionally, which means it affects all guests.
> > > > You need to make this configurable by the caller, probably by
> > > > plumbing it through in $xopts (a hash of extra options).
> > > >
> > > > I see now you were told this last time around by Ian J, please don't
> > > > just resend such things without change either fix them, make an
> > > > argument for doing it your way or ask for clarification if you don't
> > understand the requested change.
> > > >
> > >
> > > Thanks for your advice, I will try it. But, do you have any idea about 
> > > below
> > issue that confused me?
> > > After L1 Debian hvm guest boot into XEN kernel, it failed to load
> > > 8139cp driver(Realtek RTL-8139), that cause L1 guest's network 
> > > unavailable,
> > and I have to specify 'model=e1000' to make L1's network available.
> > > The issue does not exist in RHEL6u5 OS(L0 and L1 are both RHEL6u5 OS).
> > 
> > Could just be a bug in Debian's kernel. Without more information it's rather
> > hard to say.
> > 
> > >
> > > > > 2. Since reboot L1 guest VM will take more time to boot up, we
> > > > > increase multi-times for reboot-confirm-booted if test nested job,
> > > > > and the multi value is stored as a runvar in 'ts-nested-setup' script.
> > > > > Added another function 'guest_editconfig_cd' and expose it, this
> > > > > function bascically changes guest boot device sequence, alter its
> > > > > on_reboot behavior to restart and enabled nestedhvm feature.
> > > >
> > > > This looks like two items run together?
> > > >
> > > > The multi_reboot_time thing sounds ok, but it should be called
> > > > reboot_time_factor or something like that. In fact I see that Ian
> > > > suggested previously that it should have the host ident in it, that 
> > > > makes
> > sense to me.
> > > >
> > > I will try it. Also, how do you handle below question after reboot
> > > host OS during running OSSTest job?
> > > After finishing L0 and L1 host installation, the OSs will take a lot
> > > time(about 150s) to start MTA service and NTP service. I know that,
> > > the poll_loop timeout is 40s of 'reboot-confirm-booted', that's why
> > > timeout happened when calling 'host_reboot' function after reboot host OS.
> > 
> > I'm afraid I don't know what you are asking here.
> > 
> When rebooting Debian L0 or Debian L1 guest, during booting, it will
> take a lot of time(about 150s) to starting MTA and NTP service, and
> then boot into Debian OS.

That normally suggests that there is something wrong with your network
setup, perhaps forward or reverse DNS. IMHO it shouldn't ever take 150s
to do this for a native/L0 Debian.

For example
http://www.chiark.greenend.org.uk/~xensrcts/logs/36514/test-amd64-amd64-xl/serial-gall-mite.log
 shows:
        Mar 18 21:42:39.935554 Using makefile-style concurrent boot in runlevel 
2.
        Mar 18 21:42:39.951952 Starting rpcbind daemon...Already running..
        Mar 18 21:42:39.995954 Starting NFS common utilities: statd idmapd.
        Mar 18 21:42:40.019569 Starting enhanced syslogd: rsyslogd.
        Mar 18 21:42:40.191582 Starting ACPI services....
        Mar 18 21:42:40.315569 Starting web server: apache2.
        Mar 18 21:42:40.539998 Starting deferred execution scheduler: atd.
        Mar 18 21:42:40.708028 Starting periodic command scheduler: cron.
        Mar 18 21:42:40.876030 Starting NTP server: ntpd.
        Mar 18 21:42:40.876078 Starting OpenBSD Secure Shell server: sshd.
        Mar 18 21:42:41.052041 Starting MTA: exim4.
        Mar 18 21:42:41.275639 
        Mar 18 21:42:42.351734 Debian GNU/Linux 7 gall-mite ttyS0
        Mar 18 21:42:42.363737 
        Mar 18 21:42:42.363769 gall-mite login: 
i.e. a few seconds. Even a really slow machine shouldn't be taking 150s.

I doubt L1 should be noticeably different from L0 in this (L2 might be a
different matter but AFAICT you aren't talking about L2 here).

You need to figure out what is in your environment that makes it take so
long to start up these services, not work around the issue by fudging
the timeouts in osstest.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.