[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [linux-4.1 test] 63030: regressions - FAIL



On Wed, 2015-10-21 at 11:35 +0100, Wei Liu wrote:
> On Wed, Oct 21, 2015 at 10:44:48AM +0100, Ian Campbell wrote:
> > On Wed, 2015-10-21 at 10:24 +0100, Wei Liu wrote:
> > > On Wed, Oct 21, 2015 at 10:04:14AM +0100, Ian Campbell wrote:
> > > > On Tue, 2015-10-20 at 16:24 +0100, Wei Liu wrote:
> > > > > But this is only code inspection,  so I'm not very confident whether
> > > > > everything does what it says it does.
> > > > 
> > > > Right,. I think this one probably needs someone to setup a system in a
> > > > similar configuration and play with it.
> > > > 
> > > 
> > > Is there an easy way to do that? Say, give me some runes so that I can
> > > lock a machine in Cambridge instance, run the failing test case.
> > 
> > I could[0] but, why can't you just set things up on your existing test
> > hosts, either using standalone mode or by just installing the guest by
> > hand?
> > 
> > That's what I would do (probably the latter) in the first instance. It's
> > very likely IME that you are going to need to poke at this interactively
> > while debugging and to run repeated migrations etc to trigger the issue.
> > IMHO trying to use osstest for such manual debugging is just going to get
> > in the way.
> > 
> 
> I could do all these manually, but not without paying much attention:
> allocating a new test box (all my test boxes are in use at the moment),
> run standalone mode, use standalone mode to install the test box, grab
> various tarballs from osstest website if I don't want to build them
> again, put them in suitable location and use standalone script to fiddle
> with standalone mode database, manually install a guest etc etc,  let
> alone the bug we're hunting might not be reproducible on the new test
> box due to different hardware and external environment (as we've already
> witnessed in production osstest system), then I'm left in dilemma
> wondering whether I should repeat all these things (well, part of) again
> or just give up.
> 
> This looks like a list of endless tedious tasks and it could go wrong
> many places in between. If I can get OSSTest to lock a box and run up to
> the point that it reproduces the issue that would be of great help.

This seems to me to be making a mountain out of a mole hill, installing a
Xen host should be bread and butter for most of us.

However, since you insist, I recently added some explanation in README of
how to make an adhoc job including cloning a previous flight and forcing it
to run on a given machine (useful if you think it might be machine
specific).

There is no mechanical way to then lock a host on failure. What I usually
do is run the mg-allocate run I mentioned in my previous mail after the
test case has already started. Since mg-allocate has a higher priority than
regular jobs, but with -U waits for the current job to finish, you are
basically guaranteed that your mg-allocate will get the host next.

> Furthermore, I can write down all the runes I use so that other people
> can do the same to reproduce bugs discovered in osstest. That would
> certainly help lower the barrier for people who want to help triaging
> bugs.

This sort of thing is of no help with triage. It might be useful for
debugging and reproducing an issue, but triage does not involve doing such
things, it is the step before.

I'm being pedantic here because I don't think it is helpful to overstate
what triage involves, since that will put people off doing useful triage
activities.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.