[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [XTF PATCH v2] xtf-runner: use xl create -Fc directly



On Tue, Aug 09, 2016 at 02:11:27PM +0100, Andrew Cooper wrote:
> On 08/08/16 14:13, Wei Liu wrote:
> > On Mon, Aug 08, 2016 at 02:06:37PM +0100, Andrew Cooper wrote:
> >> On 08/08/16 12:24, Wei Liu wrote:
> >>> Now that xl create -c is fixed in xen-unstable, there is no need to keep
> >>> the hack to get guest console output anymore.
> >>>
> >>> Use xl create -Fc directly, then wait for the xl process to exit.  Print
> >>> any error as it occurs.
> >>>
> >>> Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> >> Sadly, now I think about this further, it does re-introduce the
> >> serialisation problem I was trying specifically trying to avoid.
> >>
> > Can you give an example of the race you wanted to avoid?
> >
> > I thought with the xenconsole work in place I had solved all races I was
> > aware of, but maybe I missed something obvious.
> 
> It isn't a race.  I do not want synchronously wait for the taredown of
> domains to complete, as it adds unnecessary latency when running more
> than a single test.
> 
> >> You need to run `xl create -F` so you can sensibly wait on the create
> >> list to avoid tripping up the leak detection.
> >>
> >> However, the guest.communicate() call will wait for the guest process to
> >> terminate, which includes all output.
> >>
> > Is there a problem with that?
> 
> It makes your "for child in child_list" useless, as all processes have
> guareteed to exit already.
> 
> >
> >> Therefore, I think we still need the `xl create -Fp`, `xl console`, `xl
> >> unpause` dance, where the create process gets put on the create_list,

There are several issues with this.

Let's go back to the original `xl create -p`, `xl console` and `xl
unpause` first.  This sequence is not race-free. 

xl create -p can race with xl console. That, however, is mitigated by
xl console waiting for 5s for tty node to show up and the use of
subproc_call in runner.

xl unpause can race with xl console (Popen doesn't wait for process to
finish). The domain can be unpaused and exit before xl console gets a
chance to attach. This would cause random failure in osstest.

See

http://osstest.xs.citrite.net/~osstest/testlogs/logs/66926/test-xtf-amd64-amd64-2/info.html
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66926/test-xtf-amd64-amd64-2/gall-mite---var-log-xen-console-guest-test-hvm32-fep.log

My theory is that the failure in fep test is due to the second race.
Unfortunately I can't reproduce that locally at the moment.

Then let's look at `xl create -Fp`, `xl console` and `xl unpause`
sequence. `xl create -Fp` doesn't return until the guest vanishes, so
that means we need to use Popen to run it. The leaves us even less clue
to tell when to execute `xl console`. The first race can only become
worse. The second race still exists.

If you want to retain this trick. I'm afraid what you need is more or
less my very first patch.

> >> and it is the console process which gets communicated with.
> >>
> >> This also has the advantage that it doesn't cause ./xtf-runner to break
> >> against all non-staging trees.
> >>
> > I thought we decided to grep log file for that?
> 
> Right, but until that happens, this patch constitutes a functional
> regression.
> 

Hmm... I think we're talking past each other. Let me clarify.

There are two distinct use cases: used by a human vs used by an
automated test system like osstest.

For osstest this is not a functional regression because that function is
not reliable in the first place. We would need grepping log file to fix
the issue.

For a human, because he or she doesn't care about 1 in 50 random
failure, this is a functional regression. It's still something that
should be fixed (and can be fixed) by grepping log file, but obviously
less important.

I don't want to mix these two use cases. This patch is indeed biased
towards the automated test system. I think I can retain the original
function and test the version of Xen it runs on to decide which version
to use.

FAOD my impression is that you have strong opinions on how log grepping
should be implemented. That's why I didn't do it.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.