[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 0/2][XTF] xtf/vpmu VPMU tests



On 28/04/17 20:35, Mohit Gambhir wrote:
>
>
> On 04/25/2017 02:50 PM, Andrew Cooper wrote:
>> On 24/04/17 18:54, Mohit Gambhir wrote:
>>> Mohit Gambhir (2):
>>>    xtf/vpmu: Add Intel PMU MSR addresses
>>>    xtf/vpmu: MSR read/write tests for VPMU
>>>
>>>   arch/x86/include/arch/msr-index.h |  11 +
>>>   tests/vpmu/Makefile               |   9 +
>>>   tests/vpmu/main.c                 | 504
>>> ++++++++++++++++++++++++++++++++++++++
>>>   3 files changed, 524 insertions(+)
>>>   create mode 100644 tests/vpmu/Makefile
>>>   create mode 100644 tests/vpmu/main.c
>>>
>> Independently from the content of this series, how have you found the
>> XTF to use?  Any issues or unexpected corner cases which can be
>> improved?
> I think overall it was fairly easy to use and was very useful for the
> kind of testing we wanted to do for VPMU.
> It helped me find 3 bugs - one hypervisor crash, one potential state
> leak and one missing functionality.
> I didnt find any bugs/corner cases in XTF but here are the few nice to
> have features that you can consider adding -
>
> 1. As said in responses to one of the comments, I think it will be
> nice to have some kind of
> "expected failure" return type that indicates that it is a known bug.
> Thats particularly
> useful if the person writing the tests is not a Xen developer. You
> seem to have indicated
> that xtf_error() means that it is a known error but that wasn't
> intuitive to me.

I clearly need to improve the documentation in this area.  The
terminology is inherited from XenRT (our internal test system) and
learns from 10 years of lessons on how to sensibly report results in a
clear and concise fashion.

SUCCESS means "everything worked correctly".
FAILURE means "the test positively identified that the subject under
test behaved incorrectly".
ERROR means "Something unexpected went wrong".
SKIP means "The test didn't run", usually because the preconditions were
not met.

The distinction between failure and error is important when triaging
results (and indeed, the priority order of investigation), but in the
end, still means that a human is going to have to look at the logs and
either fix a product bug/regression, or fix whatever is causing the
unexpected nature of the problem (which itself is either a testcase fix,
or more likely a testcase and product fix).

As for "expected failure", there is no such thing.  If your test case
has identified that Xen is malfunctioning, then that is a straight
failure.  Having any other kind of "its broken, but pretend everything
went successfully" is a sure-fire way for the problem to get ignored
forever more if it becomes the status quo in regression tests.

My method thus far is to fix all parts of the problem in Xen, then
introduce the complete XTF test which passes on master (but typically
will fail on older branches, and OSSTest takes care of relegating those
to its idea of "expected failure".)

I accept that this is harder to do for larger areas of functionality,
but there is no point running a regression test if we are expecting it
not to pass, because it won't identify other regressions which might
slip it.  However, this is predicated on getting master sorted before
the test gets run in a regression-capacity.  (I am facing a similar
dilemma with the nested-virt tests, which is why I am leaning towards
having a new category which is for tests or areas of functionality which
are in development and aren't ready to be part of the regression tests yet.)

>
> 2. Its useful to print __LINE__ number from where an error/failure is
> reported.

Is it?  I'm an advocate of the principle that if the message text
contains sufficient information to be considered useful, a
__func/FILE/LINE__ reference isn't necessary.  (Quality of error message
text is a frequent review point of mine, with a maintainers hat on).

Nothing stops you formatting __LINE__ into the message, but I would
suggest that probably means your text could be better.

>
> 3. Is test_main() considered 1 test?  If so it would be useful to
> define a test at a finer granularity.
> For instance in VPMU case, each MSR test should really be a separate
> test. It would require
> adding some semantics to names of the functions that define a test.
> That way, its easier to
> keep track of the health of the software I think.

There is a tradeoff with the granularity of reporting.  It is a
conscious design choice that a single microkernels worth of test logic
has a single net result (again, inherited from XenRTs eventual lessons
learnt from how to effectively deal with tens of thousands of machine
hours of tests per night, and hundreds of thousands of logical tests).

The common case for automation is "run all $N thousand XTF tests, and
raise tickets for any failures".  It is not useful for the automation
test framework to have to deal in terms of individual test steps worth
of success/failure inside a single logical test.

On the other hand, when any non-success is encountered, a developer is
going to have to intervene (at which point the quality of error
information is important), but the test is designed to be easy and quick
to run in isolation, so investigating a single failure inside something
like this larger vpmu test shouldn't be difficult.

>
> 4. Is there a way to build just one test? Something like make
> test-hvm64-vpmu?

No, but I've been meaning to rewrite the build system to be less bad
than it currently is.  For now, the fact that total compilation of
everything is only a few seconds means that it isn't terribly high on my
priority list.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.