[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Notes for xen summit 2018 design session] Process changes: is the 6 monthly release Cadence too short, Security Process, ...

>>> On 05.07.18 at 20:13, <cardoe@xxxxxxxxxx> wrote:
> On Thu, Jul 05, 2018 at 12:16:09PM +0100, Ian Jackson wrote:
>> Juergen Gross writes ("Re: [Xen-devel] [Notes for xen summit 2018 design 
> session] Process changes: is the 6 monthly release Cadence too short, 
> Security Process, ..."):
>> > We didn't look at the sporadic failing tests thoroughly enough. The
>> > hypercall buffer failure has been there for ages, a newer kernel just
>> > made it more probable. This would have saved us some weeks.
>> In general, as a community, we are very bad at this kind of thing.
>> In my experience, the development community is not really interested
>> in fixing bugs which aren't directly in their way.
>> You can observe this easily in the way that regression in Linux,
>> spotted by osstest, are handled.  Linux 4.9 has been broken for 43
>> days.  Linux mainline is broken too.
>> We do not have a team of people reading these test reports, and
>> chasing developers to fix them.  I certainly do not have time to do
>> this triage.  On trees where osstest failures do not block
>> development, things go unfixed for weeks, sometimes months.
> Honestly this is where we need some kind of metrics with output that my
> 5-year old could decipher. The OSSTEST emails are large and overwhelming
> and require a bit of time commitment to digest the volume and amount of
> data.

I don't understand this: All that's really relevant in those mails for
an initial check is the top most section "Tests which did not succeed
and are blocking". Everything further from that requires looking into
one or more of the logs and auxiliary files linked to at the very top
of those mails.

> Jenkins uses weather icons to attempt to convey if this test is
> trending worse or better or successful or broken. If it fails but not
> every time and the amount of failures is increasing over time then its
> got storm clouds. If the amount of failures is decreasing there's a
> little bit of sun peaking out.
> Just some kind of dashboard which would tell me what would provide the
> most value to drill into would likely go a long way. But again, this is
> just an assumption and could be a time waste.

I think every test failure warrants looking into. It is just the case that
after having seen a certain "uninteresting" case a number of times, I
for instance make further implications from that on later flight reports.
Maybe I shouldn't, but I also can't afford spending endless hours on
looking all the details of all the flights.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.