[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] S3 is broken again in xen-unstable



On Wed, May 1, 2013 at 7:01 AM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
> Ben Guthro writes ("Re: [Xen-devel] S3 is broken again in xen-unstable"):
> ...
>> That said, when things go wrong, the machine does need to be power
>> cycled...so if you are not physically located near the machine under
>> test, you would need a PDU as a recovery mechanism, I suppose.
>
> Ah this makes matters a bit more complicated.  The code which
> implements the test schedule would need to know to power cycle the
> host after a failure.  Could we be confident that after a failed test
> of this kind we wouldn't see filesystem corruption ?

If you are using a journaled filesystem, I think the confidence level
is raised...but there are no guarantees, when you just yank a power
cord.

>
> Also, looking at your test script, you seem to be testing using dom0
> only.  We're ignoring guests then.  Perhaps this should be a separate
> test column.  (That might be a way to fudge the recovery question
> too.)

I'm going for baby-steps here.
The vast majority of the S3 failures we have encountered have been
dom0 related, so I thought that would be a decent starting place.

>
>> >       * How hardware specific are the s3 failures -- we obviously can't
>> >         have one of every laptop ever ;-)
>>
>> Clearly. I'm just looking to get a foot in the door here, so there is
>> a chance of catching gross regressions.
>> The hardware differences seem to be more timing related, due to
>> speed... ie, you are likely to uncover new failures when new, faster
>> hardware comes out for laptops.
>> Since typically server hardware is faster than laptop hardware, that
>> would theoretically catch problems at a higher frequency.
>
> If the hardware/BIOS is likely to be buggy, that's a bit of a pain.
> We'd have to at least figure out which machines worked and flag them
> so that the test was only run on those.

I think testing a known good configuration for regression seems
appropriate, yes.
They all *should* work...but I'm just being conservative here.

>
>> > Once we have a test case in the standard flights then we can consider
>> > the options around new flights testing other trees.
>>
>> I'm not sure I understand this point.
>> Are you saying you want to see a test that fails in the standard test
>> flight first...because without Konrad's patches, it will be guaranteed
>> not to work.
>
> As Ian says, there is no problem with deploying the test first and
> fixing the actual code later...
>
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.