[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Prepping a bugfix push



On 12/04/09 07:50, Ian Campbell wrote:
On Fri, 2009-12-04 at 07:46 +0000, Ian Campbell wrote:
I've been doing regular suspend/resumes not checkpoint ones as Brendan
is doing, I did try a couple of checkpointed ones yesterday and they
failed, IIRC with a similar softlockup to this one.
So what is happening is that the device event channels are getting torn
down by the resume handler and never completely reinstated in the
cancelled suspend (aka checkpoint) case.

Hm.

In 2.6.18 there was a separate ->suspend_cancel() callback for each
driver, called instead of the ->resume() callback in exactly these
circumstances. The cancel callback doesn't do any of the teardown, in
fact for blkfront it doesn't even exist.

(As a proof of concept, commenting out the entire contents of
blkfront_resume and netfront_resume makes checkpointing work OK for me,
at the cost of breaking regular resume, of course)

pv-ops uses the generic power management infrastructure which does not
have a concept of cancelling a suspend. Perhaps it should? Otherwise a
different solution will be required, I'm not sure what that might be yet
yet.

Well, the obvious one is to treat it as a full suspend followed by immediate resume. That is, just remove all the special case handling for checkpoint, and let it do the normal resume stuff when the hypercall returns.

I think the PM core can fail to suspend; it just resumes anything that has been suspended so far.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.