[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] pv 2.6.31 (kernel.org) and save/migrate fails, domU BUG



Hello,

Jeremy: Here's summary about these save/restore problems
using upstream Linux 2.6.31.5 PV guest.

For me:
        - I can "xm save" + "xm restore" UP guest, but I get non-fatal
          BUG in the guest kernel, see [1].
        - "xm save" fails for SMP guest with "failed to get the suspend evtchn 
port", see [2].

For Dan:
        - "xm save" works for UP guest, but "xm restore" doesn't, giving
          infinite xen_sched_clock related dumps in the guest kernel, see [3].
        - "xm save" for SMP guest fails, it never ends. I suspect this
          is the same problem I'm seeing.


[1] non-fatal BUG on the guest kernel after "xm restore":
http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt

[2] "xm log" contains:
[2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: 
/usr/lib64/xen/bin/xc_save 28 2 0 0 0
[2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the 
suspend evtchn port

[3] See the attachment in this email:
http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00391.html


Any tips how to debug these? 

-- Pasi


On Sun, Nov 08, 2009 at 07:27:47PM +0200, Pasi Kärkkäinen wrote:
> On Sun, Nov 08, 2009 at 08:54:23AM -0800, Dan Magenheimer wrote:
> > > > Ok, so it appears there is something problematic with
> > > > saving an upstream kernel.  It might be (partially) fixed
> > > > in Fedora 12 or maybe there is some other environmental
> > > > difference which makes save fail entirely on my system.
> > > > 
> > > 
> > > Yeah, fedora kernel has some patches, but it should be pretty 
> > > close to upstream kernel..
> > > 
> > > btw was your guest UP or SMP? Mine was UP..
> > 
> > Mine was SMP... switching to UP I can now save.  BUT...
> > restore doesn't seem to quite work.  The restore completes
> > but I get no response from the VNC console.  When I
> > use a tty console, after restore, I am getting
> > an infinite dump of
> > 
> > WARNING: at arch/x86/time.c:180 xen_sched_clock+0x2b
> > 
> > (see attached).
> > 
> > Did you try restore on Fedora 12?
> >  
> 
> Yeah. save+restore for UP F12 guest works for me 
> (except I get that non-fatal BUG on the guest).
> 
> SMP guest doesn't work.. save crashes it.
> 
> > > > > > The results explain why I can get it to run on
> > > > > > one machine (an older laptop) and not run on another
> > > > > > machine (a Nehalem system)... looks like this is maybe
> > > > > > related to the cpuid-extended-topology-leaf bug that Jeremy
> > > > > > sent a fix for upstream recently.
> > > > > 
> > > > > Did you try with that patch applied? 
> > > > 
> > > > No, the patch wasn't posted, just a pull request to Linus,
> > > > so I don't have the patch (and am not a git expert so
> > > > am not sure how to get it).
> > > > 
> > > > http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html
> > > >
> > > > So I'll try it again when .6 or .7 is available.
> > > 
> > > See here for changelog:
> > > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=bugfix
> > > 
> > > You can get the diffs/patches from there using the links..
> > 
> > Thanks.  Yes, Jeremy's patch allows 2.6.31.5 (in a PV domain)
> > to completely boot on my Nehalem box.
> 
> Ok. But I guess those doesn't help for the save+restore problem..
> 
> -- Pasi
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.