[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] pv 2.6.31 (kernel.org) and save/migrate, domU BUG



On Sun, Nov 08, 2009 at 04:17:43PM +0200, Pasi Kärkkäinen wrote:
> On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote:
> > > > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > > > machine and it fails to save also.  Are you able to save
> > > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > > know if that is important.)
> > > 
> > > I'll have to try it later today..
> > 
> > Let me know.
> > 
> 
> Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to 
> "xm save" and "xm restore" it without problems. 
> 
> But I noticed there was a BUG printed on the guest console:
> http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt
> 
> BUG: sleeping function called from invalid context at kernel/mutex.c:94
> in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
> Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
> Call Trace:
>  [<ffffffff8104021f>] __might_sleep+0xe6/0xe8
>  [<ffffffff81419c84>] mutex_lock+0x22/0x4e
>  [<ffffffff812afdce>] dpm_resume_noirq+0x21/0x11f
>  [<ffffffff81272b05>] xen_suspend+0xca/0xd1
>  [<ffffffff8108c172>] stop_cpu+0x8c/0xd2
>  [<ffffffff8106350c>] worker_thread+0x18a/0x224
>  [<ffffffff81067ae7>] ? autoremove_wake_function+0x0/0x39
>  [<ffffffff8141ab29>] ? _spin_unlock_irqrestore+0x19/0x1b
>  [<ffffffff81063382>] ? worker_thread+0x0/0x224
>  [<ffffffff81067765>] kthread+0x91/0x99
>  [<ffffffff81012daa>] child_rip+0xa/0x20
>  [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b
>  [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6
>  [<ffffffff81012da0>] ? child_rip+0x0/0x20
> 

Oh, I forgot to mention that this BUG is non-fatal. The guest still
works after that..

-- Pasi

> 
> More information about my setup:
> 
> Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and
> custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old).
> 
> Guest/domU: Fedora 12 (latest rawhide) with the included/default
> 2.6.31.5-122.fc12.x86_64 kernel.
> 
> > > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > > > was absolutely no console output.  However, I think tools
> > > > are out-of-date on that machine so ignore that.)
> > > 
> > > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> > > parameters?
> > 
> > No, but that didn't work either.
> > 
> 
> Ok.. then it crashes really early.
> 
> > > You might also change the xen guest cfgfile so that you have
> > > on_crash=preserve and then when the PV guest is crashed run this:
> > > 
> > > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
> > > 
> > > (if you have 64b host the xenctx binary might be under /usr/lib64/)
> > > 
> > > to get a stack trace..
> > 
> > Very interesting and useful!  I was completely unaware of
> > xenctx and could have used it many times in tmem development!
> > 
> > The results explain why I can get it to run on
> > one machine (an older laptop) and not run on another
> > machine (a Nehalem system)... looks like this is maybe
> > related to the cpuid-extended-topology-leaf bug that Jeremy
> > sent a fix for upstream recently.
> > 
> 
> Did you try with that patch applied? 
> 
> -- Pasi
> 
> > cs:eip: e019:c040342d xen_cpuid+0x46 
> > flags: 00001206 i nz p
> > ss:esp: e021:c0779ee4
> > eax: 00000001       ebx: 00000002   ecx: 00000100   edx: 00000001
> > esi: c0779f1c       edi: c0779f18   ebp: c0779f24
> >  ds:     e021        es:     e021    fs:     00d8    gs:     0000
> > Code (instr addr c040342d)
> > 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 
> > 8b 04 24 89 18 89 0e 89 
> > 
> > 
> > Stack:
> >  c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
> >  c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
> >  0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
> >  c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000
> > 
> > Call Trace:
> >   [<c040342d>] xen_cpuid+0x46  <--
> >   [<c066fd0f>] detect_extended_topology+0xae 
> >   [<c0670ed8>] init_intel+0x140 
> >   [<c066fe6a>] init_scattered_cpuid_features+0x82 
> >   [<c06705e2>] identify_cpu+0x22d 
> >   [<c040584c>] xen_force_evtchn_callback+0xc 
> >   [<c0405e78>] check_events+0x8 
> >   [<c07c9dec>] identify_boot_cpu+0xa 
> >   [<c07c9e9a>] check_bugs+0x8 
> >   [<c07c27bd>] start_kernel+0x2a0 
> >   [<c07c5206>] xen_start_kernel+0x340 
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.