[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request



Pasi Kärkkäinen wrote:
On Sun, Mar 22, 2009 at 07:04:23PM +0200, Pasi Kärkkäinen wrote:
On Sun, Mar 22, 2009 at 01:51:51PM +0200, Pasi Kärkkäinen wrote:
On Sat, Mar 21, 2009 at 09:28:55PM -0700, Jeremy Fitzhardinge wrote:
Pasi Kärkkäinen wrote:
On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
Also, do you see this problem before you've started any other domains? Or does it only happen once you've run a domU (or only while a domU is running)?

I'm not running any other domains.. Only dom0 is running.

Steps to reproduce this BUG on my pv_ops dom0 testbox:

1) Reboot the box to pv_ops dom0 kernel
2) Login to dom0 via ssh
3) Start kernel compilation on dom0 (make bzImage && make modules)
4) Wait some minutes and pv_ops dom0 kernel BUGs

So no other domains has been or is running when this happens..

I'll try disabling CONFIG_HIGHPTE now, and see if that makes any difference.

CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
BUGs (during kernel compilation):
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt


CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt

Starting udev: BUG: unable to handle kernel paging request at 70007823
IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
*pdpt = 000000005f781001 Oops: 0002 [#1] SMP So yeah.. with CONFIG_HIGHPTE=n it seems to happen when sata_promise is loaded.. What should I try next?
Actually it's not only sata_promise. I tried 2 more times with the
CONFIG_HIGHPTE=n pv_ops dom0 kernel:

http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt

BUG: unable to handle kernel paging request at a536462c
IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
*pdpt = 000000005f759001 Oops: 0002 [#1] SMP
Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.

Actually it seems CONFIG_HIGHPTE=n kernel fails also on baremetal:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-24-baremetal-2.6.29-rc8-bug-no-highpte.txt

Starting udev: invalid opcode: 0000 [#1] SMP
Summary:
CONFIG_HIGHPTE=n: both dom0 and baremetal fail during system startup when udev 
is started
CONFIG_HIGHPTE=y: baremetal works OK, dom0 fails with BUG after around 30 mins 
of kernel compilation

Please ignore this summary, there was something wrong with my kernel builds or
something. I'll post new summary soon when I'm finished with testing.

Ok, I did new fresh kernel+modules builds and re-tested everything.

New summary:
CONFIG_HIGHPTE=n: both dom0 and baremetal work OK, both survive kernel 
compilation.
CONFIG_HIGHPTE=y: baremetal works OK and survives kernel compilation, but dom0 
fails with BUG after around 20-30 mins of kernel compilation

Thanks for getting a consistent test result; the other reports looked, frankly, scary and I wouldn't want to be on that wild goose chase.

These ones look much more tractable, though I don't really have a theory for them. I'll have a look next week sometime.

   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.