[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub
This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches applied) and Xen 4.4.1 I originally posted a bug report with Debian but got the suggestion to file bugs with upstream as well. Debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480 Note that my original thought was that this bug probably is within GRUB. But Ian asked me to file a bug with Xen as well, you have to live with the fact that it is centered around GRUB though. Here's the information from my original bug report: Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes fail when chainloading the domU's grub. 64-bit domU seem to work 100% of the time. My understanding of the process: * dom0 launches domU with grub that is loaded from dom0's disk. * Grub reads config file from memdisk, and then looks for grub binary in domU filesystem. * If grub is found in domU it then chainloads (multiboot) that grub binary and the domU grub reads grub.cfg and continue booting. * If grub is not found in domU it reads grub.cfg and continues with boot. It fails at step 3 in my list of the boot process, but sometimes it does work so it may be something like a race condition that causes the problem? A workaround is to not install or rename /boot/xen in domU so that the first grub that is loaded from dom0's disk will not find the grub binary in the domU filesystem and hence continues to read grub.cfg and boot. The drawback of this is of course that the two versions can't differ too much as there are different setups creating grub.cfg and then reading/parsing it at boot time. I am not sure at this point whether this is a problem in XEN or a problem in grub but I compiled the legacy pvgrub that uses some minios from XEN (don't really know much more about it) and when that legacy pvgrub chainloads the domU grub it seems to work 100% of the time. Now the legace pvgrub is not a real alternative as it's not packaged for Debian though. When it fails "xl create vm -c" outputs this: Parsing config from /etc/xen/vm libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain type for domid=16 Unable to attach console libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console child [0] exited with error status 1 And "xl dmesg" shows errors like this: (XEN) traps.c:2514:d15 Domain attempted WRMSR 00000000c0010201 from 0x0000000000000000 to 0x000000000000ffff. (XEN) d16:v0: unhandled page fault (ec=0010) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 0000000200256027 000000000000049c (XEN) L3[0x000] = 0000000200255027 000000000000049d (XEN) L2[0x000] = 0000000200251023 00000000000004a1 (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff (XEN) domain_crash_sync called from entry.S: fault at ffff82d08021feb0 compat_create_bounce_frame+0xc6/0xde (XEN) Domain 16 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.4.1 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e019:[<0000000000000000>] (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000 (XEN) rdx: 0000000000000000 rsi: 0000000000499000 rdi: 0000000000800000 (XEN) rbp: 000000000000000a rsp: 00000000005a5ff0 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: ffff83023e9b9000 r11: ffff83023e9b9000 (XEN) r12: 0000033f3d335bfb r13: ffff82d080300800 r14: ffff82d0802ea940 (XEN) r15: ffff83005e819000 cr0: 000000008005003b cr4: 00000000000506f0 (XEN) cr3: 0000000200b7a000 cr2: 0000000000000000 (XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019 (XEN) Guest stack trace from esp=005a5ff0: (XEN) 00000010 00000000 0001e019 00010046 0016b38b 0016b38a 0016b389 0016b388 (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381 0016b380 (XEN) 0016b37f 0016b37e 0016b37d 0016b37c 0016b37b 0016b37a 0016b379 0016b378 (XEN) 0016b377 0016b376 0016b375 0016b374 0016b373 0016b372 0016b371 0016b370 (XEN) 0016b36f 0016b36e 0016b36d 0016b36c 0016b36b 0016b36a 0016b369 0016b368 (XEN) 0016b367 0016b366 0016b365 0016b364 0016b363 0016b362 0016b361 0016b360 (XEN) 0016b35f 0016b35e 0016b35d 0016b35c 0016b35b 0016b35a 0016b359 0016b358 (XEN) 0016b357 0016b356 0016b355 0016b354 0016b353 0016b352 0016b351 0016b350 (XEN) 0016b34f 0016b34e 0016b34d 0016b34c 0016b34b 0016b34a 0016b349 0016b348 (XEN) 0016b347 0016b346 0016b345 0016b344 0016b343 0016b342 0016b341 0016b340 (XEN) 0016b33f 0016b33e 0016b33d 0016b33c 0016b33b 0016b33a 0016b339 0016b338 (XEN) 0016b337 0016b336 0016b335 0016b334 0016b333 0016b332 0016b331 0016b330 (XEN) 0016b32f 0016b32e 0016b32d 0016b32c 0016b32b 0016b32a 0016b329 0016b328 (XEN) 0016b327 0016b326 0016b325 0016b324 0016b323 0016b322 0016b321 0016b320 (XEN) 0016b31f 0016b31e 0016b31d 0016b31c 0016b31b 0016b31a 0016b319 0016b318 (XEN) 0016b317 0016b316 0016b315 0016b314 0016b313 0016b312 0016b311 0016b310 (XEN) 0016b30f 0016b30e 0016b30d 0016b30c 0016b30b 0016b30a 0016b309 0016b308 (XEN) 0016b307 0016b306 0016b305 0016b304 0016b303 0016b302 0016b301 0016b300 (XEN) 0016b2ff 0016b2fe 0016b2fd 0016b2fc 0016b2fb 0016b2fa 0016b2f9 0016b2f8 (XEN) 0016b2f7 0016b2f6 0016b2f5 0016b2f4 0016b2f3 0016b2f2 0016b2f1 0016b2f0 An easy way to find out which grub you are in if the machine boots is to hit 'c' and type 'ls', only the grub from dom0 will know about (memdisk). So when trying to replicate the issue (and the domU actually starts) you can hit 'c', type 'ls' (check for memdisk) and then type 'halt' and relaunch the domU. Usually I can't launch more than 4-5 times in a row before it fails, often it fails on my first try. For information I have reproduced on two different AMD desktop processor machines, not sure if Intel would be any different. I'm pretty sure I did tests with grub from unstable with same result at some point, but can test again if that is likely to work. The package that is in installed on the domU side is "grub-xen". I am unable to understand how to debug grub further on my own, I have printed out text from grub so that I understood that it is the chainload that fails. I see no output from the domU grub (except when it works as it should of course). I can help with further testing if needed. /Andreas _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |