[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: dom0 PV looping on search_pre_exception_table()
On 09/12/2020 13:59, Manuel Bouyer wrote: > On Wed, Dec 09, 2020 at 01:28:54PM +0000, Andrew Cooper wrote: >> Pagefaults on IRET come either from stack accesses for operands (not the >> case here as Xen is otherwise working fine), or from segement selector >> loads for %cs and %ss. >> >> In this example, %ss is in the LDT, which specifically does use >> pagefaults to promote the frame to PGT_segdesc. >> >> I suspect that what is happening is that handle_ldt_mapping_fault() is >> failing to promote the page (for some reason), and we're taking the "In >> hypervisor mode? Leave it to the #PF handler to fix up." path due to the >> confusion in context, and Xen's #PF handler is concluding "nothing else >> to do". >> >> The older behaviour of escalating to the failsafe callback would have >> broken this cycle by rewriting %ss and re-entering the kernel. >> >> >> Please try the attached debugging patch, which is an extension of what I >> gave you yesterday. First, it ought to print %cr2, which I expect will >> point to Xen's virtual mapping of the vcpu's LDT. The logic ought to >> loop a few times so we can inspect the hypervisor codepaths which are >> effectively livelocked in this state, and I've also instrumented >> check_descriptor() failures because I've got a gut feeling that is the >> root cause of the problem. > here's the output: > (XEN) IRET fault: #PF[0000] > [23/1999] > (XEN) %cr2 ffff820000010040 > > (XEN) IRET fault: #PF[0000] > > (XEN) %cr2 ffff820000010040 > (XEN) IRET fault: #PF[0000] > (XEN) %cr2 ffff820000010040 > (XEN) IRET fault: #PF[0000] > (XEN) %cr2 ffff820000010040 > (XEN) domain_crash called from extable.c:216 > (XEN) Domain 0 (vcpu#0) crashed on cpu#0: > (XEN) ----[ Xen-4.15-unstable x86_64 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) RIP: 0047:[<00007f7ff60007d0>] > (XEN) RFLAGS: 0000000000000202 EM: 0 CONTEXT: pv guest (d0v0) > (XEN) rax: ffff82d04038c309 rbx: 0000000000000000 rcx: 000000000000e008 > (XEN) rdx: 0000000000010086 rsi: ffff83007fcb7f78 rdi: 000000000000e010 > (XEN) rbp: 0000000000000000 rsp: 00007f7fff4876c0 r8: 0000000e00000000 > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 > (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 0000000000002660 > (XEN) cr3: 0000000079cdb000 cr2: ffffa1000000a040 > (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: ffffffff80cf2dc0 > (XEN) ds: 0023 es: 0023 fs: 0000 gs: 0000 ss: 003f cs: 0047 > (XEN) Guest stack trace from rsp=00007f7fff4876c0: > (XEN) 0000000000000001 00007f7fff487bd8 0000000000000000 0000000000000000 > (XEN) 0000000000000003 00000000aee00040 0000000000000004 0000000000000038 > (XEN) 0000000000000005 0000000000000008 0000000000000006 0000000000001000 > (XEN) 0000000000000007 00007f7ff6000000 0000000000000008 0000000000000000 > (XEN) 0000000000000009 00000000aee01cd0 00000000000007d0 0000000000000000 > (XEN) 00000000000007d1 0000000000000000 00000000000007d2 0000000000000000 > (XEN) 00000000000007d3 0000000000000000 000000000000000d 00007f7fff488000 > (XEN) 00000000000007de 00007f7fff4877c0 0000000000000000 0000000000000000 > (XEN) 6e692f6e6962732f 0000000000007469 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. Huh, so it is the LDT, but we're not getting as far as inspecting the target frame. I wonder if the LDT is set up correctly. How about this incremental delta? ~Andrew diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c index 88b05bef38..be59a3e216 100644 --- a/xen/arch/x86/extable.c +++ b/xen/arch/x86/extable.c @@ -203,13 +203,16 @@ search_pre_exception_table(struct cpu_user_regs *regs) __start___pre_ex_table, __stop___pre_ex_table-1, addr); if ( fixup ) { + struct vcpu *curr = current; static int count; printk(XENLOG_ERR "IRET fault: %s[%04x]\n", vec_name(regs->entry_vector), regs->error_code); if ( regs->entry_vector == X86_EXC_PF ) - printk(XENLOG_ERR "%%cr2 %016lx\n", read_cr2()); + printk(XENLOG_ERR "%%cr2 %016lx, LDT base %016lx, limit %04x\n", + read_cr2(), curr->arch.pv.ldt_base, + (curr->arch.pv.ldt_ents << 3) | 7); if ( count++ > 2 ) { diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 1059f3ce66..3ac07a84c3 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1233,6 +1233,8 @@ static int handle_ldt_mapping_fault(unsigned int offset, } else { + printk(XENLOG_ERR "*** pv_map_ldt_shadow_page(%#x) failed\n", offset); + /* In hypervisor mode? Leave it to the #PF handler to fix up. */ if ( !guest_mode(regs) ) return 0;
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |