[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] possible I/O emulation state machine issue
Paul, our PV driver person has found a reproducible crash with ws2k8, triggered by one of the WHQL tests. The guest get crashed because the re-issue check of an ioreq close to the top of hvmemul_do_io() fails. I've handed him a first debugging patch, output of which suggests that we're dealing with a completely new request, which in turn would mean that we've run into stale STATE_IORESP_READY state: (XEN) d2v3: t=0/1 a=3c4/fed000f0 s=2/4 c=1/1 d=0/1 f=0/0 p=0/0 v=100/ffff831873f27a30 (XEN) ----[ Xen-4.10.0_15-0 x86_64 debug=n Tainted: C ]---- (XEN) CPU: 39 (XEN) RIP: e008:[<ffff82d0802d4b91>] emulate.c#hvmemul_do_io+0x1b1/0x640 (XEN) RFLAGS: 0000000000010292 CONTEXT: hypervisor (d2v3) (XEN) rax: ffff8308797d802c rbx: 0000000000000004 rcx: 0000000000000000 (XEN) rdx: ffff831873f27fff rsi: 000000000000000a rdi: ffff82d0804433b8 (XEN) rbp: ffff830007d28000 rsp: ffff831873f27728 r8: 0000000000000027 (XEN) r9: 0000000000100000 r10: 0000000000000400 r11: ffff82d08035bd40 (XEN) r12: 0000000000000001 r13: 0000000000000000 r14: 0000000000000001 (XEN) r15: ffff831873f278e0 cr0: 0000000080050033 cr4: 00000000000026e0 (XEN) cr3: 0000003794f02000 cr2: fffffa6000fae10e (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 000007fffffdd000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d0802d4b91> (emulate.c#hvmemul_do_io+0x1b1/0x640): (XEN) 54 24 70 e8 cf 87 f7 ff <0f> 0b 48 8d 3d 16 b6 0b 00 48 8d 35 88 f8 0c 00 (XEN) Xen stack trace from rsp=ffff831873f27728: (XEN) 0000000000000002 0000000000000004 0000000000000001 0000000000000001 (XEN) 0000000000000000 0000000000000001 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000100 ffff831873f27a30 (XEN) ffff83283fe74010 ffff83284ad22000 0000000000000000 0000000100000000 (XEN) ffff831873f277d8 ffff831873f277e0 00000000000003c4 0000000000000100 (XEN) 0000000200000001 0000000000000000 ffff8317f8e5b000 0000000000000004 (XEN) 0000000000000001 ffff831873f27a30 ffff831873f27a30 00000000fed000f0 (XEN) ffff830007d289c8 ffff82d0802d578e 0000000000000000 ffff831873f27a30 (XEN) 0000000000000000 0000000000000004 0000000000000004 0000000000000000 (XEN) ffff831873f27a30 ffff82d0802d64dd ffff831873f27a30 ffff831873f27d10 (XEN) 00000000fed000f0 ffff831873f27a30 0100000000000003 0000000000000000 (XEN) ffff831873f278e0 ffffffffffd070f0 0000000400000004 0000000000000004 (XEN) 0000000100000000 ffff831873f27c78 ffff831873f278d8 ffff831873f278d0 (XEN) ffff831873f27938 00000000fed000f0 0000000000000001 0000000000000001 (XEN) ffff82d080350ecb 0000000000000004 0000000000000001 ffff831873f27c78 (XEN) ffff831873f27a30 0000000000000002 ffff830007d28000 ffff82d0802d69f1 (XEN) 0000000000000001 ffff82d0802a313d ffffffffffd070f0 0000000000000001 (XEN) 0000000000000000 00000000000000f0 ffff82d080350ecb ffff831873f27aa0 (XEN) 0000000000000000 ffff831873f27c78 ffff831873f27a28 ffff830007d28a60 (XEN) ffff82d0803a7620 ffff82d0802a4aad ffff831873f279c8 ffff831873f27ac0 (XEN) Xen call trace: (XEN) [<ffff82d0802d4b91>] emulate.c#hvmemul_do_io+0x1b1/0x640 (XEN) [<ffff82d0802d578e>] emulate.c#hvmemul_do_io_buffer+0x2e/0x70 (XEN) [<ffff82d0802d64dd>] emulate.c#hvmemul_linear_mmio_access+0x24d/0x540 (XEN) [<ffff82d080350ecb>] common_interrupt+0x9b/0x120 (XEN) [<ffff82d0802d69f1>] emulate.c#__hvmemul_read+0x221/0x230 (XEN) [<ffff82d0802a313d>] x86_emulate.c#x86_decode+0xe2d/0x1e50 (XEN) [<ffff82d080350ecb>] common_interrupt+0x9b/0x120 (XEN) [<ffff82d0802a4aad>] x86_emulate+0x94d/0x19150 (XEN) [<ffff82d08030ebd1>] __get_gfn_type_access+0x101/0x290 (XEN) [<ffff82d0802d7c0a>] emulate.c#_hvm_emulate_one+0x4a/0x1e0 (XEN) [<ffff82d0803006e0>] vmx.c#vmx_get_interrupt_shadow+0/0x10 (XEN) [<ffff82d0802d7a2e>] hvm_emulate_init_once+0x7e/0xb0 (XEN) [<ffff82d0802e394b>] hvm_emulate_one_insn+0x3b/0x120 (XEN) [<ffff82d0802bd3a0>] x86_insn_is_mem_access+0/0xc0 (XEN) [<ffff82d0802dc5b8>] hvm_hap_nested_page_fault+0x138/0x710 (XEN) [<ffff82d08023bdc0>] timer.c#add_entry+0x50/0xc0 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030517e>] vmx_vmexit_handler+0x8ae/0x1960 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240 (XEN) [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240 (XEN) [<ffff82d08030b5e2>] vmx_asm_vmexit_handler+0xe2/0x240 (XEN) (XEN) domain_crash called from emulate.c:171 (XEN) Domain 2 (vcpu#3) crashed on cpu#39: (XEN) ----[ Xen-4.10.0_15-0 x86_64 debug=n Tainted: C ]---- (XEN) CPU: 39 (XEN) RIP: 0010:[<fffff8000162411e>] (XEN) RFLAGS: 0000000000010286 CONTEXT: hvm guest (d2v3) (XEN) rax: ffffffffffd07000 rbx: 0000000000000003 rcx: 0000000a00005036 (XEN) rdx: 0000000002549700 rsi: fffffa80044b8990 rdi: 00000001adfbbe88 (XEN) rbp: fffffa6001145128 rsp: fffffa60019ffb58 r8: 00000000b57e152b (XEN) r9: 0000000001d3c1ec r10: fffff6fb7e980038 r11: 0000000000000003 (XEN) r12: fffffa80044b8990 r13: 0000000000000004 r14: 0000000001d3c1ec (XEN) r15: fffffa60019dbc00 cr0: 0000000080050031 cr4: 00000000000006f8 (XEN) cr3: 0000000000124000 cr2: fffffa6000fae10e (XEN) fsb: 00000000fffdf000 gsb: fffffa60019d8000 gss: 000007fffffae000 (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 0018 cs: 0010 The elements in the first line are recorded / actual values for each of the elements the if() checks, in that same order (patch below for reference). The stack trace also suggests to me that we're not in the context of a re-issue (which iirc would always originate from hvm_do_resume()). I'd appreciate any thoughts on the matter, Jan --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -164,7 +164,12 @@ static int hvmemul_do_io( (p.dir != dir) || (p.df != df) || (p.data_is_ptr != data_is_addr) ) +{//temp + printk("%pv: t=%d/%d a=%lx/%lx s=%x/%x c=%x/%lx d=%d/%d f=%d/%d p=%d/%d v=%lx/%lx\n", curr, + p.type, is_mmio, p.addr, addr, p.size, size, p.count, *reps, p.dir, dir, p.df, df, p.data_is_ptr, data_is_addr, p.data, data); + dump_execution_state(); domain_crash(currd); +} if ( data_is_addr ) return X86EMUL_UNHANDLEABLE; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |