[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] possible I/O emulation state machine issue



Paul,

our PV driver person has found a reproducible crash with ws2k8,
triggered by one of the WHQL tests. The guest get crashed because
the re-issue check of an ioreq close to the top of hvmemul_do_io()
fails. I've handed him a first debugging patch, output of which
suggests that we're dealing with a completely new request, which
in turn would mean that we've run into stale STATE_IORESP_READY
state:

(XEN) d2v3: t=0/1 a=3c4/fed000f0 s=2/4 c=1/1 d=0/1 f=0/0 p=0/0 
v=100/ffff831873f27a30
(XEN) ----[ Xen-4.10.0_15-0  x86_64  debug=n   Tainted:  C   ]----
(XEN) CPU:    39
(XEN) RIP:    e008:[<ffff82d0802d4b91>] emulate.c#hvmemul_do_io+0x1b1/0x640
(XEN) RFLAGS: 0000000000010292   CONTEXT: hypervisor (d2v3)
(XEN) rax: ffff8308797d802c   rbx: 0000000000000004   rcx: 0000000000000000
(XEN) rdx: ffff831873f27fff   rsi: 000000000000000a   rdi: ffff82d0804433b8
(XEN) rbp: ffff830007d28000   rsp: ffff831873f27728   r8:  0000000000000027
(XEN) r9:  0000000000100000   r10: 0000000000000400   r11: ffff82d08035bd40
(XEN) r12: 0000000000000001   r13: 0000000000000000   r14: 0000000000000001
(XEN) r15: ffff831873f278e0   cr0: 0000000080050033   cr4: 00000000000026e0
(XEN) cr3: 0000003794f02000   cr2: fffffa6000fae10e
(XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 000007fffffdd000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d0802d4b91> (emulate.c#hvmemul_do_io+0x1b1/0x640):
(XEN)  54 24 70 e8 cf 87 f7 ff <0f> 0b 48 8d 3d 16 b6 0b 00 48 8d 35 88 f8 0c 00
(XEN) Xen stack trace from rsp=ffff831873f27728:
(XEN)    0000000000000002 0000000000000004 0000000000000001 0000000000000001
(XEN)    0000000000000000 0000000000000001 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000100 ffff831873f27a30
(XEN)    ffff83283fe74010 ffff83284ad22000 0000000000000000 0000000100000000
(XEN)    ffff831873f277d8 ffff831873f277e0 00000000000003c4 0000000000000100
(XEN)    0000000200000001 0000000000000000 ffff8317f8e5b000 0000000000000004
(XEN)    0000000000000001 ffff831873f27a30 ffff831873f27a30 00000000fed000f0
(XEN)    ffff830007d289c8 ffff82d0802d578e 0000000000000000 ffff831873f27a30
(XEN)    0000000000000000 0000000000000004 0000000000000004 0000000000000000
(XEN)    ffff831873f27a30 ffff82d0802d64dd ffff831873f27a30 ffff831873f27d10
(XEN)    00000000fed000f0 ffff831873f27a30 0100000000000003 0000000000000000
(XEN)    ffff831873f278e0 ffffffffffd070f0 0000000400000004 0000000000000004
(XEN)    0000000100000000 ffff831873f27c78 ffff831873f278d8 ffff831873f278d0
(XEN)    ffff831873f27938 00000000fed000f0 0000000000000001 0000000000000001
(XEN)    ffff82d080350ecb 0000000000000004 0000000000000001 ffff831873f27c78
(XEN)    ffff831873f27a30 0000000000000002 ffff830007d28000 ffff82d0802d69f1
(XEN)    0000000000000001 ffff82d0802a313d ffffffffffd070f0 0000000000000001
(XEN)    0000000000000000 00000000000000f0 ffff82d080350ecb ffff831873f27aa0
(XEN)    0000000000000000 ffff831873f27c78 ffff831873f27a28 ffff830007d28a60
(XEN)    ffff82d0803a7620 ffff82d0802a4aad ffff831873f279c8 ffff831873f27ac0
(XEN) Xen call trace:
(XEN)    [<ffff82d0802d4b91>] emulate.c#hvmemul_do_io+0x1b1/0x640
(XEN)    [<ffff82d0802d578e>] emulate.c#hvmemul_do_io_buffer+0x2e/0x70
(XEN)    [<ffff82d0802d64dd>] emulate.c#hvmemul_linear_mmio_access+0x24d/0x540
(XEN)    [<ffff82d080350ecb>] common_interrupt+0x9b/0x120
(XEN)    [<ffff82d0802d69f1>] emulate.c#__hvmemul_read+0x221/0x230
(XEN)    [<ffff82d0802a313d>] x86_emulate.c#x86_decode+0xe2d/0x1e50
(XEN)    [<ffff82d080350ecb>] common_interrupt+0x9b/0x120
(XEN)    [<ffff82d0802a4aad>] x86_emulate+0x94d/0x19150
(XEN)    [<ffff82d08030ebd1>] __get_gfn_type_access+0x101/0x290
(XEN)    [<ffff82d0802d7c0a>] emulate.c#_hvm_emulate_one+0x4a/0x1e0
(XEN)    [<ffff82d0803006e0>] vmx.c#vmx_get_interrupt_shadow+0/0x10
(XEN)    [<ffff82d0802d7a2e>] hvm_emulate_init_once+0x7e/0xb0
(XEN)    [<ffff82d0802e394b>] hvm_emulate_one_insn+0x3b/0x120
(XEN)    [<ffff82d0802bd3a0>] x86_insn_is_mem_access+0/0xc0
(XEN)    [<ffff82d0802dc5b8>] hvm_hap_nested_page_fault+0x138/0x710
(XEN)    [<ffff82d08023bdc0>] timer.c#add_entry+0x50/0xc0
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030517e>] vmx_vmexit_handler+0x8ae/0x1960
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b59f>] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)    [<ffff82d08030b5ab>] vmx_asm_vmexit_handler+0xab/0x240
(XEN)    [<ffff82d08030b5e2>] vmx_asm_vmexit_handler+0xe2/0x240
(XEN) 
(XEN) domain_crash called from emulate.c:171
(XEN) Domain 2 (vcpu#3) crashed on cpu#39:
(XEN) ----[ Xen-4.10.0_15-0  x86_64  debug=n   Tainted:  C   ]----
(XEN) CPU:    39
(XEN) RIP:    0010:[<fffff8000162411e>]
(XEN) RFLAGS: 0000000000010286   CONTEXT: hvm guest (d2v3)
(XEN) rax: ffffffffffd07000   rbx: 0000000000000003   rcx: 0000000a00005036
(XEN) rdx: 0000000002549700   rsi: fffffa80044b8990   rdi: 00000001adfbbe88
(XEN) rbp: fffffa6001145128   rsp: fffffa60019ffb58   r8:  00000000b57e152b
(XEN) r9:  0000000001d3c1ec   r10: fffff6fb7e980038   r11: 0000000000000003
(XEN) r12: fffffa80044b8990   r13: 0000000000000004   r14: 0000000001d3c1ec
(XEN) r15: fffffa60019dbc00   cr0: 0000000080050031   cr4: 00000000000006f8
(XEN) cr3: 0000000000124000   cr2: fffffa6000fae10e
(XEN) fsb: 00000000fffdf000   gsb: fffffa60019d8000   gss: 000007fffffae000
(XEN) ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 0018   cs: 0010

The elements in the first line are recorded / actual values for
each of the elements the if() checks, in that same order (patch
below for reference). The stack trace also suggests to me that
we're not in the context of a re-issue (which iirc would always
originate from hvm_do_resume()).

I'd appreciate any thoughts on the matter,
Jan

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -164,7 +164,12 @@ static int hvmemul_do_io(
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
+{//temp
+ printk("%pv: t=%d/%d a=%lx/%lx s=%x/%x c=%x/%lx d=%d/%d f=%d/%d p=%d/%d 
v=%lx/%lx\n", curr,
+        p.type, is_mmio, p.addr, addr, p.size, size, p.count, *reps, p.dir, 
dir, p.df, df, p.data_is_ptr, data_is_addr, p.data, data);
+ dump_execution_state();
             domain_crash(currd);
+}
 
         if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.