[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] ACPI shutdown unreliable with win7?



In osstest self gate flight 56412 a change to make HVM guests
configurably use "xl shutdown -F" (-F == use ACPI fallback if PV drivers
absent) and apply the option to the win7 and winxpsp3 tests went into
production.

On winxpsp3 this appears to have been a success and things are working
pretty well since that flight:
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemuu-winxpsp3.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemuu-winxpsp3-vcpus1.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemut-winxpsp3.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemut-winxpsp3.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemut-winxpsp3-vcpus1.html

However win7 seems to be in somewhat poorer shape:

http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemuu-win7-amd64.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemut-win7-amd64.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-i386-xl-qemut-win7-amd64.html

it passes less than one time in 10 (of the ones which don't get blocked
earlier on).

The problem seems to be independent of the qemu version and with qemuu
of seabios vs rombios. dom0 kernel also seems irrelevant.

Some random examples which I looked at:
http://logs.test-lab.xenproject.org/osstest/logs/56929/test-amd64-amd64-xl-qemut-win7-amd64/info.html
http://logs.test-lab.xenproject.org/osstest/logs/56929/test-amd64-i386-xl-qemuu-win7-amd64/info.html
http://logs.test-lab.xenproject.org/osstest/logs/56718/test-amd64-amd64-xl-qemut-win7-amd64/info.html

all showed in there vnc screenshot a guest which is showing no
indication of shutting down.

Looking a bit closer at the first 56929 one xl list shows the windows
domain 18 in state blocked.

The host serial log shows an apparently successful restore into dom18,
then the debug key output after failure show CPU1, corresponding to
d18v0, sat in hvm_io_pending:
        May 22 01:27:58.889074 (XEN) *** Dumping CPU1 host state: ***
        May 22 01:27:58.897025 (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  
Not tainted ]----
        May 22 01:27:58.904993 (XEN) CPU:    1
        May 22 01:27:58.905036 (XEN) RIP:    e008:[<ffff82d0801bdce5>] 
hvm_io_pending+0/0x52
        May 22 01:27:58.912996 (XEN) RFLAGS: 0000000000000297   CONTEXT: 
hypervisor (d18v0)
        May 22 01:27:58.913056 (XEN) rax: 0000000000000000   rbx: 
ffff8300cceeb000   rcx: 00000000fed000f0
        May 22 01:27:58.920996 (XEN) rdx: 0000000000000004   rsi: 
00000000fed000f4   rdi: ffff8300cceeb000
        May 22 01:27:58.928997 (XEN) rbp: ffff83022b4bf5b8   rsp: 
ffff83022b4bf520   r8:  0000000000000001
        May 22 01:27:58.936987 (XEN) r9:  0000000000000000   r10: 
0000000000000000   r11: fffff80000b9c9e8
        May 22 01:27:58.944986 (XEN) r12: ffff830219db5000   r13: 
0000000000000004   r14: ffff82e0024433a0
        May 22 01:27:58.953029 (XEN) r15: 0000000000000000   cr0: 
0000000080050033   cr4: 00000000001526f0
        May 22 01:27:58.960994 (XEN) cr3: 000000010c38e000   cr2: 
fffff8a0007d1538
        May 22 01:27:58.968987 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000  
 ss: 0000   cs: e008
        May 22 01:27:58.969027 (XEN) Xen stack trace from rsp=ffff83022b4bf520:
        May 22 01:27:58.976990 (XEN)    ffff82d0801b8300 0000000000000001 
ffff83022b4bfb60 ffff83022b4bf628
        May 22 01:27:58.985003 (XEN)    0000000100000002 00000000fed000f0 
0100000000000001 ffff83022b4bf578
        May 22 01:27:58.993017 (XEN)    00000000801b8d1c 00000000fed000f0 
0000000000000000 0000000400000000
        May 22 01:27:59.001010 (XEN)    0120000000000000 ffff83022b4bf630 
0000000000000004 0000000000000004
        May 22 01:27:59.009040 (XEN)    00000000000000f0 ffff83022b4bf9a8 
0000000000000002 ffff83022b4bf5d8
        May 22 01:27:59.017016 (XEN)    ffff82d0801b85fb ffff830200000000 
ffff83022b4bf9a8 ffff83022b4bf668
        May 22 01:27:59.025011 (XEN)    ffff82d0801b9086 ffff83022b4bf9a8 
ffff82d0801b85fb ffff830200000000
        May 22 01:27:59.033009 (XEN)    0000000229000000 ffff83022b4bfb60 
ffff8300000000f4 ffff83022b4bf978
        May 22 01:27:59.040990 (XEN)    00000000fed000f0 0000000000000001 
ffffffffffd090f0 ffff83022b4bf688
        May 22 01:27:59.041030 (XEN)    000000000000008b 0000000000000000 
0000000000000000 ffff82d08028d900
        May 22 01:27:59.048993 (XEN)    ffff83022b4bfb60 ffff83022b4bf678 
ffff82d0801b92b8 ffff83022b4bf688
        May 22 01:27:59.057007 (XEN)    ffff82d0801958de ffff83022b4bfac8 
ffff82d080197875 ffff83022b4bf6c8
        May 22 01:27:59.064999 (XEN)    ffff82d0801f7b1c ffff83012c9ea008 
0000000000000001 0000000000000080
        May 22 01:27:59.073007 (XEN)    00000000000001d9 ffff830200000001 
ffff82d0801f890f 8086000000008086
        May 22 01:27:59.080990 (XEN)    0000000200020001 ffffffffffd090f0 
dc00000000000008 ffff83022b4b0000
        May 22 01:27:59.089011 (XEN)    0000000900000000 00000000ffffff18 
ffff830100000004 0000000000000144
        May 22 01:27:59.097029 (XEN)    ffff830100000008 ffff82d0801b92b8 
0000000000000000 ffff83012c9ea010
        May 22 01:27:59.105006 (XEN)    ffff83022b4bf8c4 00000000000000f0 
ffff83022b4bf7f8 ffff83022b4bf800
        May 22 01:27:59.112998 (XEN)    0000000000000012 000000000012c9e8 
0000000000000001 ffff83022b4bf7a8
        May 22 01:27:59.121016 (XEN)    ffff82d0801f7b1c ffff83012c9ea010 
0000000000000001 0000000000000000
        May 22 01:27:59.129007 (XEN) Xen call trace:
        May 22 01:27:59.129072 (XEN)    [<ffff82d0801bdce5>] 
hvm_io_pending+0/0x52
        May 22 01:27:59.137022 (XEN)    [<ffff82d0801b85fb>] 
hvmemul_do_mmio+0x2d/0x2f
        May 22 01:27:59.137060 (XEN)    [<ffff82d0801b9086>] 
__hvmemul_read+0x11c/0x29c
        May 22 01:27:59.145011 (XEN)    [<ffff82d0801b92b8>] 
hvmemul_read+0x12/0x14
        May 22 01:27:59.152979 (XEN)    [<ffff82d0801958de>] read_ulong+0xe/0x10
        May 22 01:27:59.153012 (XEN)    [<ffff82d080197875>] 
x86_emulate+0x1577/0x10362
        May 22 01:27:59.161001 (XEN)    [<ffff82d0801b8a5a>] 
_hvm_emulate_one+0x197/0x2bb
        May 22 01:27:59.169007 (XEN)    [<ffff82d0801b8c3f>] 
hvm_emulate_one+0x10/0x12
        May 22 01:27:59.176996 (XEN)    [<ffff82d0801c8274>] 
handle_mmio+0x54/0xd4
        May 22 01:27:59.177033 (XEN)    [<ffff82d0801c8338>] 
handle_mmio_with_translation+0x44/0x46
        May 22 01:27:59.185009 (XEN)    [<ffff82d0801c6624>] 
hvm_hap_nested_page_fault+0x163/0x541
        May 22 01:27:59.193029 (XEN)    [<ffff82d0801e7df6>] 
vmx_vmexit_handler+0x16a8/0x1a63
        May 22 01:27:59.201048 (XEN)    [<ffff82d0801ee091>] 
vmx_asm_vmexit_handler+0x41/0xc0
        May 22 01:27:59.201088 (XEN) 
        May 22 01:27:59.209011 (XEN) *** Dumping CPU1 guest state (d18v0): ***
        May 22 01:27:59.209060 (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  
Not tainted ]----
        May 22 01:27:59.217002 (XEN) CPU:    1
        May 22 01:27:59.217018 (XEN) RIP:    0010:[<fffff8000260eeea>]
        May 22 01:27:59.225029 (XEN) RFLAGS: 0000000000010246   CONTEXT: hvm 
guest (d18v0)
        May 22 01:27:59.225067 (XEN) rax: ffffffffffd09000   rbx: 
fffffa8001404000   rcx: 0000001280000b9f
        May 22 01:27:59.233042 (XEN) rdx: 000000000342b947   rsi: 
0000000000006ec8   rdi: 0000000000000000
        May 22 01:27:59.241045 (XEN) rbp: 0000000000010000   rsp: 
fffff80000b9c828   r8:  fffff80000b9c900
        May 22 01:27:59.249008 (XEN) r9:  0000000003b9aca0   r10: 
0000000000000000   r11: fffff80000b9c9e8
        May 22 01:27:59.256996 (XEN) r12: 0000000000000020   r13: 
fffffa80006e7010   r14: fffff80000b9c948
        May 22 01:27:59.265034 (XEN) r15: 0000000000000000   cr0: 
0000000080050031   cr4: 00000000000406f8
        May 22 01:27:59.273026 (XEN) cr3: 0000000000187000   cr2: 
fffff8a0007d1538
        May 22 01:27:59.280994 (XEN) ds: 002b   es: 002b   fs: 0053   gs: 002b  
 ss: 0018   cs: 0010
        May 22 01:27:59.288972 (XEN) 

 and later:
        May 22 01:28:33.044999 (XEN) General information for domain 18:
        May 22 01:28:33.045034 (XEN)     refcnt=3 dying=0 pause_count=0
        May 22 01:28:33.053003 (XEN)     nr_pages=131074 xenheap_pages=6 
shared_pages=0 paged_pages=0 dirty_cpus={1-2} max_pages=131328
        May 22 01:28:33.061035 (XEN)     
handle=e2333298-4442-4ff6-aaaf-fd6168c5c8f7 vm_assist=00000000
        May 22 01:28:33.068986 (XEN)     paging assistance: hap refcounts 
translate external 
        May 22 01:28:33.076975 (XEN) Rangesets belonging to domain 18:
        May 22 01:28:33.077009 (XEN)     I/O Ports  { }
        May 22 01:28:33.085015 (XEN)     log-dirty  { }
        May 22 01:28:33.085047 (XEN)     Interrupts { }
        May 22 01:28:33.085073 (XEN)     I/O Memory { }
        May 22 01:28:33.093013 (XEN) Memory pages belonging to domain 18:
        May 22 01:28:33.093046 (XEN)     DomPage list too long to display
        May 22 01:28:33.101054 (XEN)     PoD entries=0 cachesize=0
        May 22 01:28:33.101098 (XEN)     XenPage 000000000012c8af: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.108981 (XEN)     XenPage 000000000012c9ee: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.116979 (XEN)     XenPage 000000000010c39d: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.125010 (XEN)     XenPage 000000000012c9f5: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.132998 (XEN)     XenPage 00000000000ccef2: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.141013 (XEN)     XenPage 000000000012c8ec: 
caf=c000000000000001, taf=7400000000000001
        May 22 01:28:33.148973 (XEN) NODE affinity for domain 18: [0]
        May 22 01:28:33.149028 (XEN) VCPU information and callbacks for domain 
18:
        May 22 01:28:33.156972 (XEN)     VCPU0: CPU2 [has=F] poll=0 
upcall_pend=00 upcall_mask=00 dirty_cpus={2}
        May 22 01:28:33.165038 (XEN)     cpu_hard_affinity={0-47} 
cpu_soft_affinity={0-3}
        May 22 01:28:33.173029 (XEN)     pause_count=0 pause_flags=1
        May 22 01:28:33.173064 (XEN)     paging assistance: hap, 4 levels
        May 22 01:28:33.181020 (XEN)     No periodic timer
        May 22 01:28:33.181052 (XEN)     VCPU1: CPU1 [has=F] poll=0 
upcall_pend=00 upcall_mask=00 dirty_cpus={1}
        May 22 01:28:33.189031 (XEN)     cpu_hard_affinity={0-47} 
cpu_soft_affinity={0-3}
        May 22 01:28:33.197069 (XEN)     pause_count=0 pause_flags=1
        May 22 01:28:33.197105 (XEN)     paging assistance: hap, 4 levels
        May 22 01:28:33.205013 (XEN)     No periodic timer

The xl log just shows it waiting for the domain to die.

The qemu log
http://logs.test-lab.xenproject.org/osstest/logs/56929/test-amd64-amd64-xl-qemut-win7-amd64/elbling1---var-log-xen-qemu-dm-win.guest.osstest--incoming.log
 seems uninteresting.

Can anything be gleaned from this?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.