[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] S3 crash with VTD Queue Invalidation enabled



On 03/06/13 19:29, Ben Guthro wrote:
> I am seeing a crash on some vPro systems in the S3 path -
> specifically a Lenovo ThinkPad x220t (Sandybridge)
>
> Once I managed to not suspend the console, I got a panic in
> queue_invalidate_wait()
> (I added a dump_execution_state() here, to get some more info)
>
> (XEN) Entering ACPI S3 state.
> (XEN) ----[ Xen-4.2.2  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c480149091>] invalidate_sync+0x258/0x291
> (XEN) RFLAGS: 0000000000010086   CONTEXT: hypervisor
> (XEN) rax: 0000000000000000   rbx: ffff830137a665c0   rcx: 0000000000000000
> (XEN) rdx: ffff82c48030a0a0   rsi: 000000000000000a   rdi: ffff82c4802766e0
> (XEN) rbp: ffff82c4802bfd30   rsp: ffff82c4802bfce0   r8:  0000000000000004
> (XEN) r9:  0000000000000002   r10: 0000000000000020   r11: 0000000000000010
> (XEN) r12: 0000000bf34a77bc   r13: 0000000000000000   r14: ffff830137a665f8
> (XEN) r15: 0000000137a5c002   cr0: 000000008005003b   cr4: 00000000000426f0
> (XEN) cr3: 00000000ba2cd000   cr2: ffff880024181ff0
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802bfce0:
> (XEN)    0000000000000002 0000000000000002 0101010000000002 0000000000000082
> (XEN)    00000001802bfd30 ffff830137a665c0 0000000000000000 0000000000000000
> (XEN)    0000000000000000 1000000000000000 ffff82c4802bfd90 ffff82c48014919d
> (XEN)    ffff82c400000000 0000000000000000 ffff82c4802bfd60 0000000000000000
> (XEN)    ffff82c4802bfd90 ffff830137a665c0 ffff830137a66540 0000000000000000
> (XEN)    ffff830137a66670 ffff82c4802679e0 ffff82c4802bfde0 ffff82c480145a60
> (XEN)    0000000000000000 ffff82c4802bfdc0 ffff82c480125d36 ffff82c3ffd7a00c
> (XEN)    0000000000000000 0000000000000003 0000000000000003 ffff82c48030a100
> (XEN)    ffff82c4802bfe20 ffff82c480145b08 ffff830137a4e620 ffff82c3ffd7a00c
> (XEN)    0000000000000000 0000000000000003 0000000000000003 ffff82c48030a100
> (XEN)    ffff82c4802bfe30 ffff82c480141e12 ffff82c4802bfe80 ffff82c48019f315
> (XEN)    ffff82c4802bfe60 0000000000000282 0000000000000003 ffff83010cc0a010
> (XEN)    ffff8300ba0fd000 0000000000000000 0000000000000003 ffff82c48030a100
> (XEN)    ffff82c4802bfea0 ffff82c480105ed4 ffff8300ba0fd188 ffff82c48030a170
> (XEN)    ffff82c4802bfec0 ffff82c480127a1e ffff82c480125b8a ffff82c48030a190
> (XEN)    ffff82c4802bfef0 ffff82c480127d89 ffff82c4802bff18 ffff82c4802bff18
> (XEN)    ffff82c4802bff18 00000000ffffffff ffff82c4802bff10 ffff82c48015a42f
> (XEN)    ffff8300ba59a000 ffff8300ba0fd000 ffff82c4802bfda8 0000000000001403
> (XEN)    0000000000000003 0000000000003403 ffffffff81a6b278 ffff8800049f3d28
> (XEN)    0000000000000000 0000000000000246 0000000000000404 0000000000000003
> (XEN) Xen call trace:
> (XEN)    [<ffff82c480149091>] invalidate_sync+0x258/0x291
> (XEN)    [<ffff82c48014919d>] flush_iotlb_qi+0xd3/0xef
> (XEN)    [<ffff82c480145a60>] iommu_flush_all+0xb5/0xde
> (XEN)    [<ffff82c480145b08>] vtd_suspend+0x23/0xf1
> (XEN)    [<ffff82c480141e12>] iommu_suspend+0x3c/0x3e
> (XEN)    [<ffff82c48019f315>] enter_state_helper+0x1a0/0x3cb
> (XEN)    [<ffff82c480105ed4>] continue_hypercall_tasklet_handler+0x51/0xbf
> (XEN)    [<ffff82c480127a1e>] do_tasklet_work+0x8d/0xc7
> (XEN)    [<ffff82c480127d89>] do_tasklet+0x6b/0x9b
> (XEN)    [<ffff82c48015a42f>] idle_loop+0x67/0x6f
> (XEN)
> (XEN)
> (XEN) DMAR_IQA_REG = 137a5c002
> (XEN) DMAR_IQH_REG = 120
> (XEN) DMAR_IQT_REG = 140
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) queue invalidate wait descriptor was not executed
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
>
>
> This particular dump was with Xen 4.2.2, and Linux 3.8.8
> I have tested the following other combinations, with no difference in 
> behavior:
>
> Xen-unstable git cs da3bca931fbcf0cbdfec971aca234e7ec0f41e16, with
> Linux 3.10-rc3 cs 58f8bbd2e39c3732c55698494338ee19a92c53a0
>
> Xen-4.2.2 / linux-3.8.8
> Xen-4.2.2 / linux-3.8.13
> Xen-4.2.3-pre / linux-3.8.13
>
> Booting with iommu=no-qinval or iommu=off works around the problem,
> but I was wondering if there was a more elegant solution, possibly
> detecting, and disabling this feature if not working properly?
>
>
> Thanks in advance for any insight.
>
> Ben

This was likely broken by XSA-36

My fix for the crash path is:
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=53fd1d8458de01169dfb56feb315f02c2b521a86

You want to inspect the use of iommu_enabled and iommu_intremap.

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.