[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] S3 crash with VTD Queue Invalidation enabled

On Tue, Jun 4, 2013 at 4:54 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>> On 03.06.13 at 21:22, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 03/06/13 19:29, Ben Guthro wrote:
>>> (XEN) Xen call trace:
>>> (XEN)    [<ffff82c480149091>] invalidate_sync+0x258/0x291
>>> (XEN)    [<ffff82c48014919d>] flush_iotlb_qi+0xd3/0xef
>>> (XEN)    [<ffff82c480145a60>] iommu_flush_all+0xb5/0xde
>>> (XEN)    [<ffff82c480145b08>] vtd_suspend+0x23/0xf1
>>> (XEN)    [<ffff82c480141e12>] iommu_suspend+0x3c/0x3e
>>> (XEN)    [<ffff82c48019f315>] enter_state_helper+0x1a0/0x3cb
>>> (XEN)    [<ffff82c480105ed4>] continue_hypercall_tasklet_handler+0x51/0xbf
>>> (XEN)    [<ffff82c480127a1e>] do_tasklet_work+0x8d/0xc7
>>> (XEN)    [<ffff82c480127d89>] do_tasklet+0x6b/0x9b
>>> (XEN)    [<ffff82c48015a42f>] idle_loop+0x67/0x6f
>> This was likely broken by XSA-36
>> My fix for the crash path is:
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=53fd1d8458de01169dfb
>> 56feb315f02c2b521a86
>> You want to inspect the use of iommu_enabled and iommu_intremap.
> According to the comment in vtd_suspend(),
> iommu_disable_x2apic_IR() is supposed to run after
> iommu_suspend() (and indeed lapic_suspend() gets called
> immediately after iommu_suspend() by device_power_down()),
> and hence that shouldn't be the reason here. But, Ben, to be
> sure, dumping the state of the various IOMMU related enabling
> variables would be a good idea.

I assume you are referring to the variables below, defined at the top of iommu.c
At the time of the crash, they look like this:

(XEN) iommu_enabled = 1
(XEN) force_iommu; = 0
(XEN) iommu_verbose; = 0
(XEN) iommu_workaround_bios_bug; = 0
(XEN) iommu_passthrough; = 0
(XEN) iommu_snoop = 0
(XEN) iommu_qinval = 1
(XEN) iommu_intremap = 1
(XEN) iommu_hap_pt_share = 0
(XEN) iommu_debug; = 0
(XEN) amd_iommu_perdev_intremap = 1

If that gives any additional insight, please let me know.
I'm not sure I gleaned anything particularly significant from it though.

Or - perhaps you are referring to other enabling variables?

> Is this perhaps having some similarity with
> http://lists.xen.org/archives/html/xen-devel/2013-04/msg00343.html?
> We're clearly running single-CPU only here and there...

We certainly should be, as we have gone through the
disable_nonboot_cpus() by this point - and I can verify that from the


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.