[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IOMMU faults after S3


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • Date: Tue, 7 Apr 2026 12:02:23 +0200
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=fm2 header.d=invisiblethingslab.com header.i="@invisiblethingslab.com" header.h="Cc:Content-Type:Date:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To"; dkim=pass header.s=fm2 header.d=messagingengine.com header.i="@messagingengine.com" header.h="Cc:Content-Type:Date:Feedback-ID:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To:X-ME-Proxy:X-ME-Sender"
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 07 Apr 2026 10:02:30 +0000
  • Feedback-id: i1568416f:Fastmail
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Apr 07, 2026 at 08:29:48AM +0200, Jan Beulich wrote:
> On 03.04.2026 01:06, Marek Marczykowski-Górecki wrote:
> > On Thu, Apr 02, 2026 at 04:53:31PM +0200, Jan Beulich wrote:
> >> On 02.04.2026 16:47, Marek Marczykowski-Górecki wrote:
> >>> On Thu, Apr 02, 2026 at 12:48:14PM +0200, Jan Beulich wrote:
> >>>> On 02.04.2026 11:35, Marek Marczykowski-Górecki wrote:
> >>>>> On Thu, Apr 02, 2026 at 10:39:41AM +0200, Jan Beulich wrote:
> >>>>>> On 02.04.2026 10:08, Marek Marczykowski-Górecki wrote:
> >>>>>>> The xl dmesg output (from MTL this time):
> >>>>>>>
> >>>>>>>     (XEN) [  123.477511] Entering ACPI S3 state.
> >>>>>>>     (XEN) [18446743903.571842] _disable_pit_irq:2649: using_pit: 0, 
> >>>>>>> cpu_has_apic: 1
> >>>>>>>     (XEN) [18446743903.571856] _disable_pit_irq:2659: 
> >>>>>>> cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0
> >>>>>
> >>>>>> Hmm, but what you didn't log is whether __hpet_setup_msi_irq() actually
> >>>>>> succeeded everywhere. (And if it did, also logging HPET_Tn_ROUTE() 
> >>>>>> values
> >>>>>> might be a good idea, if only to double check.)
> >>>>>
> >>>>> Updated output:
> >>>>>
> >>>>>     (XEN) [18446743899.720395] _disable_pit_irq:2649: using_pit: 0, 
> >>>>> cpu_has_apic: 1
> >>>>>     (XEN) [18446743899.720409] _disable_pit_irq:2659: 
> >>>>> cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0
> >>>>>     (XEN) [18446743899.720420] _disable_pit_irq:2662: init: 0
> >>>>>     (XEN) [18446743899.720431] hpet_broadcast_resume:663: hpet_events: 
> >>>>> ffff83046bc1f080
> >>>>>     (XEN) [18446743899.720579] hpet_broadcast_resume:674: 
> >>>>> num_hpets_used: 8
> >>>>>     (XEN) [18446743899.720587] hpet_broadcast_resume:692: cfg: 0x1
> >>>>>     (XEN) [18446743899.720599] hpet_broadcast_resume:697: i:0, 
> >>>>> hpet_events[i].msi.irq: 122, hpet_events[i].flags: 0
> >>>>>     (XEN) [18446743899.720612] hpet_msi_write:283: iommu_intremap: 2 
> >>>>> (iommu_intremap_off: 0), HPET_Tn_ROUTE(ch->idx): 0x110
> >>>>>     (XEN) [18446743899.720638] hpet_msi_write:287: 
> >>>>> iommu_update_ire_from_msi rc: 0
> >>>>
> >>>> So it succeeds, and the low half of HPET_Tn_ROUTE also looks plausible. 
> >>>> The high
> >>>> half is, however, the address that the low half value is written to. 
> >>>> It's hard
> >>>> to imagine that it would be zero when the low half isn't, but it is 
> >>>> about the
> >>>> last thing I can think of which could explain observed behavior. (Yet 
> >>>> then, all
> >>>> of this is pretty meaningless; see below.)
> >>>>
> >>>>> And the current debug diff attached.
> >>>>
> >>>> Hmm, you log HPET_Tn_ROUTE _before_ our update. That's not very useful. 
> >>>> You want
> >>>> to move that part of logging to the bottom of hpet_msi_write(), or maybe 
> >>>> to
> >>>> where you also log the per-channel cfg value in hpet_broadcast_resume() 
> >>>> (thus
> >>>> making the logging overall less verbose).
> >>>
> >>> This test is with the updated patch (attached) + your extra
> >>> calculate_host_policy() call and "no-arat" on cmdline:
> >>
> >> And IOMMU faults still occurring as before, I expect.
> >>
> >> Sadly you now log the low halves of HPET_Tn_ROUTE twice, while you don't 
> >> log
> >> the high halves at all.
> > 
> > I was missing hpet_read32 there...
> > 
> > Updated:
> > (XEN) [  116.921573] Entering ACPI S3 state.
> > (XEN) [18446743895.088893] _disable_pit_irq:2649: using_pit: 0, 
> > cpu_has_apic: 1
> > (XEN) [18446743895.088907] _disable_pit_irq:2659: 
> > cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0
> > (XEN) [18446743895.088918] _disable_pit_irq:2662: init: 0
> > (XEN) [18446743895.088928] hpet_broadcast_resume:662: hpet_events: 
> > ffff83046bc1f080
> > (XEN) [18446743895.089072] hpet_broadcast_resume:673: num_hpets_used: 8
> > (XEN) [18446743895.089081] hpet_broadcast_resume:691: cfg: 0x1
> > (XEN) [18446743895.089092] hpet_broadcast_resume:696: i:0, 
> > hpet_events[i].msi.irq: 122, hpet_events[i].flags: 0
> > (XEN) [18446743895.089122] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089132] hpet_broadcast_resume:700: i:0, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089168] hpet_broadcast_resume:710: i:0, cfg: 0xc134, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xf18
> 
> Okay, this would appear to clarify that the address really isn't correct. Yet 
> I'm
> confused now by the low half values: In your earlier log there was
> 
> hpet_broadcast_resume:710: i:0, cfg: 0xc134, 
> HPET_Tn_ROUTE(hpet_events[i].idx): 0x110

My earlier logging included literal HPET_Tn_ROUTE() macro output, not
hpet_read32() of it...

> and alike, i.e. clearly a non-zero value. Now all low halves are zero. I'll 
> try
> to figure how the logged values here could result, but consistent data (or an
> explantation for the apparent inconsistency) would help.
> 
> Jan
> 
> > (XEN) [18446743895.089180] hpet_broadcast_resume:696: i:1, 
> > hpet_events[i].msi.irq: 123, hpet_events[i].flags: 0
> > (XEN) [18446743895.089203] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089213] hpet_broadcast_resume:700: i:1, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089242] hpet_broadcast_resume:710: i:1, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xf38
> > (XEN) [18446743895.089254] hpet_broadcast_resume:696: i:2, 
> > hpet_events[i].msi.irq: 124, hpet_events[i].flags: 0
> > (XEN) [18446743895.089278] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089288] hpet_broadcast_resume:700: i:2, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089316] hpet_broadcast_resume:710: i:2, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xf58
> > (XEN) [18446743895.089327] hpet_broadcast_resume:696: i:3, 
> > hpet_events[i].msi.irq: 125, hpet_events[i].flags: 0
> > (XEN) [18446743895.089350] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089361] hpet_broadcast_resume:700: i:3, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089390] hpet_broadcast_resume:710: i:3, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xf78
> > (XEN) [18446743895.089401] hpet_broadcast_resume:696: i:4, 
> > hpet_events[i].msi.irq: 126, hpet_events[i].flags: 0
> > (XEN) [18446743895.089425] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089436] hpet_broadcast_resume:700: i:4, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089465] hpet_broadcast_resume:710: i:4, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xf98
> > (XEN) [18446743895.089476] hpet_broadcast_resume:696: i:5, 
> > hpet_events[i].msi.irq: 127, hpet_events[i].flags: 0
> > (XEN) [18446743895.089499] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089509] hpet_broadcast_resume:700: i:5, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089540] hpet_broadcast_resume:710: i:5, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xfb8
> > (XEN) [18446743895.089551] hpet_broadcast_resume:696: i:6, 
> > hpet_events[i].msi.irq: 128, hpet_events[i].flags: 0
> > (XEN) [18446743895.089574] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089584] hpet_broadcast_resume:700: i:6, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089622] hpet_broadcast_resume:710: i:6, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xfd8
> > (XEN) [18446743895.089633] hpet_broadcast_resume:696: i:7, 
> > hpet_events[i].msi.irq: 129, hpet_events[i].flags: 0
> > (XEN) [18446743895.089655] hpet_msi_write:286: iommu_update_ire_from_msi 
> > rc: 0
> > (XEN) [18446743895.089665] hpet_broadcast_resume:700: i:7, 
> > __hpet_setup_msi_irq ret: 0
> > (XEN) [18446743895.089702] hpet_broadcast_resume:710: i:7, cfg: 0xc104, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx)): 0, 
> > hpet_read32(HPET_Tn_ROUTE(hpet_events[i].idx) + 4): 0xff8
> > 
> > 
> > 
> > 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.