|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: IOMMU faults after S3
On Thu, Apr 02, 2026 at 09:01:12AM +0200, Jan Beulich wrote: > On 02.04.2026 01:17, Marek Marczykowski-Górecki wrote: > > On Wed, Apr 01, 2026 at 10:52:37AM +0200, Jan Beulich wrote: > >> On 01.04.2026 09:14, Jan Beulich wrote: > >>> On 27.03.2026 11:19, Marek Marczykowski-Górecki wrote: > >>>> I noticed that on some systems, there are a lot of IOMMU faults after > >>>> S3. I can see it also on a laptop with MTL, but it affects also the ADL > >>>> gitlab runner: > >>>> > >>>> https://gitlab.com/xen-project/hardware/xen/-/jobs/13661033722 > >>>> (XEN) [ 37.201160] [VT-D]DMAR:[DMA Write] Request device > >>>> [0000:00:1e.6] fault addr 0 > >>>> (XEN) [ 37.201164] [VT-D]DMAR: reason 02 - Present bit in context > >>>> entry is clear > >>>> (XEN) [ 37.202332] [VT-D]DMAR:[DMA Write] Request device > >>>> [0000:00:1e.6] fault addr 0 > >>>> (XEN) [ 37.202339] [VT-D]DMAR: reason 02 - Present bit in context > >>>> entry is clear > >>>> > >>>> Interestingly, the 0000:00:1e.6 device is not even listed by lspci. > >>>> > >>>> The issue is present only on staging, not staging-4.21. > >>>> > >>>> Bisect says: > >>>> > >>>> 5ec93b2f19ff8873fca65d38c1164b0a56d3898b is the first bad commit > >>>> commit 5ec93b2f19ff8873fca65d38c1164b0a56d3898b > >>>> Author: Jan Beulich <jbeulich@xxxxxxxx> > >>>> Date: Thu Jan 22 14:13:35 2026 +0100 > >>>> > >>>> x86/HPET: drop .set_affinity hook > >>> > >>> Looking into this, I find several things I can't quite understand (yet). > >>> First there is > >>> > >>> (XEN) [000000456c0fe39f] Disabling HPET for being unreliable > >>> > >>> which looks to only affect clocksource selection, but not use as > >>> broadcast source for CPU-idle management. (This may be an independent > >>> issue.) > >>> > >>> Then there is > >>> > >>> (XEN) [ 2.760248] HPET: 8 timers usable for broadcast (8 total) > >>> > >>> which should only occur on ARAT-incapable systems. That should only be > >>> older hardware. (On my much older Skylake I don't see this line, for > >>> example.) What does CPUID leaf 6 have on this system? Sadly xen-cpuid > >>> is purely featureset based, and hence doesn't expose info about that > >>> leaf. The leaf also isn't exposed to domains, so CPUID output in Dom0 > >>> isn't useful to look at either. It would need to be CPUID output on a > >>> bare metal kernel. > >>> > >>> Further I suspect the fingered commit may only have uncovered an issue > >>> elsewhere. I don't think we clear any context table entries during > >>> suspend or resume. Hence in > >>> > >>> (XEN) [ 20.554813] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] > >>> fault addr 0 > >>> (XEN) [ 20.554819] [VT-D]DMAR: reason 02 - Present bit in context entry > >>> is clear > >>> > >>> the latter message is confusing me. > >>> > >>> The fault address being zero may, otoh, be a hint of hpet_msi_write() > >>> never having run post-resume. Which may be the connection to the > >>> dropping of hpet_msi_set_affinity(), as that did call that function. > >> > >> There clearly is an issue with the handling of the max_cstate variable, > >> but I expect you don't use xenpm to limit usable C-states (there clearly > >> is no respective command line option in the log you referenced)? > > > > No, I don't think so. > > > >> From what the log has, I conclude hpet_broadcast_resume() is called. > > > > I don't think so... I applied changes as attached and got this on > > resume: > > > > (XEN) [ 69.486120] Enabling non-boot CPUs ... > > (XEN) [ 69.486404] mwait-idle: state C1 is disabled > > (XEN) [ 69.587869] mwait-idle: state C1 is disabled > > (XEN) [ 69.588008] mwait-idle: state C1 is disabled > > (XEN) [ 69.689438] mwait-idle: state C1 is disabled > > (XEN) [ 69.689608] mwait-idle: state C1 is disabled > > (XEN) [ 69.791066] mwait-idle: state C1 is disabled > > (XEN) [ 69.791334] mwait-idle: state C1 is disabled > > (XEN) [ 69.892938] mwait-idle: state C1 is disabled > > (XEN) [ 69.893209] mwait-idle: state C1 is disabled > > (XEN) [ 69.994890] mwait-idle: state C1 is disabled > > (XEN) [ 69.995096] mwait-idle: state C1 is disabled > > (XEN) [ 70.096638] mwait-idle: state C1 is disabled > > (XEN) [ 70.096915] mwait-idle: state C1 is disabled > > (XEN) [ 70.097093] mwait-idle: state C1 is disabled > > (XEN) [ 70.097272] mwait-idle: state C1 is disabled > > (XEN) [ 70.203357] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] > > fault addr 0 > > (XEN) [ 70.203363] [VT-D]DMAR: reason 02 - Present bit in context entry > > is clear > > That was on the serial console or from xl dmesg? I ask because > console_resume() > runs after time_resume(), so nothing appearing on the serial console would be > expected (I think). Ah, right, that's why I don't see my messages. The xl dmesg output (from MTL this time): (XEN) [ 123.477511] Entering ACPI S3 state. (XEN) [18446743903.571842] _disable_pit_irq:2649: using_pit: 0, cpu_has_apic: 1 (XEN) [18446743903.571856] _disable_pit_irq:2659: cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0 (XEN) [18446743903.571866] _disable_pit_irq:2662: init: 0 (XEN) [18446743903.571877] hpet_broadcast_resume:661: hpet_events: ffff83046bc1f080 (XEN) [18446743903.572020] hpet_broadcast_resume:672: num_hpets_used: 8 (XEN) [18446743903.572029] hpet_broadcast_resume:690: cfg: 0x1 (XEN) [18446743903.572040] hpet_broadcast_resume:695: i:0, hpet_events[i].msi.irq: 122, hpet_events[i].flags: 0 (XEN) [18446743903.572081] hpet_broadcast_resume:706: i:0, cfg: 0xc134 (XEN) [18446743903.572089] hpet_broadcast_resume:695: i:1, hpet_events[i].msi.irq: 123, hpet_events[i].flags: 0 (XEN) [18446743903.572123] hpet_broadcast_resume:706: i:1, cfg: 0xc104 (XEN) [18446743903.572132] hpet_broadcast_resume:695: i:2, hpet_events[i].msi.irq: 124, hpet_events[i].flags: 0 (XEN) [18446743903.572167] hpet_broadcast_resume:706: i:2, cfg: 0xc104 (XEN) [18446743903.572175] hpet_broadcast_resume:695: i:3, hpet_events[i].msi.irq: 125, hpet_events[i].flags: 0 (XEN) [18446743903.572210] hpet_broadcast_resume:706: i:3, cfg: 0xc104 (XEN) [18446743903.572218] hpet_broadcast_resume:695: i:4, hpet_events[i].msi.irq: 126, hpet_events[i].flags: 0 (XEN) [18446743903.572252] hpet_broadcast_resume:706: i:4, cfg: 0xc104 (XEN) [18446743903.572261] hpet_broadcast_resume:695: i:5, hpet_events[i].msi.irq: 127, hpet_events[i].flags: 0 (XEN) [18446743903.572294] hpet_broadcast_resume:706: i:5, cfg: 0xc104 (XEN) [18446743903.572303] hpet_broadcast_resume:695: i:6, hpet_events[i].msi.irq: 128, hpet_events[i].flags: 0 (XEN) [18446743903.572338] hpet_broadcast_resume:706: i:6, cfg: 0xc104 (XEN) [18446743903.572347] hpet_broadcast_resume:695: i:7, hpet_events[i].msi.irq: 129, hpet_events[i].flags: 0 (XEN) [18446743903.572382] hpet_broadcast_resume:706: i:7, cfg: 0xc104 And the xen-cpuid -p output from this system: Xen reports there are maximum 120 leaves and 2 MSRs Raw policy: 48 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 00000023:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:20800800:77fafbff:bfebfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000005:ffffffff -> 00000040:00000040:00000003:11112020 00000006:ffffffff -> 00dfcff7:00000002:00000409:00040003 00000007:00000000 -> 00000002:239c27eb:994007ac:fc18c410 00000007:00000001 -> 40400910:00000001:00000000:00040000 00000007:00000002 -> 00000000:00000000:00000000:0000003f 0000000a:ffffffff -> 07300805:00000000:00000007:00008603 0000000b:00000000 -> 00000001:00000002:00000100:00000020 0000000b:00000001 -> 00000007:00000016:00000201:00000020 0000000d:00000000 -> 00000207:00000000:00000a88:00000000 0000000d:00000001 -> 0000000f:00000000:00019900:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 0000000d:00000008 -> 00000080:00000000:00000001:00000000 0000000d:00000009 -> 00000008:00000a80:00000000:00000000 0000000d:0000000b -> 00000010:00000000:00000001:00000000 0000000d:0000000c -> 00000018:00000000:00000001:00000000 0000000d:0000000f -> 00000328:00000000:00000001:00000000 0000000d:00000010 -> 00000008:00000000:00000001:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:2c100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 0000302e:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 000000000d89fd6b Host policy: 41 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:20800800:77fafbff:bfebfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000005:ffffffff -> 00000040:00000040:00000003:11112020 00000006:ffffffff -> 00dfcff7:00000002:00000409:00040003 00000007:00000000 -> 00000002:239c27eb:994007ac:fc18c410 00000007:00000001 -> 40000910:00000001:00000000:00040000 00000007:00000002 -> 00000000:00000000:00000000:0000003f 0000000b:00000000 -> 00000001:00000002:00000100:00000020 0000000b:00000001 -> 00000007:00000016:00000201:00000020 0000000d:00000000 -> 00000207:00000000:00000a88:00000000 0000000d:00000001 -> 0000000f:00000000:00000000:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 0000000d:00000009 -> 00000008:00000a80:00000000:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:2c100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 0000302e:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 400000000d89fd6b PV Max policy: 58 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:00800800:f6f83203:1fc9cbf5 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000007:00000000 -> 00000002:218c0329:18400700:ac004410 00000007:00000001 -> 00000810:00000000:00000000:00000000 00000007:00000002 -> 00000000:00000000:00000000:00000021 0000000d:00000000 -> 00000007:00000000:00000340:00000000 0000000d:00000001 -> 00000007:00000000:00000000:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 80000000:ffffffff -> 80000021:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000123:28100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 0000302e:00001000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 400000001d0ae167 HVM Max policy: 65 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:00800800:f7fa3223:1fcbfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000007:00000000 -> 00000002:219c07ab:9840070c:bc004410 00000007:00000001 -> 00000810:00000000:00000000:00000000 00000007:00000002 -> 00000000:00000000:00000000:00000037 0000000d:00000000 -> 00000207:00000000:00000a88:00000000 0000000d:00000001 -> 0000000f:00000000:00000000:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 0000000d:00000009 -> 00000008:00000a80:00000000:00000000 80000000:ffffffff -> 80000021:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000123:2c100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 0000302e:00101000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 400000001d0ae167 PV Default policy: 33 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:00800800:f6d83203:1fc9cbf5 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000007:00000000 -> 00000002:218c0329:00400700:ac004410 00000007:00000001 -> 00000810:00000000:00000000:00000000 00000007:00000002 -> 00000000:00000000:00000000:00000021 0000000d:00000000 -> 00000007:00000000:00000340:00000000 0000000d:00000001 -> 00000007:00000000:00000000:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:28100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000008:ffffffff -> 0000302e:00001000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 400000000d08e163 HVM Default policy: 40 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000a06a4:00800800:f7fa3203:1fcbfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> fc004121:02c0003f:0000003f:00000000 00000004:00000001 -> fc004122:03c0003f:0000003f:00000000 00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000 00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004 00000007:00000000 -> 00000002:219c07ab:8040070c:bc004410 00000007:00000001 -> 00000810:00000000:00000000:00000000 00000007:00000002 -> 00000000:00000000:00000000:00000037 0000000d:00000000 -> 00000207:00000000:00000a88:00000000 0000000d:00000001 -> 0000000f:00000000:00000000:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 0000000d:00000009 -> 00000008:00000a80:00000000:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:2c100800 80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865 80000003:ffffffff -> 6c552029:20617274:35312037:00004835 80000006:ffffffff -> 00000000:00000000:08007040:00000000 80000008:ffffffff -> 0000302e:00101000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000080000000 0000010a -> 400000000d08e163 > Without hpet_broadcast_resume() running, I don't think I could explain how the > channels (and their FSB interrupts) would get enabled. > > Jan -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |