|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: IOMMU faults after S3
On 02.04.2026 16:02, Marek Marczykowski-Górecki wrote:
> On Thu, Apr 02, 2026 at 12:23:08PM +0200, Jan Beulich wrote:
>> On 02.04.2026 11:42, Marek Marczykowski-Górecki wrote:
>>> On Thu, Apr 02, 2026 at 10:47:53AM +0200, Jan Beulich wrote:
>>>> On 02.04.2026 10:39, Jan Beulich wrote:
>>>>> On 02.04.2026 10:08, Marek Marczykowski-Górecki wrote:
>>>>>> The xl dmesg output (from MTL this time):
>>>>>>
>>>>>> (XEN) [ 123.477511] Entering ACPI S3 state.
>>>>>> (XEN) [18446743903.571842] _disable_pit_irq:2649: using_pit: 0,
>>>>>> cpu_has_apic: 1
>>>>>> (XEN) [18446743903.571856] _disable_pit_irq:2659:
>>>>>> cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0
>>>>>
>>>>> XEN_ARAT being off is the one odd aspect here. That'll want tracking down
>>>>> separately. As per xen-cpuid output (below) ARAT is available.
>>>>
>>>> For this you may want to also add logging to intel_init_arat(): Since
>>>> opt_arat
>>>> can be false only due to command line option use, it can only be the
>>>> function
>>>> not being called (which looks impossible on plain staging code), or
>>>> cpu_has_arat
>>>> being false despite the xen-cpuid output that you supplied earlier
>>>> (inexplicable
>>>> as well, at least for now).
>>>
>>> Hm, I got this:
>>>
>>> (XEN) [ 11.403340] intel_init_arat:674: opt_arat: 1, cpu_has_arat: 0
>>>
>>> so, cpu_has_arat=0 ...
>>> next lines are those, to hint when it happened in the boot process:
>>>
>>> (XEN) [ 11.409754] mwait-idle: MWAIT substates: 0x11112020
>>> (XEN) [ 11.416130] mwait-idle: v0.4.1 model 0xaa
>>> (XEN) [ 11.422396] mwait-idle: lapic_timer_reliable_states 0x2
>>>
>>> Looks like calculate_host_policy() runs much later...
>>
>> Hmm, yes, and that's the problem. The reason I don't see this is that a newer
>> version of [1] has this
>>
>> --- a/xen/arch/x86/cpu/common.c
>> +++ b/xen/arch/x86/cpu/common.c
>> @@ -628,6 +628,8 @@ void identify_cpu(struct cpuinfo_x86 *c)
>> }
>>
>> /* Now the feature flags better reflect actual CPU features! */
>> + if (c == &boot_cpu_data)
>> + calculate_host_policy();
>>
>> xstate_init(c);
>>
>> --- a/xen/arch/x86/cpu-policy.c
>> +++ b/xen/arch/x86/cpu-policy.c
>> @@ -384,7 +384,7 @@ void calculate_raw_cpu_policy(void)
>> /* Was already added by probe_cpuid_faulting() */
>> }
>>
>> -static void __init calculate_host_policy(void)
>> +void __init calculate_host_policy(void)
>> {
>> struct cpu_policy *p = &host_cpu_policy;
>>
>> @@ -959,6 +959,7 @@ static void __init calculate_hvm_def_pol
>>
>> void __init init_guest_cpu_policies(void)
>> {
>> + /* Do this a 2nd time to account for setup_{clear,force}_cpu_cap()
>> uses. */
>> calculate_host_policy();
>>
>> if ( IS_ENABLED(CONFIG_PV) )
>>
>> and of course I'm doing my work (and my analysis) with that in place.
>
> FWIW, with this patch applied I get:
> (XEN) [18446743899.051851] _disable_pit_irq:2649: using_pit: 0, cpu_has_apic:
> 1
> (XEN) [18446743899.051865] _disable_pit_irq:2659: cpuidle_using_deep_cstate:
> 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 1
>
> And no IOMMU faults anymore.
Right, because then - as intended - HPET broadcast isn't used. You'd see them
again if you put "no-arat" on the command line. (And we really want to figure
out that issue, if at all possible.)
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |