[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] x86/amd: Add support for AMD TCE



Le 03/04/2025 à 16:08, Andrew Cooper a écrit :
> On 03/04/2025 1:58 pm, Jan Beulich wrote:
>> On 03.04.2025 14:44, Teddy Astie wrote:
>>> Signed-off-by: Teddy Astie <teddy.astie@xxxxxxxxxx>
>>> ---
>>> RFC:


>>>   - is this change actually safe ?
>> Well, before getting here with reading I was already about to ask this very
>> question. It's really you who needs to prove it.
>>
>>>   - should we add a tce/no-tce option to opt-in/out this feature ?
>> Unless we're entirely certain we got this right and didn't overlook any
>> corner case, perhaps better to do so.
>
> To bring across a quote of mine from Mattermost:
>
> "I'm reasonably sure our TLB handling algorithm is safe for it,
> following the PCID work we did for Meltdown"
>
> But, proving this is hard.
> > Some history: INVLPG flushing the entire paging structure cache
> (non-leaf mappings) was a last-minute "fix" to keep Windows working on
> the Pentium(?), where it had started using INVLPG from the 486(?) but
> with a logical error.
>
> AMD's TCE feature is "that's a hefty hit to keep around, so here's an
> option for the behaviour one would more reasonably expect from INVLPG".
>
> Anyway.  I have a suspicion that Intel's INVPCID no longer followed the
> INVLPG behaviour anyway, and that we were forced to account for that.
> However, I'm struggling to find confirmation one way or another in the SDM.
>
> Another mitigating factor is that, because we use recursive pagetables,
> we have to upgrade an INVLPG into a full flush anyway if we edited
> non-leaf entries.
>

Yes, while proving it on the hypervisor side may be doable, I am quite
unsure about PV guests.
Some calls to HYPERVISOR_mmuext_op incidentally call invlpg and alike
which could be affected with this change, as the guest can "assume" some
behavior aspects of invlpg.

>
> As to a cmdline option, there's cpuid=no-tce if we really really need
> it, but I don't think we want a dedicated TCE option.
>
>
>>> --- a/xen/arch/x86/smpboot.c
>>> +++ b/xen/arch/x86/smpboot.c
>>> @@ -372,6 +372,9 @@ void asmlinkage start_secondary(void *unused)
>>>
>>>       microcode_update_one();
>>>
>>> +    if ( boot_cpu_has(X86_FEATURE_TCE) )
>>> +        write_efer(read_efer() | EFER_TCE);
>> Same here. But I wonder if you couldn't set the bit in trampoline_efer.
>
> Yes, do set it in trampoline_efer, and drop this hunk.
>

Will do.

>
> If you add this, ...
>
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -2008,6 +2008,13 @@ void asmlinkage __init noreturn __start_xen(void)
>>      if ( cpu_has_pku )
>>          set_in_cr4(X86_CR4_PKE);
>>
>> +    if ( boot_cpu_has(X86_FEATURE_TCE) )
>
> ... the please also use it.
>

Yes, I forgot to change it.

---

Aside enabling this flag for Xen/PV guests, it can be useful to expose
it to the guests. While it's currently not going to change anything as
most of the related instructions are trapped and managed by the
hypervisor, it does affect the behavior of inside-guest INVLPGB if
enabled in the VMCB.

> ~Andrew

Teddy



Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.