[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 7/7] xen/x86: use PCID feature



On 03/04/18 09:14, Juergen Gross wrote:
> On 29/03/18 17:37, Jan Beulich wrote:
>>>>> On 29.03.18 at 17:15, <jgross@xxxxxxxx> wrote:
>>> On 29/03/18 16:19, Jan Beulich wrote:
>>>>>>> On 27.03.18 at 11:07, <jgross@xxxxxxxx> wrote:
>>>>> @@ -102,7 +103,21 @@ void write_cr3_cr4(unsigned long cr3, unsigned long 
>>>>> cr4)
>>>>>      t = pre_flush();
>>>>>  
>>>>>      if ( read_cr4() & X86_CR4_PGE )
>>>>> +        /*
>>>>> +         * X86_CR4_PGE set means PCID being inactive.
>>>>> +         * We have to purge the TLB via flipping cr4.pge.
>>>>> +         */
>>>>>          write_cr4(cr4 & ~X86_CR4_PGE);
>>>>> +    else if ( cpu_has_invpcid )
>>>>> +        /*
>>>>> +         * If we are using PCID purge the TLB via INVPCID as loading cr3
>>>>> +         * will affect the current PCID only.
>>>>
>>>> s/current/new/ ?
>>>
>>> Okay.
>>>
>>>>
>>>>> +         * If INVPCID is not supported we don't use PCIDs so loading cr3
>>>>> +         * will purge the TLB (we are in the "global pages off" branch).
>>>>> +         * invpcid_flush_all_nonglobals() seems to be faster than
>>>>> +         * invpcid_flush_all().
>>>>> +         */
>>>>> +        invpcid_flush_all_nonglobals();
>>>>>  
>>>>>      asm volatile ( "mov %0, %%cr3" : : "r" (cr3) : "memory" );
>>>>
>>>> What about the TLB entries that have been re-created between
>>>> the INVPCID and the write of CR3? It's not obvious to me that
>>>> leaving them around is not going to be a problem subsequently,
>>>> the more that you write cr3 frequently with the no-flush bit set.
>>>
>>> The no-flush bit should not make any difference here. It controls only
>>> flushing of TLB-entries with the new PCID.
>>
>> Right, but in a subsequent write to CR3 you may make active again
>> what was the old PCID here. This is in particular so for PCID 0 (which
>> has dual use).
>>
>>>> Don't you need to do a single context invalidation for the prior
>>>> PCID (if different from the new one)?
>>>
>>> Hmm, I don't think so, as the mov to %cr3 is a serializing instruction
>>> acting as a speculation barrier. So the only TLB entries which could
>>> survive would be the ones for the few instruction bytes after the
>>> invpcid instruction until the end of the mov to %cr3. Those are harmless
>>> as they are associated with the hypervisor PCID value, so there is no
>>> risk of any data leak to a guest. Maybe a comment explaining that would
>>> be a good idea.
>>
>> Well, to be honest I don't trust in things like "speculation barrier"
>> anymore, at least not as far as things ahead of the barrier go.
>> Who knows what exactly the CPU does (and hence which TLB
>> entries it creates) between the INVPCID and the CR3 write. I
>> don't think we can blindly assume only entries for Xen mappings
>> could be created during that window.
> 
> Those speculation barriers are one of the main mitigations for Spectre.
> So either we don't think Spectre can be mitigated by using those or we
> should trust the barriers to be effective in this case, too, IMHO.
> 
> Which documented behavior of the processor are you going to trust? I
> think as long as there are no known errata in this regard we should
> assume the cpu will behave as documented. And mv to %cr3 is documented
> to be serializing nad serializing instruction are documented to be
> speculation barriers.

I have done my standard simple performance test with an extra
invpcid_flush_single_context() added in case the PCID changed in
write_cr3_cr4().

Performance impact seems to be neglectible. OTOH this isn't really
surprising as the only case where the additional flush would be needed
is the case of vcpu scheduling when PCIDs are different: either the
old or the new vcpu (not both) need to be in user mode of a XPTI domain.
Guest address space switches will always happen with PCID 0 (guest is in
kernel).

So I'm adding the additional flush to the patch. Better save than sorry.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.