[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] x86: use invpcid to do global flushing



On 05/03/18 13:57, Andrew Cooper wrote:
> On 05/03/18 12:54, Jan Beulich wrote:
>>>>> On 05.03.18 at 13:35, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 05/03/18 12:06, Juergen Gross wrote:
>>>> On 05/03/18 12:50, Andrew Cooper wrote:
>>>>> On 05/03/18 11:31, Jan Beulich wrote:
>>>>>>>>> On 05.03.18 at 10:50, <wei.liu2@xxxxxxxxxx> wrote:
>>>>>>> Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
>>>>>> No description at all? I'd at least expect mention of how much of a
>>>>>> performance win this is (for whichever hardware you happen to
>>>>>> know that).
>>>>>>
>>>>>>> @@ -120,11 +121,24 @@ unsigned int flush_area_local(const void *va, 
>>>>>>> unsigned 
>>> int flags)
>>>>>>>          else
>>>>>>>          {
>>>>>>>              u32 t = pre_flush();
>>>>>>> -            unsigned long cr4 = read_cr4();
>>>>>>>  
>>>>>>> -            write_cr4(cr4 & ~X86_CR4_PGE);
>>>>>>> -            barrier();
>>>>>>> -            write_cr4(cr4);
>>>>>>> +            if ( !cpu_has_invpcid )
>>>>>>> +            {
>>>>>>> +                unsigned long cr4 = read_cr4();
>>>>>>> +
>>>>>>> +                write_cr4(cr4 & ~X86_CR4_PGE);
>>>>>>> +                barrier();
>>>>>>> +                write_cr4(cr4);
>>>>>>> +            }
>>>>>>> +            else
>>>>>>> +            {
>>>>>>> +                /*
>>>>>>> +                 * Using invpcid to flush all mappings works
>>>>>>> +                 * regardless of whether PCID is enabled or not.
>>>>>>> +                 * It is faster than read-modify-write CR4.
>>>>>>> +                 */
>>>>> Its a cr4 double write, rather than RMW.  We read from a cached value
>>>>> anyway, not from hardware.
>>>>>
>>>>>>> +                invpcid_flush_all();
>>>>>>> +            }
>>>>>> The reference to PCID in the comment isn't really meaningful imo.
>>>>>> PCID and INVPCID are independent features anyway. Also please
>>>>>> don't create artificially short comment lines.
>>>>>>
>>>>>> Generally I also think such if() conditions would better be inverted:
>>>>>> There's no reason to make the legacy form look as if it was
>>>>>> preferred.
>>>>>>
>>>>>> And then - what about the use in write_cr3() and the two uses that
>>>>>> remain after my XPTI follow-up series (which sadly looks to be stuck
>>>>>> for whatever reason), or (without that series) the write_cr3
>>>>>> assembler macro?
>>>>> I don't think it is safe to use invpcid when we're also switching cr3. 
>>>>> The new cr3 may have global pages with different translations, as they
>>>>> are guest controlled.
>>>> Can you elaborate a little bit more?
>>>>
>>>> How can a guest control any hypervisor mappings? As long as the new cr3
>>>> is being loaded before the TLB is flushed via INVPCID I can't see how
>>>> a problem should occur.
>>>>
>>>> In fact my series does exactly what Jan is asking above: it is replacing
>>>> the remaining cr4 based TLB flushing by INVPCID if possible. So in case
>>>> there is a flaw in my design please tell me.
>>> At the moment, we have guest and hypervisor controlled global mappings.
>>>
>>> The current switch is:
>>> cr4 &= ~PGE;
>>> cr3 = new_cr3;
>>> cr4 |= PGE;
>>>
>>> which means that all global mappings are flushed by the first action,
>>> and no new global mappings can come into existence.  We then switch to
>>> the new cr3 (again with global fully disabled), then allow global
>>> mappings to come back into existence.
>>>
>>> With the invpcid route, we switch via:
>>>
>>> cr3 = new_cr3;
>>> invpcid all+global;
>>>
>>> This has a race window where global mappings are active, and could
>>> mismatch what is in cr3.  This yields #MC on at least some hardware, and
>>> is specified to have undefined behaviour. 
>> Oh, right, this would be okay only without what used to be named
>> USER_MAPPINGS_ARE_GLOBAL (and what is now implied).
> 
> When we start using PCID for user mappings, then we don't need them to
> be global, at which point we can require/expect that the only global
> mappings are hypervisor ones which we expect to remain correct across a
> write to cr3.  However, if we do this, then we need to use a bit other
> than PAGE_GLOBAL to signify guest user mappings.
> 
> I think this is doable, but I don't think it is going to be trivial to
> get correct.

Why would we want to keep any global mappings at all? What are they good
for? Today the only case I could find where they make sense at all is
for 64-bit pv-guests to keep hypervisor mappings in the TLB when the
guest is switching between user and kernel mode. In all other cases the
complete TLB is flushed when cr3 is loaded with a new value (either due
to a Xen context switch or a guest cr3 load being performed by Xen).

With using PCID we can just keep cr4.pge being 0 and everything is fine.
No additional work required, we can just keep the G bit in the PTEs as a
flag which doesn't affect the TLB at all.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.