[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: slightly reduce Meltdown band-aid overhead



>>> On 29.01.18 at 19:55, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 18/01/18 15:39, Jan Beulich wrote:
>> I'm not sure why I didn't do this right away: By avoiding to make any
>> of the cloned directmap PTEs global, there's no need to fiddle with
> 
> "avoiding to make" is a very odd way of phrasing this.  Something like
> "By avoiding the use of global PTEs in the cloned directmap, " perhaps?

Changed.

>> CR4.PGE on any of the entry paths. Only the exit paths need to flush
>> global mappings.
>>
>> The reduced flushing, however, implies that we now need to have
>> interrupts off on all entry paths until after the page table switch, so
>> that flush IPIs can't arrive with the restricted page tables still
>> active, but only a non-global flush happening with the CR3 loads.
> 
> I can't parse this last clause in the context of the sentence, which
> looks to finish after "active".

There are two different aspects making necessary to keep interrupts
of, both of which I try to name in each half sentence. I would re-
phrase this to

"The reduced flushing, however, implies that we now need to have
 interrupts off on all entry paths until after the page table switch:
 With the CR3 loads being only non-global flushes, flush IPIs mustn't
 arrive with the restricted page tables still active."

Is that any better?

>> Along those lines the "sync" IPI after L4 entry updates now needs to become a
>> real (and global) flush IPI, so that inside Xen we'll also pick up such
>> changes.
> 
> The entry paths don't tick the TLB clock, so we are in no worse of a
> position than before.  IOW, I don't see why this needs to change to
> being a FLUSH_GLOBAL.

In your later reply you talk about flushing (global) user mappings -
that's another way to put it, just that I didn't want to special case
user mappings here. From the perspective of this change, it doesn't
matter what global mappings there might be. All we care about is
that despite the now non-global flushes we still get rid of global
TLB entries hanging off of changed L4 entries.

>> @@ -1009,8 +1010,17 @@ void __init smp_prepare_cpus(unsigned in
>>      if ( rc )
>>          panic("Error %d setting up PV root page table\n", rc);
>>      if ( per_cpu(root_pgt, 0) )
>> +    {
>>          get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
>>  
>> +        /*
>> +         * All entry points which may need to switch page tables have to 
>> start
>> +         * with interrupts off. Re-write what pv_trap_init() has put there.
>> +         */
>> +        _set_gate(idt_table + LEGACY_SYSCALL_VECTOR, SYS_DESC_irq_gate, 3,
>> +                  &int80_direct_trap);
> 
> The int82 path is also a trap gate.  Given how subtle this is, and how
> hard a resulting crash would be to debug, I think it would be better to
> unilaterally switch both to being interrupt gates.

The int82 path is of no interest: As the comment says, we care about
interrupts off only on entry paths where we switch page tables (back
from the restricted ones).

> Neither are fastpaths, so the single extra sti in their execution paths
> will be negligible.

How is a system call path not performance relevant? Intel's ORM
doesn't say anything about STI afaics, but I assume it's
microcoded as much as it is for AMD (who explicitly say so).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.