[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/5] x86/pv: Optimise to the segment context switching paths


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 11 Sep 2020 13:53:12 +0100
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 11 Sep 2020 12:53:29 +0000
  • Ironport-sdr: Armard+jyVQnP+yo/uN6DSBQCxrwe1kxHsTBQKZoTumQCT/M+HqzG16HC0/GesBfNZlOSlyceD fc22uh9amlTjq+usGdx0XiQMjpgg4yDDLrNAM3HXSmDw2kbtjjdxAY4iTq6U84x4EyjCU36dyx 1zkgVqAG0X16RviNHSbHkixA8W+1pIQtYwQ20SOGuwDL27woC/D9pbM8Tjt5AplZgkA+QLOFM6 cH01ajlvya7L0enLYLyR4nJfPYobtkPkbxx8tr98Z5ORA0ioIQZsam5VlY/Zt1yX5GMRaMsl0j th0=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11/09/2020 10:49, Jan Beulich wrote:
> On 09.09.2020 11:59, Andrew Cooper wrote:
>> Save the segment selectors with explicit asm, rather than with read_sreg().
>> This permits the use of MOV's m16 encoding, which avoids indirecting the
>> selector value through a register.
> Instead of this, how about making read_sreg() look like
>
> #define read_sreg(val, name) ({                                  \
>     if ( sizeof(val) == 2 )                                      \
>         asm volatile ( "mov %%" STR(name) ",%0" : "=m" (val) );  \
>     else                                                         \
>         asm volatile ( "mov %%" STR(name) ",%k0" : "=r" (val) ); \
> })
>
> which then also covers read_registers()? I have a full patch
> ready to send, if you agree.

That will go wrong for

uint16_t ds; read_sreg(ds, ds);

as it will force the variable to be spilled onto the stack.

I don't think this is a clever move.


Furthermore, it is bad enough that read_sreg() already takes one magic
parameter which doesn't follow normal C rules - renaming to READ_SREG()
would be an improvement - but this is now adding a second which is a
capture by name.

I'm afraid that is a firm no from me.


There is one other place where we read all segment registers at once. 
Maybe having a static inline save_sregs(struct cpu_user_regs *) hiding
the asm block, but I'm not necessarily convinced of this plan either. 
At least it would cleanly separate the "I've obviously got a memory
operand" and "I almost certainly want it in a register anyway" logic.

>
>> For {save,load}_segments(), opencode the fs/gs helpers, as the optimiser is
>> unable to rearrange the logic down to a single X86_CR4_FSGSBASE check.  This
>> removes several jumps and creates bigger basic blocks.
>>
>> In load_segments(), optimise GS base handling substantially.  The call to
>> svm_load_segs() already needs gsb/gss the correct way around, so hoist the
>> logic for the later path to use it as well.  Swapping the inputs in GPRs is
>> far more efficient than using SWAPGS.
>>
>> Previously, there was optionally one SWAPGS from the user/kernel mode check,
>> two SWAPGS's in write_gs_shadow() and two WRGSBASE's.  Updates to GS (4 or 5
>> here) in quick succession stall all contemporary pipelines repeatedly.  
>> (Intel
>> Core/Xeon pipelines have segment register renaming[1], so can continue to
>> speculatively execute with one GS update in flight.  Other pipelines cannot
>> have two updates in flight concurrently, and must stall dispatch of the 
>> second
>> until the first has retired.)
>>
>> Rewrite the logic to have exactly two WRGSBASEs and one SWAPGS, which removes
>> two stalles all contemporary processors.
>>
>> Although modest, the resulting delta is:
>>
>>   add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-106 (-106)
>>   Function                                     old     new   delta
>>   paravirt_ctxt_switch_from                    235     198     -37
>>   context_switch                              3582    3513     -69
>>
>> in a common path.
>>
>> [1] 
>> https://software.intel.com/security-software-guidance/insights/deep-dive-intel-analysis-speculative-behavior-swapgs-and-segment-registers
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Preferably re-based onto said change, and maybe with another adjustment
> (see below)
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>
>> @@ -1556,18 +1557,24 @@ static void load_segments(struct vcpu *n)
>>                     : [ok] "+r" (all_segs_okay)          \
>>                     : [_val] "rm" (val) )
>>  
>> -#ifdef CONFIG_HVM
>> -    if ( cpu_has_svm && !compat && (uregs->fs | uregs->gs) <= 3 )
>> +    if ( !compat )
>>      {
>> -        unsigned long gsb = n->arch.flags & TF_kernel_mode
>> -            ? n->arch.pv.gs_base_kernel : n->arch.pv.gs_base_user;
>> -        unsigned long gss = n->arch.flags & TF_kernel_mode
>> -            ? n->arch.pv.gs_base_user : n->arch.pv.gs_base_kernel;
>> +        gsb = n->arch.pv.gs_base_kernel;
>> +        gss = n->arch.pv.gs_base_user;
>> +
>> +        /*
>> +         * Figure out which way around gsb/gss want to be.  gsb needs to be
>> +         * the active context, and gss needs to be the inactive context.
>> +         */
>> +        if ( !(n->arch.flags & TF_kernel_mode) )
>> +            SWAP(gsb, gss);
>>  
>> -        fs_gs_done = svm_load_segs(n->arch.pv.ldt_ents, LDT_VIRT_START(n),
>> -                                   n->arch.pv.fs_base, gsb, gss);
>> +        if ( IS_ENABLED(CONFIG_HVM) && cpu_has_svm &&
> The change from #ifdef to IS_ENABLED() wants mirroring to the
> prefetching site imo. I wonder though whether the adjustment is a
> good idea: The declaration lives in svm.h, and I would view it as
> quite reasonable for that header to not get included in !HVM builds
> (there may be a lot of disentangling to do to get there, but still).

I'm not overly fussed, but there will absolutely have to be HVM stubs
for normal code.  I don't see why we should special case svm_load_segs()
to not have a stub.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.