[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/5] x86/pv: Optimise to the segment context switching paths



On 09.09.2020 11:59, Andrew Cooper wrote:
> Save the segment selectors with explicit asm, rather than with read_sreg().
> This permits the use of MOV's m16 encoding, which avoids indirecting the
> selector value through a register.

Instead of this, how about making read_sreg() look like

#define read_sreg(val, name) ({                                  \
    if ( sizeof(val) == 2 )                                      \
        asm volatile ( "mov %%" STR(name) ",%0" : "=m" (val) );  \
    else                                                         \
        asm volatile ( "mov %%" STR(name) ",%k0" : "=r" (val) ); \
})

which then also covers read_registers()? I have a full patch
ready to send, if you agree.

> For {save,load}_segments(), opencode the fs/gs helpers, as the optimiser is
> unable to rearrange the logic down to a single X86_CR4_FSGSBASE check.  This
> removes several jumps and creates bigger basic blocks.
> 
> In load_segments(), optimise GS base handling substantially.  The call to
> svm_load_segs() already needs gsb/gss the correct way around, so hoist the
> logic for the later path to use it as well.  Swapping the inputs in GPRs is
> far more efficient than using SWAPGS.
> 
> Previously, there was optionally one SWAPGS from the user/kernel mode check,
> two SWAPGS's in write_gs_shadow() and two WRGSBASE's.  Updates to GS (4 or 5
> here) in quick succession stall all contemporary pipelines repeatedly.  (Intel
> Core/Xeon pipelines have segment register renaming[1], so can continue to
> speculatively execute with one GS update in flight.  Other pipelines cannot
> have two updates in flight concurrently, and must stall dispatch of the second
> until the first has retired.)
> 
> Rewrite the logic to have exactly two WRGSBASEs and one SWAPGS, which removes
> two stalles all contemporary processors.
> 
> Although modest, the resulting delta is:
> 
>   add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-106 (-106)
>   Function                                     old     new   delta
>   paravirt_ctxt_switch_from                    235     198     -37
>   context_switch                              3582    3513     -69
> 
> in a common path.
> 
> [1] 
> https://software.intel.com/security-software-guidance/insights/deep-dive-intel-analysis-speculative-behavior-swapgs-and-segment-registers
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Preferably re-based onto said change, and maybe with another adjustment
(see below)
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

> @@ -1556,18 +1557,24 @@ static void load_segments(struct vcpu *n)
>                     : [ok] "+r" (all_segs_okay)          \
>                     : [_val] "rm" (val) )
>  
> -#ifdef CONFIG_HVM
> -    if ( cpu_has_svm && !compat && (uregs->fs | uregs->gs) <= 3 )
> +    if ( !compat )
>      {
> -        unsigned long gsb = n->arch.flags & TF_kernel_mode
> -            ? n->arch.pv.gs_base_kernel : n->arch.pv.gs_base_user;
> -        unsigned long gss = n->arch.flags & TF_kernel_mode
> -            ? n->arch.pv.gs_base_user : n->arch.pv.gs_base_kernel;
> +        gsb = n->arch.pv.gs_base_kernel;
> +        gss = n->arch.pv.gs_base_user;
> +
> +        /*
> +         * Figure out which way around gsb/gss want to be.  gsb needs to be
> +         * the active context, and gss needs to be the inactive context.
> +         */
> +        if ( !(n->arch.flags & TF_kernel_mode) )
> +            SWAP(gsb, gss);
>  
> -        fs_gs_done = svm_load_segs(n->arch.pv.ldt_ents, LDT_VIRT_START(n),
> -                                   n->arch.pv.fs_base, gsb, gss);
> +        if ( IS_ENABLED(CONFIG_HVM) && cpu_has_svm &&

The change from #ifdef to IS_ENABLED() wants mirroring to the
prefetching site imo. I wonder though whether the adjustment is a
good idea: The declaration lives in svm.h, and I would view it as
quite reasonable for that header to not get included in !HVM builds
(there may be a lot of disentangling to do to get there, but still).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.