[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/7] x86/alt: Support for automatic padding calculations



On 12/02/18 14:39, Wei Liu wrote:
> On Mon, Feb 12, 2018 at 11:23:05AM +0000, Andrew Cooper wrote:
>> The correct amount of padding in an origin patch site can be calculated
>> automatically, based on the relative lengths of the replacements.
>>
>> This requires a bit of trickery to calculate correctly, especially in the
>> ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
>> calculation is further complicated because GAS's idea of true is -1 rather
>> than 1, which is why the extra negations are required.
>>
>> Additionally, have apply_alternatives() attempt to optimise the padding nops.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
>> ---
>>  xen/arch/x86/alternative.c            | 32 ++++++++++++++++++++++++----
>>  xen/include/asm-x86/alternative-asm.h | 40 
>> +++++++++++++++++++++++++++--------
>>  xen/include/asm-x86/alternative.h     | 39 
>> ++++++++++++++++++++++++++--------
>>  3 files changed, 89 insertions(+), 22 deletions(-)
>>
>> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
>> index f8ddab5..ec87ff4 100644
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -180,13 +180,37 @@ void init_or_livepatch apply_alternatives(const struct 
>> alt_instr *start,
>>          uint8_t *orig = ALT_ORIG_PTR(a);
>>          uint8_t *repl = ALT_REPL_PTR(a);
>>          uint8_t buf[MAX_PATCH_LEN];
>> +        unsigned int total_len = a->orig_len + a->pad_len;
>>  
>> -        BUG_ON(a->repl_len > a->orig_len);
>> -        BUG_ON(a->orig_len > sizeof(buf));
>> +        BUG_ON(a->repl_len > total_len);
>> +        BUG_ON(total_len > sizeof(buf));
>>          BUG_ON(a->cpuid >= NCAPINTS * 32);
>>  
>>          if ( !boot_cpu_has(a->cpuid) )
>> +        {
>> +            unsigned int i;
>> +
>> +            /* No replacement to make, but try to optimise any padding. */
>> +            if ( a->pad_len <= 1 )
>> +                continue;
>> +
>> +            /* Search the padding area for any byte which isn't a nop. */
>> +            for ( i = a->orig_len; i < total_len; ++i )
>> +                if ( orig[i] != 0x90 )
>> +                    break;
>> +
>> +            /*
>> +             * Only make any changes if all padding bytes are unoptimised
>> +             * nops.  With multiple alternatives over the same origin site, 
>> we
>> +             * may have already made a replacement, or optimised the nops.
>> +             */
>> +            if ( i != total_len )
>> +                continue;
>> +
>> +            add_nops(buf, a->pad_len);
>> +            text_poke(orig + a->orig_len, buf, a->pad_len);
>>              continue;
>> +        }
> Is the expectation here the alternative instructions already contain
> optimised paddings (including live patches)? Otherwise why is the same
> optimisation no needed when later?

The problem is that we don't store the actual original bytes, so can't
trivially detect whether we've already patched this site before.  We've
a number of cases which are an ALTERNATIVE_2 based on SMEP and SMAP, so
on a fair chunk of hardware, we first make a replacement because of
SMEP, then fail the SMAP check and don't make the second replacement.

Later, we are discarding everything in orig+pad, and replacing it with
repl+any necessary padding, which is made of optimised nops.

>
>>  
>>          memcpy(buf, repl, a->repl_len);
>>  
>> @@ -194,8 +218,8 @@ void init_or_livepatch apply_alternatives(const struct 
>> alt_instr *start,
>>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
>>              *(s32 *)(buf + 1) += repl - orig;
>>  
>> -        add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
>> -        text_poke(orig, buf, a->orig_len);
>> +        add_nops(buf + a->repl_len, total_len - a->repl_len);
>> +        text_poke(orig, buf, total_len);
>>      }
>>  }
>>  
>> diff --git a/xen/include/asm-x86/alternative-asm.h 
>> b/xen/include/asm-x86/alternative-asm.h
>> index 150bd1a..f7e37cb 100644
>> --- a/xen/include/asm-x86/alternative-asm.h
>> +++ b/xen/include/asm-x86/alternative-asm.h
>> @@ -9,30 +9,41 @@
>>   * enough information for the alternatives patching code to patch an
>>   * instruction. See apply_alternatives().
>>   */
>> -.macro altinstruction_entry orig repl feature orig_len repl_len
>> +.macro altinstruction_entry orig repl feature orig_len repl_len pad_len
>>      .long \orig - .
>>      .long \repl - .
>>      .word \feature
>>      .byte \orig_len
>>      .byte \repl_len
>> +    .byte \pad_len
>>  .endm
>>  
>>  #define orig_len               (.L\@_orig_e       -     .L\@_orig_s)
>> +#define pad_len                (.L\@_orig_p       -     .L\@_orig_e)
>> +#define total_len              (.L\@_orig_p       -     .L\@_orig_s)
>>  #define repl_len(nr)           (.L\@_repl_e\()nr  -     .L\@_repl_s\()nr)
>>  #define decl_repl(insn, nr)     .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
>> +#define gas_max(a, b)          ((a) ^ (((a) ^ (b)) & -(-((a) < (b)))))
> What about clang's assembler? At least give it a stub to cause
> compilation error?
>
>>  
>>  .macro ALTERNATIVE oldinstr, newinstr, feature
>>  .L\@_orig_s:
>>      \oldinstr
>>  .L\@_orig_e:
>> +     .skip (-((repl_len(1) - orig_len) > 0) * (repl_len(1) - orig_len)), 
>> 0x90
> Seeing the negation at the beginning, I suppose this should also be a
> gas specific macro?

The build failures are because clang's integrated assembler can't cope
with non-absolute references with .skip, but we already know about this
and have code identical to this in tree.  (I temporarily removed it in
patch 4).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.