[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available
On 02/03/18 07:10, Jan Beulich wrote: >>>> On 01.03.18 at 17:58, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 01/03/18 10:54, Jan Beulich wrote: >>>>>> On 01.03.18 at 11:36, <roger.pau@xxxxxxxxxx> wrote: >>>> On Thu, Mar 01, 2018 at 12:28:27AM -0700, Jan Beulich wrote: >>>>>>>> Andrew Cooper <andrew.cooper3@xxxxxxxxxx> 02/28/18 7:20 PM >>> >>>>>> On 28/02/18 16:22, Jan Beulich wrote: >>>>>>>>>> On 26.02.18 at 12:35, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>>>> --- a/xen/include/asm-x86/alternative-asm.h >>>>>>>> +++ b/xen/include/asm-x86/alternative-asm.h >>>>>>>> @@ -1,6 +1,8 @@ >>>>>>>> #ifndef _ASM_X86_ALTERNATIVE_ASM_H_ >>>>>>>> #define _ASM_X86_ALTERNATIVE_ASM_H_ >>>>>>>> >>>>>>>> +#include <asm/nops.h> >>>>>>>> + >>>>>>>> #ifdef __ASSEMBLY__ >>>>>>>> >>>>>>>> /* >>>>>>>> @@ -18,6 +20,14 @@ >>>>>>>> .byte \pad_len >>>>>>>> .endm >>>>>>>> >>>>>>>> +.macro mknops nr_bytes >>>>>>>> +#ifdef HAVE_AS_NOP_DIRECTIVE >>>>>>>> + .nop \nr_bytes, ASM_NOP_MAX >>>>>>> And do you really need to specify ASM_NOP_MAX here? What's >>>>>>> wrong with letting the assembler pick what it wants as the longest >>>>>>> NOP? >>>>>> I don't want a toolchain change to cause us to go beyond 11 byte nops, >>>>>> because of the associated decode stall on almost all hardware. Using >>>>>> ASM_NOP_MAX seemed like the easiest way to keep the end result >>>>>> consistent, irrespective of toolchain support. >>>>> I don't understand - an earlier patch takes care of runtime replacing them >>>>> anyway. What stalls can then result? >>>> The runtime replacement won't happen when using the .nops directive >>>> AFAICT, because the original padding section will likely be filled >>>> with opcodes different than 0x90, and thus the runtime nop >>>> optimization won't be performed. >>> Oh, indeed. That puts under question the whole idea of using >>> .nops in favor of .skip. Andrew, I'm sorry, but with this I prefer >>> to withdraw my ack. >>> >>>> I also agree that using the default (not proving a second argument) >>>> seems like a better solution. Why would the toolstack switch to >>>> something that leads to worse performance? That would certainly be >>>> considered a bug. >>> Why? They may change it based on data available for newer / >>> older / whatever hardware. Any build-time choice is going to be >>> suboptimal somewhere, so I think we absolutely should not >>> bypass runtime replacing these NOPs, the more that now we >>> may have quite large sequences of them. >> The pont of having the toolchain put out optimised nops is to avoid the >> need for us to patch the site at all. I.e. calling optimise_nops() on a >> set of toolchain nops defeats the purpose in the overwhelming common >> case of running on a system which prefers P6 nops. >> >> The problem of working out when to optimise is that, when we come to >> apply an individual alternative, we don't know if we've already patched >> this site before. Even the unoptimised algorithm has a corner case >> which explodes, if there is a stream of 0x90's on the end of a >> replacement e.g. in a imm or disp field. >> >> Put simply, we cannot determine, by peeking at the patchsite, whether it >> has been patched or not (other than keeping a full copy of the origin >> site as a reference). As soon as we chose to optimise the nops of the >> origin site, we cannot determine anything at all. >> >> Thinking out loud, we could perhaps have a section containing one byte >> per origin site, which we use to track whether we've already optimised >> the padding bytes, and whether the contents have been replaced. This >> would also add an extra long into struct altentry, but its all cold data >> after boot. > What about alternatively simply updating the struct alt_instr > instances to describe the code _after_ a patch that was applied? > That'll allow to always know how much padding there is. There are multiple alt_instr pointing to the same origin sites when using ALTERNATIVE_2, meaning you keep all the safety problems with the current setup. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |