[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/build: Use new .nops directive when available
>>> On 16.08.18 at 13:48, <andrew.cooper3@xxxxxxxxxx> wrote: > On 16/08/18 12:34, Jan Beulich wrote: >>>>> On 16.08.18 at 12:42, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 16/08/18 10:55, Roger Pau Monné wrote: >>>> On Wed, Aug 15, 2018 at 06:57:38PM +0100, Andrew Cooper wrote: >>>>> @@ -112,6 +125,11 @@ static void __init arch_init_ideal_nops(void) >>>>> ideal_nops = k8_nops; >>>>> break; >>>>> } >>>>> + >>>>> +#ifdef HAVE_AS_NOP_DIRECTIVE >>>>> + if ( memcmp(ideal_nops[ASM_NOP_MAX], toolchain_nops, ASM_NOP_MAX) == >>>>> 0 > ) >>>>> + toolchain_nops_are_ideal = true; >>>>> +#endif >>>> You are only comparing that the biggest nop instruction (9 bytes >>>> AFAICT) generated by the assembler is what Xen believes to be the more >>>> optimized version. What about shorter nops? >>> They are all variations on a theme. >>> >>> For P6 nops, its the 0f 1f root which is important, which takes a modrm >>> byte. Traditionally, its always encoded with eax and uses redundant >>> memory encodings for longer instructions. >>> >>> I can't think of any way of detecting if the optimised nops if the >>> toolchain starts using alternative registers in the encoding, but I >>> expect this case won't happen in practice. >> It's not just the register encoding, but also the maximum single-insn >> length that gets generated. Recall that until not very long ago we >> had up to 8-byte NOP insns only? The view on the mod (as in ModRM) >> usage may vary over time, as may the view on which or how many >> prefixes are reasonable to have. > > Strictly speaking, the ORM says "encode the least-recently live > register", because all the hint nops are still subject to reg/reg > dependencies. > > However, we definitely can't take advantage of this, nor can the > assembler. Well, _we_ could, at least when tail padding patched in insns: I very much hope we know what we've patched in, and hence at least what registers were used most recently. This is not readily available today, but could be made so. > Compilers can't either, because the exact length of the nop > depends on other relocations. Furthermore, the perf improvement from > doing this would be fractional. True. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |