[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 13/16] x86/msr: Use MSR_IMM when available
On 21.08.2025 20:59, Andrew Cooper wrote: > On 19/08/2025 1:55 pm, Jan Beulich wrote: >> On 15.08.2025 22:41, Andrew Cooper wrote: >>> --- a/xen/arch/x86/include/asm/msr.h >>> +++ b/xen/arch/x86/include/asm/msr.h >>> @@ -29,10 +29,52 @@ >>> * wrmsrl(MSR_FOO, val); >>> */ >>> >>> -static inline uint64_t rdmsr(unsigned int msr) >>> +/* >>> + * RDMSR with a compile-time constant index, when available. Falls back to >>> + * plain RDMSR. >>> + */ >>> +static always_inline uint64_t __rdmsr_imm(uint32_t msr) >>> +{ >>> + uint64_t val; >>> + >>> + /* >>> + * For best performance, RDMSR $msr, %r64 is recommended. For >>> + * compatibility, we need to fall back to plain RDMSR. >>> + * >>> + * The combined ABI is awkward, because RDMSR $imm produces an r64, >>> + * whereas WRMSR{,NS} produces a split edx:eax pair. >>> + * >>> + * Always use RDMSR $imm, %rax, because it has the most in common with >>> the >>> + * legacy form. When MSR_IMM isn't available, emit logic to fold %edx >>> + * back into %rax. >>> + * >>> + * Let the compiler do %ecx setup. This does mean there's a useless >>> `mov >>> + * $imm, %ecx` in the instruction stream in the MSR_IMM case, but it >>> means >>> + * the compiler can de-duplicate the setup in the common case of >>> reading >>> + * and writing the same MSR. >>> + */ >>> + alternative_io( >>> + "rdmsr\n\t" >>> + "shl $32, %%rdx\n\t" >>> + "or %%rdx, %%rax\n\t", >>> + >>> + /* RDMSR $msr, %rax */ >>> + ".byte 0xc4,0xe7,0x7b,0xf6,0xc0; .long %c[msr]", >>> X86_FEATURE_MSR_IMM, >>> + >>> + "=a" (val), >> Strictly speaking "=&a". Not that it matters much here; just to not >> set a bad precedent. > > Why? A is not written to until after all inputs are consumed. > > I don't see how it can qualify for being early-clobber. Because the very first insn already writes to the register. >>> @@ -55,11 +97,51 @@ static inline void wrmsr(unsigned int msr, uint64_t val) >>> } >>> #define wrmsrl(msr, val) wrmsr(msr, val) >>> >>> +/* >>> + * Non-serialising WRMSR with a compile-time constant index, when >>> available. >>> + * Falls back to plain WRMSRNS, or to a serialising WRMSR. >>> + */ >>> +static always_inline void __wrmsrns_imm(uint32_t msr, uint64_t val) >>> +{ >>> + /* >>> + * For best performance, WRMSRNS %r64, $msr is recommended. For >>> + * compatibility, we need to fall back to plain WRMSRNS, or to WRMSR. >>> + * >>> + * The combined ABI is awkward, because WRMSRNS $imm takes a single >>> r64, >>> + * whereas WRMSR{,NS} takes a split edx:eax pair. >>> + * >>> + * Always use WRMSRNS %rax, $imm, because it has the most in common >>> with >>> + * the legacy forms. When MSR_IMM isn't available, emit setup logic >>> for >>> + * %edx. >>> + * >>> + * Let the compiler do %ecx setup. This does mean there's a useless >>> `mov >>> + * $imm, %ecx` in the instruction stream in the MSR_IMM case, but it >>> means >>> + * the compiler can de-duplicate the setup in the common case of >>> reading >>> + * and writing the same MSR. >>> + */ >>> + alternative_input_2( >>> + "mov %%rax, %%rdx\n\t" >>> + "shr $32, %%rdx\n\t" >>> + ".byte 0x2e; wrmsr", >>> + >>> + /* CS WRMSRNS %rax, $msr */ >>> + ".byte 0x2e,0xc4,0xe7,0x7a,0xf6,0xc0; .long %c[msr]", >>> X86_FEATURE_MSR_IMM, >>> + >>> + "mov %%rax, %%rdx\n\t" >>> + "shr $32, %%rdx\n\t" >>> + ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS, >> Isn't this the wrong way round for hardware which has both features? The >> last active alternative wins, iirc. > > Bah - fooled once again by the nop optimisation. I'll reorder. > > But, we really should swap the order. Especially now that you've > inserted serialisation, it's an expensive waste of time patching the > same site multiple times. I'm not convinced swapping the order would be a good thing. Nor do I see how it would (easily) help here: When both features are present, we always patch twice. To avoid that, while patching one location we would need to peek ahead into the next table entry to see whether that's going to patch the same location, same (or bigger) size. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |