Xen project Mailing List

Re: [Xen-devel] [PATCH v2 03/12] x86: infrastructure to allow converting certain indirect calls to direct ones

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 30 Aug 2018 01:19:50 -0600

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 30 Aug 2018 07:20:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 29.08.18 at 18:01, <andrew.cooper3@xxxxxxxxxx> wrote: > On 29/08/18 15:02, Jan Beulich wrote: >> @@ -235,13 +243,58 @@ void init_or_livepatch apply_alternative >> continue; >> } >> >> - base->priv = 1; >> - >> memcpy(buf, repl, a->repl_len); >> >> /* 0xe8/0xe9 are relative branches; fix the offset. */ >> if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 ) >> - *(int32_t *)(buf + 1) += repl - orig; >> + { >> + /* >> + * Detect the special case of indirect-to-direct branch >> patching: >> + * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; >> already >> + * checked above), >> + * - replacement's displacement is -5 (pointing back at the very >> + * insn, which makes no sense in a real replacement insn), >> + * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4) >> + * using RIP-relative addressing. >> + * Some function targets may not be available when we come here >> + * the first time. Defer patching of those until the >> post-presmp- >> + * initcalls re-invocation. > > Which calls? That smells like a bug which should be fixed rather than > worked around. Well, hvm_funcs being the primary goal of all this work - HVM init is a pre-SMP initcall. > As for the other complexity here... > >> If at that point the target pointer is >> + * still NULL, insert "UD2; UD0" (for ease of recognition) >> instead >> + * of CALL/JMP. >> + */ >> + if ( a->cpuid == X86_FEATURE_ALWAYS && >> + *(int32_t *)(buf + 1) == -5 && >> + a->orig_len >= 6 && >> + orig[0] == 0xff && >> + orig[1] == (*buf & 1 ? 0x25 : 0x15) ) >> + { >> + long disp = *(int32_t *)(orig + 2); >> + const uint8_t *dest = *(void **)(orig + 6 + disp); >> + >> + if ( dest ) >> + { >> + disp = dest - (orig + 5); >> + ASSERT(disp == (int32_t)disp); >> + *(int32_t *)(buf + 1) = disp; >> + } >> + else if ( force ) >> + { >> + buf[0] = 0x0f; >> + buf[1] = 0x0b; >> + buf[2] = 0x0f; >> + buf[3] = 0xff; >> + buf[4] = 0xff; >> + } >> + else >> + continue; >> + } >> + else if ( force && system_state < SYS_STATE_active ) >> + ASSERT_UNREACHABLE(); >> + else >> + *(int32_t *)(buf + 1) += repl - orig; >> + } >> + else if ( force && system_state < SYS_STATE_active ) >> + ASSERT_UNREACHABLE(); >> /* RIP-relative addressing is easy to check for in VEX-encoded >> insns. */ >> else if ( a->repl_len >= 8 && >> (*buf & ~1) == 0xc4 && >> @@ -149,6 +150,203 @@ extern void alternative_instructions(voi >> /* Use this macro(s) if you need more than one output parameter. */ >> #define ASM_OUTPUT2(a...) a >> >> +/* >> + * Machinery to allow converting indirect to direct calls, when the called >> + * function is determined once at boot and later never changed. >> + */ >> + >> +#define ALT_CALL_arg1 "rdi" >> +#define ALT_CALL_arg2 "rsi" >> +#define ALT_CALL_arg3 "rdx" >> +#define ALT_CALL_arg4 "rcx" >> +#define ALT_CALL_arg5 "r8" >> +#define ALT_CALL_arg6 "r9" >> + >> +#define ALT_CALL_ARG(arg, n) \ >> + register typeof((arg) ? (arg) : 0) a ## n ## _ \ >> + asm ( ALT_CALL_arg ## n ) = (arg) >> +#define ALT_CALL_NO_ARG(n) \ >> + register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) >> + >> +#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6) >> +#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6 >> +#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5 >> +#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4 >> +#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3 >> +#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2 >> + >> +/* >> + * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to >> + * suppress "uninitialized variable" warnings), as various versions of gcc >> + * older than 8.1 fall on the nose in various ways with that (always because >> + * of some other construct elsewhere in the same function needing to use the >> + * same hard register). Otherwise the asm() below could uniformly use "+r" >> + * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros. >> + */ >> +#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \ >> + "=r" (a4_), "=r" (a5_), "=r" (a6_) >> +#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \ >> + "=r" (a4_), "=r" (a5_), "=r" (a6_) >> +#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \ >> + "=r" (a4_), "=r" (a5_), "=r" (a6_) >> +#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \ >> + "=r" (a4_), "=r" (a5_), "=r" (a6_) >> +#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \ >> + "+r" (a4_), "=r" (a5_), "=r" (a6_) >> +#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \ >> + "+r" (a4_), "+r" (a5_), "=r" (a6_) >> +#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \ >> + "+r" (a4_), "+r" (a5_), "+r" (a6_) >> + >> +#define alternative_callN(n, rettype, func) ({ \ >> + rettype ret_; \ >> + register unsigned long r10_ asm("r10"); \ >> + register unsigned long r11_ asm("r11"); \ >> + asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \ >> + "call .", \ >> + X86_FEATURE_ALWAYS) \ >> + : ALT_CALL ## n ## _OUT, "=a" (ret_), \ >> + "=r" (r10_), "=r" (r11_) \ >> + : [addr] "i" (&(func)), "g" (func) \ > > There was a Linux thread (which I've lost track of) which went around > replacing "i" with "P" for labels like this, which behaves better for > relocatable targets. Looking at Linux sources, do you mean the P operand modifier instead of a P constraint (which I can't find any reference to)? I.e. to replace "%c[addr]" by "%P[addr]"? But all P does, according to gcc sources, is to add @plt() around the symbol. We definitely don't want this here - it's only applicable to direct branches. > Furthermore, I can't work out what the "g" > constraint is for, but if it is simply for making the reference visible, > I'd suggest "X" instead which avoids any interference with register > scheduling. I don't think there's a big difference between "Any register, memory or immediate integer operand is allowed, except for registers that are not general registers" and "Any operand whatsoever is allowed" in the our context, but yes, I can switch to this. > As for the complexity, how about implementing this as an unlikely block > with a single call in? That way, patching turns the entry jmp imm32 > into a call imm32 and we lose the complicated decoding and nop padding > from the result. I don't understand this at all: It would further complicate decoding, as then I also need to follow the JMP. Please carefully look at the logic in _apply_alternatives() again: I need to decode the RIP-relative operand in order to determine the CALL target. Your suggestion has the only benefit of reducing main line code size (as we'd then not need to patch in any NOP anymore). That _could_ be considered as an add-on optimization (provided it is one in the first place, considering the larger overall code size, but perhaps the "CALL <imm32>" could be put in .init.text, as we guarantee to patch all instances), but not something I'd be willing to do right in this patch. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.