Xen project Mailing List

Re: [Xen-devel] [PATCH 2/3 v2] x86emul: conditionally clear BNDn for branches

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 5 Jan 2017 18:59:54 +0000

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 05 Jan 2017 19:00:27 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 05/01/17 09:13, Jan Beulich wrote: >>>> On 04.01.17 at 22:11, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 12/12/16 10:00, Jan Beulich wrote: >>> @@ -1791,6 +1795,34 @@ static int inject_swint(enum x86_swint_t >>> generate_exception(fault_type, error_code); >>> } >>> >>> +static void clear_bnd(struct x86_emulate_ctxt *ctxt, >>> + const struct x86_emulate_ops *ops, enum vex_pfx pfx) >> As with register_address_adjust(), this would be better as adjust_bnd() >> as we don't necessarily clear them. > Okay. > >>> +{ >>> + uint64_t bndcfg; >>> + int rc; >>> + >>> + if ( pfx == vex_f2 || !vcpu_has_mpx() ) >> This is an awkward edgecase of the rules concerning the host_ variants, >> but we will take a fault from xsave/xrstor for using XSTATE_BND* if the >> host doesn't support MPX. > Right. For now the implication is that guest available MPX implies > host support. But the reason for the host_and_vcpu_* predicate is because we cannot rely on this. While we are not actually using MPX instructions specifically, we are using MPX hardware state/mechanisms to perform the emulation. (MPX is an odd case, as the new MPX instructions are specifically chosen from things which were nops on older processors, so one single binary will execute on newer and older hardware.) > >>> --- a/xen/arch/x86/xstate.c >>> +++ b/xen/arch/x86/xstate.c >>> @@ -723,6 +741,66 @@ int handle_xsetbv(u32 index, u64 new_bv) >>> return 0; >>> } >>> >>> +uint64_t read_bndcfgu(void) >>> +{ >>> + unsigned long cr0 = read_cr0(); >>> + struct xsave_struct *xstate >>> + = idle_vcpu[smp_processor_id()]->arch.xsave_area; >>> + const struct xstate_bndcsr *bndcsr; >>> + >>> + ASSERT(cpu_has_mpx); >>> + clts(); >>> + >>> + if ( cpu_has_xsavec ) >>> + { >>> + asm ( ".byte 0x0f,0xc7,0x27\n" /* xsavec */ >>> + : "=m" (*xstate) >>> + : "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate) ); >>> + >>> + bndcsr = (void *)(xstate + 1); >>> + } >>> + else >>> + { >>> + alternative_io(".byte 0x0f,0xae,0x27\n", /* xsave */ >>> + ".byte 0x0f,0xae,0x37\n", /* xsaveopt */ >>> + X86_FEATURE_XSAVEOPT, >>> + "=m" (*xstate), >>> + "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate)); >> I am not certain this is safe. xsaveopt has an extra optimisation to do >> with whether the state has been internally modified. Because we use an >> xsave area from the idle vcpu, I can't see anything which prevents the >> LAXA (linear address of XSAVE area) optimisation kicking in, causing us >> to observe a zero in xstate_bv despite BNDCSR having a non-zero value >> loaded in hardware. > True - I've dropped the alternative now as well as the use of XSAVEC. Why drop XSAVEC? It appears to only be XSAVEOPT and XSAVES which have the linear address optimisation. XSAVEC claims only to use the XINUSE[i] optimisation which is fine for our needs. Alternatively, the linear address optimisation can be fooled into working sensibly if we add 64bytes to the idle xstate allocation, and alternate between using the two. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.