Xen project Mailing List

Re: [Xen-devel] [PATCH 2/3 v2] x86emul: conditionally clear BNDn for branches

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Fri, 06 Jan 2017 08:35:23 -0700

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 06 Jan 2017 15:35:56 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 05.01.17 at 19:59, <andrew.cooper3@xxxxxxxxxx> wrote: > On 05/01/17 09:13, Jan Beulich wrote: >>>>> On 04.01.17 at 22:11, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 12/12/16 10:00, Jan Beulich wrote: >>>> @@ -1791,6 +1795,34 @@ static int inject_swint(enum x86_swint_t >>>> generate_exception(fault_type, error_code); >>>> } >>>> >>>> +static void clear_bnd(struct x86_emulate_ctxt *ctxt, >>>> + const struct x86_emulate_ops *ops, enum vex_pfx pfx) >>>> +{ >>>> + uint64_t bndcfg; >>>> + int rc; >>>> + >>>> + if ( pfx == vex_f2 || !vcpu_has_mpx() ) >>> This is an awkward edgecase of the rules concerning the host_ variants, >>> but we will take a fault from xsave/xrstor for using XSTATE_BND* if the >>> host doesn't support MPX. >> Right. For now the implication is that guest available MPX implies >> host support. > > But the reason for the host_and_vcpu_* predicate is because we cannot > rely on this. Well, not really today: There's no case where a guest sees a set CPUID flag (affecting the ISA) which the hypervisor would see clear. > While we are not actually using MPX instructions specifically, we are > using MPX hardware state/mechanisms to perform the emulation. Right, and afaics there's no way to reasonably emulate that without MPX hardware being available to us (else we'd need to emulate all instructions, as we can't intercept branches). >>>> --- a/xen/arch/x86/xstate.c >>>> +++ b/xen/arch/x86/xstate.c >>>> @@ -723,6 +741,66 @@ int handle_xsetbv(u32 index, u64 new_bv) >>>> return 0; >>>> } >>>> >>>> +uint64_t read_bndcfgu(void) >>>> +{ >>>> + unsigned long cr0 = read_cr0(); >>>> + struct xsave_struct *xstate >>>> + = idle_vcpu[smp_processor_id()]->arch.xsave_area; >>>> + const struct xstate_bndcsr *bndcsr; >>>> + >>>> + ASSERT(cpu_has_mpx); >>>> + clts(); >>>> + >>>> + if ( cpu_has_xsavec ) >>>> + { >>>> + asm ( ".byte 0x0f,0xc7,0x27\n" /* xsavec */ >>>> + : "=m" (*xstate) >>>> + : "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate) ); >>>> + >>>> + bndcsr = (void *)(xstate + 1); >>>> + } >>>> + else >>>> + { >>>> + alternative_io(".byte 0x0f,0xae,0x27\n", /* xsave */ >>>> + ".byte 0x0f,0xae,0x37\n", /* xsaveopt */ >>>> + X86_FEATURE_XSAVEOPT, >>>> + "=m" (*xstate), >>>> + "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate)); >>> I am not certain this is safe. xsaveopt has an extra optimisation to do >>> with whether the state has been internally modified. Because we use an >>> xsave area from the idle vcpu, I can't see anything which prevents the >>> LAXA (linear address of XSAVE area) optimisation kicking in, causing us >>> to observe a zero in xstate_bv despite BNDCSR having a non-zero value >>> loaded in hardware. >> True - I've dropped the alternative now as well as the use of XSAVEC. > > Why drop XSAVEC? It appears to only be XSAVEOPT and XSAVES which have > the linear address optimisation. XSAVEC claims only to use the > XINUSE[i] optimisation which is fine for our needs. Hmm, looks like I've mixed up XSAVEC and XSAVES. However, even the modified optimization should present no problem: We never load from this area (so the address will never match y [in SDM speak]), and the address aliasing with HVM guests is not an issue because x for a guest performed XRSTOR{,S} will always be 1, while the value x is being compared against will always be 0. So I think I can undo the entire dropping of logic. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.