[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3 v2] x86emul: conditionally clear BNDn for branches



>>> On 05.01.17 at 19:59, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 05/01/17 09:13, Jan Beulich wrote:
>>>>> On 04.01.17 at 22:11, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 12/12/16 10:00, Jan Beulich wrote:
>>>> @@ -1791,6 +1795,34 @@ static int inject_swint(enum x86_swint_t
>>>>      generate_exception(fault_type, error_code);
>>>>  }
>>>>  
>>>> +static void clear_bnd(struct x86_emulate_ctxt *ctxt,
>>>> +                      const struct x86_emulate_ops *ops, enum vex_pfx pfx)
>>>> +{
>>>> +    uint64_t bndcfg;
>>>> +    int rc;
>>>> +
>>>> +    if ( pfx == vex_f2 || !vcpu_has_mpx() )
>>> This is an awkward edgecase of the rules concerning the host_ variants,
>>> but we will take a fault from xsave/xrstor for using XSTATE_BND* if the
>>> host doesn't support MPX.
>> Right. For now the implication is that guest available MPX implies
>> host support.
> 
> But the reason for the host_and_vcpu_* predicate is because we cannot
> rely on this.

Well, not really today: There's no case where a guest sees a set
CPUID flag (affecting the ISA) which the hypervisor would see
clear.

> While we are not actually using MPX instructions specifically, we are
> using MPX hardware state/mechanisms to perform the emulation.

Right, and afaics there's no way to reasonably emulate that without
MPX hardware being available to us (else we'd need to emulate all
instructions, as we can't intercept branches).

>>>> --- a/xen/arch/x86/xstate.c
>>>> +++ b/xen/arch/x86/xstate.c
>>>> @@ -723,6 +741,66 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>>      return 0;
>>>>  }
>>>>  
>>>> +uint64_t read_bndcfgu(void)
>>>> +{
>>>> +    unsigned long cr0 = read_cr0();
>>>> +    struct xsave_struct *xstate
>>>> +        = idle_vcpu[smp_processor_id()]->arch.xsave_area;
>>>> +    const struct xstate_bndcsr *bndcsr;
>>>> +
>>>> +    ASSERT(cpu_has_mpx);
>>>> +    clts();
>>>> +
>>>> +    if ( cpu_has_xsavec )
>>>> +    {
>>>> +        asm ( ".byte 0x0f,0xc7,0x27\n" /* xsavec */
>>>> +              : "=m" (*xstate)
>>>> +              : "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate) );
>>>> +
>>>> +        bndcsr = (void *)(xstate + 1);
>>>> +    }
>>>> +    else
>>>> +    {
>>>> +        alternative_io(".byte 0x0f,0xae,0x27\n", /* xsave */
>>>> +                       ".byte 0x0f,0xae,0x37\n", /* xsaveopt */
>>>> +                       X86_FEATURE_XSAVEOPT,
>>>> +                       "=m" (*xstate),
>>>> +                       "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate));
>>> I am not certain this is safe.  xsaveopt has an extra optimisation to do
>>> with whether the state has been internally modified.  Because we use an
>>> xsave area from the idle vcpu, I can't see anything which prevents the
>>> LAXA (linear address of XSAVE area) optimisation kicking in, causing us
>>> to observe a zero in xstate_bv despite BNDCSR having a non-zero value
>>> loaded in hardware.
>> True - I've dropped the alternative now as well as the use of XSAVEC.
> 
> Why drop XSAVEC?  It appears to only be XSAVEOPT and XSAVES which have
> the linear address optimisation.  XSAVEC claims only to use the
> XINUSE[i] optimisation which is fine for our needs.

Hmm, looks like I've mixed up XSAVEC and XSAVES. However, even
the modified optimization should present no problem: We never load
from this area (so the address will never match y [in SDM speak]),
and the address aliasing with HVM guests is not an issue because
x for a guest performed XRSTOR{,S} will always be 1, while the
value x is being compared against will always be 0. So I think I can
undo the entire dropping of logic.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.