[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3 v2] x86emul: conditionally clear BNDn for branches



On 05/01/17 09:13, Jan Beulich wrote:
>>>> On 04.01.17 at 22:11, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 12/12/16 10:00, Jan Beulich wrote:
>>> @@ -1791,6 +1795,34 @@ static int inject_swint(enum x86_swint_t
>>>      generate_exception(fault_type, error_code);
>>>  }
>>>  
>>> +static void clear_bnd(struct x86_emulate_ctxt *ctxt,
>>> +                      const struct x86_emulate_ops *ops, enum vex_pfx pfx)
>> As with register_address_adjust(), this would be better as adjust_bnd()
>> as we don't necessarily clear them.
> Okay.
>
>>> +{
>>> +    uint64_t bndcfg;
>>> +    int rc;
>>> +
>>> +    if ( pfx == vex_f2 || !vcpu_has_mpx() )
>> This is an awkward edgecase of the rules concerning the host_ variants,
>> but we will take a fault from xsave/xrstor for using XSTATE_BND* if the
>> host doesn't support MPX.
> Right. For now the implication is that guest available MPX implies
> host support.

But the reason for the host_and_vcpu_* predicate is because we cannot
rely on this.

While we are not actually using MPX instructions specifically, we are
using MPX hardware state/mechanisms to perform the emulation.

(MPX is an odd case, as the new MPX instructions are specifically chosen
from things which were nops on older processors, so one single binary
will execute on newer and older hardware.)

>
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -723,6 +741,66 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>      return 0;
>>>  }
>>>  
>>> +uint64_t read_bndcfgu(void)
>>> +{
>>> +    unsigned long cr0 = read_cr0();
>>> +    struct xsave_struct *xstate
>>> +        = idle_vcpu[smp_processor_id()]->arch.xsave_area;
>>> +    const struct xstate_bndcsr *bndcsr;
>>> +
>>> +    ASSERT(cpu_has_mpx);
>>> +    clts();
>>> +
>>> +    if ( cpu_has_xsavec )
>>> +    {
>>> +        asm ( ".byte 0x0f,0xc7,0x27\n" /* xsavec */
>>> +              : "=m" (*xstate)
>>> +              : "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate) );
>>> +
>>> +        bndcsr = (void *)(xstate + 1);
>>> +    }
>>> +    else
>>> +    {
>>> +        alternative_io(".byte 0x0f,0xae,0x27\n", /* xsave */
>>> +                       ".byte 0x0f,0xae,0x37\n", /* xsaveopt */
>>> +                       X86_FEATURE_XSAVEOPT,
>>> +                       "=m" (*xstate),
>>> +                       "a" (XSTATE_BNDCSR), "d" (0), "D" (xstate));
>> I am not certain this is safe.  xsaveopt has an extra optimisation to do
>> with whether the state has been internally modified.  Because we use an
>> xsave area from the idle vcpu, I can't see anything which prevents the
>> LAXA (linear address of XSAVE area) optimisation kicking in, causing us
>> to observe a zero in xstate_bv despite BNDCSR having a non-zero value
>> loaded in hardware.
> True - I've dropped the alternative now as well as the use of XSAVEC.

Why drop XSAVEC?  It appears to only be XSAVEOPT and XSAVES which have
the linear address optimisation.  XSAVEC claims only to use the
XINUSE[i] optimisation which is fine for our needs.

Alternatively, the linear address optimisation can be fooled into
working sensibly if we add 64bytes to the idle xstate allocation, and
alternate between using the two.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.