[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/cpuid: fix dom0 crash on skylake machine



>>> On 01.06.16 at 13:27, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 01/06/16 10:43, Jan Beulich wrote:
>>>>> On 01.06.16 at 11:34, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 01/06/16 10:17, Jan Beulich wrote:
>>>>>>> On 01.06.16 at 11:03, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>> While this does work, it undoes some of the work I started with my cpuid
>>>>> improvements in 4.7
>>>>>
>>>>> Does the attached patch also resolve your issue?
>>>> While that's much better than the original, I don't think it's quite
>>>> enough. The rest of the domain policy should be taken into account
>>>> (and I think I had suggested to do so during review of your CPUID
>>>> rework series), i.e. this can't be calculated once for every domain.
>>> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
>>> just a stopgap fix.
>>>
>>> Fixing this in a properly per-domain way is part of my further plans for
>>> cpuid improvements.  The reason it isn't done like this yet is because
>>> there is a substantial quantity of work required to make this function.
>> What substantial work other than XSTATE leaf handling inquiring
>> other leaves do you see?
> 
> I want to adjust the representation of cpuid information in struct
> domain. The current loop in domain_cpuid() causes an O(N) overhead for
> every query, which is very poor for actions which really should be a
> single bit test at a fixed offset.
> 
> This needs to be combined with properly splitting the per-domain and
> per-vcpu information, which requires knowing the expected vcpu topology
> during domain creation.
> 
> On top of that, there needs to be verification logic to check the
> correctness of information passed from the toolstack.
> 
> All of these areas are covered in the "known issues" section of the
> feature doc, and I do plan to fix them all.  However, it isn't a couple
> of hours worth of work.

All understood, yet not to the point: The original remark was that
the very XSTATE handling could be done better with far not as much
of a change, at least afaict without having tried.

>>>> And then, as said in reply to the original patch, handle_xsetbv()'s
>>>> checking should be generalized from the special casing of PKRU (or
>>>> if we don't want that, then that special case would better get
>>>> removed for consistency reasons).
>>> Oh - I hadn't even noticed that.  How about this incremental change?
>>>
>>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
>>> index a0cfcc2..67c0e4b 100644
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>      if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
>>>          return -EINVAL;
>>>  
>>> -    /* XCR0.PKRU is disabled on PV mode. */
>>> -    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
>>> +    /* Sanity check against domain maximum. */
>>> +    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask : 
>>> hvm_xfeature_mask) )
>>>          return -EOPNOTSUPP;
>> Looks good.
>>
>> But one other aspect: MPX is tagged S, while PKU is tagged H. Do
>> we perhaps need three masks to get the above right at least with
>> just the global masking?
> 
> So it is.  I wonder why.  (It was previously like that, and I didn't
> alter it as part of my cpuid work).  I can't see anything in MPX which
> makes it unsafe to use with shadow.  All control state works on linear
> addresses rather than physical addresses.

Well, it's MPX that's tagged as available for shadow guests, and
PKU as unavailable.

> For now, I will implement the same kind of runtime check as hides MPX
> from shadow guests in hvm_cpuid().  This again will eventually be
> subsumed by the further work.
> 
> I am also wondering whether we should disable MPX by default for
> domains.  We know it won't work properly for anything which ends up
> being emulated.  Users who specifically want to try it should be able to
> turn it on in their domain configuration file, but it really shouldn't
> be on my default.

For guests being introspected I agree. For others I'm not sure, as
normally we shouldn't hit the instruction emulator for branches (the
main exception being PAE mode shadow guests).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.