[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline



On 12.11.2019 17:09, Andrew Cooper wrote:
> On 04/11/2019 15:31, Jan Beulich wrote:
>> On 04.11.2019 16:22, Andrew Cooper wrote:
>>> On 04/11/2019 15:03, Jan Beulich wrote:
>>>> On 04.11.2019 15:59, Andrew Cooper wrote:
>>>>> On 04/11/2019 13:25, Jan Beulich wrote:
>>>>>> On 01.11.2019 21:25, Andrew Cooper wrote:
>>>>>>> --- a/xen/arch/x86/cpu/intel.c
>>>>>>> +++ b/xen/arch/x86/cpu/intel.c
>>>>>>> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>>>>>>>         if (disable) {
>>>>>>>                 wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>>>>>>>                 bootsym(trampoline_misc_enable_off) |= disable;
>>>>>>> +               bootsym(trampoline_efer) |= EFER_NX;
>>>>>>>         }
>>>>>> I'm fine with all other changes here, just this one concerns me:
>>>>>> Before your change we latch a value into trampoline_misc_enable_off
>>>>>> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
>>>>>> means that, on a hypervisor (like Xen itself) simply discarding
>>>>>> guest writes to the MSR (which is "fine" with the described usage
>>>>>> of ours up to now), with your change we'd now end up trying to set
>>>>>> EFER.NX when the feature may not actually be enabled in
>>>>>> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
>>>>>> I.e. don't we need to read back the MSR value here, and verify
>>>>>> we actually managed to clear the bit before actually OR-ing in
>>>>>> EFER_NX?
>>>>> If this is a problem in practice, execution won't continue beyond the
>>>>> next if() condition just out of context, which set EFER.NX on the BSP
>>>>> with an unguarded WRMSR.
>>>> And how is this any good? This kind of crash is exactly what I'm
>>>> asking to avoid.
>>> What is the point of working around a theoretical edge case of broken
>>> nesting under Xen which demonstrably doesn't exist in practice?
>> Well, so far nothing was said about this not being an actual problem.
> 
> Its not an actual problem.  If it were, we would have had crash reports.
> 
>> I simply don't know whether hardware would refuse such an EFER write.
> 
> I've just experimented - writing EFER.NX takes a #GP fault when
> MISC_ENABLE.XD is set.
> 
>> If it does, it would be appropriate for hypervisors to also refuse
>> it. I.e. if we don't do so right now, correcting the behavior would
>> trip the code here.
> 
> MISC_ENABLES.XD is architectural on any Intel system which enumerates
> NX, and if the bit is set, it can be cleared.  (Although the semantics
> described in the SDM are broken.  It is only available if NX is
> enumerated, which is obfuscated by setting XD).
> 
> However, no hypervisor is going to bother virtualising this
> functionality.  Either configure the VM with NX or without.  (KVM for
> example doesn't virtualise MISC_ENABLES at all.)

I'm sorry, but I still don't follow: You say "if the bit is set, it
can be cleared", which is clearly not in line with our current guest
MSR write handling. It just so happens that we have no command line
option allowing to suppress the clearing of XD. If we had, according
to your findings above we'd run into a #GP upon trying to set NX.
How can you easily exclude another hypervisor actually doing so (and
nobody having run into the issue simply because the option is rarely
used)?

Btw - all would be fine if the code in question was guarded by an
NX feature check, but as you say that's not possible because XD set
forces NX clear. However, our setting of EFER.NX could be guarded
this way, as we _expect_ XD to be clear at that point.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.