[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] x86/Intel: virtualize support for cpuid faulting



Thanks for the reviews!

On Tue, Oct 4, 2016 at 3:31 AM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> On 03/10/16 23:38, Kyle Huey wrote:
>> rr (http://rr-project.org/), a Linux userspace record-and-replay reverse-
>> execution debugger, would like to trap and emulate the CPUID instruction.
>> This would allow us to a) mask away certain hardware features that rr does
>> not support (e.g. RDRAND) and b) enable trace portability across machines
>> by providing constant results. Patches for support in the Linux kernel are in
>> flight, and we'd like to be able to use this feature on virtualized Linux
>> instances as well.
>>
>> On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
>> faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
>> hardware support for faulting on cpuid is necessary to emulate support with 
>> an
>> HVM guest.
>>
>> On PV guests, hardware support is required so that userspace cpuid will trap
>> to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
>> (that
>> aren't the control domain, see the comment in intel_ctxt_switch_levelling).
>
> I see you found the first part of my cpuid overhaul work.
>
> For your reference (as it will eventually impact the code you add here),
> I am currently working on the 2nd part which will clean up the myriad of
> different cpuid functions we currently have.  After that, I will be
> arranging to include MSR levelling, at which point your cpuid_fault
> parameter can be read straight out of the vcpu's MSR bank, rather than
> living in a magic variable.

Cool.  Sounds like a nicer world :-)

> MSR levelling however is a long way off and shouldn't block this
> feature, so the magic boolean is ok for now.
>
>> Every userspace cpuid will trap via a GP(0) to emulate_privileged_op
>> (via do_general_protection). Once there we can simply decline to emulate the
>> cpuid and instead pass the GP(0) along to the guest kernel, thus emulating
>> the cpuid faulting behavior. PV guest kernels enter pv_cpuid via a different
>> path, so we do not need to check the CPL here.
>
> Which path is this? emulate_forced_invalid_op()?
>
> While typically only used by a guest kernel, it is accessable to
> userspace as well, so should be treated accordingly.

No, emulate_privileged_op is called only by do_general_protection
(unless I'm missing something here).  I see now that I'm looking at it
again that 'gp_in_kernel' here refers to xen's kernel, and not the
guest kernel, so this also needs to grow a CPL check. ('guest_mode'
and 'toggle_guest_mode' having different concepts of 'mode' is
slightly confusing).

>>
>> Signed-off-by: Kyle Huey <khuey@xxxxxxxxxxxx>
>> ---
>>  xen/arch/x86/domain.c        |  2 ++
>>  xen/arch/x86/hvm/vmx/vmx.c   | 24 +++++++++++++++++++-----
>>  xen/arch/x86/traps.c         | 31 ++++++++++++++++++++++++++++++-
>>  xen/include/asm-x86/domain.h |  3 +++
>>  4 files changed, 54 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index 3c4b094..f2bac10 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1095,6 +1095,8 @@ int arch_set_info_guest(
>>      for ( i = 0; i < 8; i++ )
>>          (void)set_debugreg(v, i, c(debugreg[i]));
>>
>> +    v->arch.cpuid_fault = 0;
>> +
>>      if ( v->is_initialised )
>>          goto out;
>>
>> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
>> index 50cbfed..f1e04c1 100644
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2430,11 +2430,16 @@ static void vmx_cpuid_intercept(
>>      HVMTRACE_5D (CPUID, input, *eax, *ebx, *ecx, *edx);
>>  }
>>
>> -static int vmx_do_cpuid(struct cpu_user_regs *regs)
>> +static int vmx_do_cpuid(struct vcpu *v, struct cpu_user_regs *regs)
>>  {
>>      unsigned int eax, ebx, ecx, edx;
>>      unsigned int leaf, subleaf;
>>
>> +    if ( v->arch.cpuid_fault && !ring_0(regs) ) {
>
> ring_0() is not the correct check to use here.  It is unsafe for
> anything other than a PV guest, and I will see about removing it from
> the header files to avoid future misuse.

Good to know!  Is ring_0 the correct thing to use above, in
emulate_privileged_op (which I believe is PV only)?

> CPL must always be read from %ss, not %cs, as %cs is inaccurate in real
> or vm86 mode.
>
> Amazingly, it turns out that we don't actually have an hvm_get_cpl()
> (we should fix that).  Until we gain one, you want:
>
> struct segment_register sreg;
> hvm_get_segment_register(curr, x86_seg_ss, &sreg);
>
> if ( v->arch.cpuid_fault && sreg.attr.fields.dpl > 0 )
>
> ~Andrew

Ok.  Adding an hvm_get_cpl seems pretty straightforward.  I'll do that.

- Kyle

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.