[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/5] x86/HVM: cache emulated instruction for retry processing



>>> On 24.12.13 at 12:29, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote:
> Jan Beulich wrote on 2013-12-18:
>>>>> On 18.12.13 at 10:40, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote:
>>> Jan Beulich wrote on 2013-12-18:
>>>>>>> On 18.12.13 at 09:36, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>
>> wrote:
>>>>> Jan Beulich wrote on 2013-09-30:
>>>>>> Rather than re-reading the instruction bytes upon retry
>>>>>> processing, stash away and re-use tha what we already read. That
>>>>>> way we can be certain that the retry won't do something different
>>>>>> from what requested the retry, getting once again closer to real
>>>>>> hardware behavior (where what we use retries for is simply a bus
>>>>>> operation, not involving redundant decoding of instructions).
>>>>>> 
>>>>> 
>>>>> This patch doesn't consider the nested case.
>>>>> For example, if the buffer saved the L2's instruction, then vmexit
>>>>> to
>>>>> L1 and
>>>>> L1 may use the wrong instruction.
>>>> 
>>>> I'm having difficulty seeing how the two could get intermixed:
>>>> There should be, at any given point in time, at most one
>>>> instruction being emulated. Can you please give a more elaborate
>>>> explanation of the situation where you see a (theoretical? practical?) 
> problem?
>>> 
>>> I saw this issue when booting L1 hyper-v. I added some debug info
>>> and saw the strange phenomenon:
>>> 
>>> (XEN) write to buffer: eip 0xfffff8800430bc80, size 16,
>>> content:f7420c1f608488b 44000000011442c7
>>> (XEN) read from buffer: eip 0xfffff800002f6138, size 16,
>>> content:f7420c1f608488b 44000000011442c7
>>> 
>>> From the log, we can see different eip but using the same buffer.
>>> Since I don't know how hyper-v working, so I cannot give more
>>> information why this happens. And I only saw it with L1 hyper-v (Xen
>>> on Xen and KVM on Xen don't have this issue) .
>> 
>> But in order to validate the fix is (a) needed and (b) correct, a
>> proper understanding of the issue is necessary as the very first step.
>> Which doesn't require knowing internals of Hyper-V, all you need is
>> tracking of emulator state changes in Xen (which I realize is said
>> easier than done, since you want to make sure you don't generate huge
>> amounts of output before actually hitting the issue, making it close to 
> impossible to analyze).
> 
> Ok. It should be an old issue and just exposed by your patch only:
> Sometimes, L0 need to decode the L2's instruction to handle IO access 
> directly. For example, if L1 pass through (not VT-d) a device to L2 directly. 
> And L0 may get X86EMUL_RETRY when handling this IO request. At same time, if 
> there is a virtual vmexit pending (for example, an interrupt pending to 
> inject to L1) and hypervisor will switch the VCPU context from L2 to L1. Now 
> we already in L1's context, but since we got a X86EMUL_RETRY just now and 
> this means hyprevisor will retry to handle the IO request later and 
> unfortunately, the retry will happen in L1's context.
> Without your patch, L0 just emulate an L1's instruction and everything goes 
> on. With your patch, L0 will get the instruction from the buffer which is 
> belonging to L2, and problem arises.
> 
> So the fixing is that if there is a pending IO request, no virtual 
> vmexit/vmentry is allowed which following hardware's behavior.
> 
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index 41db52b..c5446a9 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -1394,7 +1394,10 @@ void nvmx_switch_guest(void)
>      struct vcpu *v = current;
>      struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
>      struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    ioreq_t *p = get_ioreq(v);
>  
> +    if ( p->state != STATE_IOREQ_NONE )
> +        return;
>      /*
>       * a softirq may interrupt us between a virtual vmentry is
>       * just handled and the true vmentry. If during this window,

This change looks much more sensible; question is whether simply
ignoring the switch request is acceptable (and I don't know the
nested HVM code well enough to tell). Furthermore I wonder
whether that's really a VMX-only issue.

Another alternative might be to flush the cached instruction when
a guest mode switch is being done. Again, I don't know the
nested code well enough to tell what the most suitable behavior
here would be.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.