[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] Nested VMX: update nested paging mode when vmswitch is in progress



>>> On 14.01.14 at 03:33, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote:
> Zhang, Yang Z wrote on 2013-12-24:
>> Zhang, Yang Z wrote on 2013-12-23:
>>> Egger, Christoph wrote on 2013-12-18:
>>>> On 18.12.13 11:24, Zhang, Yang Z wrote:
>>>>> Jan Beulich wrote on 2013-12-18:
>>>>>>>>> On 18.12.13 at 09:58, "Dong, Eddie" <eddie.dong@xxxxxxxxx> wrote:
>>>>>>> Acked by Eddie Dong <eddie.dong@xxxxxxxxx>
>>>>>> 
>>>>>> As long as Christoph's reservations wrt SVM aren't being
>>>>>> addressed/ eliminated, I don't think we can apply this patch.
>>>>>> 
>>>>>> Furthermore, while you ack-ed this patch (which isn't really VMX
>>>>>> specific) and patch 3, you didn't ack patch 2, but you also
>>>>>> didn't indicate anything that's possibly wrong with it.
>>>>> 
>>>>> Actually, I asked him help to review the first patch. Since
>>>>> Christoph thought
>>>> the first patch may break AMD. So I hope he can help to review the
>>>> first patch to see whether I am wrong.
>>>>> 
>>>>>> 
>>>>>> And finally, with patch 1 needing to be left out for the moment,
>>>>>> I'd like to have confirmation that all three patches can be
>>>>>> applied independently (i.e. with the current state of things only
>>>>>> patch 3 is ready to
>>>> go in).
>>>>> 
>>>>> Yes, the three patches are independent.
>>>> 
>>>> I have looked through code.
>>>> 
>>>> vcpu is in guestmode till the vmentry/vmexit emulation is done.
>>>> In SVM the vcpu guestmode changes right before setting
>>>> nv_vmswitch_in_progress to 0 when the vmentry/vmexit emulation was
>>>> successfull (there is a bunch of error-checking).
>>>> 
>>> 
>>> After checking the SVM logic, I find the basic usage of
>>> vcpu_in/exitk_guestmode of VMX side is different from that of SVM side:
>>> VMX side:
>>>     virtual vmentry: set vcpu in guestmode after vmcs switched to vmcs02.
>>> This happed at the beginning of vmentry which means the whole
>>> vmentry emulation code is executed when vcpu is in guest mode.
>>>     virtual vmexit: set vcpu exit guestmode after vmcs switched to vmcs01.
>>> Also, this action occurred just after sync vmcs02 to vmcs12 and
>>> before the vmcs01's state restored (like set_cr), which means the
>>> vmcs01's state restored when vcpu is not in guest mode.
>>> SVM side:
>>>     virtual vmentry: set vcpu in guest mode after vmentry emulation is
>>>     done, which means the emulation code is executed when vcpu is not in
>>>     guest mode. virtual vmexit: set vcpu exit guest mode after vmexit
>>>     emulation is
>>> done, which means the emulation code is executed when vcpu is in guest
>>> mode.
>>> 
>>> Ok, now let us take a look at current implementation, take
>>> hvm_set_cr0 for example: Update nested mode when (vcpu_in_guestmode
>>> && !vmswitch_in_progress). Otherwise, update L1's paging mode:
>>>         if ( !nestedhvm_vmswitch_in_progress(v) &&
>>> nestedhvm_vcpu_in_guestmode(v) )
>>>             paging_update_nestedmode(v); else
>>>             paging_update_paging_modes(v); virtual vmentry:
>>>     Expected result: nested mode is being updated.
>>>     Current result in SVM:
>>>           !vcpu_in_guestmode and vmswitch_in_progress:  L1's paging
>>> mode is updated.  Wrong.
>>>     Current result in VMX:
>>>           vcpu_in_guestmode and vmswitch_in_progress :  L1's paging
>>> mode is updated.  Wrong.
>>> 
>>> Virtual vmexit:
>>>     Expected result: L1's paging mode is updated.
>>>     Current result in SVM:
>>>           vcpu_in_guestmode and vmswitch_in_progress:  L1's paging
>>> mode is updated.   Correct.
>>>     Current result in VMX:
>>>           !vcpu_in_guestmode and vmswitch_in_progress:  L1's paging
>>> mod is updated.    Correct
>>> 
>>> From the above result, we can see the vmswitch_in_progress is
>>> actually take effect not vcpu_guest_mode. The original code doesn't
>>> consider the paging mode changed during vmentry/vmexit emulation.
>>> This seems wrong to me, because paging mode changing happens in real world.
>>> 
>>> Here is the result with my patch: Update nested mode when
>>> vcpu_in_guestmode. Otherwise, update L1's paging mode:
>>>         if (nestedhvm_vcpu_in_guestmode(v) )
>>>             paging_update_nestedmode(v); else
>>>             paging_update_paging_modes(v); virtual vmentry:
>>>     Expected result: nested mode is being updated.
>>>     Current result in SVM:
>>>           !vcpu_in_guestmode:  L1's paging mode is updated. Wrong.
>>>           Current result in VMX: vcpu_in_guestmode :  Nested paging
>>>           mode is updated.  Correct.
>>> Virtual vmexit:
>>>     Expected result: L1's paging mode is updated.
>>>     Current result in SVM:
>>>           vcpu_in_guestmode:  Nested paging mode is updated. Wrong.
>>>           Current result in VMX: !vcpu_in_guestmode:  L1's paging
>>> mod
>> is
>>>           updated.      Correct
>>> From the above, we can see the problem is that the way to
>>> distinguish the vmentry and vmexit in SVM and VMX side is different.
>>> Since I am not familiar the SVM, so i am not sure whether the usage
>>> of vcpu_in_guestmode in SVM is right or wrong. But in VMX side, once
>>> the vmcs is changed, then the vcpu_in_guestmode is changed too which
>>> we think is following hardware's behavior.
>>> 
>>> Also, I think I found another issue that the paging mode cannot be
>>> tracked correctly with current way or with my patch. Need more time
>>> to
>> investigate.
>> 
>> Ok. The issue is that if the paging state is changed via vmwrite (L1
>> writes L2's vmcs to change paging mode directly) which is unaware to
>> L0. And hypervisor cannot set the right nested paging mode during
>> virtual vmentry. So we need to update nest mode unconditionally for each 
> virtual vmentry.
>> 
>>> 
>>> 
>>>> This patch breaks both vmentry and vmexit emulation for SVM.
>>>> The vmentry breakage comes with l1-hypervisor using shadow-paging.
>>>> 
>>>> During vmexit emulation hvm_set_cr0 and hvm_set_cr4 are called to
>>>> restore cr0 and cr4 for the l1 guest. With this patch the paging
>>>> mode for the l2 guest is updated rather for the l1 guest.
>>>> 
>>>> I think this patch also breaks the case where l2 guest wants to set
>>>> cr0 or cr4 and l1-hypervisor does not intercept cr0/cr4 and
>>>> l1-hypervisor uses shadow-paging. This may also count for VMX.
>>> 
>>> For this case, am I missing something? If yes, please correct me.
>>> 
>>>> 
>>>> This is just from reading the code. As I said, I do not have a
>>>> setup to verify this, unfortunately.
> 
> Any comments ?

Considering Christoph's comments and reservations, if you can't
alone deal with this I think you should work with the AMD people
to eliminate or address his concerns.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.