Xen project Mailing List

Re: [Xen-devel] [PATCH 1/3] Nested VMX: update nested paging mode when vmswitch is in progress

To: "Christoph Egger" <chegger@xxxxxxxxx>, "Eddie Dong" <eddie.dong@xxxxxxxxx>, "Yang Z Zhang" <yang.z.zhang@xxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Tue, 14 Jan 2014 07:29:51 +0000

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 14 Jan 2014 07:30:26 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 14.01.14 at 03:33, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote: > Zhang, Yang Z wrote on 2013-12-24: >> Zhang, Yang Z wrote on 2013-12-23: >>> Egger, Christoph wrote on 2013-12-18: >>>> On 18.12.13 11:24, Zhang, Yang Z wrote: >>>>> Jan Beulich wrote on 2013-12-18: >>>>>>>>> On 18.12.13 at 09:58, "Dong, Eddie" <eddie.dong@xxxxxxxxx> wrote: >>>>>>> Acked by Eddie Dong <eddie.dong@xxxxxxxxx> >>>>>> >>>>>> As long as Christoph's reservations wrt SVM aren't being >>>>>> addressed/ eliminated, I don't think we can apply this patch. >>>>>> >>>>>> Furthermore, while you ack-ed this patch (which isn't really VMX >>>>>> specific) and patch 3, you didn't ack patch 2, but you also >>>>>> didn't indicate anything that's possibly wrong with it. >>>>> >>>>> Actually, I asked him help to review the first patch. Since >>>>> Christoph thought >>>> the first patch may break AMD. So I hope he can help to review the >>>> first patch to see whether I am wrong. >>>>> >>>>>> >>>>>> And finally, with patch 1 needing to be left out for the moment, >>>>>> I'd like to have confirmation that all three patches can be >>>>>> applied independently (i.e. with the current state of things only >>>>>> patch 3 is ready to >>>> go in). >>>>> >>>>> Yes, the three patches are independent. >>>> >>>> I have looked through code. >>>> >>>> vcpu is in guestmode till the vmentry/vmexit emulation is done. >>>> In SVM the vcpu guestmode changes right before setting >>>> nv_vmswitch_in_progress to 0 when the vmentry/vmexit emulation was >>>> successfull (there is a bunch of error-checking). >>>> >>> >>> After checking the SVM logic, I find the basic usage of >>> vcpu_in/exitk_guestmode of VMX side is different from that of SVM side: >>> VMX side: >>> virtual vmentry: set vcpu in guestmode after vmcs switched to vmcs02. >>> This happed at the beginning of vmentry which means the whole >>> vmentry emulation code is executed when vcpu is in guest mode. >>> virtual vmexit: set vcpu exit guestmode after vmcs switched to vmcs01. >>> Also, this action occurred just after sync vmcs02 to vmcs12 and >>> before the vmcs01's state restored (like set_cr), which means the >>> vmcs01's state restored when vcpu is not in guest mode. >>> SVM side: >>> virtual vmentry: set vcpu in guest mode after vmentry emulation is >>> done, which means the emulation code is executed when vcpu is not in >>> guest mode. virtual vmexit: set vcpu exit guest mode after vmexit >>> emulation is >>> done, which means the emulation code is executed when vcpu is in guest >>> mode. >>> >>> Ok, now let us take a look at current implementation, take >>> hvm_set_cr0 for example: Update nested mode when (vcpu_in_guestmode >>> && !vmswitch_in_progress). Otherwise, update L1's paging mode: >>> if ( !nestedhvm_vmswitch_in_progress(v) && >>> nestedhvm_vcpu_in_guestmode(v) ) >>> paging_update_nestedmode(v); else >>> paging_update_paging_modes(v); virtual vmentry: >>> Expected result: nested mode is being updated. >>> Current result in SVM: >>> !vcpu_in_guestmode and vmswitch_in_progress: L1's paging >>> mode is updated. Wrong. >>> Current result in VMX: >>> vcpu_in_guestmode and vmswitch_in_progress : L1's paging >>> mode is updated. Wrong. >>> >>> Virtual vmexit: >>> Expected result: L1's paging mode is updated. >>> Current result in SVM: >>> vcpu_in_guestmode and vmswitch_in_progress: L1's paging >>> mode is updated. Correct. >>> Current result in VMX: >>> !vcpu_in_guestmode and vmswitch_in_progress: L1's paging >>> mod is updated. Correct >>> >>> From the above result, we can see the vmswitch_in_progress is >>> actually take effect not vcpu_guest_mode. The original code doesn't >>> consider the paging mode changed during vmentry/vmexit emulation. >>> This seems wrong to me, because paging mode changing happens in real world. >>> >>> Here is the result with my patch: Update nested mode when >>> vcpu_in_guestmode. Otherwise, update L1's paging mode: >>> if (nestedhvm_vcpu_in_guestmode(v) ) >>> paging_update_nestedmode(v); else >>> paging_update_paging_modes(v); virtual vmentry: >>> Expected result: nested mode is being updated. >>> Current result in SVM: >>> !vcpu_in_guestmode: L1's paging mode is updated. Wrong. >>> Current result in VMX: vcpu_in_guestmode : Nested paging >>> mode is updated. Correct. >>> Virtual vmexit: >>> Expected result: L1's paging mode is updated. >>> Current result in SVM: >>> vcpu_in_guestmode: Nested paging mode is updated. Wrong. >>> Current result in VMX: !vcpu_in_guestmode: L1's paging >>> mod >> is >>> updated. Correct >>> From the above, we can see the problem is that the way to >>> distinguish the vmentry and vmexit in SVM and VMX side is different. >>> Since I am not familiar the SVM, so i am not sure whether the usage >>> of vcpu_in_guestmode in SVM is right or wrong. But in VMX side, once >>> the vmcs is changed, then the vcpu_in_guestmode is changed too which >>> we think is following hardware's behavior. >>> >>> Also, I think I found another issue that the paging mode cannot be >>> tracked correctly with current way or with my patch. Need more time >>> to >> investigate. >> >> Ok. The issue is that if the paging state is changed via vmwrite (L1 >> writes L2's vmcs to change paging mode directly) which is unaware to >> L0. And hypervisor cannot set the right nested paging mode during >> virtual vmentry. So we need to update nest mode unconditionally for each > virtual vmentry. >> >>> >>> >>>> This patch breaks both vmentry and vmexit emulation for SVM. >>>> The vmentry breakage comes with l1-hypervisor using shadow-paging. >>>> >>>> During vmexit emulation hvm_set_cr0 and hvm_set_cr4 are called to >>>> restore cr0 and cr4 for the l1 guest. With this patch the paging >>>> mode for the l2 guest is updated rather for the l1 guest. >>>> >>>> I think this patch also breaks the case where l2 guest wants to set >>>> cr0 or cr4 and l1-hypervisor does not intercept cr0/cr4 and >>>> l1-hypervisor uses shadow-paging. This may also count for VMX. >>> >>> For this case, am I missing something? If yes, please correct me. >>> >>>> >>>> This is just from reading the code. As I said, I do not have a >>>> setup to verify this, unfortunately. > > Any comments ? Considering Christoph's comments and reservations, if you can't alone deal with this I think you should work with the AMD people to eliminate or address his concerns. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.