[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 9/9] x86/vmx: Don't leak EFER.NXE into guest context
>>> On 25.05.18 at 10:36, <andrew.cooper3@xxxxxxxxxx> wrote: > On 25/05/2018 08:49, Jan Beulich wrote: >>>>> On 22.05.18 at 13:20, <andrew.cooper3@xxxxxxxxxx> wrote: >>> @@ -1650,22 +1641,81 @@ static void vmx_update_guest_cr(struct vcpu *v, > unsigned int cr, >>> >>> static void vmx_update_guest_efer(struct vcpu *v) >>> { >>> - unsigned long vm_entry_value; >>> + unsigned long entry_ctls, guest_efer = v->arch.hvm_vcpu.guest_efer, >>> + xen_efer = read_efer(); >>> + >>> + if ( paging_mode_shadow(v->domain) ) >>> + { >>> + /* >>> + * When using shadow pagetables, EFER.NX is a Xen-owned bit and is >>> not >>> + * under guest control. >>> + */ >>> + guest_efer &= ~EFER_NX; >>> + guest_efer |= xen_efer & EFER_NX; >>> + >>> + /* >>> + * At the time of writing (May 2018), the Intel SDM "VM Entry: >>> Checks >>> + * on Guest Control Registers, Debug Registers and MSRs" section >>> says: >>> + * >>> + * If the "Load IA32_EFER" VM-entry control is 1, the following >>> + * checks are performed on the field for the IA32_MSR: >>> + * - Bits reserved in the IA32_EFER MSR must be 0. >>> + * - Bit 10 (corresponding to IA32_EFER.LMA) must equal the >>> value of >>> + * the "IA-32e mode guest" VM-entry control. It must also be >>> + * identical to bit 8 (LME) if bit 31 in the CR0 field >>> + * (corresponding to CR0.PG) is 1. >>> + * >>> + * Experimentally what actually happens is: >>> + * - Checks for EFER.{LME,LMA} apply uniformly whether using the >>> + * GUEST_EFER VMCS controls, or MSR load/save lists. >>> + * - Without EPT, LME being different to LMA isn't tolerated by >>> + * hardware. As writes to CR0 are intercepted, it is safe to >>> + * leave LME clear at this point, and fix up both LME and LMA >>> when >>> + * CR0.PG is set. >>> + */ >>> + if ( !(guest_efer & EFER_LMA) ) >>> + guest_efer &= ~EFER_LME; >>> + } >> Why is this latter adjustments done only for shadow mode? > > How should I go about making the comment clearer? > > When EPT is active, hardware is happy with LMA != LME. When EPT is > disabled, hardware strictly requires LME == LMA. Part of my problem may be that "Without EPT" can have two meanings: Hardware without EPT, or EPT disabled on otherwise capable hardware. > This particular condition occurs architecturally on the transition into > long mode, between setting LME and setting CR0.PG, and updating EFER > controls in the naive way results in a vmentry failure. > > Having spoken to Intel, they agree with my assessment that the docs > appear to be correct for Gen1 hardware, and stale for Gen2 hardware, > where fixing this was one of many parts of making Unrestricted Guest work. This suggests you mean the former, in which case the check really doesn't belong inside a paging_mode_shadow() conditional. >> After the above adjustments, when guest_efer still matches >> v->arch.hvm_vcpu.guest_efer, couldn't we disable the MSR read >> intercept? > > In principle, yes. We use load/save lists, as long as we remembered to > recalculate EFER every time CR0 gets modified in the shadow path. > > However, that would be a net performance penalty rather than benefit > (which is why I've gone to the effort of creating load-only lists). > > In practice, EFER is written at boot and not touched again. Having > load/save logic might avoid these vmexits, but at the cost of almost > every other vmexit needing to keep the guest_efer in sync with the > load/save list or VMCS field. I can't seem to connect this to my question about MSR _read_ intercept. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |