[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 7/8] xen/x86: Rework CR4 handling for PV guests
>>> On 24.06.15 at 18:31, <andrew.cooper3@xxxxxxxxxx> wrote: > PV CR4 settings are now based on mmu_cr4_features, rather than the current > contents of CR4. This causes dom0 to be consistent with domUs, despite > being > constructed in a region with CR4.SMAP purposefully disabled. That'll be fine as long as we're keeping CR4 stable (except for guest controlled bits). Agreed, once that's no longer the case, mmu_cr4_features would no longer be very useful either. Still it would seem more reasonable to me to base things on current values than on some not guaranteed to be up to date. > The major change here is that v->arch.pv_vcpu.ctrlreg[4] now contains only CR4 > bits which Xen wishes to shadow, rather than containing a mix of host and > guest bits. This is likely to elicit a warning when migrating a PV guest > from > an older version of Xen, although no function problems as a result. For one, I'm having a hard time guessing what the part of the sentence after the comma is meant to tell me. And then, from a support perspective, such warnings aren't helpful, as they incur questions. Plus finally, storing neither the current guest view nor the current hypervisor view in that field seems prone to cause headaches in the future. Otoh this gets it in line with CR0 handling as it looks. > A second change is that visible set of CR4 bits is different. In > particular, > VMXE and MCE are not leaked into a guest, while SMAP and SMEP are exposed. Not leaking VMXE into the guest is fine. Not exposing MCE even to Dom0 is questionable. And exposing SMEP and SMAP without exposing their CPUID flags seems bogus. > In addition, add PGE to the set of bits we don't care about a guest > attempting to modify. I agree from a functionality pov this is okay, but do we really want to allow the guest to turn this off from a performance pov? > @@ -683,6 +683,26 @@ void arch_domain_unpause(struct domain *d) > } > > /* > + * CR4 bits Xen will shadow on behalf of the guest. > + * - TSD for vtsc > + * - DE for %%dr4/5 emulation DR4/5 handling is surely secondary here: I/O breakpoints are the main feature to be controlled by this bit. > @@ -699,6 +719,8 @@ static int __init init_pv_cr4_masks(void) > common_mask &= ~X86_CR4_DE; > if ( cpu_has_xsave ) > common_mask &= ~X86_CR4_OSXSAVE; > + if ( cpu_has_pge ) > + common_mask &= ~X86_CR4_PGE; Following the earlier comment, shouldn't this bit then also be added to PV_CR4_SHADOW? > @@ -708,21 +730,61 @@ static int __init init_pv_cr4_masks(void) > if ( cpu_has_fsgsbase ) > pv_cr4_mask &= ~X86_CR4_FSGSBASE; > > + BUILD_BUG_ON(PV_CR4_SHADOW & PV_CR4_READ); > + > return 0; > } > __initcall(init_pv_cr4_masks); > > -unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long > guest_cr4) > +void pv_cr4_write(struct vcpu *v, unsigned long guest_cr4) > { > - unsigned long hv_cr4 = real_cr4_to_pv_guest_cr4(read_cr4()); > + unsigned long host_cr4 = mmu_cr4_features & (PV_CR4_READ | > PV_CR4_SHADOW); > unsigned long mask = is_pv_32bit_vcpu(v) ? compat_pv_cr4_mask : > pv_cr4_mask; > > - if ( (guest_cr4 & mask) != (hv_cr4 & mask) ) > + if ( guest_cr4 == 0 ) > + { > + /* Default - all shadowed bits to 0. */ > + guest_cr4 = mmu_cr4_features & ~PV_CR4_SHADOW; > + } > + else if ( (guest_cr4 & ~PV_CR4_SHADOW) == 0 ) > + { > + /* > + * Only setting shadowed bits. This will be the case on restore, so > + * avoid warning about losing readable host bits. > + */ > + } > + else if ( (guest_cr4 & mask) != (host_cr4 & mask) ) > + { > printk(XENLOG_G_WARNING > - "d%d attempted to change %pv's CR4 flags %08lx -> %08lx\n", > - current->domain->domain_id, v, hv_cr4, guest_cr4); > + "d%d attempted to change %pv's CR4 flags %08lx -> %08lx " > + "(bad +%#lx, -%#lx)\n", No mixture of 0x-prefixed and not-0x-prefixed hex numbers please. I anyway wonder whether printing four numbers here is necessary when printing just three (host_cr4, guest_cr4, and mask) would suffice. > + current->domain->domain_id, v, host_cr4, guest_cr4, > + (host_cr4 ^ guest_cr4) & guest_cr4 & mask, > + (host_cr4 ^ guest_cr4) & ~guest_cr4 & mask); > + } > + > + v->arch.pv_vcpu.ctrlreg[4] = > + ((mmu_cr4_features & mask) | (guest_cr4 & ~mask)) & PV_CR4_SHADOW; With FSGSBASE not in PV_CR4_SHADOW, how would (as the comment in init_pv_cr4_masks() says) the guest manage to observe the flag in its desired state after a migration? > +unsigned long pv_cr4_read(const struct vcpu *v) > +{ > + unsigned long cr4 = read_cr4(); > + > + return (cr4 & PV_CR4_READ & ~PV_CR4_SHADOW) | > + (v->arch.pv_vcpu.ctrlreg[4] & PV_CR4_SHADOW); I don't understand either of the two uses of PV_CR4_SHADOW here: The first seems pointless as earlier an you already assert that it and PV_CR4_READ are disjoint, i.e. the negation of the former surely has no less bits set than the latter. And the second seems pointless since you only ever store bits in PV_CR4_SHADOW into the field. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |