Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/shadow: Correct the behaviour of a guests view of maxphysaddr

To: Tim Deegan <tim@xxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Wed, 8 Feb 2017 15:29:58 +0000

Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Wed, 08 Feb 2017 15:30:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 08/02/17 14:12, Tim Deegan wrote: > Hi, > > At 12:36 +0000 on 08 Feb (1486557378), Andrew Cooper wrote: >> XSA-173 (c/s 8b1764833) introduces gfn_bits, and an upper limit which might >> be >> lower than the real maxphysaddr, to avoid overflowing the superpage shadow >> backpointer. >> >> However, plenty of hardware has a physical address width less that 44 bits, >> and the code added in shadow_domain_init() is a straight asignment. This >> causes gfn_bits to be increased beyond the physical address width on most >> Intel consumer hardware (which has a width of 39). > Once again, I think this is actually OK. But cpuid and shadow not > agreeing on the limit is a bug and I'm happy for it to be resolved > this way. > >> It also means that the effective maxphysaddr for shadowed guests differs from >> the value reported to the guest in CPUID. This means that a guest can create >> PTEs which either should fault but don't, or shouldn't fault but do. > Can it really create a PTE that shouldn't fault but does? AFAICS the > cpuid policy can report a lower value than 44 but not a higher one. Most Intel hardware has a width of 39. Most AMD hardware has a width of 48. In both cases with this change, CPUID reports these values, but the shadow code limits at 44 bits. On Intel, a guest can get an PTE shadowed which should have faulted and didn't (because the smfn is actually within 39 bits), while on AMD, a guest can try to create a PTE which shouldn't fault (because CPUID says anything up to 48 bits is fine) yet does fault because the shadow code rejects anything above 44 bits. > >> Remove gfn_bits and rework its logic in terms of a guests maxphysaddr. >> recalculate_cpuid_policy() is updated to properly clamp maxphysaddr between >> the host maximum, a possibly-smaller shadow restriction, and 32 as a minimum. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> @@ -502,11 +504,17 @@ void recalculate_cpuid_policy(struct domain *d) >> >> cpuid_featureset_to_policy(fs, p); >> >> - p->extd.maxphysaddr = min(p->extd.maxphysaddr, max->extd.maxphysaddr); >> - p->extd.maxphysaddr = min_t(uint8_t, p->extd.maxphysaddr, >> - d->arch.paging.gfn_bits + PAGE_SHIFT); >> - p->extd.maxphysaddr = max_t(uint8_t, p->extd.maxphysaddr, >> - (p->basic.pae || p->basic.pse36) ? 36 : 32); >> + maxphysaddr_limit = paging_mode_hap(d) ? hap_paddr_bits : paddr_bits; >> + >> + if ( !IS_ENABLED(BIGMEM) && paging_mode_shadow(d) && >> + (!is_pv_domain(d) || opt_allow_superpage) ) >> + { >> + /* Shadowed superpages store GFNs in 32-bit page_info fields. */ >> + maxphysaddr_limit = min(maxphysaddr_limit, 32U + PAGE_SHIFT); > I don't like this implementation detail escaping from x86/mm/shadow; > can we have a cpuid_limit_paddr_bits(d) that the shadow code can call > to tell cpuid about this? Or perhaps a paging_max_paddr_bits(d) so > cpuid can ask paging when it needs to know? I will see about doing this. > In either case I wonder whether there needs to be some trigger of > recalculate_cpuid_policy() once shadow mode is enabled -- I guess > currently it relies on it being called by something late enough in hvm > domain creation? Yes, although now I am worried about migrating PV guests. I will double check to see whether that path works sensibly or not, because we can't have maxphysaddr suddenly shrink under the feet of a running guest. > >> + } >> + >> + p->extd.maxphysaddr = min_t(uint8_t, p->extd.maxphysaddr, >> maxphysaddr_limit); >> + p->extd.maxphysaddr = max_t(uint8_t, p->extd.maxphysaddr, 32); > No longer requiring 36 bits for pae/pse36? This is mentioned in the > description but not explained. 32/36 is only the default to be assumed if no maxphysaddr is provided. We always provide maxphysaddr, and passing 32 on a massive machine is legitimate. I don't think we can reasonably get away with offering a VM less than 32 (which is why the check exists), but it is perfectly reasonable to give the VM 34 if we tell it that it only has 16GB of physical address space to use. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.