[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/shadow: Correct the behaviour of a guests view of maxphysaddr

On 08/02/17 14:12, Tim Deegan wrote:
> Hi,
> At 12:36 +0000 on 08 Feb (1486557378), Andrew Cooper wrote:
>> XSA-173 (c/s 8b1764833) introduces gfn_bits, and an upper limit which might 
>> be
>> lower than the real maxphysaddr, to avoid overflowing the superpage shadow
>> backpointer.
>> However, plenty of hardware has a physical address width less that 44 bits,
>> and the code added in shadow_domain_init() is a straight asignment.  This
>> causes gfn_bits to be increased beyond the physical address width on most
>> Intel consumer hardware (which has a width of 39).
> Once again, I think this is actually OK.  But cpuid and shadow not
> agreeing on the limit is a bug and I'm happy for it to be resolved
> this way.
>> It also means that the effective maxphysaddr for shadowed guests differs from
>> the value reported to the guest in CPUID.  This means that a guest can create
>> PTEs which either should fault but don't, or shouldn't fault but do.
> Can it really create a PTE that shouldn't fault but does?  AFAICS the
> cpuid policy can report a lower value than 44 but not a higher one.

Most Intel hardware has a width of 39.
Most AMD hardware has a width of 48.

In both cases with this change, CPUID reports these values, but the
shadow code limits at 44 bits.

On Intel, a guest can get an PTE shadowed which should have faulted and
didn't (because the smfn is actually within 39 bits), while on AMD, a
guest can try to create a PTE which shouldn't fault (because CPUID says
anything up to 48 bits is fine) yet does fault because the shadow code
rejects anything above 44 bits.

>> Remove gfn_bits and rework its logic in terms of a guests maxphysaddr.
>> recalculate_cpuid_policy() is updated to properly clamp maxphysaddr between
>> the host maximum, a possibly-smaller shadow restriction, and 32 as a minimum.
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> @@ -502,11 +504,17 @@ void recalculate_cpuid_policy(struct domain *d)
>>      cpuid_featureset_to_policy(fs, p);
>> -    p->extd.maxphysaddr = min(p->extd.maxphysaddr, max->extd.maxphysaddr);
>> -    p->extd.maxphysaddr = min_t(uint8_t, p->extd.maxphysaddr,
>> -                                d->arch.paging.gfn_bits + PAGE_SHIFT);
>> -    p->extd.maxphysaddr = max_t(uint8_t, p->extd.maxphysaddr,
>> -                                (p->basic.pae || p->basic.pse36) ? 36 : 32);
>> +    maxphysaddr_limit = paging_mode_hap(d) ? hap_paddr_bits : paddr_bits;
>> +
>> +    if ( !IS_ENABLED(BIGMEM) && paging_mode_shadow(d) &&
>> +         (!is_pv_domain(d) || opt_allow_superpage) )
>> +    {
>> +        /* Shadowed superpages store GFNs in 32-bit page_info fields. */
>> +        maxphysaddr_limit = min(maxphysaddr_limit, 32U + PAGE_SHIFT);
> I don't like this implementation detail escaping from x86/mm/shadow;
> can we have a cpuid_limit_paddr_bits(d) that the shadow code can call
> to tell cpuid about this?  Or perhaps a paging_max_paddr_bits(d) so
> cpuid can ask paging when it needs to know?

I will see about doing this.

> In either case I wonder whether there needs to be some trigger of
> recalculate_cpuid_policy() once shadow mode is enabled -- I guess
> currently it relies on it being called by something late enough in hvm
> domain creation?

Yes, although now I am worried about migrating PV guests.  I will double
check to see whether that path works sensibly or not, because we can't
have maxphysaddr suddenly shrink under the feet of a running guest.

>> +    }
>> +
>> +    p->extd.maxphysaddr = min_t(uint8_t, p->extd.maxphysaddr, 
>> maxphysaddr_limit);
>> +    p->extd.maxphysaddr = max_t(uint8_t, p->extd.maxphysaddr, 32);
> No longer requiring 36 bits for pae/pse36?  This is mentioned in the
> description but not explained.

32/36 is only the default to be assumed if no maxphysaddr is provided. 
We always provide maxphysaddr, and passing 32 on a massive machine is

I don't think we can reasonably get away with offering a VM less than 32
(which is why the check exists), but it is perfectly reasonable to give
the VM 34 if we tell it that it only has 16GB of physical address space
to use.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.