[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2][4.15?] x86/shadow: encode full GFN in magic MMIO entries



On 05.03.2021 17:32, Andrew Cooper wrote:
> On 05/03/2021 15:37, Jan Beulich wrote:
>> Since we don't need to encode all of the PTE flags, we have enough bits
>> in the shadow entry to store the full GFN. Don't use literal numbers -
>> instead derive the involved values. Or, where derivation would become
>> too ugly, sanity-check the result (invoking #error to identify failure).
>>
>> This then allows dropping from sh_l1e_mmio() again the guarding against
>> too large GFNs.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> I wonder if the respective check in sh_audit_l1_table() is actually
>> useful to retain with these changes.
>>
>> --- a/xen/arch/x86/mm/shadow/types.h
>> +++ b/xen/arch/x86/mm/shadow/types.h
>> @@ -283,9 +283,17 @@ shadow_put_page_from_l1e(shadow_l1e_t sl
>>   * This is only feasible for PAE and 64bit Xen: 32-bit non-PAE PTEs don't
>>   * have reserved bits that we can use for this.  And even there it can only
>>   * be used if we can be certain the processor doesn't use all 52 address 
>> bits.
>> + *
>> + * For the MMIO encoding (see below) we need the bottom 4 bits for
>> + * identifying the kind of entry and a full GFN's worth of bits to encode
>> + * the originating frame number.  Set all remaining bits to trigger
>> + * reserved bit faults, if (see above) the hardware permits triggering such.
>>   */
>>  
>> -#define SH_L1E_MAGIC 0xffffffff00000001ULL
>> +#define SH_L1E_MAGIC_NR_META_BITS 4
>> +#define SH_L1E_MAGIC_MASK ((~0ULL << (PADDR_BITS - PAGE_SHIFT + \
>> +                                      SH_L1E_MAGIC_NR_META_BITS)) | \
>> +                           _PAGE_PRESENT)
>>  
>>  static inline bool sh_have_pte_rsvd_bits(void)
>>  {
>> @@ -294,7 +302,8 @@ static inline bool sh_have_pte_rsvd_bits
>>  
>>  static inline bool sh_l1e_is_magic(shadow_l1e_t sl1e)
>>  {
>> -    return (sl1e.l1 & SH_L1E_MAGIC) == SH_L1E_MAGIC;
>> +    BUILD_BUG_ON(!(PADDR_MASK & SH_L1E_MAGIC_MASK));
>> +    return (sl1e.l1 & SH_L1E_MAGIC_MASK) == SH_L1E_MAGIC_MASK;
>>  }
>>  
>>  /* Guest not present: a single magic value */
>> @@ -320,20 +329,26 @@ static inline bool sh_l1e_is_gnp(shadow_
>>  
>>  /*
>>   * MMIO: an invalid PTE that contains the GFN of the equivalent guest l1e.
>> - * We store 28 bits of GFN in bits 4:32 of the entry.
>> + * We store the GFN in bits 4:43 of the entry.
>>   * The present bit is set, and the U/S and R/W bits are taken from the 
>> guest.
>>   * Bit 3 is always 0, to differentiate from gnp above.
>>   */
>> -#define SH_L1E_MMIO_MAGIC       0xffffffff00000001ULL
>> -#define SH_L1E_MMIO_MAGIC_MASK  0xffffffff00000009ULL
>> -#define SH_L1E_MMIO_GFN_MASK    0x00000000fffffff0ULL
>> +#define SH_L1E_MMIO_MAGIC       SH_L1E_MAGIC_MASK
>> +#define SH_L1E_MMIO_MAGIC_BIT   ((_PAGE_PRESENT | _PAGE_RW | _PAGE_USER) + 
>> 1)
>> +#if SH_L1E_MMIO_MAGIC_BIT & (SH_L1E_MMIO_MAGIC_BIT - 1)
>> +# error SH_L1E_MMIO_MAGIC_BIT needs to be a power of 2
>> +#endif
>> +#if SH_L1E_MMIO_MAGIC_BIT >> SH_L1E_MAGIC_NR_META_BITS
>> +# error SH_L1E_MMIO_MAGIC_BIT and SH_L1E_MAGIC_NR_META_BITS are out of sync
>> +#endif
>> +#define SH_L1E_MMIO_MAGIC_MASK  (SH_L1E_MAGIC_MASK | SH_L1E_MMIO_MAGIC_BIT)
>> +#define SH_L1E_MMIO_GFN_MASK    ~(SH_L1E_MMIO_MAGIC_MASK | _PAGE_RW | 
>> _PAGE_USER)
> 
> In practice, it is 4:36, because we have to limit shadow guests to 32
> bits of gfn for XSA-173 (width of the superpage backpointer IIRC).

When !BIGMEM - yes.

> Also, this property is important for L1TF.  The more guest-controllable
> bits we permit in here, the greater the chance of being vulnerable to
> L1TF on massive machines.
> 
> (I'm a little concerned that I can't spot an L1TF comment which has been
> made stale by these changes...  Probably the fault of whichever numpty
> prepared the L1TF patches, because I'm certain we discussed this at the
> time)
> 
> Going from 32 to 36 bits moves the upper safety barrier from TOP-4G to
> TOP-64G but I recall us agreed that that was ok, especially as the main
> safety guestimate is "no RAM in the top quarter of the address space".
> 
> However, I don't think we want to accidentally creep beyond bit 36, so
> I'd suggest that the easy fix here is just adjusting a nibble in the
> MMIO masks.

With BIGMEM I'm not sure we want to be this strict. Nor do we need
to as long as we only need 4 bits at the bottom - we only go up to
bit 43 with what we allow guests control over. IOW we will need to
be careful on old hardware when l1d_maxphysaddr == 44, but on
anything newer we're still far enough away I would think. So I
guess instead of outright dropping the GFN check from sh_l1e_mmio()
I want to replace it by use of is_l1tf_safe_maddr() (on the
produced shadow_l1e_t).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.