[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/8] x86/shadow: properly handle get_page() failing


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Wed, 27 Jul 2022 12:46:50 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0qISryQ9rNGSgvE/QLjbWx0uIyXwCUUkFfqeYJOve8s=; b=REadaCjUO+0cxMeVaOPmyPKmY+ZF8mSMZgQsDb6WN5NzQSiX/RuLDMt1fPJzvYpJci/AGqcVmD4H7wCBstsXCowL4KccvaBQFURxyuMkT3F3q3Hh+VoU4FceWvqzo78e8GSVjGegpXnU2NOLQLwyNEuu5l/Fj26l8In0Q+0FLtJdSMqRsgKd3RB7MLjKB3EPzwb6mg4paOvot7jsqh6PEeztZbCRDXjG7v68DMhuor5H+45j1R+2Fudf2wTOxjO5JMHWNt2UcWOGe8WajA5EnYkhCpfJBaP2HuuQ/TfReWWqgRf4OhppXlfzw0Vsi0p7UujfzMbXSc3PbJbzj5a2fg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TUPIPub54G4MY8Aj2uIN+9Iw92XbOvPzkKeMqnGW3pRzYqoT8g3HNnrsBAa7eYj2uaAjHNOoGYa4+jKJqD9SFc1MFNvgUYupzs20NMT9rL1XCwrA2qug74Sy+uMNe0hIvlA5c2Hn0EMYOS6LCW6reetY7RUjPYKeCj8vqr9LtF6BOFv4WthDnk3i0gbRBEBvlwdK7gJr/4Pxncawig9Ael9/PCICq76l38a51oQOge/MN1H1lvorJQvY3Jf7ZFS8xuKucVtRtFTgzk7eSG+OpzvbBlrhJcavqu/L55skhN1AiJttvIFmgEfYNT5ZkQO1ulPRtODeaaxf2VJGoQYZ9w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxx>, "Tim (Xen.org)" <tim@xxxxxxx>
  • Delivery-date: Wed, 27 Jul 2022 12:47:12 +0000
  • Ironport-data: A9a23:THmPkKMfi7HOF/PvrR2JlsFynXyQoLVcMsEvi/4bfWQNrUpxhD1Tx zQXXWnXOP7cM2PwfY93bI609kgGusDcnd5jGwto+SlhQUwRpJueD7x1DKtR0wB+jCHnZBg6h ynLQoCYdKjYdleF+lH3dOCJQUBUjcmgXqD7BPPPJhd/TAplTDZJoR94kqsyj5UAbeKRWmthg vuv5ZyEULOZ82QsaDhMtPjc8EoHUMna41v0gHRvPZing3eG/5UlJMp3Db28KXL+Xr5VEoaSL woU5Ojklo9x105F5uKNyt4XQGVTKlLhFVHmZk5tc7qjmnB/Shkaic7XAha+hXB/0F1ll/gpo DlEWAfZpQ0BZsUgk8xFO/VU/r0X0QSrN9YrLFDm2fF/wXEqfFPH2edlNEQ3FLYA3cgrK2J8x fMVeAo0O0Xra+KemNpXS8FKr+F6dIzBGtxavXttizbEEfwhXJbPBb3Q4sNV1ysxgcYIGuvCY 80eanxkaxGojx9nYw9LTs5h2rjwwCCnK1W0q3rMzUYzy0HVwBZ8z/7GN93Nd8bRbc5UglyZt iTN+GGR7hQya4LCk2vfryvEaunnuQWjaYZRFLGD+fc13H6t7FQ9CDARbA7uyRW+ogvkMz5FE GQW8Cczqak59GSwU8LwGRa/pRasrhMaHtZdDeA+wAWM0bbPpRaUAHAeSTxMY8Bgs9U5LRQ10 neZktWvAiZg2JWJSHe15rqS6zSoNkAowXQqYCYFSU4P5YblqYRq1hbXFI87SOiyk8H/Hiz2z 3aSti8iir4PjMkNkaKm4VTAhDHqrZ/MJuIo2jjqsquexlsRTOaYi0aAsDA3Md4owF6lc2S8
  • Ironport-hdrordr: A9a23:o01+OK2rMsa1XBGmbhziaAqjBK8kLtp133Aq2lEZdPUzSL37qy nOpoV56faaslsssR0b9exoW5PhfZq/z/BICOAqVN/INjUO0FHYSb2KhrGC/9SPIULDytI=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYoQlhV37xX0Hh2kWLBKJVPAkGN62SK82A
  • Thread-topic: [PATCH 2/8] x86/shadow: properly handle get_page() failing

On 26/07/2022 17:04, Jan Beulich wrote:
> We should not blindly (in a release build) insert the new entry in the
> hash if a reference to the guest page cannot be obtained, or else an
> excess reference would be put when removing the hash entry again. Crash
> the domain in that case instead. The sole caller doesn't further care
> about the state of the guest page: All it does is return the
> corresponding shadow page (which was obtained successfully before) to
> its caller.
>
> To compensate we further need to adjust hash removal: Since the shadow
> page already has had its backlink set, domain cleanup code would try to
> destroy the shadow, and hence still cause a put_page() without
> corresponding get_page(). Leverage that the failed get_page() leads to
> no hash insertion, making shadow_hash_delete() no longer assume it will
> find the requested entry. Instead return back whether the entry was
> found. This way delete_shadow_status() can avoid calling put_page() in
> the problem scenario.
>
> For the other caller of shadow_hash_delete() simply reinstate the
> otherwise dropped assertion at the call site.
>
> While touching the conditionals in {set,delete}_shadow_status() anyway,
> also switch around their two pre-existing parts, to have the cheap one
> first (frequently allowing to avoid evaluation of the expensive - due to
> evaluate_nospec() - one altogether).
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, although with
some observations and a suggestion.

>
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -1581,7 +1581,7 @@ void shadow_hash_insert(struct domain *d
>      sh_hash_audit_bucket(d, key);
>  }
>  
> -void shadow_hash_delete(struct domain *d, unsigned long n, unsigned int t,
> +bool shadow_hash_delete(struct domain *d, unsigned long n, unsigned int t,
>                          mfn_t smfn)
>  /* Excise the mapping (n,t)->smfn from the hash table */
>  {
> @@ -1606,10 +1606,8 @@ void shadow_hash_delete(struct domain *d
>      {
>          /* Need to search for the one we want */
>          x = d->arch.paging.shadow.hash_table[key];
> -        while ( 1 )
> +        while ( x )
>          {
> -            ASSERT(x); /* We can't have hit the end, since our target is
> -                        * still in the chain somehwere... */
>              if ( next_shadow(x) == sp )
>              {
>                  x->next_shadow = sp->next_shadow;
> @@ -1617,10 +1615,14 @@ void shadow_hash_delete(struct domain *d
>              }
>              x = next_shadow(x);
>          }
> +        if ( !x )
> +            return false;
>      }

This entire if/else block is "delete matching element from single linked
list", opencoded in a very confusing way to follow.  I can't help but
feel there's a way to simplify it.  (Not in this patch, but for future
clean-up.)

>      set_next_shadow(sp, NULL);
>  
>      sh_hash_audit_bucket(d, key);
> +
> +    return true;
>  }
>  
>  typedef int (*hash_vcpu_callback_t)(struct vcpu *v, mfn_t smfn, mfn_t 
> other_mfn);
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -132,7 +132,8 @@ delete_fl1_shadow_status(struct domain *
>      SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%"PRI_mfn"\n",
>                     gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
>      ASSERT(mfn_to_page(smfn)->u.sh.head);
> -    shadow_hash_delete(d, gfn_x(gfn), SH_type_fl1_shadow, smfn);
> +    if ( !shadow_hash_delete(d, gfn_x(gfn), SH_type_fl1_shadow, smfn) )
> +        ASSERT_UNREACHABLE();

I'd recommend crashing the domain here too.  I know the we've already
got the return value we want, but this represents corruption in the
shadow pagetable metadata, and I highly doubt it is safe to let the
guest continue to execute in such circumstances.

>  }
>  
>  
> --- a/xen/arch/x86/mm/shadow/private.h
> +++ b/xen/arch/x86/mm/shadow/private.h
> @@ -375,7 +375,7 @@ shadow_size(unsigned int shadow_type)
>  mfn_t shadow_hash_lookup(struct domain *d, unsigned long n, unsigned int t);
>  void  shadow_hash_insert(struct domain *d,
>                           unsigned long n, unsigned int t, mfn_t smfn);
> -void  shadow_hash_delete(struct domain *d,
> +bool  shadow_hash_delete(struct domain *d,
>                           unsigned long n, unsigned int t, mfn_t smfn);
>  
>  /* shadow promotion */
> @@ -773,18 +773,19 @@ static inline void
>  set_shadow_status(struct domain *d, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
>  /* Put a shadow into the hash table */
>  {
> -    int res;
> -
>      SHADOW_PRINTK("d%d gmfn=%lx, type=%08x, smfn=%lx\n",
>                    d->domain_id, mfn_x(gmfn), shadow_type, mfn_x(smfn));
>  
>      ASSERT(mfn_to_page(smfn)->u.sh.head);
>  
>      /* 32-bit PV guests don't own their l4 pages so can't get_page them */
> -    if ( !is_pv_32bit_domain(d) || shadow_type != SH_type_l4_64_shadow )
> +    if ( (shadow_type != SH_type_l4_64_shadow || !is_pv_32bit_domain(d)) &&

This is the sensible way around anyway, but this is also a great example
of a predicate which really doesn't want to be eval_nospec().

We've got a growing pile of such usecases, so really do need to find a
predicate we can use which indicates "safely not speculation relevant".

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.