[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/grant-table: Properly acquire the vCPU maptrack freelist lock


  • To: Ruben Hakobyan <hakor@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 2 May 2023 10:38:48 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Xuhrjsvox4jEM2bJBgm4PuEwQYTn8ZcnwU6nTDq3t+Y=; b=Fp3h7WbOh3Hekd+IwRQV+qzZOnzdbfVVgN3zs7qbZ0r0JqrLYepW8teNZo49VIc6JMNN5tVc/z0zStWUy70WWAtM08dJ6PARKJXbK+XFMv0Y7FVXGq5hX0AuXG1ro1Oc0uciY1A/zIcmK6mdHku6EhoGXVk6umwZ9TlDnX/pyrcKzu2A7XU+v3X2fUKDDCNR62L/z87+59+GYqvqJUGqc/ccDp47Bjouxw3r9AMIyimQX5VJQ0mk9yMnY9R/oG4Bk+9XYK0a8D2A0OCk4li0KXcygbxbQ8nzF9usOlQLY/uHGUOd11GBfYTN5uSgqP0EBLi4+L3AcCHmoqpnbG1nbA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EgyPGO1kksQmiFNu/Kmo6csu45oBbMdf3mtqKjvozgwEIuyaupOZM2nP/H9lg2wpREH1kkTIgTKtH0zmOjgA7V1n+XQZNarmk1Kf+yMimm+yh6NX7UD9Ep+YMfbBjfdXRpfgRWtEKt0EOglhEvipF8CEmtFPeQg58ebKTygPazEvANqMqfjYLag2Lah0ZVvCxqaZFsv16WyRII5oZQxzEHnd7ADu6vEyv+OJO2opQufJ6hGAfQILyVhwhGxHwSB0lXscK80NTrqz7C0Lg9TBEldRRkIAFMhTAC7Y974J4qyk5A0RLc+VRMW25ENWYfupxwnh2pZNZ71MT+hXLG8uUA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 02 May 2023 08:39:05 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28.04.2023 12:26, Ruben Hakobyan wrote:
> Introduced as part of XSA-228, the maptrack_freelist_lock is meant to
> protect all accesses to entries in the vCPU freelist as well as the
> head and tail pointers.
> 
> However, this principle is violated twice in get_maptrack_handle(),
> where the tail pointer is directly accessed without taking the lock.
> The first occurrence is when stealing an extra entry for the tail
> pointer, and the second occurrence is when directly setting the tail of
> an empty freelist after allocating its first page.

I don't read out of the doc we have that this is a violation (the lock
still fully protects the list, just - as you say below - that in two
cases taking the lock isn't necessary to achieve that goal), and iirc
the relaxation was done quite deliberately. Did you, as an alternative,
consider making the doc more explicit?

> Make sure to correctly acquire the freelist lock before accessing and
> modifying the tail pointer to fully comply with XSA-228.
> 
> It should be noted that with the current setup, it is not possible for
> these accesses to race with anything. However, it is still important
> to correctly take the lock here to avoid any future possible races. For
> example, a race could be possible with put_maptrack_handle() if the
> maptrack code is modified to allow vCPU freelists to temporarily
> include handles not directly assigned to them in the maptrack.
> 
> Note that the tail and head pointers can still be accessed without
> taking the lock when initialising the freelist in grant_table_init_vcpu()
> as concurrent access will not be possible here.

This "no concurrent accesses" is the justification also for at least the
"set tail directly ..." case, aiui.

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -660,23 +660,27 @@ get_maptrack_handle(
>      if ( !new_mt )
>      {
>          spin_unlock(&lgt->maptrack_lock);
> +        handle = steal_maptrack_handle(lgt, curr);
> +        if ( handle == INVALID_MAPTRACK_HANDLE )
> +            return handle;
> +
> +        spin_lock(&curr->maptrack_freelist_lock);
> +        if ( curr->maptrack_tail != MAPTRACK_TAIL )
> +        {
> +            spin_unlock(&curr->maptrack_freelist_lock);
> +            return handle;
> +        }
>  
>          /*
>           * Uninitialized free list? Steal an extra entry for the tail
>           * sentinel.
>           */
> -        if ( curr->maptrack_tail == MAPTRACK_TAIL )
> -        {
> -            handle = steal_maptrack_handle(lgt, curr);
> -            if ( handle == INVALID_MAPTRACK_HANDLE )
> -                return handle;
> -            spin_lock(&curr->maptrack_freelist_lock);
> -            maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
> -            curr->maptrack_tail = handle;
> -            if ( curr->maptrack_head == MAPTRACK_TAIL )
> -                curr->maptrack_head = handle;
> -            spin_unlock(&curr->maptrack_freelist_lock);
> -        }
> +        maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
> +        curr->maptrack_tail = handle;
> +        if ( curr->maptrack_head == MAPTRACK_TAIL )
> +            curr->maptrack_head = handle;
> +        spin_unlock(&curr->maptrack_freelist_lock);
> +
>          return steal_maptrack_handle(lgt, curr);
>      }

While this transformation looks to provide the intended guarantees (but
the comment would then need re-wording some), ...

> @@ -696,8 +700,10 @@ get_maptrack_handle(
>      }
>  
>      /* Set tail directly if this is the first page for the local vCPU. */
> +    spin_lock(&curr->maptrack_freelist_lock);
>      if ( curr->maptrack_tail == MAPTRACK_TAIL )
>          curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
> +    spin_unlock(&curr->maptrack_freelist_lock);
>  
>      lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
>      smp_wmb();

... I don't think this change does: It now leaves the free list in a
transiently corrupt state until the freelist-lock is taken again a few
lines down and the list is further manipulated. Ftaod it's not the state
itself that I'm worried about - that remains as before - but the
supposed implication of the list now always being in consistent/correct
state when having successfully acquired the lock. (Iirc one of the goals
with omitting the lock here was to avoid lock nesting, even if this
nesting arrangement is permitted. Plus of course said observation that
acquiring the lock once here and then once again later in the function
doesn't really help. I don't recall though whether there was a reason
why setting the tail needs to be done right here. Yet if it was to be
moved into the locked region further down, it would need to be
clarified that such movement is correct/legitimate.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.