[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/4] x86/mem_sharing: reorder when pages are unlocked and released



On Mon, Apr 29, 2019 at 9:44 AM George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
>
> On 4/26/19 6:21 PM, Tamas K Lengyel wrote:
> > Calling _put_page_type while also holding the page_lock
> > for that page can cause a deadlock.
>
> I can't immediately see what the mem_sharing_page_[un]lock() functions
> are meant to be protecting against.  Is there a comment anywhere
> describing what they're used for or how they work?

There are none. The whole subsystem and its of page_lock/unlock use is
undocumented. The lock is used to protect the page_sharing_info
structure from simultaneous updates (when the page is shared/unshared)
and also to make freeing that structure safe.

>
> Because...
>
> > Signed-off-by: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
> > Cc: Jan Beulich <jbeulich@xxxxxxxx>
> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
> > Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > ---
> > v3: simplified patch by keeping the additional references already in-place
> > ---
> >  xen/arch/x86/mm/mem_sharing.c | 19 ++++++++++---------
> >  1 file changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> > index dfc279d371..e2f74ac770 100644
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -648,10 +648,6 @@ static int page_make_private(struct domain *d, struct 
> > page_info *page)
> >          return -EBUSY;
> >      }
> >
> > -    /* We can only change the type if count is one */
> > -    /* Because we are locking pages individually, we need to drop
> > -     * the lock here, while the page is typed. We cannot risk the
> > -     * race of page_unlock and then put_page_type. */
>
> ...the comment you're removing explicitly says that what you're doing is
> "risk"-y.; but you don't explain at all why the comment is wrong.

AFAICT that comment was correct when it was added but since then it is
out-of-date. There is now an explicit switch case added to
_put_page_type that seem to handle this situation (case PGT_locked |
1) that was not there when this comment was introduced. That's what I
was able to decipher using git archeology. IMHO the whole
page_lock/put_type and reference counting is undecipherable (at least
I'm no closer to understanding it after staring at this for 2 weeks
now) and I wish there was a way to just getting rid of the whole thing
but unfortunately there doesn't seem to be one due to not being able
to easily grow page_info.

>
> Ultimately you're the maintainer, and this is non-security-supported, so
> if you insist it's safe, I won't oppose it; but it seems like having
> some clarity about what is or is not risky and why would be helpful.

I'm just trying to revive the subsystem as the way things are right
now it's broken. Doing this reorder makes the deadlock go away.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.