[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c



On Wed, 15 Nov 2017, Stefano Stabellini wrote:
> On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
> > On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
> > > On Wed, 15 Nov 2017, Juergen Gross wrote:
> > >>>>>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
> > >>>>>               mutex_is_locked(&map->active.out_mutex.owner))
> > >>>>>                 cpu_relax();
> > >>>>>
> > >>>>> ?
> > >>>> I'm not convinced there isn't a race.
> > >>>>
> > >>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and 
> > >>>> only
> > >>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
> > >>>> sk_send_head and manages to test the mutex before the mutex is locked?
> > >>>>
> > >>>> Even in case this is impossible: the whole construct seems to be rather
> > >>>> fragile.
> > > I agree it looks fragile, and I agree that it might be best to avoid the
> > > usage of in_mutex and out_mutex as refcounts. More comments on this
> > > below.
> > >
> > >  
> > >>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> > >>> not rely on mutex state.
> > >> Yes, this would work.
> > > Yes, I agree it would work and for the sake of getting something in
> > > shape for the merge window I am attaching a patch for it. Please go
> > > ahead with it. Let me know if you need anything else immediately, and
> > > I'll work on it ASAP.
> > >
> > >
> > >
> > > However, I should note that this is a pretty big hammer we are using:
> > > the refcount is global, while we only need to wait until it's only us
> > > _on this specific socket_.
> > 
> > Can you explain why socket is important?
> 
> Yes, of course: there are going to be many open sockets on a given
> pvcalls connection. pvcalls_refcount is global: waiting on
> pvcalls_refcount means waiting until any operations on any unrelated
> sockets stop. While we only need to wait until the operations on the one
> socket we want to close stop.
> 
> 
> > >
> > > We really need a per socket refcount. If we don't want to use the mutex
> > > internal counters, then we need another one.
> > >
> > > See the appended patch that introduces a per socket refcount. However,
> > > for the merge window, also using pvcalls_refcount is fine.
> > >
> > > The race Juergen is concerned about is only theoretically possible:
> > >
> > > recvmsg:                                 release:
> > >   
> > >   test sk_send_head                      clear sk_send_head
> > >   <only few instructions>                <prepare a message>
> > >   grab in_mutex                          <send a message to the server>
> > >                                          <wait for an answer>
> > >                                          test in_mutex
> > >
> > > Without kernel preemption is not possible for release to clear
> > > sk_send_head and test in_mutex after recvmsg tests sk_send_head and
> > > before recvmsg grabs in_mutex.
> > 
> > Sorry, I don't follow --- what does preemption have to do with this? If
> > recvmsg and release happen on different processors the order of
> > operations can be
> > 
> > CPU0                                   CPU1
> > 
> > test sk_send_head
> > <interrupt>
> >                                         clear sk_send_head
> >                                         <send a message to the server>
> >                                         <wait for an answer>
> >                                         test in_mutex
> >                                         free everything
> > grab in_mutex
> > 
> > I actually think RCU should take care of all of this.
> 
> Preemption could cause something very similar to happen, but your
> example is very good too, even better, because it could trigger the
> issue even with preemption disabled. I'll think more about this and
> submit a separate patch on top of the simple pvcalls_refcount patch
> below.
> 
> 
> 
> > But for now I will take your refcount-based patch. However, it also
> > needs comment update.
> > 
> > How about
> > 
> > /*
> >  * We need to make sure that send/rcvmsg on this socket has not started
> >  * before we've cleared sk_send_head here. The easiest (though not optimal)
> >  * way to guarantee this is to see that no pvcall (other than us) is in
> > progress.
> >  */
> 
> Yes, this is the patch:
> 
> ---
> 
> 
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
> 
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next times
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
> 
> Solve the problem by waiting until the global refcount is 1 instead (the
> refcount is 1 when the only active pvcalls frontend function is
> pvcalls_front_release).
> 
> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> CC: boris.ostrovsky@xxxxxxxxxx
> CC: jgross@xxxxxxxx
> 
> 
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..54c0fda 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1043,13 +1043,12 @@ int pvcalls_front_release(struct socket *sock)
>               wake_up_interruptible(&map->active.inflight_conn_req);
>  
>               /*
> -              * Wait until there are no more waiters on the mutexes.
> -              * We know that no new waiters can be added because sk_send_head
> -              * is set to NULL -- we only need to wait for the existing
> -              * waiters to return.
> -              */
> -             while (!mutex_trylock(&map->active.in_mutex) ||
> -                        !mutex_trylock(&map->active.out_mutex))
> +              * We need to make sure that sendmsg/recvmsg on this socket have
> +              * not started before we've cleared sk_send_head here. The
> +              * easiest (though not optimal) way to guarantee this is to see
> +              * that no pvcall (other than us) is in progress.
> +             */
> +             while (atomic_read(&pvcalls_refcount) > 1)
>                       cpu_relax();
>  
>               pvcalls_front_free_map(bedata, map);


Sorry, code style issue: one missing space in the comment. I'll send it
again separately.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.