Xen project Mailing List

Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c

To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Juergen Gross <jgross@xxxxxxxx>

Date: Wed, 15 Nov 2017 09:14:42 +0100

Delivery-date: Wed, 15 Nov 2017 08:15:03 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 14/11/17 22:46, Boris Ostrovsky wrote: > On 11/14/2017 04:11 AM, Juergen Gross wrote: >> On 13/11/17 19:33, Stefano Stabellini wrote: >>> On Mon, 13 Nov 2017, Juergen Gross wrote: >>>> On 11/11/17 00:57, Stefano Stabellini wrote: >>>>> On Tue, 7 Nov 2017, Juergen Gross wrote: >>>>>> On 06/11/17 23:17, Stefano Stabellini wrote: >>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you >>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times >>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless >>>>>>> loop. >>>>>>> >>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate >>>>>>> loops. >>>>>>> >>>>>>> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> >>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> >>>>>>> CC: boris.ostrovsky@xxxxxxxxxx >>>>>>> CC: jgross@xxxxxxxx >>>>>>> --- >>>>>>> drivers/xen/pvcalls-front.c | 5 +++-- >>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c >>>>>>> index 0c1ec68..047dce7 100644 >>>>>>> --- a/drivers/xen/pvcalls-front.c >>>>>>> +++ b/drivers/xen/pvcalls-front.c >>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock) >>>>>>> * is set to NULL -- we only need to wait for the >>>>>>> existing >>>>>>> * waiters to return. >>>>>>> */ >>>>>>> - while (!mutex_trylock(&map->active.in_mutex) || >>>>>>> - !mutex_trylock(&map->active.out_mutex)) >>>>>>> + while (!mutex_trylock(&map->active.in_mutex)) >>>>>>> + cpu_relax(); >>>>>>> + while (!mutex_trylock(&map->active.out_mutex)) >>>>>>> cpu_relax(); >>>>>> Any reason you don't just use mutex_lock()? >>>>> Hi Juergen, sorry for the late reply. >>>>> >>>>> Yes, you are right. Given the patch, it would be just the same to use >>>>> mutex_lock. >>>>> >>>>> This is where I realized that actually we have a problem: no matter if >>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll >>>>> be the last to take the in/out_mutex. Other waiters could be still >>>>> outstanding. >>>>> >>>>> We solved the same problem using a refcount in pvcalls_front_remove. In >>>>> this case, I was thinking of reusing the mutex internal counter for >>>>> efficiency, instead of adding one more refcount. >>>>> >>>>> For using the mutex as a refcount, there is really no need to call >>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter >>>>> directly: >>>>> >>>>> >>>>> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL || >>>>> atomic_long_read(&map->active.out_mutex.owner) != 0UL) >>>>> cpu_relax(); >>>>> >>>>> Cheers, >>>>> >>>>> Stefano >>>>> >>>>> >>>>> --- >>>>> >>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c >>>>> >>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you >>>>> take in_mutex on the first try, but you can't take out_mutex. Next time >>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless >>>>> loop. >>>>> >>>>> Actually, we don't want to use mutex_trylock at all: we don't need to >>>>> take the mutex, we only need to wait until the last mutex waiter/holder >>>>> releases it. >>>>> >>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex >>>>> refcount instead. >>>>> >>>>> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> >>>>> Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> >>>>> CC: boris.ostrovsky@xxxxxxxxxx >>>>> CC: jgross@xxxxxxxx >>>>> >>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c >>>>> index 0c1ec68..9f33cb8 100644 >>>>> --- a/drivers/xen/pvcalls-front.c >>>>> +++ b/drivers/xen/pvcalls-front.c >>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock) >>>>> * is set to NULL -- we only need to wait for the existing >>>>> * waiters to return. >>>>> */ >>>>> - while (!mutex_trylock(&map->active.in_mutex) || >>>>> - !mutex_trylock(&map->active.out_mutex)) >>>>> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL || >>>>> + atomic_long_read(&map->active.out_mutex.owner) != 0UL) >>>> I don't like this. >>>> >>>> Can't you use a kref here? Even if it looks like more overhead it is >>>> much cleaner. There will be no questions regarding possible races, >>>> while with an approach like yours will always smell racy (can't there >>>> be someone taking the mutex just after above test?). >>>> >>>> In no case you should make use of the mutex internals. >>> Boris' suggestion solves that problem well. Would you be OK with the >>> proposed >>> >>> while(mutex_is_locked(&map->active.in_mutex.owner) || >>> mutex_is_locked(&map->active.out_mutex.owner)) >>> cpu_relax(); >>> >>> ? >> I'm not convinced there isn't a race. >> >> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only >> then in_mutex is taken. What happens if pvcalls_front_release() resets >> sk_send_head and manages to test the mutex before the mutex is locked? >> >> Even in case this is impossible: the whole construct seems to be rather >> fragile. > > > I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and > not rely on mutex state. Yes, this would work. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.