[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c



On 14/11/17 22:46, Boris Ostrovsky wrote:
> On 11/14/2017 04:11 AM, Juergen Gross wrote:
>> On 13/11/17 19:33, Stefano Stabellini wrote:
>>> On Mon, 13 Nov 2017, Juergen Gross wrote:
>>>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>>>> loop.
>>>>>>>
>>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>>>> loops.
>>>>>>>
>>>>>>> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
>>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>>>>>>> CC: boris.ostrovsky@xxxxxxxxxx
>>>>>>> CC: jgross@xxxxxxxx
>>>>>>> ---
>>>>>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>>>> index 0c1ec68..047dce7 100644
>>>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>>>                  * is set to NULL -- we only need to wait for the 
>>>>>>> existing
>>>>>>>                  * waiters to return.
>>>>>>>                  */
>>>>>>> -               while (!mutex_trylock(&map->active.in_mutex) ||
>>>>>>> -                          !mutex_trylock(&map->active.out_mutex))
>>>>>>> +               while (!mutex_trylock(&map->active.in_mutex))
>>>>>>> +                       cpu_relax();
>>>>>>> +               while (!mutex_trylock(&map->active.out_mutex))
>>>>>>>                         cpu_relax();
>>>>>> Any reason you don't just use mutex_lock()?
>>>>> Hi Juergen, sorry for the late reply.
>>>>>
>>>>> Yes, you are right. Given the patch, it would be just the same to use
>>>>> mutex_lock.
>>>>>
>>>>> This is where I realized that actually we have a problem: no matter if
>>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>>>> be the last to take the in/out_mutex. Other waiters could be still
>>>>> outstanding.
>>>>>
>>>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>>>> this case, I was thinking of reusing the mutex internal counter for
>>>>> efficiency, instead of adding one more refcount.
>>>>>
>>>>> For using the mutex as a refcount, there is really no need to call
>>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>>>> directly:
>>>>>
>>>>>
>>>>>           while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>>                  atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>>>                   cpu_relax();
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stefano
>>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>>>
>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>> loop.
>>>>>
>>>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>>>> releases it.
>>>>>
>>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>>>> refcount instead.
>>>>>
>>>>> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
>>>>> Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>>>>> CC: boris.ostrovsky@xxxxxxxxxx
>>>>> CC: jgross@xxxxxxxx
>>>>>
>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>> index 0c1ec68..9f33cb8 100644
>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>>>            * is set to NULL -- we only need to wait for the existing
>>>>>            * waiters to return.
>>>>>            */
>>>>> -         while (!mutex_trylock(&map->active.in_mutex) ||
>>>>> -                    !mutex_trylock(&map->active.out_mutex))
>>>>> +         while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>> +                atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>> I don't like this.
>>>>
>>>> Can't you use a kref here? Even if it looks like more overhead it is
>>>> much cleaner. There will be no questions regarding possible races,
>>>> while with an approach like yours will always smell racy (can't there
>>>> be someone taking the mutex just after above test?).
>>>>
>>>> In no case you should make use of the mutex internals.
>>> Boris' suggestion solves that problem well. Would you be OK with the
>>> proposed
>>>
>>>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>>               mutex_is_locked(&map->active.out_mutex.owner))
>>>                 cpu_relax();
>>>
>>> ?
>> I'm not convinced there isn't a race.
>>
>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>> sk_send_head and manages to test the mutex before the mutex is locked?
>>
>> Even in case this is impossible: the whole construct seems to be rather
>> fragile.
> 
> 
> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> not rely on mutex state.

Yes, this would work.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.