[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] rwlock: allow recursive read locking when already locked in write mode
On 21.02.2020 16:13, Jürgen Groß wrote: > On 21.02.20 15:51, Julien Grall wrote: >> >> >> On 21/02/2020 14:35, Jürgen Groß wrote: >>> On 21.02.20 15:32, Julien Grall wrote: >>>> >>>> >>>> On 21/02/2020 14:16, Jürgen Groß wrote: >>>>> On 21.02.20 15:11, Julien Grall wrote: >>>>>> Hi Juergen, >>>>>> >>>>>> On 21/02/2020 14:06, Jürgen Groß wrote: >>>>>>> On 21.02.20 14:49, Julien Grall wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 21/02/2020 13:46, Jürgen Groß wrote: >>>>>>>>> On 21.02.20 14:36, Jan Beulich wrote: >>>>>>>>>> On 21.02.2020 10:10, Roger Pau Monné wrote: >>>>>>>>>>> On Thu, Feb 20, 2020 at 07:20:06PM +0000, Julien Grall wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> On 20/02/2020 17:31, Roger Pau Monne wrote: >>>>>>>>>>>>> Allow a CPU already holding the lock in write mode to also >>>>>>>>>>>>> lock it in >>>>>>>>>>>>> read mode. There's no harm in allowing read locking a rwlock >>>>>>>>>>>>> that's >>>>>>>>>>>>> already owned by the caller (ie: CPU) in write mode. >>>>>>>>>>>>> Allowing such >>>>>>>>>>>>> accesses is required at least for the CPU maps use-case. >>>>>>>>>>>>> >>>>>>>>>>>>> In order to do this reserve 14bits of the lock, this allows >>>>>>>>>>>>> to support >>>>>>>>>>>>> up to 16384 CPUs. Also reduce the write lock mask to 2 bits: >>>>>>>>>>>>> one to >>>>>>>>>>>>> signal there are pending writers waiting on the lock and the >>>>>>>>>>>>> other to >>>>>>>>>>>>> signal the lock is owned in write mode. Note the write >>>>>>>>>>>>> related data >>>>>>>>>>>>> is using 16bits, this is done in order to be able to clear >>>>>>>>>>>>> it (and >>>>>>>>>>>>> thus release the lock) using a 16bit atomic write. >>>>>>>>>>>>> >>>>>>>>>>>>> This reduces the maximum number of concurrent readers from >>>>>>>>>>>>> 16777216 to >>>>>>>>>>>>> 65536, I think this should still be enough, or else the lock >>>>>>>>>>>>> field >>>>>>>>>>>>> can be expanded from 32 to 64bits if all architectures >>>>>>>>>>>>> support atomic >>>>>>>>>>>>> operations on 64bit integers. >>>>>>>>>>>> >>>>>>>>>>>> FWIW, arm32 is able to support atomic operations on 64-bit >>>>>>>>>>>> integers. >>>>>>>>>>>> >>>>>>>>>>>>> static inline void _write_unlock(rwlock_t *lock) >>>>>>>>>>>>> { >>>>>>>>>>>>> - /* >>>>>>>>>>>>> - * If the writer field is atomic, it can be cleared >>>>>>>>>>>>> directly. >>>>>>>>>>>>> - * Otherwise, an atomic subtraction will be used to >>>>>>>>>>>>> clear it. >>>>>>>>>>>>> - */ >>>>>>>>>>>>> - atomic_sub(_QW_LOCKED, &lock->cnts); >>>>>>>>>>>>> + /* Since the writer field is atomic, it can be cleared >>>>>>>>>>>>> directly. */ >>>>>>>>>>>>> + ASSERT(_is_write_locked_by_me(atomic_read(&lock->cnts))); >>>>>>>>>>>>> + BUILD_BUG_ON(_QR_SHIFT != 16); >>>>>>>>>>>>> + write_atomic((uint16_t *)&lock->cnts, 0); >>>>>>>>>>>> >>>>>>>>>>>> I think this is an abuse to cast an atomic_t() directly into >>>>>>>>>>>> a uint16_t. You >>>>>>>>>>>> would at least want to use &lock->cnts.counter here. >>>>>>>>>>> >>>>>>>>>>> Sure, I was wondering about this myself. >>>>>>>>>>> >>>>>>>>>>> Will wait for more comments, not sure whether this can be >>>>>>>>>>> fixed upon >>>>>>>>>>> commit if there are no other issues. >>>>>>>>>> >>>>>>>>>> It's more than just adding another field specifier here. A cast >>>>>>>>>> like >>>>>>>>>> this one is endianness-unsafe, and hence a trap waiting for a big >>>>>>>>>> endian port attempt to fall into. At the very least this should >>>>>>>>>> cause >>>>>>>>>> a build failure on big endian systems, even better would be if >>>>>>>>>> it was >>>>>>>>>> endianness-safe. >>>>>>>>> >>>>>>>>> Wouldn't a union be the better choice? >>>>>>>> >>>>>>>> You would not be able to use atomic_t in that case as you can't >>>>>>>> assume the layout of the structure. >>>>>>> >>>>>>> union rwlockword { >>>>>>> atomic_t cnts; >>>>>>> uint32_t val; >>>>>>> struct { >>>>>>> uint16_t write; >>>>>>> uint16_t readers; >>>>>>> }; >>>>>>> }; >>>>>>> >>>>>>> static inline const uint32_t _qr_bias( >>>>>>> const union rwlockword { >>>>>>> .write = 0, >>>>>>> .readers = 1 >>>>>>> } x; >>>>>>> >>>>>>> return x.val; >>>>>>> } >>>>>>> >>>>>>> ... >>>>>>> cnts = atomic_add_return(_qr_bias(), &lock->cnts); >>>>>>> ... >>>>>>> >>>>>>> I guess this should do the trick, no? >>>>>> >>>>>> You are assuming the atomic_t layout which I would rather no want >>>>>> to happen. >>>>> >>>>> That already happened. rwlock.h already assumes atomic_t to be 32 bits >>>>> wide. I agree it would be better to have an atomic32_t type for this >>>>> use case. >>>> >>>> I don't think you understood my point here. An atomic32_t will not be >>>> better as you still assume the layout of the structure. I.e the >>>> structure has only one field. >>> >>> I don't understand your reasoning here, sorry. >>> >>> Are you saying that a structure of two uint16_t isn't known to be 32 >>> bits long? >> >> You are assuming that atomic_t will always be: >> >> struct >> { >> int counter; >> } >> >> What if we decide to turn into >> >> struct >> { >> bool a; >> int counter; >> } > > As said before: then queue_write_lock_slowpath() is already broken. Why? The atomic_*() used would still only act on the counter field (for their actual operation, i.e. what matters to callers; the other field(s) could be statistics or whatever). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |