[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 9 Jan 2020 13:20:31 +0100
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 09 Jan 2020 12:21:06 +0000
  • Ironport-sdr: qtuSyNNSb32h2mEl6YpjwEqBulDP+JQDwH69MQdxmIEOa+Uto7KCzzTq//VKUntGpBBZDEsWoi a6U41wp0NhqVqQvfOC31VFch/0P1orMhhqUZ8iV4cydmwaYMdx9o4R4XloGQwlOapS17pprEIz HV729nVUWdeiFzNxkTYOzX4ZHcYDE+FP8Jkc1oyi6L9omW5w8/9qjSAMjzcvfd7oWsrFBRdw+T 6K3qyiagSeDrGMhMxOxS3x9WaYfQRbOYEcf7351ExUe/cIEe8hWzm/Hv1bCPnvlBZHVJAMngJW 7oQ=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jan 09, 2020 at 12:24:05PM +0100, Roger Pau Monné wrote:
> On Thu, Jan 09, 2020 at 10:43:01AM +0100, Jan Beulich wrote:
> > On 08.01.2020 19:14, Roger Pau Monné wrote:
> > > On Wed, Jan 08, 2020 at 02:54:57PM +0100, Jan Beulich wrote:
> > >> On 08.01.2020 14:30, Roger Pau Monné  wrote:
> > >>> On Fri, Jan 03, 2020 at 01:55:51PM +0100, Jan Beulich wrote:
> > >>>> On 03.01.2020 13:34, Roger Pau Monné wrote:
> > >>>>> On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
> > >>>>>> On 24.12.2019 13:44, Roger Pau Monne wrote:
> > >>>>>> Further a question on lock nesting: Since the commit message
> > >>>>>> doesn't say anything in this regard, did you check there are no
> > >>>>>> TLB flush invocations with the get_cpu_maps() lock held?
> > >>>>>
> > >>>>> The CPU maps lock is a recursive one, so it should be fine to attempt
> > >>>>> a TLB flush with the lock already held.
> > >>>>
> > >>>> When already held by the same CPU - sure. It being a recursive
> > >>>> one (which I paid attention to when writing my earlier reply)
> > >>>> doesn't make it (together with any other one) immune against
> > >>>> ABBA deadlocks, though.
> > >>>
> > >>> There's no possibility of a deadlock here because get_cpu_maps does a
> > >>> trylock, so if another cpu is holding the lock the flush will just
> > >>> fallback to not using the shorthand.
> > >>
> > >> Well, with the _exact_ arrangements (flush_lock used only in one
> > >> place, and cpu_add_remove_lock only used in ways similar to how it
> > >> is used now), there's no such risk, I agree. But there's nothing
> > >> at all making sure this doesn't change. Hence, as said, at the very
> > >> least this needs reasoning about in the description (or a code
> > >> comment).
> > > 
> > > I'm afraid you will have to bear with me, but I'm still not sure how
> > > the addition of get_cpu_maps in flush_area_mask can lead to deadlocks.
> > > As said above get_cpu_maps does a trylock, which means that it will
> > > never deadlock, and that's the only way to lock cpu_add_remove_lock.
> > 
> > That's why I said "cpu_add_remove_lock only used in ways similar to
> > how it is used now" - any non-trylock addition would break the
> > assumptions you make afaict. And you can't really expect people
> > adding another (slightly different) use of an existing lock to be
> > aware they now need to check whether this introduces deadlock
> > scenarios on unrelated pre-existing code paths. Hence my request to
> > at least discuss this in the description (raising awareness, and
> > hence allowing reviewers to judge whether further precautionary
> > measures should be taken).
> 
> Thanks for the clarification, I did indeed misunderstood your
> complain.
> 
> Regarding the generalization of the shorthand and integration into
> send_IPI_mask, I've found an issue related to locking. send_IPI_mask
> is called with both interrupts enabled and disabled, and so
> get_cpu_maps panics because of the lock checking.
> 
> I however don't think that such panic is accurate: the logic in
> check_lock specifically relates to the spinning done when picking a
> lock, but I would say the call to check_lock in
> _spin_trylock{_recursive} is not strictly needed, the scenario
> described in check_lock doesn't apply when using trylock.

Never mind, this is not actually true. You can still trigger the
deadlock if you mix trylock with regular locking, so I guess I will
leave the shorthand usage to flush_area_mask unless I can make all the
callers of send_IPI_mask use the same irq status.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.