[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 9 Jan 2020 12:24:05 +0100
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 09 Jan 2020 11:24:29 +0000
  • Ironport-sdr: rFVfMACi1DB6viumUg9oEGN9XgnZs9Nh4J0DTjNtuUZd4WHDjFjZ36Rx4lyoXjNk1Wg2ZHyT2p /Sng7OH8fh6PSsW++fMPUJiZc8Otm2uJoFTymlOf0PtWRVmrEyIAzNrbIKjMzWwEyVcJgCtGYr OkdqEA4tRqJuUcfju+XxbgScDrckNCMhsjJi6DtlgocJzXmgE6mRQQfXJavn64D8H0IelRoWoO rtm/nwy1GKSiZ7R4EBLg8Lg/ZTDZve3KX0QIwz7ZLzSnCtqYvPaCLjIHsiNYEg1HOPJw4kw5Ly A44=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jan 09, 2020 at 10:43:01AM +0100, Jan Beulich wrote:
> On 08.01.2020 19:14, Roger Pau Monné wrote:
> > On Wed, Jan 08, 2020 at 02:54:57PM +0100, Jan Beulich wrote:
> >> On 08.01.2020 14:30, Roger Pau Monné  wrote:
> >>> On Fri, Jan 03, 2020 at 01:55:51PM +0100, Jan Beulich wrote:
> >>>> On 03.01.2020 13:34, Roger Pau Monné wrote:
> >>>>> On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
> >>>>>> On 24.12.2019 13:44, Roger Pau Monne wrote:
> >>>>>> Further a question on lock nesting: Since the commit message
> >>>>>> doesn't say anything in this regard, did you check there are no
> >>>>>> TLB flush invocations with the get_cpu_maps() lock held?
> >>>>>
> >>>>> The CPU maps lock is a recursive one, so it should be fine to attempt
> >>>>> a TLB flush with the lock already held.
> >>>>
> >>>> When already held by the same CPU - sure. It being a recursive
> >>>> one (which I paid attention to when writing my earlier reply)
> >>>> doesn't make it (together with any other one) immune against
> >>>> ABBA deadlocks, though.
> >>>
> >>> There's no possibility of a deadlock here because get_cpu_maps does a
> >>> trylock, so if another cpu is holding the lock the flush will just
> >>> fallback to not using the shorthand.
> >>
> >> Well, with the _exact_ arrangements (flush_lock used only in one
> >> place, and cpu_add_remove_lock only used in ways similar to how it
> >> is used now), there's no such risk, I agree. But there's nothing
> >> at all making sure this doesn't change. Hence, as said, at the very
> >> least this needs reasoning about in the description (or a code
> >> comment).
> > 
> > I'm afraid you will have to bear with me, but I'm still not sure how
> > the addition of get_cpu_maps in flush_area_mask can lead to deadlocks.
> > As said above get_cpu_maps does a trylock, which means that it will
> > never deadlock, and that's the only way to lock cpu_add_remove_lock.
> 
> That's why I said "cpu_add_remove_lock only used in ways similar to
> how it is used now" - any non-trylock addition would break the
> assumptions you make afaict. And you can't really expect people
> adding another (slightly different) use of an existing lock to be
> aware they now need to check whether this introduces deadlock
> scenarios on unrelated pre-existing code paths. Hence my request to
> at least discuss this in the description (raising awareness, and
> hence allowing reviewers to judge whether further precautionary
> measures should be taken).

Thanks for the clarification, I did indeed misunderstood your
complain.

Regarding the generalization of the shorthand and integration into
send_IPI_mask, I've found an issue related to locking. send_IPI_mask
is called with both interrupts enabled and disabled, and so
get_cpu_maps panics because of the lock checking.

I however don't think that such panic is accurate: the logic in
check_lock specifically relates to the spinning done when picking a
lock, but I would say the call to check_lock in
_spin_trylock{_recursive} is not strictly needed, the scenario
described in check_lock doesn't apply when using trylock.

So my question is whether you would be OK to remove the calls to
check_lock in the trylock variants, or would you rather keep the
shorthand usage limited to flush_area_mask.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.