[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain

On Wed, Dec 19, 2018 at 12:38:50PM +0000, George Dunlap wrote:
> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
> > On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
> >> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> >>> paging_log_dirty_op function takes mm locks from a subject domain and
> >>> then attempts to perform copy to operations against the caller
> >>> domain in order to copy the result of the hypercall into the caller
> >>> provided buffer.
> >>>
> >>> This works fine when the caller is a non-paging domain, but triggers a
> >>> lock order panic when the caller is a paging domain due to the fact
> >>> that at the point where the copy to operation is performed the subject
> >>> domain paging lock is locked, and the copy operation requires locking
> >>> the caller p2m lock which has a lower level.
> >>>
> >>> Fix this limitation by adding a bias to the level of the caller domain
> >>> mm locks, so that the lower caller domain mm lock always has a level
> >>> greater than the higher subject domain lock level. This allows locking
> >>> the subject domain mm locks and then locking the caller domain mm
> >>> locks, while keeping the same lock ordering and the changes mostly
> >>> confined to mm-locks.h.
> >>>
> >>> Note that so far only this flow (locking a subject domain locks and
> >>> then the caller domain ones) has been identified, but not all possible
> >>> code paths have been inspected. Hence this solution attempts to be a
> >>> non-intrusive fix for the problem at hand, without discarding further
> >>> changes in the future if other valid code paths are found that require
> >>> more complex lock level ordering.
> >>>
> >>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>
> >> As a quick fix I think this general approach is OK; the thing I don't
> >> like is that it's symmetric.  We don't *expect* to ever have a situation
> >> where A grabs one of its own MM locks and then one of B's, *and* B then
> >> grabs one of its own locks and then A's; but it could happen.
> > 
> > I have not identified such scenario ATM, but we cannot discard future
> > features needing such interlocking I guess. In any case, I think this
> > is something that would have to be solved when we came across such
> > scenario IMO.
> Right -- and the purpose of these macros is to make sure that we
> discover such potential deadlocks in testing rather than in production.
> >> Since we've generally identified dom0 which may be grabbing locks of a
> >> PVH stubdom, which may be grabbing logs of a normal domU, would it be
> >> possible / make sense instead to give a 2x bonus for dom0, and a 1x
> >> bonus for "is_priv_for" domains?
> > 
> > Jan pointed out such case, but I'm not sure I can see how this is
> > supposedly to happen even given the scenario above, I have to admit
> > however I'm not that familiar with the mm code, so it's likely I'm
> > missing something.
> > 
> > Hypercalls AFAIK have a single target (or subject) domain, so even if
> > there's a stubdomain relation I'm not sure I see why that would
> > require this kind of locking, any domain can perform hypercalls
> > against a single subject domain, and the hypervisor itself doesn't
> > even know about stubdomain relations.
> We're considering three potential cases:
> A. dom0 makes a hypercall w/ domU as a target.
> B. dom0 makes a hypercall w/ stubdom as a target.
> c. stubdom makes a hypercall w/ domU as a target.
> We could give only dom0 a bonus.  In that case, A and B would work, but
> C might fail (since stubdom's lock values are the same as domU's).
> We could give both dom0 and stubdoms a bonus.  In that case, A and C
> would work, but B might fail (since the stubdom's lock values are the
> same as dom0's).
> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
> double bonus.  That way all 3 work, since dom0's lock values !=
> stubdom's lock values, and stubdom's lock values != domU's lock values.
> On the other hand, starting simple and adding things in as you find you
> need them isn't a bad approach; so possibly just giving a bonus to dom0
> is a good place to start.

IMO just giving a bonus to the caller domain (current->domain) is even
easier. I've only spotted a single case where there's such
interleaved domain locking, which is due to a copy_to into a caller
provided buffer while having some subject's domain mm locks taken.

On the line of your reply below, I would leave more complex locking
level adjustment to further patches if there's such a need.

> >> I also think it would be good to start recording specific instances of
> >> ordering requirements, so that we don't have to go track them down.
> >> (It's not immediately obvious to me, for instance, why the paging lock
> >> is so far down the list.)  Could you add somewhere into the comments in
> >> this section something like this:
> >>
> >> paging_log_dirty_op grabs tgt paging_lock, then does copy_to_user which
> >> grabs dom0's P2M lock.
> >>
> >> The other thing we could do is generate the lock level as (OWNER_TYPE <<
> >> 11) | (LOCK_TYPE << 5) | (domain_id), where OWNER_TYPE is 2 for dom0, 1
> >> for stub domains, 0 for normal domains, and LOCK_TYPE is the current
> >> lock level; and then fail if the new lock level >= current lock level.
> >> That would flag up any potential inter-domain deadlocks as well.
> > 
> > I'm not sure it will catch all inter-domain locks. For example Xen
> > would still allow to take the same lock type from a domain with a
> > higher domain_id than the currently locked one if both have the same
> If we wanted to ASSERT on all interdomain locks, we could just change
> the '>' to a '>=' and call it a day.  The idea instead is to allow
> inter-domain locking *if* it follows the "grab the lower domid lock
> first" discipline (in which case there won't be any deadlocks).
> But maybe that's beginning to get into 'feature creep'.  Best leave that
> for an improvement patch, not a bug fix.

As you say, I would rather perform such changes when there's a real
need for them.

Thanks, Roger.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.