[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain

On Wed, Dec 19, 2018 at 02:46:55PM +0000, George Dunlap wrote:
> On 12/19/18 2:40 PM, Roger Pau Monné wrote:
> > On Wed, Dec 19, 2018 at 12:38:50PM +0000, George Dunlap wrote:
> >> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
> >>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
> >>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> >>>>> paging_log_dirty_op function takes mm locks from a subject domain and
> >>>>> then attempts to perform copy to operations against the caller
> >>>>> domain in order to copy the result of the hypercall into the caller
> >>>>> provided buffer.
> >>>>>
> >>>>> This works fine when the caller is a non-paging domain, but triggers a
> >>>>> lock order panic when the caller is a paging domain due to the fact
> >>>>> that at the point where the copy to operation is performed the subject
> >>>>> domain paging lock is locked, and the copy operation requires locking
> >>>>> the caller p2m lock which has a lower level.
> >>>>>
> >>>>> Fix this limitation by adding a bias to the level of the caller domain
> >>>>> mm locks, so that the lower caller domain mm lock always has a level
> >>>>> greater than the higher subject domain lock level. This allows locking
> >>>>> the subject domain mm locks and then locking the caller domain mm
> >>>>> locks, while keeping the same lock ordering and the changes mostly
> >>>>> confined to mm-locks.h.
> >>>>>
> >>>>> Note that so far only this flow (locking a subject domain locks and
> >>>>> then the caller domain ones) has been identified, but not all possible
> >>>>> code paths have been inspected. Hence this solution attempts to be a
> >>>>> non-intrusive fix for the problem at hand, without discarding further
> >>>>> changes in the future if other valid code paths are found that require
> >>>>> more complex lock level ordering.
> >>>>>
> >>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>>>
> >>>> As a quick fix I think this general approach is OK; the thing I don't
> >>>> like is that it's symmetric.  We don't *expect* to ever have a situation
> >>>> where A grabs one of its own MM locks and then one of B's, *and* B then
> >>>> grabs one of its own locks and then A's; but it could happen.
> >>>
> >>> I have not identified such scenario ATM, but we cannot discard future
> >>> features needing such interlocking I guess. In any case, I think this
> >>> is something that would have to be solved when we came across such
> >>> scenario IMO.
> >>
> >> Right -- and the purpose of these macros is to make sure that we
> >> discover such potential deadlocks in testing rather than in production.
> >>
> >>>> Since we've generally identified dom0 which may be grabbing locks of a
> >>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
> >>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
> >>>> bonus for "is_priv_for" domains?
> >>>
> >>> Jan pointed out such case, but I'm not sure I can see how this is
> >>> supposedly to happen even given the scenario above, I have to admit
> >>> however I'm not that familiar with the mm code, so it's likely I'm
> >>> missing something.
> >>>
> >>> Hypercalls AFAIK have a single target (or subject) domain, so even if
> >>> there's a stubdomain relation I'm not sure I see why that would
> >>> require this kind of locking, any domain can perform hypercalls
> >>> against a single subject domain, and the hypervisor itself doesn't
> >>> even know about stubdomain relations.
> >>
> >> We're considering three potential cases:
> >>
> >> A. dom0 makes a hypercall w/ domU as a target.
> >> B. dom0 makes a hypercall w/ stubdom as a target.
> >> c. stubdom makes a hypercall w/ domU as a target.
> >>
> >> We could give only dom0 a bonus.  In that case, A and B would work, but
> >> C might fail (since stubdom's lock values are the same as domU's).
> >>
> >> We could give both dom0 and stubdoms a bonus.  In that case, A and C
> >> would work, but B might fail (since the stubdom's lock values are the
> >> same as dom0's).
> >>
> >> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
> >> double bonus.  That way all 3 work, since dom0's lock values !=
> >> stubdom's lock values, and stubdom's lock values != domU's lock values.
> >>
> >> On the other hand, starting simple and adding things in as you find you
> >> need them isn't a bad approach; so possibly just giving a bonus to dom0
> >> is a good place to start.
> > 
> > IMO just giving a bonus to the caller domain (current->domain) is even
> > easier. I've only spotted a single case where there's such
> > interleaved domain locking, which is due to a copy_to into a caller
> > provided buffer while having some subject's domain mm locks taken.
> > 
> > On the line of your reply below, I would leave more complex locking
> > level adjustment to further patches if there's such a need.
> Not sure how it's easier -- one is (current == d), the other is
> is_control_domain(d).

Using is_control_domain will limit the usage of paging_log_dirty_op to
Dom0 (because that's the only domain that has is_privileged set,
there's no way for the toolstack to create a domain with is_privileged
set), no other paging domain will be able to use such hypercall
without triggering the lock level check. Is that really what we

I assume that under a distributed Xen system it would be possible to
issue this hypercall from a domain different that Dom0 if it's given
the right privileges?

> Using 'current' means that potential deadlocks which would now cause a
> BUG() won't anymore.  I'm fine with not adding extra protections that
> aren't there now; but I don't want to remove protections that are.

The lock ordering enforcement is still kept as-is, but Xen is allowed
to lock the caller mm locks in the right order (be it privileged or
not) after having locked a subject domain ones also in the correct

Thanks, Roger.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.