[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

x86/mm lock order enforcement

  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 30 Sep 2021 12:02:18 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=UjD7HxCfcb4pbywXa2JcJ2B7izfbj4CYQEkWazOF0kc=; b=kJf9r6Vlaya/SlARVC3rk+ZNMY/sYlyhmFxqHUJ+V88HhH+2xe3T/l+HKFdiZciI+IlrAjsAohlo/4usbWZLsisMUA3oFNpazhk8QaIa+uNOo1cSw5EyLJr62Jji5pRzfD4QBuXnxpJSMPx2FfDuDcT8eEy4vvl4KPuYZeQ8xuAhB1YHEFOuiPEifmqR6Q3IzQopZ6YnvUS5ZDB7NgBHvBGXsJLa8K7H1inEAX2qPHsXAOkKCrlYwoPY0AChr9bII92c7nCLYsZcxwLJrYeBt9Pt9DiwStulZ2ad3oee1/oKeIUIsQpJ/JQGnnUslZyqxF1Or7dfU3cDCXtw8I7hcA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KTxULLFYfRsnAvrvBXSGQGZtJoBLb3JA+t/fvFjf/TgiM2ugxK1zTr3icFtufi/jNpcuKj3e42fAHljhJVvrL2W+Tr9JaEge36Os4VC0BUIGiM3QwH8TNDsTVmRS2kqwHhTPO/YlyYK5CbTyTaaQOnMBxElw3oNwZBx1EqtscNxLBOyIw1wUZe63+DmiLdEbZ9DThAqJcz7Bk9tsiAcKKo7apZm33UrlDO3/bac1+Hb7aECjxvbLrwg35dUnvsUkrGE61ChveB1VfJwbVKOhgl0TPyx/pufKkfoyiFfCcanSQMC0q4z2izU2J+UPiUWT29LipIlfhZCFJdtQtMsIlA==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>
  • Delivery-date: Thu, 30 Sep 2021 10:02:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


the issue pointed out in [1] (and hence the proposed fix in a reply) is
actually pointing out a bigger problem, addressing of which may also
invalidate that proposed fix.

First, fundamentally any code path reaching p2m_pt_set_entry() (i.e. in
particular any caller of p2m_set_entry()) is liable to hit the same
lock order violation if it holds _any_ of the locks numbered higher than
the p2m one in mm-locks.h, due to write_p2m_entry()'s call to

Similarly I think ept_set_entry() -> ept_sync_domain() ->
ept_sync_domain_prepare() -> p2m_flush_nestedp2m() is affected.

But what's imo worse - there's no enforcement (and afaics not even a
rule written down) when it comes to the many p2m locks associated with
a single domain: The host p2m has one, and every altp2m as well as every
nested p2m have one, too. _check_lock_level() only checks whether the
already held lock has a strictly lower number in the scheme, so all of
these p2m locks are only put in just two separate groups, the members of
which are then all equal right now. The two groups are altp2m (all the
altp2m locks) and the rest (host p2m and all nested p2ms). Within a
group the various locks can be freely acquired in any order without the
resulting issue getting pointed out.

A similar issue looks to have been recognized for mem-sharing's per-
page locking: These also all fall into a single group. share_pages()
has special logic to guarantee uniform order of acquires. (I don't
think the comment in mm-locks.h is quite right there, though: "We
enforce order here against the p2m lock, which is taken after the
page_lock to change the gfn's p2m entry." To me this contradicts both
the code in share_pages() and the numbering in the scheme.)

A first step here might be to have _check_lock_level() consider not
only the level, but also the involved lock. Since recursive acquires
are frequently intended, the two involved levels being equal would
only be permitted if it is the same lock that gets used.

Otoh it may be somehow implicit that all of the altp2m and similarly
all of the nested p2m locks may not nest with one another, in which
case it might be good enough to separate the nested p2m ones from the
host p2m one (just like the altp2m ones have their own group). If so,
I'm afraid I would see no way for myself to determine where in the
numbering scheme they should be placed - them currently sharing
position with the host p2m lock suggests that any lock numbered
higher may already get acquired somewhere with a nested p2m lock held.
Yet to address the original report they would need to go above of at
least the PoD lock.

Thoughts anyone?


[1] https://lists.xen.org/archives/html/xen-devel/2021-09/msg02231.html



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.