[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] altp2m: Allow the hostp2m to be shared
On Wed, May 25, 2016 at 5:58 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> wrote: > On Wed, May 25, 2016 at 10:08 AM, George Dunlap > <george.dunlap@xxxxxxxxxx> wrote: >> On Wed, May 25, 2016 at 4:31 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> wrote: >>> >>> On May 25, 2016 05:27, "George Dunlap" <george.dunlap@xxxxxxxxxx> wrote: >>>> >>>> On Fri, Apr 29, 2016 at 6:42 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> >>>> wrote: >>>> > Don't propagate altp2m changes from ept_set_entry for memshare as >>>> > memshare >>>> > already has the lock. We call altp2m propagate changes once memshare >>>> > successfully finishes. Allow the hostp2m entries to be of type >>>> > p2m_ram_shared when applying mem_access. Also, do not trigger PoD for >>>> > hostp2m >>>> > when setting altp2m mem_access to be in-line with non-altp2m mem_access >>>> > path. >>>> >>>> Hey Tamas, >>>> >>>> Sorry for the long delay in getting back to you on this. >>> >>> No problem, thanks for taking a closer look! >>> >>>> >>>> So the main issue here (correct me if I'm wrong) is the locking >>>> discipline: namely, men_sharing_share_pages(): >>>> - Grabs the hostp2m lock >>>> - Grabs the appropriate domain memsharing locks >>>> - Calls set_shared_p2m_entry(), which ends up calling ept_set_entry(), >>>> which (when altp2m is active) grabs the altp2mlist and altp2m locks. >>>> >>>> This causes an ASSERT(), since the altp2mlist lock is ahead of the >>>> memsharing locks in the list. >>>> >>>> But having taken a closer look at the code, I'm not sure the change is >>>> quite correct. Please correct me if I've misread something: >>>> >>>> mem_sharing_share_pages() is passed two <domain,gfn> pairs -- the >>>> <sd,sgfn> (which I assume stands for "shared gfn") and <cd,cgfn> >>>> (which I assume stands for "copy"); and it >>> >>> Here s/c stands for source/client. >>> >>>> 1) Looks up smfn and cmfn, which back sgfn and cmfn respectively >>>> 2) Looks up cmfn, which backs cgfn then replaces all gfn entries which >>>> point to cmfn with smfn (updating accounting as appropriate) >>> >>> Hm, I might have missed that. Where does it do the lookup for all other >>> cgfns backed by this cmfn? >> >> I was looking at the loop in the middle of the function: >> >> while ( (gfn = rmap_iterate(cpage, &ri)) != NULL) { >> ... >> } >> >> I haven't chased it down, but it looks like this walks the reverse map >> of all gfns which map cpage; and for each such gfn it finds it: >> * removes the cpage -> gfn rmap >> * Adds an spage -> gfn map >> * Reduces the type count of cpage >> * Sets the p2m entry for that gfn to the smfn (rather than cmfn). >> >> Obviously the common case is that the number of mappings is exactly 1; >> but we need to either ensure that this is always true, or we need to >> handle the case where it's not true. :-) >> >>>> But this change will only call p2m_altp2m_propagate_change() for the >>>> original cgfn -- any other gfns which are backed by cmfn will not have >>>> the corresponding altp2m entries propagated properly. >>> >>> Right, if there is some other place where it does sharing in the back we >>> would have to propagate that change. >>> >>>> This sort of mistake is easy to make, which is why I think we should >>>> try to always update the altp2ms in ept_set_entry() if we can, to >>>> minimize the opportunity for making this sort of mistake. >>>> >>>> Is there ever a reason to grab the altp2m lock and *then* grab the >>>> sharing lock? Could we just move the sharing lock up between the p2m >>>> lock and the altp2mlist lock instead? >>>> >>> >>> I can't think of a scenario where we would get to sharing from altp2m with >>> altp2m locking first. Not sure what you mean by moving the sharing lock up >>> though. The problem is that sharing already has the lock by the time altp2m >>> tries to lock, so we could pass that info down to make altp2m aware it needs >>> no locking. It would require extending a bunch of functions though with an >>> extra input that is barely ever used.. >> >> If you have altp2m there are three locks. There's one p2m lock for >> the "host" p2m (that is, Xen's idea of what the mapping should look >> like). Then there's the altp2mlist lock, which protects the *list* of >> altp2ms; then each altp2m itself has its own lock. These are defined >> in mm-lock.h and must be grabbed in that order: p2m before >> altp2mlist, altp2mlist before altp2m. >> >> I assume that the memsharing code is grabbing the hostp2m lock (it >> should be anyway), then grabbing the memsharing locks. This is allowed >> because the memsharing locks are defined after the p2m lock in >> mm-lock.h. But then when updating the p2m entry, if you have an >> altp2m active, it then tries to grab the altp2mlist lock so it can >> iterate over the altp2ms. Since the altp2mlist lock is *before* the >> sharing lock in mm-lock.h, this triggers an assert. >> >> Is that not what your issue is? > > Ahh, I see! Let me give that a try - TBH this locking order > enforcement based on position in mm-lock.h was not entirely clear to > me =) Indeed, it is a bit strange, but if you see the number of locks that must be ordered properly to avoid deadlock (what, 8 or so?) it's really only the sane way to make sure things are kept straight. The original implementation actually uses line numbers from mm-lock.h to declare the order, but I *think* there recently went in a patch to change those to explicit enumerations, to make xsplice patches easier. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |