[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] altp2m: Allow the hostp2m to be shared
On Wed, May 25, 2016 at 10:08 AM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote: > On Wed, May 25, 2016 at 4:31 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> wrote: >> >> On May 25, 2016 05:27, "George Dunlap" <george.dunlap@xxxxxxxxxx> wrote: >>> >>> On Fri, Apr 29, 2016 at 6:42 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> >>> wrote: >>> > Don't propagate altp2m changes from ept_set_entry for memshare as >>> > memshare >>> > already has the lock. We call altp2m propagate changes once memshare >>> > successfully finishes. Allow the hostp2m entries to be of type >>> > p2m_ram_shared when applying mem_access. Also, do not trigger PoD for >>> > hostp2m >>> > when setting altp2m mem_access to be in-line with non-altp2m mem_access >>> > path. >>> >>> Hey Tamas, >>> >>> Sorry for the long delay in getting back to you on this. >> >> No problem, thanks for taking a closer look! >> >>> >>> So the main issue here (correct me if I'm wrong) is the locking >>> discipline: namely, men_sharing_share_pages(): >>> - Grabs the hostp2m lock >>> - Grabs the appropriate domain memsharing locks >>> - Calls set_shared_p2m_entry(), which ends up calling ept_set_entry(), >>> which (when altp2m is active) grabs the altp2mlist and altp2m locks. >>> >>> This causes an ASSERT(), since the altp2mlist lock is ahead of the >>> memsharing locks in the list. >>> >>> But having taken a closer look at the code, I'm not sure the change is >>> quite correct. Please correct me if I've misread something: >>> >>> mem_sharing_share_pages() is passed two <domain,gfn> pairs -- the >>> <sd,sgfn> (which I assume stands for "shared gfn") and <cd,cgfn> >>> (which I assume stands for "copy"); and it >> >> Here s/c stands for source/client. >> >>> 1) Looks up smfn and cmfn, which back sgfn and cmfn respectively >>> 2) Looks up cmfn, which backs cgfn then replaces all gfn entries which >>> point to cmfn with smfn (updating accounting as appropriate) >> >> Hm, I might have missed that. Where does it do the lookup for all other >> cgfns backed by this cmfn? > > I was looking at the loop in the middle of the function: > > while ( (gfn = rmap_iterate(cpage, &ri)) != NULL) { > ... > } > > I haven't chased it down, but it looks like this walks the reverse map > of all gfns which map cpage; and for each such gfn it finds it: > * removes the cpage -> gfn rmap > * Adds an spage -> gfn map > * Reduces the type count of cpage > * Sets the p2m entry for that gfn to the smfn (rather than cmfn). > > Obviously the common case is that the number of mappings is exactly 1; > but we need to either ensure that this is always true, or we need to > handle the case where it's not true. :-) > >>> But this change will only call p2m_altp2m_propagate_change() for the >>> original cgfn -- any other gfns which are backed by cmfn will not have >>> the corresponding altp2m entries propagated properly. >> >> Right, if there is some other place where it does sharing in the back we >> would have to propagate that change. >> >>> This sort of mistake is easy to make, which is why I think we should >>> try to always update the altp2ms in ept_set_entry() if we can, to >>> minimize the opportunity for making this sort of mistake. >>> >>> Is there ever a reason to grab the altp2m lock and *then* grab the >>> sharing lock? Could we just move the sharing lock up between the p2m >>> lock and the altp2mlist lock instead? >>> >> >> I can't think of a scenario where we would get to sharing from altp2m with >> altp2m locking first. Not sure what you mean by moving the sharing lock up >> though. The problem is that sharing already has the lock by the time altp2m >> tries to lock, so we could pass that info down to make altp2m aware it needs >> no locking. It would require extending a bunch of functions though with an >> extra input that is barely ever used.. > > If you have altp2m there are three locks. There's one p2m lock for > the "host" p2m (that is, Xen's idea of what the mapping should look > like). Then there's the altp2mlist lock, which protects the *list* of > altp2ms; then each altp2m itself has its own lock. These are defined > in mm-lock.h and must be grabbed in that order: p2m before > altp2mlist, altp2mlist before altp2m. > > I assume that the memsharing code is grabbing the hostp2m lock (it > should be anyway), then grabbing the memsharing locks. This is allowed > because the memsharing locks are defined after the p2m lock in > mm-lock.h. But then when updating the p2m entry, if you have an > altp2m active, it then tries to grab the altp2mlist lock so it can > iterate over the altp2ms. Since the altp2mlist lock is *before* the > sharing lock in mm-lock.h, this triggers an assert. > > Is that not what your issue is? Ahh, I see! Let me give that a try - TBH this locking order enforcement based on position in mm-lock.h was not entirely clear to me =) Tamas _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |