[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86: fix paging_log_dirty_op to work with paging guests
On Wed, Dec 12, 2018 at 09:07:14AM -0700, Jan Beulich wrote: > >>> On 12.12.18 at 15:54, <roger.pau@xxxxxxxxxx> wrote: > > Fix this by releasing the target paging lock before attempting to > > perform the copy of the dirty bitmap, and then forcing a restart of > > the whole process in case there have been changes to the dirty bitmap > > tables. > > I'm afraid it's not that simple: The writer side (paging_mark_pfn_dirty()) > uses the same lock, and I think the assumption is that while the copying > takes place no updates to the bitmap can occur. Then again the subject > domain gets paused anyway, so updates probably can't happen (the > "probably" of course needs to be eliminated). I've looked into this, and I think you are right, the copy must be done with the paging lock held. Even if the domain is paused, the device model can still mark gfns as dirty using XEN_DMOP_modified_memory, so the only option would be to copy to a temporary page while holding the lock, release the lock and copy the temporary page to the called buffer. > > @@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d, > > bytes = (unsigned int)((sc->pages - pages + 7) >> 3); > > if ( likely(peek) ) > > { > > + if ( paging_mode_enabled(current->domain) ) > > + /* > > + * Drop the target p2m lock, or else Xen will panic > > + * when trying to acquire the p2m lock of the > > caller > > + * due to invalid lock order. Note that there are > > no > > + * lock ordering issues here, and the panic is due > > to > > + * the fact that the lock level tracking doesn't > > record > > + * the domain the lock belongs to. > > + */ > > + paging_unlock(d); > > This makes it sound as if tracking the domain would help. It doesn't, > at least not as long as there is not also some sort of ordering or > hierarchy between domains. IOW I'd prefer if the "doesn't" became > "can't". Well, Just keeping correct order between each domain locks should be enough? Ie: exactly the same that Xen currently does but on a per-domain basis. This is feasible, but each CPU would need to store the lock order of each possible domain: DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]); This would consume ~32KB per CPU, which is not that much but seems a waste when most of the time a single entry will be used. > Now before we go this route I think we need to consider whether > this really is the only place where the two locks get into one > another's way. That's because I don't think we want to introduce > more of such, well, hackery, and hence we'd otherwise need a > better solution. For example the locking model could be adjusted > to allow such nesting in the general case: Dom0 and any domain > whose ->target matches the subject domain here could be given > a slightly different "weight" in the lock order violation check logic. So locks from domains != current would be given a lower order, let's say: #define MM_LOCK_ORDER_nestedp2m 8 #define MM_LOCK_ORDER_p2m 16 #define MM_LOCK_ORDER_per_page_sharing 24 #define MM_LOCK_ORDER_altp2mlist 32 #define MM_LOCK_ORDER_altp2m 40 #define MM_LOCK_ORDER_pod 48 #define MM_LOCK_ORDER_page_alloc 56 #define MM_LOCK_ORDER_paging 64 #define MM_LOCK_ORDER_MAX MM_LOCK_ORDER_paging If domain != current, the above values are used. If domain == current the values above are used + MM_LOCK_ORDER_MAX? So in that case the order of MM_LOCK_ORDER_p2m against the current domain would be 16 + 64 = 80? This has the slight inconvenience that not all mm lock call sites have the target domain available, but can be solved. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |