[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings
On Wed, 19 Jul 2017 11:00:26 -0700 (PDT) Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote: > My expectation is that unlocked mappings are much more frequent than > locked mappings. Also, I expect that only very rarely we'll be able to > reuse locked mappings. Over the course of a VM lifetime, it seems to me > that walking the list every time would cost more than it would benefit. > > These are only "expectations", I would love to see numbers. Numbers make > for better decisions :-) Would you be up for gathering some of these > numbers? Such as how many times you get to reuse locked mappings and how > many times we walk items on the list fruitlessly? > > Otherwise, would you be up for just testing the modified version of the > patch I sent to verify that solves the bug? Numbers will show that there is a one single entry in the bucket's list most of the time. :) Even two entries are rare encounters, typically to be seen only when guest performs some intensive I/O. OK, I'll collect some real stats for different scenarios, these are interesting numbers, might come useful for later optimizations. The approach your proposed is good, but it allows reusing of suitable locked entries only when they come first in list (an existing behavior). But we can actually reuse a locked entry which may come next (if any) in the list as well. When we have the situation when lock=0 entry comes first in the list and lock=1 entry is the second -- there is a chance the first entry was a 2MB-type (must be some reason why 2nd entry was added to the list), so picking it for a lock0-request might result in xen_remap_bucket... which should be avoided. Anyway, there is no big deal which approach is better as these situations are uncommon. After all, mostly it's just a single entry in the bucket's list. > > One possible minor optimization for xen-mapcache would be to reuse > > larger mappings for mappings of lesser cache_size. Right now existing > > code does checks in the "entry->size == cache_size" manner, while we > > can use "entry->size >= cache_size" here. However, we may end up with > > resident MapCacheEntries being mapped to a bigger mapping sizes than > > necessary and thus might need to add remapping back to the normal size > > in xen_invalidate_map_cache_entry_unlocked() when there are no other > > mappings. > > Yes, I thought about it, that would be a good improvement to have. Well, it appears there is a lot of space for improvements in xen-mapcache usage. Probably getting rid of the lock0/lock1-request separation will allow to drastically reduce the number of xen_remap_bucket calls. There also might be a possible bug for lock0-mappings being remapped by concurrent xen-mapcache requests. The whole xen_map_cache(addr, 0, lock=0) thing looks very strange. As it seems, the idea was to have a way to receive a temporary mapping to read some tiny item from guest's RAM and after that leaving this mapping on its own without bothering to unmap it. So it will be either reused later by some other lock0/1-request or even remapped. It appears that lock=0 mappings are very fragile. Their typical usage scenario is like this: rcu_read_lock(); ... ptr = qemu_map_ram_ptr(mr->ram_block, addr1); memcpy(buf, ptr, len); ... rcu_read_unlock(); Here qemu_map_ram_ptr calls xen_map_cache(lock=0) which returns the actual ptr. This scenario assumes there will be no intervention between qemu_map_ram_ptr and rcu_read_unlock, providing ptr validity. This might be ok for QEMU alone, but with underlying xen-mapcache usage it seems to be assumed for RCU read lock to provide protection against concurrent remappings of ptr's MapCacheEntry... which it doesn't obviously. The problem is that rcu_read_lock() seems to be used to protect QEMU stuff only, leaving us only mapcache_(un)lock to sync execution. But, upon return from qemu_map_ram_ptr we don't hold the xen-mapcache lock anymore, so the question is how rcu read lock supposed to save us from concurrent qemu_map_ram_ptr/qemu_ram_ptr_length's? If there will be some DMA mapping (lock=1) for that address_index or even some another lock0-read (of different size) -- they will see an unlocked entry which can be remapped without hesitation, breaking the ptr mapping which might be still in use. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |