Xen project Mailing List

Re: [Xen-devel] Possible Xen grant table locking improvements

To: Tim Deegan <tim@xxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>

From: David Vrabel <david.vrabel@xxxxxxxxxx>

Date: Fri, 7 Nov 2014 15:41:37 +0000

Cc: Keir Fraser <keir@xxxxxxx>, Matt Wilson <msw@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "Xen-devel@xxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 07 Nov 2014 15:42:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 23/10/14 10:46, Tim Deegan wrote: > Hi, > > At 18:35 +0100 on 20 Oct (1413826547), David Vrabel wrote: >> Most guests do not map a grant reference more than twice (Linux, for >> example, will typically map a gref once in the kernel address space, or >> twice if a userspace mapping is required). The maptrack entries for >> these two mappings can be stored in the active entry (the "fast" >> entries). If more than two mappings are required, the existing maptrack >> table can be used (the "slow" entries). > > Sounds good, as long as the hit rate is indeed high. Do you know if > the BSD/windows client code behaves this way too? I don't know about BSD but XenServer's Windows PV drivers don't support mapping grants (only granting access). >> A maptrack handle for a "fast" entry is encoded as: >> >> 31 30 16 15 0 >> +---+---------------+---------------+ >> | F | domid | gref | >> +---+---------------+---------------+ >> >> F is set for a "fast" entry, and clear for a "slow" one. Grant >> references above 2^16 will have to be tracked with "slow" entries. > > How restricting is that limit? Would 2^15½ and also encoding > which of the two entries to look at be good? Oh, I forgot about the bit for the entry index. 2^15 entries allows for (e.g.,) 8 multiqueue VIFs in a 8 VCPU guest. Which is not a huge number and not a limit I would like to introduce. One possibility would be guests wanting to use the fast path have to use a new grant unmap hypercall that also passes the original grant ref and domain. >> We can omit taking the grant table lock to check the validity of a grant >> ref or maptrack handle since these tables only grow and do not shrink. > > Can you also avoid the lock for accessing the entry itself, with a bit > of RCU magic? Maybe that's overengineering things. I don't think this will be necessary -- the active entry lock won't be contended. >> If strict IOMMU mode is used, IOMMU mappings are updated on every grant >> map/unmap. These are currently setup such that BFN == MFN which >> requires reference counting the IOMMU mappings so they are only torn >> down when all grefs for that MFN are unmapped. This requires an >> expensive mapcount() operation that iterates over the whole maptrack table. >> >> There is no requirement for BFN == MFN so each grant map can create its >> own IOMMU mapping. This will require a region of bus address space that >> does not overlap with RAM. > > Hrmn. That could be tricky to arrange. And the reference counting > might end up being cheaper than the extra IOMMU flush operations. > (Also, how much would you bet that clients actually use the returned > BFN correctly?) Hmm. Yes, if a guest has assumed that BFN == MFN, it could read the PTE and get the BFN that way. > Would it be enough to optimise mapcount() a bit? We could organise the > in-use maptrack entries as a hash table instead of (or as well as) a > single linked list. > > On similar lines, would it be worth fragmenting the maptrack itself > (e.g. with per-page locks) to reduce locking contention instead of > moving maptrack entries into the active entry? If might be Good > Enough[tm], and simpler to build/maintain than this proposal. A per maptrack page lock you would still need a per-domain maptrack lock protecting the maptrack free list. A better idea may be to hash the domain and grant ref and have a maptrack table per-hash bucket. Each maptrack table would have its own maptrack lock. The maptrack handle could be: 31 16 15 0 +-------------------+---------------+ | bucket | index | +-------------------+---------------+ We should probably try something simpler like this before getting carried away... David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.