Xen project Mailing List

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

>>> On 22.04.16 at 04:36, <haozhong.zhang@xxxxxxxxx> wrote: > On 04/21/16 01:04, Jan Beulich wrote: >> >>> On 21.04.16 at 07:09, <haozhong.zhang@xxxxxxxxx> wrote: >> > On 04/12/16 16:45, Haozhong Zhang wrote: >> >> On 04/08/16 09:52, Jan Beulich wrote: >> >> > >>> On 08.04.16 at 07:02, <haozhong.zhang@xxxxxxxxx> wrote: >> >> > > On 03/29/16 04:49, Jan Beulich wrote: >> >> > >> >>> On 29.03.16 at 12:10, <haozhong.zhang@xxxxxxxxx> wrote: >> >> > >> > On 03/29/16 03:11, Jan Beulich wrote: >> >> > >> >> >>> On 29.03.16 at 10:47, <haozhong.zhang@xxxxxxxxx> wrote: >> >> > > [..] >> >> > >> >> > I still cannot find a neat approach to manage guest permissions >> >> > >> >> > for >> >> > >> >> > nvdimm pages. A possible one is to use a per-domain bitmap to >> >> > >> >> > track >> >> > >> >> > permissions: each bit corresponding to an nvdimm page. The >> >> > >> >> > bitmap can >> >> > >> >> > save lots of spaces and even be stored in the normal ram, but >> >> > >> >> > operating it for a large nvdimm range, especially for a >> >> > >> >> > contiguous >> >> > >> >> > one, is slower than rangeset. >> >> > >> >> >> >> > >> >> I don't follow: What would a single bit in that bitmap mean? Any >> >> > >> >> guest may access the page? That surely wouldn't be what we >> >> > >> >> need. >> >> > >> >> >> >> > >> > >> >> > >> > For a host having a N pages of nvdimm, each domain will have a N >> >> > >> > bits >> >> > >> > bitmap. If the m'th bit of a domain's bitmap is set, then that >> >> > >> > domain >> >> > >> > has the permission to access the m'th host nvdimm page. >> >> > >> >> >> > >> Which will be more overhead as soon as there are enough such >> >> > >> domains in a system. >> >> > >> >> >> > > >> >> > > Sorry for the late reply. >> >> > > >> >> > > I think we can make some optimization to reduce the space consumed by >> >> > > the bitmap. >> >> > > >> >> > > A per-domain bitmap covering the entire host NVDIMM address range is >> >> > > wasteful especially if the actual used ranges are congregated. We may >> >> > > take following ways to reduce its space. >> >> > > >> >> > > 1) Split the per-domain bitmap into multiple sub-bitmap and each >> >> > > sub-bitmap covers a smaller and contiguous sub host NVDIMM address >> >> > > range. In the beginning, no sub-bitmap is allocated for the >> >> > > domain. If the access permission to a host NVDIMM page in a sub >> >> > > host address range is added to a domain, only the sub-bitmap for >> >> > > that address range is allocated for the domain. If access >> >> > > permissions to all host NVDIMM pages in a sub range are removed >> >> > > from a domain, the corresponding sub-bitmap can be freed. >> >> > > >> >> > > 2) If a domain has access permissions to all host NVDIMM pages in a >> >> > > sub range, the corresponding sub-bitmap will be replaced by a range >> >> > > struct. If range structs are used to track adjacent ranges, they >> >> > > will be merged into one range struct. If access permissions to some >> >> > > pages in that sub range are removed from a domain, the range struct >> >> > > should be converted back to bitmap segment(s). >> >> > > >> >> > > 3) Because there might be lots of above bitmap segments and range >> >> > > structs per-domain, we can organize them in a balanced interval >> >> > > tree to quickly search/add/remove an individual structure. >> >> > > >> >> > > In the worst case that each sub range has non-contiguous pages >> >> > > assigned to a domain, above solution will use all sub-bitmaps and >> >> > > consume more space than a single bitmap because of the extra space for >> >> > > organization. I assume that the sysadmin should be responsible to >> >> > > ensure the host nvdimm ranges assigned to each domain as contiguous >> >> > > and congregated as possible in order to avoid the worst case. However, >> >> > > if the worst case does happen, xen hypervisor should refuse to assign >> >> > > nvdimm to guest when it runs out of memory. >> >> > >> >> > To be honest, this all sounds pretty unconvincing wrt not using >> >> > existing code paths - a lot of special treatment, and hence a lot >> >> > of things that can go (slightly) wrong. >> >> > >> >> >> >> Well, using existing range struct to manage guest access permissions >> >> to nvdimm could consume too much space which could not fit in either >> >> memory or nvdimm. If the above solution looks really error-prone, >> >> perhaps we can still come back to the existing one and restrict the >> >> number of range structs each domain could have for nvdimm >> >> (e.g. reserve one 4K-page per-domain for them) to make it work for >> >> nvdimm, though it may reject nvdimm mapping that is terribly >> >> fragmented. >> > >> > Hi Jan, >> > >> > Any comments for this? >> >> Well, nothing new, i.e. my previous opinion on the old proposal didn't >> change. I'm really opposed to any artificial limitations here, as I am to >> any secondary (and hence error prone) code paths. IOW I continue >> to think that there's no reasonable alternative to re-using the existing >> memory management infrastructure for at least the PMEM case. > > By re-using the existing memory management infrastructure, do you mean > re-using the existing model of MMIO for passthrough PCI devices to > handle the permission of pmem? No, re-using struct page_info. >> The >> only open question remains to be where to place the control structures, >> and I think the thresholding proposal of yours was quite sensible. > > I'm little confused here. Is 'restrict the number of range structs' in > my previous reply the 'thresholding proposal' you mean? Or it's one of > 'artificial limitations'? Neither. It's the decision on where to place the struct page_info arrays needed to manage the PMEM ranges. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.