[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen



>>> On 21.04.16 at 07:09, <haozhong.zhang@xxxxxxxxx> wrote:
> On 04/12/16 16:45, Haozhong Zhang wrote:
>> On 04/08/16 09:52, Jan Beulich wrote:
>> > >>> On 08.04.16 at 07:02, <haozhong.zhang@xxxxxxxxx> wrote:
>> > > On 03/29/16 04:49, Jan Beulich wrote:
>> > >> >>> On 29.03.16 at 12:10, <haozhong.zhang@xxxxxxxxx> wrote:
>> > >> > On 03/29/16 03:11, Jan Beulich wrote:
>> > >> >> >>> On 29.03.16 at 10:47, <haozhong.zhang@xxxxxxxxx> wrote:
>> > > [..]
>> > >> >> > I still cannot find a neat approach to manage guest permissions for
>> > >> >> > nvdimm pages. A possible one is to use a per-domain bitmap to track
>> > >> >> > permissions: each bit corresponding to an nvdimm page. The bitmap 
>> > >> >> > can
>> > >> >> > save lots of spaces and even be stored in the normal ram, but
>> > >> >> > operating it for a large nvdimm range, especially for a contiguous
>> > >> >> > one, is slower than rangeset.
>> > >> >> 
>> > >> >> I don't follow: What would a single bit in that bitmap mean? Any
>> > >> >> guest may access the page? That surely wouldn't be what we
>> > >> >> need.
>> > >> >>
>> > >> > 
>> > >> > For a host having a N pages of nvdimm, each domain will have a N bits
>> > >> > bitmap. If the m'th bit of a domain's bitmap is set, then that domain
>> > >> > has the permission to access the m'th host nvdimm page.
>> > >> 
>> > >> Which will be more overhead as soon as there are enough such
>> > >> domains in a system.
>> > >>
>> > > 
>> > > Sorry for the late reply.
>> > > 
>> > > I think we can make some optimization to reduce the space consumed by
>> > > the bitmap.
>> > > 
>> > > A per-domain bitmap covering the entire host NVDIMM address range is
>> > > wasteful especially if the actual used ranges are congregated. We may
>> > > take following ways to reduce its space.
>> > > 
>> > > 1) Split the per-domain bitmap into multiple sub-bitmap and each
>> > >    sub-bitmap covers a smaller and contiguous sub host NVDIMM address
>> > >    range. In the beginning, no sub-bitmap is allocated for the
>> > >    domain. If the access permission to a host NVDIMM page in a sub
>> > >    host address range is added to a domain, only the sub-bitmap for
>> > >    that address range is allocated for the domain. If access
>> > >    permissions to all host NVDIMM pages in a sub range are removed
>> > >    from a domain, the corresponding sub-bitmap can be freed.
>> > > 
>> > > 2) If a domain has access permissions to all host NVDIMM pages in a
>> > >    sub range, the corresponding sub-bitmap will be replaced by a range
>> > >    struct. If range structs are used to track adjacent ranges, they
>> > >    will be merged into one range struct. If access permissions to some
>> > >    pages in that sub range are removed from a domain, the range struct
>> > >    should be converted back to bitmap segment(s).
>> > > 
>> > > 3) Because there might be lots of above bitmap segments and range
>> > >    structs per-domain, we can organize them in a balanced interval
>> > >    tree to quickly search/add/remove an individual structure.
>> > > 
>> > > In the worst case that each sub range has non-contiguous pages
>> > > assigned to a domain, above solution will use all sub-bitmaps and
>> > > consume more space than a single bitmap because of the extra space for
>> > > organization. I assume that the sysadmin should be responsible to
>> > > ensure the host nvdimm ranges assigned to each domain as contiguous
>> > > and congregated as possible in order to avoid the worst case. However,
>> > > if the worst case does happen, xen hypervisor should refuse to assign
>> > > nvdimm to guest when it runs out of memory.
>> > 
>> > To be honest, this all sounds pretty unconvincing wrt not using
>> > existing code paths - a lot of special treatment, and hence a lot
>> > of things that can go (slightly) wrong.
>> > 
>> 
>> Well, using existing range struct to manage guest access permissions
>> to nvdimm could consume too much space which could not fit in either
>> memory or nvdimm. If the above solution looks really error-prone,
>> perhaps we can still come back to the existing one and restrict the
>> number of range structs each domain could have for nvdimm
>> (e.g. reserve one 4K-page per-domain for them) to make it work for
>> nvdimm, though it may reject nvdimm mapping that is terribly
>> fragmented.
> 
> Hi Jan,
> 
> Any comments for this?

Well, nothing new, i.e. my previous opinion on the old proposal didn't
change. I'm really opposed to any artificial limitations here, as I am to
any secondary (and hence error prone) code paths. IOW I continue
to think that there's no reasonable alternative to re-using the existing
memory management infrastructure for at least the PMEM case. The
only open question remains to be where to place the control structures,
and I think the thresholding proposal of yours was quite sensible.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.