[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen



On 07/19/16 08:58, Tian, Kevin wrote:
> > From: Zhang, Haozhong
> > Sent: Monday, July 18, 2016 5:02 PM
> > 
> > On 07/18/16 16:36, Tian, Kevin wrote:
> > > > From: Zhang, Haozhong
> > > > Sent: Monday, July 18, 2016 8:29 AM
> > > >
> > > > Hi,
> > > >
> > > > Following is version 2 of the design doc for supporting vNVDIMM in
> > > > Xen. It's basically the summary of discussion on previous v1 design
> > > > (https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg00006.html).
> > > > Any comments are welcome. The corresponding patches are WIP.
> > > >
> > > > Thanks,
> > > > Haozhong
> > >
> > > It's a very clear doc. Thanks a lot!
> > >
> > > >
> > > > 4.2.2 Detection of Host pmem Devices
> > > >
> > > >  The detection and initialize host pmem devices require a non-trivial
> > > >  driver to interact with the corresponding ACPI namespace devices,
> > > >  parse namespace labels and make necessary recovery actions. Instead
> > > >  of duplicating the comprehensive Linux pmem driver in Xen hypervisor,
> > > >  our designs leaves it to Dom0 Linux and let Dom0 Linux report
> > > >  detected host pmem devices to Xen hypervisor.
> > > >
> > > >  Our design takes following steps to detect host pmem devices when Xen
> > > >  boots.
> > > >  (1) As booting on bare metal, host pmem devices are detected by Dom0
> > > >      Linux NVDIMM driver.
> > > >
> > > >  (2) Our design extends Linux NVDIMM driver to reports SPA's and sizes
> > > >      of the pmem devices and reserved areas to Xen hypervisor via a
> > > >      new hypercall.
> > >
> > > Does Linux need to provide reserved area to Xen? Why not leaving Xen
> > > to decide reserved area within reported pmem regions and then return
> > > reserved info to Dom0 NVDIMM driver to balloon out?
> > >
> > 
> > NVDIMM can be used as a persistent storage like a disk drive, so the
> > reservation should be done out of Xen and Dom0, for example, by an
> > administrator who is expected to make necessary data backup in
> > advance.
> 
> What prevents NVDIMM driver from reserving some region itself before
> reporting to user space?
>

Nothing in theory prevents the driver doing reservations. I just mean
the reservation should be initiated by someone which can ensure, for
example, the current data on pmem is either useless or properly
backup. The reservation is of course finally done by the driver.

> > 
> > Therefore, dom0 linux actually reports (instead of providing) the
> > reserved area to Xen, and the latter checks if the reserved area is
> > large enough and (if yes) asks dom0 to balloon out the reserved area.
> 
> It looks non-intuitive since administrator doesn't know the actual requirement
> of Xen. Then administrator has to guess and try. Even it finally works, the 
> reserved size may not be optimal.
> 
> If Dom0 NVDIMM driver does reservation itself and notify Xen, at least there 
> is a way for Xen to return a failure with required size and then at the 2nd 
> time 
> the NVDIMM driver can adjust the reservation as desired. 
>
> Did I misunderstand the flow here?
>

I designed to let the administrator calculate the reserved size and
pass to the driver. Now, you are right, I think it's better to let Xen
advise the reserved size to NVDIMM driver in dom0 and therefore no
need for manually calculated size.

Thanks,
Haozhong

> > 
> > > >
> > > >  (3) Xen hypervisor then checks
> > > >      - whether SPA and size of the newly reported pmem device is overlap
> > > >        with any previously reported pmem devices;
> > > >      - whether the reserved area can fit in the pmem device and is
> > > >        large enough to hold page_info structs for itself.
> > > >
> > > >      If any checks fail, the reported pmem device will be ignored by
> > > >      Xen hypervisor and hence will not be used by any
> > > >      guests. Otherwise, Xen hypervisor will recorded the reported
> > > >      parameters and create page_info structs in the reserved area.
> > > >
> > > >  (4) Because the reserved area is now used by Xen hypervisor, it
> > > >      should not be accessible by Dom0 any more. Therefore, if a host
> > > >      pmem device is recorded by Xen hypervisor, Xen will unmap its
> > > >      reserved area from Dom0. Our design also needs to extend Linux
> > > >      NVDIMM driver to "balloon out" the reserved area after it
> > > >      successfully reports a pmem device to Xen hypervisor.
> > >
> > > Then both ndctl and Xen become source of requesting reserved area
> > > to Linux NVDIMM driver. You don't need change ndctl as described in
> > > 4.2.1. User can still use ndctl to reserve for Dom0's own purpose.
> > >
> > 
> > I missed something here: Dom0 pmem driver should also prevent
> > further operations on host namespace after it successfully reports to
> > Xen. In this way, we can prevent uerspace tools like ndctl to break
> > the host pmem device.
> > 
> 
> yes, Dom0 driver is expected to reserve the region allocated for Xen.
> 
> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.