[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen
Hey Haozhong, On 07/18/2016 08:29 AM, Haozhong Zhang wrote: > Hi, > > Following is version 2 of the design doc for supporting vNVDIMM in This version is really good, very clear and included almost everything I'd like to know. > Xen. It's basically the summary of discussion on previous v1 design > (https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg00006.html). > Any comments are welcome. The corresponding patches are WIP. > So are you(or Intel) going to write all the patches? Is there any task the community to take part in? [..snip..] > 3. Usage Example of vNVDIMM in Xen > > Our design is to provide virtual pmem devices to HVM domains. The > virtual pmem devices are backed by host pmem devices. > > Dom0 Linux kernel can detect the host pmem devices and create > /dev/pmemXX for each detected devices. Users in Dom0 can then create > DAX file system on /dev/pmemXX and create several pre-allocate files > in the DAX file system. > > After setup the file system on the host pmem, users can add the > following lines in the xl configuration files to assign the host pmem > regions to domains: > vnvdimm = [ 'file=/dev/pmem0' ] > or > vnvdimm = [ 'file=/mnt/dax/pre_allocated_file' ] > Could you please also consider the case when driver domain gets involved? E.g vnvdimm = [ 'file=/dev/pmem0', backend='xxx' ]? > The first type of configuration assigns the entire pmem device > (/dev/pmem0) to the domain, while the second assigns the space > allocated to /mnt/dax/pre_allocated_file on the host pmem device to > the domain. > ..[snip..] > > 4.2.2 Detection of Host pmem Devices > > The detection and initialize host pmem devices require a non-trivial > driver to interact with the corresponding ACPI namespace devices, > parse namespace labels and make necessary recovery actions. Instead > of duplicating the comprehensive Linux pmem driver in Xen hypervisor, > our designs leaves it to Dom0 Linux and let Dom0 Linux report > detected host pmem devices to Xen hypervisor. > > Our design takes following steps to detect host pmem devices when Xen > boots. > (1) As booting on bare metal, host pmem devices are detected by Dom0 > Linux NVDIMM driver. > > (2) Our design extends Linux NVDIMM driver to reports SPA's and sizes > of the pmem devices and reserved areas to Xen hypervisor via a > new hypercall. > > (3) Xen hypervisor then checks > - whether SPA and size of the newly reported pmem device is overlap > with any previously reported pmem devices; > - whether the reserved area can fit in the pmem device and is > large enough to hold page_info structs for itself. > > If any checks fail, the reported pmem device will be ignored by > Xen hypervisor and hence will not be used by any > guests. Otherwise, Xen hypervisor will recorded the reported > parameters and create page_info structs in the reserved area. > > (4) Because the reserved area is now used by Xen hypervisor, it > should not be accessible by Dom0 any more. Therefore, if a host > pmem device is recorded by Xen hypervisor, Xen will unmap its > reserved area from Dom0. Our design also needs to extend Linux > NVDIMM driver to "balloon out" the reserved area after it > successfully reports a pmem device to Xen hypervisor. > > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files > > Before a pmem file is assigned to a domain, we need to know the host > SPA ranges that are allocated to this file. We do this work in xl. > > If a pmem device /dev/pmem0 is given, xl will read > /sys/block/pmem0/device/{resource,size} respectively for the start > SPA and size of the pmem device. > > If a pre-allocated file /mnt/dax/file is given, > (1) xl first finds the host pmem device where /mnt/dax/file is. Then > it uses the method above to get the start SPA of the host pmem > device. > (2) xl then uses fiemap ioctl to get the extend mappings of > /mnt/dax/file, and adds the corresponding physical offsets and > lengths in each mapping entries to above start SPA to get the SPA > ranges pre-allocated for this file. > Looks like PMEM can't be passed through to driver domain directly like e.g PCI devices. So if created a driver domain by: vnvdimm = [ 'file=/dev/pmem0' ], and make a DAX file system on the driver domain. Then creating new guests with vnvdimm = [ 'file=dax file in driver domain', backend = 'driver domain' ]. Is this going to work? In my understanding, fiemap can only get the GPFN instead of the really SPA of PMEM in this case. > The resulting host SPA ranges will be passed to QEMU which allocates > guest address space for vNVDIMM devices and calls Xen hypervisor to > map the guest address to the host SPA ranges. > Can Dom0 still access the same SPA range when Xen decides to assign it to new domU? I assume the range will be unmapped automatically from dom0 in the hypercall? Thanks, -Bob _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |