[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen
On 02/16/16 05:55, Jan Beulich wrote: > >>> On 16.02.16 at 12:14, <stefano.stabellini@xxxxxxxxxxxxx> wrote: > > On Mon, 15 Feb 2016, Zhang, Haozhong wrote: > >> On 02/04/16 20:24, Stefano Stabellini wrote: > >> > On Thu, 4 Feb 2016, Haozhong Zhang wrote: > >> > > On 02/03/16 15:22, Stefano Stabellini wrote: > >> > > > On Wed, 3 Feb 2016, George Dunlap wrote: > >> > > > > On 03/02/16 12:02, Stefano Stabellini wrote: > >> > > > > > On Wed, 3 Feb 2016, Haozhong Zhang wrote: > >> > > > > >> Or, we can make a file system on /dev/pmem0, create files on > >> > > > > >> it, set > >> > > > > >> the owner of those files to xen-qemuuser-domid$domid, and then > >> > > > > >> pass > >> > > > > >> those files to QEMU. In this way, non-root QEMU should be able > >> > > > > >> to > >> > > > > >> mmap those files. > >> > > > > > > >> > > > > > Maybe that would work. Worth adding it to the design, I would > >> > > > > > like to > >> > > > > > read more details on it. > >> > > > > > > >> > > > > > Also note that QEMU initially runs as root but drops privileges > >> > > > > > to > >> > > > > > xen-qemuuser-domid$domid before the guest is started. Initially > >> > > > > > QEMU > >> > > > > > *could* mmap /dev/pmem0 while is still running as root, but then > >> > > > > > it > >> > > > > > wouldn't work for any devices that need to be mmap'ed at run time > >> > > > > > (hotplug scenario). > >> > > > > > >> > > > > This is basically the same problem we have for a bunch of other > >> > > > > things, > >> > > > > right? Having xl open a file and then pass it via qmp to qemu > >> > > > > should > >> > > > > work in theory, right? > >> > > > > >> > > > Is there one /dev/pmem? per assignable region? > >> > > > >> > > Yes. > >> > > > >> > > BTW, I'm wondering whether and how non-root qemu works with xl disk > >> > > configuration that is going to access a host block device, e.g. > >> > > disk = [ '/dev/sdb,,hda' ] > >> > > If that works with non-root qemu, I may take the similar solution for > >> > > pmem. > >> > > >> > Today the user is required to give the correct ownership and access mode > >> > to the block device, so that non-root QEMU can open it. However in the > >> > case of PCI passthrough, QEMU needs to mmap /dev/mem, as a consequence > >> > the feature doesn't work at all with non-root QEMU > >> > (http://marc.info/?l=xen-devel&m=145261763600528). > >> > > >> > If there is one /dev/pmem device per assignable region, then it would be > >> > conceivable to change its ownership so that non-root QEMU can open it. > >> > Or, better, the file descriptor could be passed by the toolstack via > >> > qmp. > >> > >> Passing file descriptor via qmp is not enough. > >> > >> Let me clarify where the requirement for root/privileged permissions > >> comes from. The primary workflow in my design that maps a host pmem > >> region or files in host pmem region to guest is shown as below: > >> (1) QEMU in Dom0 mmap the host pmem (the host /dev/pmem0 or files on > >> /dev/pmem0) to its virtual address space, i.e. the guest virtual > >> address space. > >> (2) QEMU asks Xen hypervisor to map the host physical address, i.e. SPA > >> occupied by the host pmem to a DomU. This step requires the > >> translation from the guest virtual address (where the host pmem is > >> mmaped in (1)) to the host physical address. The translation can be > >> done by either > >> (a) QEMU that parses its own /proc/self/pagemap, > >> or > >> (b) Xen hypervisor that does the translation by itself [1] (though > >> this choice is not quite doable from Konrad's comments [2]). > >> > >> [1] > >> http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg00434.html > >> [2] > >> http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg00606.html > >> > >> For 2-a, reading /proc/self/pagemap requires CAP_SYS_ADMIN capability > >> since linux kernel 4.0. Furthermore, if we don't mlock the mapped host > >> pmem (by adding MAP_LOCKED flag to mmap or calling mlock after mmap), > >> pagemap will not contain all mappings. However, mlock may require > >> privileged permission to lock memory larger than RLIMIT_MEMLOCK. Because > >> mlock operates on memory, the permission to open(2) the host pmem files > >> does not solve the problem and therefore passing file descriptor via qmp > >> does not help. > >> > >> For 2-b, from Konrad's comments [2], mlock is also required and > >> privileged permission may be required consequently. > >> > >> Note that the mapping and the address translation are done before QEMU > >> dropping privileged permissions, so non-root QEMU should be able to work > >> with above design until we start considering vNVDIMM hotplug (which has > >> not been supported by the current vNVDIMM implementation in QEMU). In > >> the hotplug case, we may let Xen pass explicit flags to QEMU to keep it > >> running with root permissions. > > > > Are we all good with the fact that vNVDIMM hotplug won't work (unless > > the user explicitly asks for it at domain creation time, which is > > very unlikely otherwise she could use coldplug)? > > No, at least there needs to be a road towards hotplug, even if > initially this may not be supported/implemented. Guangrong: any plan or design for vNVDIMM hotplug in QEMU? Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |