[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen
> > > > Open: It seems no system call/ioctl is provided by Linux kernel to > > > > get the physical address from a virtual address. > > > > /proc/<qemu_pid>/pagemap provides information of mapping from > > > > VA to PA. Is it an acceptable solution to let QEMU parse this > > > > file to get the physical address? > > > > > > Does it work in a non-root scenario? > > > > > > > Seemingly no, according to Documentation/vm/pagemap.txt in Linux kernel: > > | Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs. > > | In 4.0 and 4.1 opens by unprivileged fail with -EPERM. Starting from > > | 4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN. > > | Reason: information about PFNs helps in exploiting Rowhammer > > vulnerability. Ah right. > > > > A possible alternative is to add a new hypercall similar to > > XEN_DOMCTL_memory_mapping but receiving virtual address as the address > > parameter and translating to machine address in the hypervisor. > > That might work. That won't work. This is a userspace VMA - which means the once the ioctl is done we swap to kernel virtual addresses. Now we may know that the prior cr3 has the userspace virtual address and walk it down - but what if the domain that is doing this is PVH? (or HVM) - the cr3 of userspace is tucked somewhere inside the kernel. Which means this hypercall would need to know the Linux kernel task structure to find this. May I propose another solution - an stacking driver (similar to loop). You setup it up (ioctl /dev/pmem0/guest.img, get some /dev/mapper/guest.img created). Then mmap the /dev/mapper/guest.img - all of the operations are the same - except it may have an extra ioctl - get_pfns - which would provide the data in similar form to pagemap.txt. But folks will then ask - why don't you just use pagemap? Could the pagemap have an extra security capability check? One that can be set for QEMU? > > > > > > Open: For a large pmem, mmap(2) is very possible to not map all SPA > > > > occupied by pmem at the beginning, i.e. QEMU may not be able to > > > > get all SPA of pmem from buf (in virtual address space) when > > > > calling XEN_DOMCTL_memory_mapping. > > > > Can mmap flag MAP_LOCKED or mlock(2) be used to enforce the > > > > entire pmem being mmaped? > > > > > > Ditto > > > > > > > No. If I take the above alternative for the first open, maybe the new > > hypercall above can inject page faults into dom0 for the unmapped > > virtual address so as to enforce dom0 Linux to create the page > > mapping. Ugh. That sounds hacky. And you wouldn't neccessarily be safe. Imagine that the system admin decides to defrag the /dev/pmem filesystem. Or move the files (disk images) around. If they do that - we may still have the guest mapped to system addresses which may contain filesystem metadata now, or a different guest image. We MUST mlock or lock the file during the duration of the guest. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |