[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Bug: privcmd failure in mmaping 160+ pages
2013/8/5 Guanglin Xu <mzguanglin@xxxxxxxxx>: > 2013/7/31 Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>: >> On Jul 31, 2013, at 8:39 PM, Guanglin Xu <mzguanglin@xxxxxxxxx> wrote: >> >>> Hi Andres, >>> >>> Thanks for your reply. However, I encounter some problems in patching >>> your diff as well as compiling the resulted privcmd.c (after manually >>> patching). >>> >>> Which kernel version are you using? Can you provide complete privcmd.c file? >> I posted this against linux-next. > > Hi Andres, > > I have successfully patched your codes into linux-next too, but it > doesn't solve the problem of mmap 160+ pages. > > My configuration: Linux-next: 3.110-rc3; Xen: 4.1 > > Do you think if I should try Xen newer than 4.1? > >> >> Please be less vague when posting. For example, how did the final diff look >> like and what is your compile error. We can probably fix the compile easily. > > Apologize for confusing. It's because of my lack of experience in > building kernel. > > 1. the diff didn't work just because some blanks were missing. For example, > > -#ifndef HAVE_ARCH_PRIVCMD_MMAP > <!!a blank should be here!!>static int > privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma); > -#endif > > 2. compiling error is because I didn't specify kernel headers for KDIR > environment variable. > > Correct: KDIR := /usr/src/linux-headers-3.11.0-rc3-xen-next-20130801 > Incorrect: KDIR := /root/linux-next > Incorrect: KDIR := /lib/modules/$(shell uname -r)/build > > Thanks, > > Guanglin > >> >> Posting the privcmd.c from linux-next would not really help. >> >> Thanks, >> Andres >> >> >>> >>> Thanks, >>> Guanglin >>> >>> 2013/7/31 Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>: >>>> Hi Konrad, David and everybody else, >>>> >>>> I find a privcmd bug that no more than 160 pages can be mmap by >>>> IOCTL_PRIVCMD_MMAPBATCH and IOCTL_PRIVCMD_MMAPBATCH_V2. >>>> >>>> I have started a thread (see below) at the xen-user list, and then >>>> discussed with Mr. Ian Campbell, who finally recommended talking to >>>> the xen-devel list and CC'ing you. >>>> >>>> I appreciate your comments. >>>> >>>> >>>> You know, this might have been me. Or in the process of being fixed by a >>>> patch I just posted. >>>> >>>> As you may know by now, xc_map_foreign* first sets up a region of virtual >>>> memory with an mmap call, and then issues the mapping ioctl to populate the >>>> region. >>>> >>>> Let's say for the sake of argument that the mapping operation is broken up >>>> into pieces. Current checks in the implementation of PRIVCMD_MMAPBATCH* >>>> prevent you from populating the memory region piece-meal. >>>> >>>> So: can you try the patch I pasted below? >>>> >>>> If that doesn't fix it, then you'll need to give us the output of those >>>> printks. >>>> >>>> Thanks >>>> Andres >>>> >>>> [PATCH] Xen: Fix retry calls into PRIVCMD_MMAPBATCH*. >>>> >>>> When a foreign mapper attempts to map guest frames that are paged out, >>>> the mapper receives an ENOENT response and will have to try again >>>> while a helper process pages the target frame back in. >>>> >>>> Gating checks on PRIVCMD_MMAPBATCH* ioctl args were preventing retries >>>> of mapping calls. >>>> >>>> Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx> >>>> --- >>>> drivers/xen/privcmd.c | 32 +++++++++++++++++++++++++++----- >>>> 1 files changed, 27 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c >>>> index f8e5dd7..44a26c6 100644 >>>> --- a/drivers/xen/privcmd.c >>>> +++ b/drivers/xen/privcmd.c >>>> @@ -43,9 +43,12 @@ MODULE_LICENSE("GPL"); >>>> >>>> #define PRIV_VMA_LOCKED ((void *)1) >>>> >>>> -#ifndef HAVE_ARCH_PRIVCMD_MMAP >>>> static int privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma); >>>> -#endif >>>> + >>>> +static int privcmd_enforce_singleshot_mapping_granular( >>>> + struct vm_area_struct *vma, >>>> + unsigned long addr, >>>> + unsigned long nr_pages); >>>> >>>> static long privcmd_ioctl_hypercall(void __user *udata) >>>> { >>>> @@ -422,9 +425,9 @@ static long privcmd_ioctl_mmap_batch(void __user >>>> *udata, >>>> int version) >>>> vma = find_vma(mm, m.addr); >>>> if (!vma || >>>> vma->vm_ops != &privcmd_vm_ops || >>>> - (m.addr != vma->vm_start) || >>>> - ((m.addr + (nr_pages << PAGE_SHIFT)) != vma->vm_end) || >>>> - !privcmd_enforce_singleshot_mapping(vma)) { >>>> + (m.addr < vma->vm_start) || >>>> + ((m.addr + (nr_pages << PAGE_SHIFT)) > vma->vm_end) || >>>> + !privcmd_enforce_singleshot_mapping_granular(vma, m.addr, nr_pages)) >>>> { >>>> up_write(&mm->mmap_sem); >>>> ret = -EINVAL; >>>> goto out; >>>> @@ -540,11 +543,30 @@ static int privcmd_mmap(struct file *file, struct >>>> vm_area_struct *vma) >>>> return 0; >>>> } >>>> >>>> +/* For MMAP */ >>>> static int privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma) >>>> { >>>> return !cmpxchg(&vma->vm_private_data, NULL, PRIV_VMA_LOCKED); >>>> } >>>> >>>> +/* For MMAPBATCH*. This allows asserting the singleshot mapping >>>> + * on a per pfn/pte basis. Mapping calls that fail with ENOENT >>>> + * can be then retried until success. */ >>>> +static int enforce_singleshot_mapping_fn(pte_t *pte, struct page >>>> *pmd_page, >>>> + unsigned long addr, void *data) >>>> +{ >>>> + return pte_none(*pte) ? 0 : -EBUSY; >>>> +} >>>> + >>>> +static int privcmd_enforce_singleshot_mapping_granular( >>>> + struct vm_area_struct *vma, >>>> + unsigned long addr, >>>> + unsigned long nr_pages) >>>> +{ >>>> + return apply_to_page_range(vma->vm_mm, addr, nr_pages << PAGE_SHIFT, >>>> + enforce_singleshot_mapping_fn, NULL) == 0; >>>> +} >>>> + >>>> const struct file_operations xen_privcmd_fops = { >>>> .owner = THIS_MODULE, >>>> .unlocked_ioctl = privcmd_ioctl, >>>> -- >>>> 1.7.1 >>>> >>>> >>>> Thanks, >>>> >>>> Guanglin >>>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Ian Campbell <Ian.Campbell@xxxxxxxxx> >>>> Date: 2013/7/30 >>>> Subject: Re: [Xen-users] BUG : privcmd mmap 160+ pages >>>> To: Guanglin Xu <mzguanglin@xxxxxxxxx> >>>> ??? "xen-users@xxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxx> >>>> >>>> >>>> On Tue, 2013-07-30 at 07:55 -0400, Guanglin Xu wrote: >>>> >>>> 2013/7/30 Ian Campbell <Ian.Campbell@xxxxxxxxx>: >>>> >>>> On Mon, 2013-07-29 at 16:16 -0400, Guanglin Xu wrote: >>>> >>>> Hi all, >>>> >>>> I just find a xc_map_foreign_range() problem in Xen. >>>> >>>> xc_map_foreign_range(), an API of libxc backed by privcmd - a xen >>>> kernel module, can be used to mmap guest VM memory into dom0. However, >>>> if we mmap more than 160 pages, we'll fail. >>>> >>>> Inside xc_map_foreign_range(), it uses ioctrl to communicate with >>>> privcmd. There are 2 ioctl commands, the one is >>>> IOCTL_PRIVCMD_MMAPBATCH (legacy), another one is >>>> IOCTL_PRIVCMD_MMAPBATCH_V2 (newer). Both of them constantly return 0 >>>> (success), but the mmapings are fail after mmaping 160 pages. >>>> >>>> Firstly, when my Linux kernel version was 3.5, IOCTL_PRIVCMD_MMAPBATCH >>>> was the only one ioctl command. rc = ioctl(fd, >>>> IOCTL_PRIVCMD_MMAPBATCH, &ioctlx). After mapping 160 pages, the >>>> subsequent invoking of ioctl would >>>> set ioctlx.arr[] items to 140737344202616, overwriting the original >>>> pfn number. >>>> >>>> And then, I updated my Linux kernel to 3.8 so as to test >>>> IOCTL_PRIVCMD_MMAPBATCH_V2. rc = ioctl(fd, >>>> IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx). This time, the post-160 invoking >>>> of ioctl would set ioctls.err[] items to EINVAL. >>>> >>>> Although I have inserted printk() in privcmd.c to track its execution >>>> flow, the result showed a look-like complete path, which is quite >>>> weird. I have no idea what happened in privcmd. >>>> >>>> Can anyone figure out this problem? >>>> >>>> >>>> I wouldn't be all that surprised if there was a hardcoded batch size >>>> limit somewhere in either libxc or the privcmd driver. >>>> >>>> >>>> Hi Ian, >>>> >>>> Thank you very much for your reply. >>>> >>>> I can confirm that it's the problem of privcmd, because I have debug >>>> libxc and narrowed the problem to ioctl(), where privcmd would set >>>> ioctlx.arr[] ( IOCTL_PRIVCMD_MMAPBATCH) or ioctlx.err[] ( >>>> IOCTL_PRIVCMD_MMAPBATCH_V2) to indicate the limitation. >>>> >>>> >>>> If you map you memory in batch of e.g. 128 pages does it all work OK? >>>> >>>> >>>> Yes. For example, [0-127] succeeds. However, the subsequent [128-255] >>>> would fail because the size of the whole region has exceeded 160 >>>> pages. >>>> >>>> >>>> That's quite interesting. >>>> >>>> >>>> If you want to get to the bottom of the 160 page limit you'll probably >>>> have to trace through the code looking for hardcoded sizes or limits on >>>> sizes (e.g. of arrays) or perhaps integer indexes etc which are too >>>> small and are overflowing. >>>> >>>> 160 * the size of the various structs involved doesn't look to be all >>>> that interesting (i.e. just over a page size boundary or something), but >>>> that's the sort of direction I would start by looking in. >>>> >>>> If you can't spot it by eye then you'll likely have to instrument the >>>> code paths with prints to try and track the progress of the initially >>>> supplied buffer through to the hypercall etc. >>>> >>>> >>>> Yes. I have been debuging libxc and instrumenting privcmd, but I >>>> couldn't find the "hardcode" or other limitation codes. >>>> >>>> Codes in libxc is easizer to trace, but in privcmd it is hard for me >>>> who is in lack of kernel dev experience. I couldn't even find where >>>> privcmd copy_to_user() or put_user() the ioctlx.err[] or ioctlx.arr[] >>>> items while the execution path of privcmd_ioctl() seems quite complete >>>> by use of printk(). >>>> >>>> Do you have idea how privcmd can set ioctlx.err[] in another way? >>>> >>>> >>>> >>>> I'm not familiar with this code. It sounds like this is most probably a >>>> kernel bug, I'd recommend taking it to the xen-devel list, perhaps CC >>>> Konrad Wilk and David Vrabel. >>>> >>>> Ian. >>>> >>>> >> Dear Andres and all, After a few days exploration, I realize that the problem I found is actually about "memory hole". I must clarify that privcmd can definitely map more than 160 pages, but it just can't map some memory holes. For example, in a 1 GB guest memory settings where 262144 pages (4096 bytes per page) should be available, privcmd denies to map pages 160 ~ 191 and 260096 ~ 262143. According to www.intel.la/content/dam/www/public/us/en/documents/datasheets/4th-gen-core-family-desktop-vol-2-datasheet.pdf , pages 160 ~ 191 are legacy video buffer; and in my settings, pages 260096 ~ 262143 are 8MB preserved memory for VGA. However, this result is not satisfying at all. I would be happy if you can response to the two further topics: 1. privcmd adds dummy paddings to the above memory holes rather than denying mapping. http://lists.xen.org/archives/html/xen-devel/2013-08/msg01052.html 2.fix Xen that reports wrong memory size of domU. http://lists.xen.org/archives/html/xen-users/2013-08/msg00098.html Thanks, Guanglin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |