[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0



On Thu, Jun 11, 2009 at 10:02:18AM +0100, Ian Campbell wrote:
> On Tue, 2009-06-09 at 13:28 -0400, Jeremy Fitzhardinge wrote:
> > Ian Campbell wrote:
> > > I wonder how this interacts with the logic in
> > > arch/x86/xen/mmu.c:xen_pin_page() which holds the lock while waiting for
> > > the (deferred) pin multicall to occur? Hmm, no this is about the
> > > PagePinned flag on the struct page which is out of date WRT the actual
> > > pinned status as Xen sees it -- we update the PagePinned flag early in
> > > xen_pin_page() long before Xen the pin hypercall so this window is the
> > > other way round to what would be needed to trigger this bug.
> > >   
> > 
> > Yes, it looks like you could get a bad mapping here.  An obvious fix 
> > would be to defer clearing the pinned flag in the page struct until 
> > after the hypercall has issued.  That would make the racy 
> > kmap_atomic_pte map RO, which would be fine unless it actually tries to 
> > modify it (but I can't imagine it would do that unlocked).
> 
> But would it redo the mapping after taking the lock? It doesn't look
> like it does (why would it). So we could end up writing to an unpinned
> pte via a R/O mapping.
> 
> As an experiment I tried the simple approach of flushing the multicalls
> explicitly in xen_unpin_page and then clearing the Pinned bit and it all
> goes a bit wrong. eip is "ptep->pte_low = 0" so I think the unpinned but
> R/O theory holds...
>         BUG: unable to handle kernel paging request at f57ab240
>         IP: [<c0486f8b>] unmap_vmas+0x32e/0x5bb
>         *pdpt = 00000001002d6001
>         Oops: 0003 [#1] SMP
>         last sysfs file:
>         Modules linked in:
>         
>         Pid: 719, comm: init Not tainted (2.6.30-rc6-x86_32p-highpte-tip #15)
>         EIP: 0061:[<c0486f8b>] EFLAGS: 00010202 CPU: 0
>         EIP is at unmap_vmas+0x32e/0x5bb
>         EAX: 1dcfb025 EBX: 00000001 ECX: f57ab240 EDX: 00000001
>         ESI: 00000001 EDI: 08048000 EBP: e18d8d54 ESP: e18d8cd4
>          DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>         Process init (pid: 719, ti=e18d8000 task=e24d8cc0 task.ti=e18d8000)
>         Stack:
>          00000002 e18ce000 e3204a50 00000000 00000000 e18ad058 e18d8d70 
> 003ff000
>          08048000 00000001 00000000 00000001 e25d3e00 08050000 e18ce000 
> e25d3e00
>          f57ab240 00000000 00000000 c17f03ec 00000000 00000000 008d8d4c 
> e18bf200
>         Call Trace:
>          [<c048a752>] ? exit_mmap+0x74/0xba
>          [<c0436b92>] ? mmput+0x37/0x81
>          [<c04a0e69>] ? flush_old_exec+0x3bc/0x635
>          [<c04a0274>] ? kernel_read+0x34/0x46
>          [<c04c7594>] ? load_elf_binary+0x329/0x1189
>          [<c049c2da>] ? fsnotify_access+0x4f/0x5a
> 
> kmap_atomic_pte doesn't get passed the mm so there is no way to get at
> the ptl we would need to do something like clearing the pinned flag
> under the lock in xen_unpin_page and holding the lock in
> xen_kmap_atomic_pte. (I don't know if that would be valid anyway under
> the locking scheme).
> 
> The experimental patch:
> 


So should I try with only this patch applied? 

-- Pasi


> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 1729178..e997813 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -1108,9 +1108,7 @@ static void __init xen_mark_init_mm_pinned(void)
>  static int xen_unpin_page(struct mm_struct *mm, struct page *page,
>                         enum pt_level level)
>  {
> -     unsigned pgfl = TestClearPagePinned(page);
> -
> -     if (pgfl && !PageHighMem(page)) {
> +     if (PagePinned(page) && !PageHighMem(page)) {
>               void *pt = lowmem_page_address(page);
>               unsigned long pfn = page_to_pfn(page);
>               spinlock_t *ptl = NULL;
> @@ -1136,10 +1134,12 @@ static int xen_unpin_page(struct mm_struct *mm, 
> struct page *page,
>                                       pfn_pte(pfn, PAGE_KERNEL),
>                                       level == PT_PGD ? UVMF_TLB_FLUSH : 0);
>  
> -             if (ptl) {
> -                     /* unlock when batch completed */
> -                     xen_mc_callback(xen_pte_unlock, ptl);
> -             }
> +             xen_mc_flush();
> +
> +             ClearPagePinned(page);
> +
> +             if (ptl)
> +                     xen_pte_unlock(ptl);
>       }
>  
>       return 0;               /* never need to flush on unpin */
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.