[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Crashing kernel with dom0/libxc gnttab/gntshr
On Tue, 30 Jul 2013, Daniel De Graaf wrote: > On 07/30/2013 12:58 PM, David Vrabel wrote: > [...] > > > > [ 902.729307] BUG: Bad page map in process vchan-node1 pte:12bfff167 > > pmd:b9b5c067 > > [ 902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping: > > (null) index:0xffffffffffffffff > > > > I think this is the test for page_mapcount(page) < 0 in zap_pte_range(). > > This has looked up the page using the PTE it is trying to clear. Has > > it found the correct page? Since the MFN is currently mapped into the > > same domain, has the m2p_override stuff confused the look up and it is > > checking the grantee page not the granter? > > > > David > > I think something like this is happening, since while reproducing this > on my test system, some linked list corruption was found that I believe > to be the cause of this problem. The gnttab_map_refs function on PV uses > m2p_add_override on the page, which threads page->lru to an > m2p_overrides list. However, something else is using page->lru during > the use of gntdev, as shown by the following debug patch: I have never managed to prove that something else is trying to use page->lru while the m2p_override is using it. Jeremy, at the time the code was written, you were pretty confident that page->lru couldn't be used by anybody else. Why was that? > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > index 3c8803f..198e57e 100644 > --- a/drivers/xen/gntdev.c > +++ b/drivers/xen/gntdev.c > @@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map) > if (err) > return err; > + printk("map page0 lru: %p prev=%p:%p next=%p:%p\n", > + &map->pages[0]->lru, > + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next, > + map->pages[0]->lru.next, map->pages[0]->lru.next->prev); > + > for (i = 0; i < map->count; i++) { > if (map->map_ops[i].status) > err = -EINVAL; > @@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int > offset, int pages) > } > } > + printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n", > + &map->pages[0]->lru, > + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next, > + map->pages[0]->lru.next, map->pages[0]->lru.next->prev); > err = gnttab_unmap_refs(map->unmap_ops + offset, > use_ptemod ? map->kmap_ops + offset : NULL, map->pages > + offset, > pages); > > Output: > [ 88.610644] map page0 lru: ffffea0001dee160 > prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160 > [ 88.611515] BUG: Bad page map in process a.out pte:8000000077b85167 > pmd:2541a067 > [ 88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping: > (null) index:0xffffffffffffffff > [ 88.611532] page flags: 0x1000000000000814(referenced|dirty|private) > [ 88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma: > (null) mapping:ffff8800692974a0 index:0 > [ 88.611547] vma->vm_ops->fault: (null) > [ 88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0 > [...backtrace cropped...] > [ 88.614301] unmap page0 lru: ffffea0001dee160 > prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938 > > The initial map is a linked list with only that element, so the address > 0xffffffff82f2d510 is the m2p_overrides entry. This means the page being > found by zap_pte_range is not a valid struct page. > > The struct page* being used by the gntalloc device was 0xffffea0000952740, > for reference; it's not a direct collision between the page used by the > gntdev and gntalloc devices. > > Not sure what the best fix is for this at the moment. > > -- > Daniel De Graaf > National Security Agency > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |