[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT



Appreciate for the quick response.
 
Actually I have done some decode on the backtrace last Friday.
According the RIP ffff82c4801153c3, I cut the "objdump -dS xen-syms"
(please see below). It looks like the bug happened on the domain page list
travels, which is beyond my understanding. Since in my understanding,
those domain pages come from kernel memory zone, they are always
reside in the physical memory, and the address shouldn't have the chance
to be changed, right?
If so, what is the relationship between all those panic and free_heap_pages?
 
Several servers (at least 3) experienced the same panic on the same test. 
Those servers have the identical hardware, kernel and xen configuration.
Right now, on one server, memtest is running, shall be finished in a few hours.
 (24G memory)
 
------------------------------------------------------------------------------------
 169 static inline void
 170 page_list_del(struct page_info *page, struct page_list_head *head)
 171 {
 172     struct page_info *next = pdx_to_page(page->list.next);
 173     struct page_info *prev = pdx_to_page(page->list.prev);
 174 ffff82c4801153b8:<++8b 73 04             <++mov    0x4(%rbx),%esi
 175 ffff82c4801153bb:<++49 8d 0c 06          <++lea    (%r14,%rax,1),%rcx
 176 ffff82c4801153bf:<++48 8d 05 fa 10 26 00 <++lea    2494714(%rip),%rax        # ffff82c4803764c0 <_heap>                               &nb sp;                
 177 ffff82c4801153c6:<++48 c1 e1 04          <++shl    $0x4,%rcx
 178 ffff82c4801153ca:<++4a 03 0c f8          <++add    (%rax,%r15,8),%rcx
 179 }
 180 static inline void
 181 page_list_del(struct page_info *page, struct page_list_head *head)
 182 {
 183     struct page_info *next = pdx_to_page(page->list.next);
 184 ffff82c4801153ce:<++8b 03                <++mov    (%rbx),%eax
 185 ffff82c4801153d0:<++48 c1 e0 05          <++shl    $0x5,%rax
 186 ffff82c4801153d4:<++48 29 e8 & nbsp;           <++sub    %rbp,%rax
 187 ffff82c4801153d7:<++48 3b 19             <++cmp    (%rcx),%rbx
 188 ffff82c4801153da:<++0f 84 95 01 00 00    <++je     ffff82c480115575 <free_heap_pages+0x405>
 189     struct page_info *prev = pdx_to_page(page->list.prev);
 190 ffff82c4801153e0:<++89 f2                <++mov    %esi,%edx
 191 ffff82c4801153e2:<++48 c1 e2 05          <++shl    $0x5,%rdx
 192 ffff82c4801153e6:<++48 29 ea             <++sub    %rbp,%rdx
 193 ffff 82c4801153e9:<++48 3b 59 08          <++cmp    0x8(%rcx),%rbx
 194 ffff82c4801153ed:<++0f 84 bd 01 00 00    <++je     ffff82c4801155b0 <free_heap_pages+0x440>
 195
 196     if ( !__page_list_del_head(page, head, next, prev) )
 197     {
 198   
------------------------------------------------------------------------------------
 
> Date: Mon, 30 Aug 2010 10:02:05 +0100
> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> From: keir.fraser@xxxxxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>
> On 30/08/2010 09:47, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
>
> > 3) Every panic pointer to the same address: ffff8315ffffffe4, which is
> > not a valid page address.
> > I printted pages of the domain in assign_pages, which all looks like
> > ffff82f60bd64000, at least
> > ffff82f60 is the same.
>
> Yes, well you may not be crashing on a supposed page address. Certainly the
> page pointer that relinquish_memory() is working on, and passed to
> put_page->free_domheap_pages is valid enough to not cause any of those
> functions to crash when dereferencin g it. At the moment you really have no
> idea what is causing free_heap_pages() to crash.
>
> > A bit of lost direction to go further. Thanks.
>
> You need to find out which line of code in free_heap_pages() is crashing,
> and what variable it is trying to dereference when it crashes. You have a
> nice backtrace with an EIP value, so you can 'objdump -d xen-syms' and
> search for the EIP in the disassembly. If you have a debug build of Xen you
> can even do 'objdump -S xen-syms' and have the disassembly annotated with
> corresponding source lines.
>
> Have you seen this on more than one physical machine? If not, have you run
> memtest on the offending machine?
>
> -- Keir
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.