[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Hunting down an oops in Xen 3.1.0's 2.6.18 kernel


  • To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
  • From: "Michael Marineau" <mike@xxxxxxxxxxxx>
  • Date: Wed, 3 Oct 2007 13:39:24 -0700
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 03 Oct 2007 13:40:07 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=q3wIxrvNK/GzFQ6gwandQHWotE/GHKhi6yjiKScm9I/ILBbjakKJzYlJOC2qO4VLYXZIyNlAV0YU0GFolkXiFLd5uA74frX5jJ96D5WEhI+f8V+7t1QEOe9hCb6OVnQfzFhd5r3hPcmW+QYtehwBWIxBqZF8WBHJOCav3YSVLt8=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 9/17/07, Michael Marineau <mike@xxxxxxxxxxxx> wrote:
> On 9/15/07, Keir Fraser <Keir.Fraser@xxxxxxxxxxxx> wrote:
> > On 14/9/07 23:51, "Michael Marineau" <mike@xxxxxxxxxxxx> wrote:
> >
> > > I have been unable to reproduce this with 3.0.4's 2.6.16 kernel but
> > > 2.6.18 will oops on both 3.0.4 and 3.1.0. Also, x86_64 appears to be
> > > ok.
> > >
> > > I'm guessing this issue is the same as the oops reported here:
> > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=975
> > >
> > > Below is an example of the oops on my 2.6.18 pae kernel with a couple
> > > extra debuging lines added:
> >
> > Looks like xen_l1_entry_update() is passed a virtual address which has no
> > corresponding machine address. So the pte page or its mapping is corrupted
> > somehow. deadbeef in the register dumps is also not a good sign. I'll have a
> > go at repro'ing.
> >
> >  -- Keir
> >
> >
> >
>
> As for the deadbeef, I'm kind of doubt it is important. Those values
> show up after the hypercall to xen. Using the attached patch which
> checks for the bogus value prior to the call I get the following oops:
>
> virtptr: f57b40c0 machineptr: 7fffffff0c0
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/hypervisor.c:64!
> invalid opcode: 0000 [#1]
> SMP
> Modules linked in:
> CPU:    0
> EIP:    0061:[<c0117893>]    Not tainted VLI
> EFLAGS: 00010286   (2.6.18-xen-r5-try2 #10)
> EIP is at xen_l1_entry_update+0xd7/0x100
> eax: 0000002d   ebx: 00000000   ecx: 00000000   edx: 00000001
> esi: fffff0c0   edi: 000007ff   ebp: ed45cd10   esp: ed45ccd8
> ds: 007b   es: 007b   ss: 0069
> Process bash (pid: 5044, ti=ed45c000 task=ec835a70 task.ti=ed45c000)
> Stack: c037b964 f57b40c0 fffff0c0 000007ff 00000000 00000000 f57b40c0 fffff0c0
>        000007ff 00000000 00000000 00000000 00000000 00000000 ed45cd84 c01586b7
>        35371025 00000000 ecd95ec0 ecd95f08 c04bce70 00000000 00000004 00000000
> Call Trace:
>  [<c01586b7>] zap_pte_range+0x265/0x658
>  [<c0158c16>] unmap_page_range+0x16c/0x2b4
>  [<c0158e2c>] unmap_vmas+0xce/0x1cb
>  [<c015f0b8>] exit_mmap+0x7d/0xf4
>  [<c011e0f3>] mmput+0x36/0x8c
>  [<c01782d3>] exec_mmap+0x156/0x229
>  [<c0178a78>] flush_old_exec+0x59/0x25a
>  [<c0198a18>] load_elf_binary+0x33c/0xc52
>  [<c0178f2a>] search_binary_handler+0x89/0x23c
>  [<c017922f>] do_execve+0x152/0x1be
>  [<c010391c>] sys_execve+0x32/0x84
>  [<c0104dfb>] syscall_call+0x7/0xb
>  [<b7efd899>] 0xb7efd899
> Code: b4 97 fe ff 85 c0 78 42 83 c4 2c 5b 5e 5f 5d c3 8b 45 e0 89 74
> 24 08 89 7c 24 0
> EIP: [<c0117893>] xen_l1_entry_update+0xd7/0x100 SS:ESP 0069:ed45ccd8

I can still reproduce this problem on the 3.1.1-rc2 xen kernel. Has
anyone had a chance to take a look at this or try to reproduce it? I
can reproduce this far to easily :-(

Is there any further debugging information I can provide?

-- 
Michael Marineau
Oregon State University
mike@xxxxxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.