[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] xen 3.0 amd64 crash... seems to be tied into disk i/o, > 4 gig ram



 
Looking at thhe oops message, this is with a 3ware card, right?
We've had at least one other report of them causing problems on systems
with >4GB enabled (or maybe it was you?)

Ian

> This seems to be a repeatable crash. just do some disk 
> intensive stuff in domU and then type "sync" :(
> 
> The box is a dual opteron 720, with 8 gig of ram, one domU 
> and (duh) one dom0, both with aprox 500 meg of RAM allocated.
> 
> The box has remote power control, serial console, and I can 
> provide developer access if it helps. Kernel was compiled 
> locally (on centos 4.2
> amd64 domU and dom0)
> 
> Box seems stable under raw linux 2.6.14.2, but does generate 
> occasionaly MCE messages pointing at the northbridge/GART... 
> I spent a day researching that, and didn't come to any 
> conclusion other than it could be a bogus report specific to 
> amd64 systems with > 4gig ram. there is an IBM page to that 
> effect for an older RHE system... box has a 3ware controller 
> and SATA drives.
> 
> Anyhow, any help would be appreciated. I'm probably going to 
> try to see if the PAE stuff is more stable... but obviously 
> not tonight.
> 
> In theory this is a 3.0.0 box, but might be 3.0-testing...
> 
> This is pretty greek to me, but given that it seems 
> reproducable, I should be able to produce any other info required...?
> 
> Or should I be dumping this into bugzilla?
> 
> -Tom
> 
> >From root@xxxxxxxxxxxxxxxxxxxxx Thu Dec  8 00:33:19 2005
> Date: Thu, 8 Dec 2005 00:21:56 -0800
> From: root <root@xxxxxxxxxxxxxxxxxxxxx>
> To: tbrown@xxxxxxxxxxxxx
> Subject: oops.2.ksymoops
> ksymoops 2.4.11 on x86_64 2.6.12.6-xen0.  Options used
>      -V (default)
>      -K (specified)
>      -l /proc/modules (default)
>      -o /lib/modules/2.6.12.6-xen0/ (default)
>      -m /boot/System.map-2.6.12.6-xen0 (specified)
> 
> No modules in ksyms, skipping objects
> No ksyms, skipping lsmod
> Unable to handle kernel paging request at ffff88001e61b000 RIP:
> <ffffffff80220bfb>{memcpy+11}
> Oops: 0003 [1]
> CPU 0
> Pid: 0, comm: swapper Not tainted 2.6.12.6-xen0
> RIP: e030:[<ffffffff80220bfb>] <ffffffff80220bfb>{memcpy+11} 
> Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64
> RSP: e02b:ffffffff80525d50  EFLAGS: 00010246
> RAX: ffff88001e61b000 RBX: 000000000000500c RCX: 0000000000000200
> RDX: 0000000000000000 RSI: ffff8800040a2000 RDI: ffff88001e61b000
> RBP: 0000000000000002 R08: 0000000000000002 R09: ffff8800040a2000
> R10: ffff8800040a2000 R11: 0000000000000246 R12: 0000000000000000
> R13: ffff800000000000 R14: 7fffffffffffffff R15: 6db6db6db6db6db7
> FS:  00002aaaaaac9360(0000) GS:ffffffff80511a00(0000) 
> knlGS:0000000055572460
> CS:  e033 DS: 0000 ES: 0000
> Stack: ffffffff8011a094 ffff8800016a55e8 0000000000000000 
> ffff880005ac42d8
>        ffffffff8011a2cd ffff8800016a55e8 0000000000000000 
> 0000000100000000
>        ffff8800147221c0 0000000000000001 Call 
> Trace:<ffffffff8011a094>{__sync_single+100} 
> <ffffffff8011a2cd>{unmap_single+109}
>        <ffffffff8011aa40>{swiotlb_unmap_sg+192} 
> <ffffffff802eb517>{tw_interrupt+1799}
>        <ffffffff8014cd9d>{handle_IRQ_event+61} 
> <ffffffff8014ce87>{__do_IRQ+167}
>        <ffffffff80114dc4>{do_IRQ+52} 
> <ffffffff8010d958>{evtchn_do_upcall+136}
>        <ffffffff80111e7d>{do_hypervisor_callback+17} 
> <ffffffff8010f793>{xen_idle+83}
>        <ffffffff8010f793>{xen_idle+83} <ffffffff8010f7cf>{cpu_idle+31}
>        <ffffffff8052671f>{start_kernel+495} 
> <ffffffff80526193>{_sinittext+403}
> Code: f3 48 a5 89 d1 f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90
> 
> 
> >>RIP; ffffffff80220bfb <memcpy+b/b0>   <=====
> 
> >>RAX; ffff88001e61b000 
> >><__start___xen_guest+ffff88001e612144/ffffffff800f7144>
> >>RSI; ffff8800040a2000 
> >><__start___xen_guest+ffff880004099144/ffffffff800f7144>
> >>RDI; ffff88001e61b000 
> >><__start___xen_guest+ffff88001e612144/ffffffff800f7144>
> >>R09; ffff8800040a2000 
> >><__start___xen_guest+ffff880004099144/ffffffff800f7144>
> >>R10; ffff8800040a2000 
> >><__start___xen_guest+ffff880004099144/ffffffff800f7144>
> >>R13; ffff800000000000 
> >><__start___xen_guest+ffff7fffffff7144/ffffffff800f7144>
> >>R14; 7fffffffffffffff 
> >><__start___xen_guest+7fffffffffff7143/ffffffff800f7144>
> >>R15; 6db6db6db6db6db7 
> >><__start___xen_guest+6db6db6db6dadefb/ffffffff800f7144>
> 
> Trace; ffffffff8011a094 <__sync_single+64/70> Trace; 
> ffffffff8011aa40 <swiotlb_unmap_sg+c0/e0> Trace; 
> ffffffff8014cd9d <handle_IRQ_event+3d/80> Trace; 
> ffffffff80114dc4 <do_IRQ+34/50> Trace; ffffffff80111e7d 
> <do_hypervisor_callback+11/18> Trace; ffffffff8010f793 
> <xen_idle+53/70> Trace; ffffffff8052671f <start_kernel+1ef/200>
> 
> Code;  ffffffff80220bfb <memcpy+b/b0>
> 0000000000000000 <_RIP>:
> Code;  ffffffff80220bfb <memcpy+b/b0>   <=====
>    0:   f3 48 a5                  repz movsq 
> %ds:(%rsi),%es:(%rdi)   <=====
> Code;  ffffffff80220bfe <memcpy+e/b0>
>    3:   89 d1                     mov    %edx,%ecx
> Code;  ffffffff80220c00 <memcpy+10/b0>
>    5:   f3 a4                     repz movsb %ds:(%rsi),%es:(%rdi)
> Code;  ffffffff80220c02 <memcpy+12/b0>
>    7:   c3                        retq
> Code;  ffffffff80220c03 <memcpy+13/b0>
>    8:   66                        data16
> Code;  ffffffff80220c04 <memcpy+14/b0>
>    9:   66                        data16
> Code;  ffffffff80220c05 <memcpy+15/b0>
>    a:   66                        data16
> Code;  ffffffff80220c06 <memcpy+16/b0>
>    b:   90                        nop
> Code;  ffffffff80220c07 <memcpy+17/b0>
>    c:   66                        data16
> Code;  ffffffff80220c08 <memcpy+18/b0>
>    d:   66                        data16
> Code;  ffffffff80220c09 <memcpy+19/b0>
>    e:   66                        data16
> Code;  ffffffff80220c0a <memcpy+1a/b0>
>    f:   90                        nop
> Code;  ffffffff80220c0b <memcpy+1b/b0>
>   10:   66                        data16
> Code;  ffffffff80220c0c <memcpy+1c/b0>
>   11:   66                        data16
> Code;  ffffffff80220c0d <memcpy+1d/b0>
>   12:   66                        data16
> Code;  ffffffff80220c0e <memcpy+1e/b0>
>   13:   90                        nop
> 
> CR2: ffff88001e61b000
>  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> 
> 
> 
> >From root@xxxxxxxxxxxxxxxxxxxxx Thu Dec  8 00:43:16 2005
> Date: Thu, 8 Dec 2005 00:40:51 -0800
> From: root <root@xxxxxxxxxxxxxxxxxxxxx>
> To: tbrown@xxxxxxxxxxxxx
> Subject: tmpx3.ksymoops
> 
> ksymoops 2.4.11 on x86_64 2.6.12.6-xen0.  Options used
>      -V (default)
>      -K (specified)
>      -l /proc/modules (default)
>      -o /lib/modules/2.6.12.6-xen0/ (default)
>      -m /usr/src/linux/System.map (default)
> 
> No modules in ksyms, skipping objects
> No ksyms, skipping lsmod
> Unable to handle kernel paging request at ffff88001e527000 RIP:
> <ffffffff80220bfb>{memcpy+11}
> Oops: 0003 [1]
> CPU 0
> Pid: 0, comm: swapper Not tainted 2.6.12.6-xen0
> RIP: e030:[<ffffffff80220bfb>] <ffffffff80220bfb>{memcpy+11} 
> Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64
> RSP: e02b:ffffffff80525d50  EFLAGS: 00010246
> RAX: ffff88001e527000 RBX: 0000000000003968 RCX: 0000000000000200
> RDX: 0000000000000000 RSI: ffff880003550000 RDI: ffff88001e527000
> RBP: 0000000000000002 R08: 0000000000000002 R09: ffff880003550000
> R10: ffff880003550000 R11: 0000000000000246 R12: 0000000000000000
> R13: ffff800000000000 R14: 7fffffffffffffff R15: 6db6db6db6db6db7
> FS:  00002aaaabe8f280(0000) GS:ffffffff80511a00(0000) 
> knlGS:0000000055572460
> CS:  e033 DS: 0000 ES: 0000
> Stack: ffffffff8011a094 ffff8800016a2088 ffffffff00000000 
> ffff880005ac42d8
>        ffffffff8011a2cd ffff8800016a2088 ffffffff00000000 
> 0000000100000000
>        ffff8800078caf20 0000000000000001 Call 
> Trace:<ffffffff8011a094>{__sync_single+100}
> <ffffffff8011a2cd>{unmap_single+109}
>        <ffffffff8011aa40>{swiotlb_unmap_sg+192}
> <ffffffff802eb517>{tw_interrupt+1799}
>        <ffffffff8014cd9d>{handle_IRQ_event+61} 
> <ffffffff8014ce87>{__do_IRQ+167}
>        <ffffffff80114dc4>{do_IRQ+52} 
> <ffffffff8010d958>{evtchn_do_upcall+136}
>        <ffffffff80111e7d>{do_hypervisor_callback+17}
> <ffffffff8010f793>{xen_idle+83}
>        <ffffffff8010f793>{xen_idle+83} <ffffffff8010f7cf>{cpu_idle+31}
>        <ffffffff8052671f>{start_kernel+495} 
> <ffffffff80526193>{_sinittext+403}
> Code: f3 48 a5 89 d1 f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90
> 
> 
> >>RIP; ffffffff80220bfb <bitmap_parse+bb/210>   <=====
> 
> >>RAX; ffff88001e527000 
> >><phys_startup_64+ffff88001e426f00/ffffffff7fffff00>
> >>RSI; ffff880003550000 
> >><phys_startup_64+ffff88000344ff00/ffffffff7fffff00>
> >>RDI; ffff88001e527000 
> >><phys_startup_64+ffff88001e426f00/ffffffff7fffff00>
> >>R09; ffff880003550000 
> >><phys_startup_64+ffff88000344ff00/ffffffff7fffff00>
> >>R10; ffff880003550000 
> >><phys_startup_64+ffff88000344ff00/ffffffff7fffff00>
> >>R13; ffff800000000000 
> >><phys_startup_64+ffff7fffffefff00/ffffffff7fffff00>
> >>R14; 7fffffffffffffff 
> >><phys_startup_64+7fffffffffeffeff/ffffffff7fffff00>
> >>R15; 6db6db6db6db6db7 
> >><phys_startup_64+6db6db6db6cb6cb7/ffffffff7fffff00>
> 
> Trace; ffffffff8011a094 <touch_nmi_watchdog+4/30> Trace; 
> ffffffff8011aa40 <pin_2_irq+60/130> Trace; ffffffff8014cd9d 
> <kfifo_init+8d/90> Trace; ffffffff80114dc4 <pda_init+94/110> 
> Trace; ffffffff80111e7d <handle_lost_ticks+13d/170> Trace; 
> ffffffff8010f793 <oops_begin+23/70> Trace; ffffffff8052671f 
> <__log_buf+e15f/20000>
> 
> Code;  ffffffff80220bfb <bitmap_parse+bb/210> 0000000000000000 <_RIP>:
> Code;  ffffffff80220bfb <bitmap_parse+bb/210>   <=====
>    0:   f3 48 a5                  repz movsq 
> %ds:(%rsi),%es:(%rdi)   <=====
> Code;  ffffffff80220bfe <bitmap_parse+be/210>
>    3:   89 d1                     mov    %edx,%ecx
> Code;  ffffffff80220c00 <bitmap_parse+c0/210>
>    5:   f3 a4                     repz movsb %ds:(%rsi),%es:(%rdi)
> Code;  ffffffff80220c02 <bitmap_parse+c2/210>
>    7:   c3                        retq
> Code;  ffffffff80220c03 <bitmap_parse+c3/210>
>    8:   66                        data16
> Code;  ffffffff80220c04 <bitmap_parse+c4/210>
>    9:   66                        data16
> Code;  ffffffff80220c05 <bitmap_parse+c5/210>
>    a:   66                        data16
> Code;  ffffffff80220c06 <bitmap_parse+c6/210>
>    b:   90                        nop
> Code;  ffffffff80220c07 <bitmap_parse+c7/210>
>    c:   66                        data16
> Code;  ffffffff80220c08 <bitmap_parse+c8/210>
>    d:   66                        data16
> Code;  ffffffff80220c09 <bitmap_parse+c9/210>
>    e:   66                        data16
> Code;  ffffffff80220c0a <bitmap_parse+ca/210>
>    f:   90                        nop
> Code;  ffffffff80220c0b <bitmap_parse+cb/210>
>   10:   66                        data16
> Code;  ffffffff80220c0c <bitmap_parse+cc/210>
>   11:   66                        data16
> Code;  ffffffff80220c0d <bitmap_parse+cd/210>
>   12:   66                        data16
> Code;  ffffffff80220c0e <bitmap_parse+ce/210>
>   13:   90                        nop
> 
> CR2: ffff88001e527000
>  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> 
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
> 

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.