[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] domU's crashing Dom0 (Xen + iSCS = timebomb)


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: "Luis Vinay" <luisvinay@xxxxxxxxx>
  • Date: Thu, 7 Dec 2006 12:04:58 -0300
  • Delivery-date: Thu, 07 Dec 2006 07:04:55 -0800
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=p0TBxE8AL8PszsXISHkc91zTQxtiBSKCWM7yJI5LFy+/NxomW1sqEC85qftZjXBa4BG67dar/rHYogmCnqSFMvnA06fkfjDqoe9E+Xy0+UdfbyZ9FzmRWl5OVXRmA8PI8kn+ZMuZFmHMligwwGk3F4G6T2iWbBBHHpu5exLbHvU=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

I'm experimenting with xen + iscsi, and I founded that under heavy stress domU's can crash entire system, I've reproduced this many many times.

My system is like this

Software:
- iSCSI Enterprise Target v0.4.13
- RedHat AS4 update 4 64bit + Xen 3.0.3-0 Kernel 2.6.16.29 + Open iSCSI v2.0.730 (Initiator)
- Bonnie++ v1.03a

VM:
- Debian 3.1r3 + Open iSCSI v2.0.730 (Initiator)
- RedHat AS4 update 4 + Open iSCSI v2.0.730 (Initiator)

Tests:
Four instances of bonnie++ with root uid on the filesystem to be stressed

hda (S.O., and swap) + hdb (stressed ext3 filesystem ) over iscsi, both debian and RH
   Result: crash
hda (S.O., and swap) local + hdb (stressed ext3 filesystem ) over iscsi, both debian and RH
   Result: crash

hda (S.O., and swap) + hdb (stressed ext3-writeback mode filesystem ) over iscsi, both debian and RH
   Result: crash
hda (S.O., and swap) local + hdb (stressed ext3-writeback mode filesystem ) over iscsi, both debian and RH
   Result: crash

hda (S.O., and swap) + hdb (stressed ext2 filesystem ) over iscsi, both debian and RH
   Result: crash
hda (S.O., and swap) local + hdb (stressed ext2 filesystem ) over iscsi, both debian and RH
   Result: crash

hda (S.O., and swap) + hdb (stressed xfs filesystem ) over iscsi, both debian and RH
   Result: crash
hda (S.O., and swap) local + hdb (stressed xfs filesystem ) over iscsi, both debian and RH
   Result: crash

Also tested:

Xen 3.0.2-2 Dom0 kernel 2.6.16-xen0 (stressed ext2 filesystem ) over iscsi
   Result: ~60hs of testing with no problems (then stopped the tests)

kernel 2.6.16.29 (stressed ext2 filesystem ) over iscsi
   Result: ~24.30hs of testing with no problems (then stopped the tests)

Xen 3.0.2-2 Dom0 kernel 2.6.16.29-xen0 (stressed ext2 filesystem ) over iscsi
   Result: 15min and crashed

I managed to capture the error:
Unable to handle kernel NULL pointer dereference at 00000000000000e8 RIP:
<ffffffff88009a1e>{:bnx2:bnx2_poll+231}
PGD 1f4d7067 PUD 1f613067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: xt_physdev iptable_filter ip_tables x_tables bridge 8021q netloop ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc crc32c libcrc32c iscsi_tcp libiscsi scsi_transport_iscsi joydev tsdev binfmt_misc dm_mirror dm_mod usb_storage video thermal processor usbmouse usbhid usbkbd fan container button battery ac uhci_hcd ehci_hcd usbcore hw_random e1000 bnx2 piix ide_generic
Pid: 0, comm: swapper Not tainted 2.6.16.29-xen0 #3
RIP: e030:[<ffffffff88009a1e>] <ffffffff88009a1e>{:bnx2:bnx2_poll+231}
RSP: e02b:ffffffff80503de8  EFLAGS: 00010286
RAX: 000000000000c9f8 RBX: ffff880017778e30 RCX: ffff880014eee000
RDX: 0000000000000001 RSI: 000000000000c9f7 RDI: 00000000000000e3
RBP: 0000000000000000 R08: 0000000100215d2c R09: 000000000000002c
R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000c9e3
R13: 0000000100215d29 R14: ffff88001e15ed00 R15: 0000000000000000
FS:  00002b02608bf360(0000) GS:ffffffff804b3000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process swapper (pid: 0, threadinfo ffffffff804ca000, task ffffffff80428bc0)
Stack: 0000000000000001 0000000000000001 0000000000000bf0 ffff88001877abf0
       00000000000000d0 ffffffff80111c93 0000000000000bf0 ffffffff8800d0c3
       ffff880000000002 ffffffff00000000
Call Trace: <IRQ> <ffffffff80111c93>{dma_map_page+43}
       <ffffffff8800d0c3>{:bnx2:bnx2_start_xmit+801} <ffffffff803548be>{net_rx_action+230}
       <ffffffff801325d6>{__do_softirq+114} <ffffffff8010bac6>{call_softirq+30}
       <ffffffff8010d575>{do_softirq+71} <ffffffff8010d3ed>{do_IRQ+63}
       <ffffffff802f6b82>{evtchn_do_upcall+192} <ffffffff8010b5f6>{do_hypervisor_callback+30} <EOI>
       <ffffffff801073aa>{hypercall_page+938} <ffffffff801073aa>{hypercall_page+938}
       <ffffffff8010f702>{safe_halt+132} <ffffffff80108d77>{xen_idle+106}
       <ffffffff80108e36>{cpu_idle+171} <ffffffff804cd77a>{start_kernel+488}
       <ffffffff804cd223>{_sinittext+547}

Code: 48 8b 85 e8 00 00 00 66 83 78 06 00 74 25 0f b7 40 04 41 8d
RIP <ffffffff88009a1e>{:bnx2:bnx2_poll+231} RSP <ffffffff80503de8>
CR2: 00000000000000e8
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

Luis Vinay
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.