[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Xen Dom0 crash doing some I/O with "Out of SW-IOMMU space"



Hi,
My dom0 crashes while doing I/O on the local harddrive.
* System is a "Dell Poweredge R710" with an "Perc H200" controller "mpt2sas"/ 
96GB RAM / 2x XEON X5650
* Harddrives are configured as raid1. 
* OS is Debian Squeeze with
  * Xen version 4.0.1 (Debian 4.0.1-1) - amd64 (xen Option: dom0_mem=512M)
  * Dom0-Kernel (Distribution Kernel) : 2.6.32-5-xen-686 (no special Options)
* after doing some moderate I/O on the local raid1 with "dd if=/dev/zero 
of=bigfile bs=1024 count=100000" the system crashes.
* Strange: if the raid1 is degraded, the system doesn't crash, doing I/O over 
the complete Harddrive.  

Has someone an idea how to fix/workaorund this "bug"? In the meantime I tested
different setings without any success: 
- VT-d enabled / disabled / (BIOS and iommu=1)
- dom0_mem=512M (my default) different settings
- modified swiotlb (without any success)

The last lines the Kernel reports:
[ 5822.499666] mpt2sas 0000:03:00.0: DMA: Out of SW-IOMMU space for 65536 bytes.
[ 5822.499743] BUG: unable to handle kernel NULL pointer dereference at 00000008
[ 5822.499919] IP: [<e09a10a4>] _scsih_qcmd+0x412/0x4d0 [mpt2sas]
[ 5822.500024] *pdpt = 0000000001466007 *pde = 0000000000000000 
[ 5822.500147] Oops: 0000 [#1] SMP 
[ 5822.500269] last sysfs file: /sys/devices/virtual/block/md0/md/mismatch_cnt
[ 5822.500330] Modules linked in: netconsole configfs xen_evtchn xenfs fuse 
8021q garp bridge stp reiserfs loop snd_pcm snd_timer ioatdma snd soundcore 
snd_page_alloc psmouse dca dcdbas serio_raw evdev processor button power_meter 
pcspkr joydev acpi_processor ext3 jbd mbcache dm_mod raid1 md_mod sg sr_mod 
sd_mod cdrom crc_t10dif usbhid hid usb_storage uhci_hcd mpt2sas ehci_hcd 
scsi_transport_sas usbcore nls_base scsi_mod bnx2 thermal thermal_sys [last 
unloaded: netconsole]
[ 5822.502221] 
[ 5822.502272] Pid: 442, comm: md0_raid1 Not tainted (2.6.32-5-xen-686 #1) 
PowerEdge R710
[ 5822.502348] EIP: 0061:[<e09a10a4>] EFLAGS: 00010002 CPU: 1
[ 5822.502406] EIP is at _scsih_qcmd+0x412/0x4d0 [mpt2sas]
[ 5822.502462] EAX: dd9ba344 EBX: 00000009 ECX: e099b05d EDX: 14000000
[ 5822.502520] ESI: 00000000 EDI: dd145b30 EBP: 0000000f ESP: dd5efd64
[ 5822.502615]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[ 5822.502679] Process md0_raid1 (pid: 442, ti=dd5ee000 task=c1f4f2c0 
task.ti=dd5ee000)
[ 5822.502754] Stack:
[ 5822.502804]  000000b6 dd9ba344 c1dde400 d5000000 94000000 fffffff1 bf145b00 
00000000
[ 5822.503086] <0> dd105b00 14000000 dd145b00 c1dde000 dada6240 dd9ba000 
dd9b0228 e096597b
[ 5822.503442] <0> dd0a0f90 c1dde000 de50f560 dd9ba000 e096a33c dd0a0f90 
c1dde0b0 dada6240
[ 5822.503844] Call Trace:
[ 5822.503907]  [<e096597b>] ? scsi_dispatch_cmd+0x179/0x1e5 [scsi_mod]
[ 5822.503971]  [<e096a33c>] ? scsi_request_fn+0x343/0x47a [scsi_mod]
[ 5822.504032]  [<c1131da3>] ? __generic_unplug_device+0x23/0x25
[ 5822.504091]  [<c11323a4>] ? __make_request+0x364/0x3d9
[ 5822.505487]  [<c107655b>] ? rcu_process_callbacks+0x33/0x39
[ 5822.505546]  [<c103c4f6>] ? __do_softirq+0x128/0x151
[ 5822.505605]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.505663]  [<c1130f81>] ? generic_make_request+0x266/0x2b4
[ 5822.505723]  [<e08f3d12>] ? flush_pending_writes+0x58/0x74 [raid1]
[ 5822.505783]  [<e08f3df3>] ? raid1d+0x61/0xccc [raid1]
[ 5822.505842]  [<c1007c85>] ? __switch_to+0x124/0x141
[ 5822.505900]  [<c1032342>] ? finish_task_switch+0x3c/0x95
[ 5822.505958]  [<c128d196>] ? schedule+0x78f/0x7dc
[ 5822.506015]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.506074]  [<c10066d3>] ? xen_restore_fl_direct_end+0x0/0x1
[ 5822.506133]  [<c128e2f9>] ? _spin_unlock_irqrestore+0xd/0xf
[ 5822.506192]  [<c104241a>] ? try_to_del_timer_sync+0x4f/0x56
[ 5822.506251]  [<c104242b>] ? del_timer_sync+0xa/0x14
[ 5822.506308]  [<c128d512>] ? schedule_timeout+0x89/0xb0
[ 5822.506365]  [<c10424d3>] ? process_timeout+0x0/0x5
[ 5822.506424]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.506483]  [<c10066dc>] ? check_events+0x8/0xc
[ 5822.506542]  [<e0acd050>] ? md_thread+0xe1/0xf8 [md_mod]
[ 5822.506601]  [<c104b0ea>] ? autoremove_wake_function+0x0/0x2d
[ 5822.506661]  [<e0accf6f>] ? md_thread+0x0/0xf8 [md_mod]
[ 5822.506718]  [<c104aeb8>] ? kthread+0x61/0x66
[ 5822.506774]  [<c104ae57>] ? kthread+0x0/0x66
[ 5822.506830]  [<c1009a67>] ? kernel_thread_helper+0x7/0x10
[ 5822.506886] Code: 08 89 eb 8b 7c 24 28 eb 48 8b 7c 24 28 e9 a9 00 00 00 8b 
44 24 04 83 fb 01 8b 88 14 02 00 00 75 06 8b 54 24 10 eb 04 8b 54 24 24 <0b> 56 
08 89 f8 ff 76 10 4b ff 76 0c ff d1 58 89 f0 5a e8 7d 4c 
[ 5822.509030] EIP: [<e09a10a4>] _scsih_qcmd+0x412/0x4d0 [mpt2sas] SS:ESP 
0069:dd5efd64
[ 5822.509183] CR2: 0000000000000008
[ 5822.509238] ---[ end trace 3c25d9a65cc7a879 ]---


Regards,
        Ulli

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.