[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Debian Squeeze, xen, multipath and iscsi




Hi!

Thanks for the answer.
Below is the kernel message I am repeatly getting in the log.
The system crash only with the interaction between xen, iscsi and multipath.
Again, system is Debian Squeeze running on a Fujitsu  PRIMERGY RX200 S4 with 8 
cores Intel(R) Xeon(R) CPU           E5405  @ 2.00GHz
iSCSI is from a EMC cabinet.

Any help, please?

Agustin


Modules linked in: dm_round_robin scsi_dh_emc crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp xen_evtchn xenfs dm_multipath dm_mod scsi_dh loop i2c_i801 usbhid ioatdma dca hid shpchp i2c_cor
Feb 16 12:08:01 ariete kernel: [  358.633003] Pid: 1672, comm: dmsetup Tainted: 
G      D    2.6.32-5-xen-amd64 #1 PRIMERGY RX200 S4
Feb 16 12:08:01 ariete kernel: [  358.633003] RIP: e030:[<ffffffff8130cb16>]  
[<ffffffff8130cb16>] _spin_lock+0x13/0x1b
Feb 16 12:08:01 ariete kernel: [  358.633003] RSP: e02b:ffff8807dbccdb10  
EFLAGS: 00000297
Feb 16 12:08:01 ariete kernel: [  358.633003] RAX: 0000000000000022 RBX: 
ffff8807dbccdb28 RCX: ffff8807dbccdb68
Feb 16 12:08:01 ariete kernel: [  358.633003] RDX: 0000000000000021 RSI: 
0000000000000200 RDI: ffff8807dbe1c300
Feb 16 12:08:01 ariete kernel: [  358.633003] RBP: 0000000000000200 R08: 
0000000000000008 R09: ffffffff814eb870
Feb 16 12:08:01 ariete kernel: [  358.633003] R10: 000000000000000b R11: 
ffff8807dbe1c280 R12: ffff8807dbe1c280
Feb 16 12:08:01 ariete kernel: [  358.633003] R13: 000000000000c580 R14: 
ffff8807dbccdb28 R15: ffffffff814eb830
Feb 16 12:08:01 ariete kernel: [  358.633003] FS:  00007fe9c607a7a0(0000) 
GS:ffff8800280c7000(0000) knlGS:0000000000000000
Feb 16 12:08:01 ariete kernel: [  358.633003] CS:  e033 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb 16 12:08:01 ariete kernel: [  358.633003] CR2: 00007fe9c5803420 CR3: 
0000000001001000 CR4: 0000000000002660
Feb 16 12:08:01 ariete kernel: [  358.633003] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Feb 16 12:08:01 ariete kernel: [  358.633003] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Feb 16 12:08:01 ariete kernel: [  358.633003] Call Trace:
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8100dd87>] ? 
xen_exit_mmap+0xf8/0x136
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff810d1208>] ? 
exit_mmap+0x5a/0x148
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8104cb09>] ? 
mmput+0x3c/0xdf
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff81050702>] ? 
exit_mm+0x102/0x10d
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8130ca72>] ? 
_spin_lock_irq+0x7/0x22
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff81052127>] ? 
do_exit+0x1f8/0x6c6
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100ecdf>] ? 
xen_restore_fl_direct_end+0x0/0x1
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130cb3a>] ? 
_spin_unlock_irqrestore+0xd/0xe
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8104f3af>] ? 
release_console_sem+0x17e/0x1af
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130d9dd>] ? 
oops_end+0xaf/0xb4
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810135f0>] ? 
do_invalid_op+0x8b/0x95
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c694>] ? 
pin_pagetable_pfn+0x2d/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffffa01bb9ea>] ? 
copy_params+0x71/0xb1 [dm_mod]
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810baf07>] ? 
__alloc_pages_nodemask+0x11c/0x5f5
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8101293b>] ? 
invalid_op+0x1b/0x20
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c694>] ? 
pin_pagetable_pfn+0x2d/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c690>] ? 
pin_pagetable_pfn+0x29/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cd4e2>] ? 
__pte_alloc+0x6b/0xc6
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cb394>] ? 
pmd_alloc+0x28/0x5b
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cd60b>] ? 
handle_mm_fault+0xce/0x80f
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810fbc5c>] ? 
do_vfs_ioctl+0x48d/0x4cb
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130f016>] ? 
do_page_fault+0x2e0/0x2fc
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130ceb5>] ? 
page_fault+0x25/0x30


El 11/02/2011 15:18, Henrik Langos escribió:
On Fri, Feb 11, 2011 at 02:35:53PM +0100, Agustin Lopez wrote:
Hi all!

I want to update my Debian Lenny xen servers to Squeeze.
I am testing with a new install. All installs Ok but when I install the
multipath
package I get a kernel crash.

I am searched a bit with google but I have not found any solution.

Are there anybody in the list working with this configuration?

PS: If I boot my server without Xen, with a standard kernel, multipath
and the other is working right.

What exactly crashes? dom0 ? Do you get a kernel dump on the console?

I have pretty much the same setup here (iSCSI + multipath + Xen +
Squeeze dom0 + Lenny/Etch PVM domUs) and I had some trouble with
multipath and iSCSI beeing a little touchy.

Basically my dom0 kernel hates to have fast iSCSI logout/login
sequences.

You'll have to give multipathd some time to cleanly remove
multipath devices before you do another login.

Otherwise I get stuff like this where kpartx (the thing that
manages of device nodes for partitions) triggers some
race condition:

Feb 10 06:46:43 xenhost03 kernel: [225060.039126] BUG: unable to handle kernel 
paging request at ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.039172] IP: [<ffffffff8100e428>] 
xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.039210] PGD 1002067 PUD 1006067 PMD 
18a067 PTE 801000001558b065
Feb 10 06:46:43 xenhost03 kernel: [225060.039253] Oops: 0003 [#1] SMP
Feb 10 06:46:43 xenhost03 kernel: [225060.039284] last sysfs file: 
/sys/devices/virtual/block/dm-6/dm/suspended
Feb 10 06:46:43 xenhost03 kernel: [225060.039319] CPU 0
Feb 10 06:46:43 xenhost03 kernel: [225060.039344] Modules linked in: tun 
dm_round_robin crc32c xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack xt_physdev iptable_filter ip_tables x_tables bridge stp xen_evtchn 
xenfs ib_iser rdma_cm ib_c
m iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi dm_multipath scsi_dh loop snd_hda_intel snd_hda_codec 
snd_hwdep snd_pcm i915 drm_kms_helper drm snd_timer i2c_i801 evdev parport_pc 
psmouse serio_raw pcspkr i2c_
algo_bit parport i2c_core snd soundcore video output snd_page_alloc button 
processor acpi_processor ext3 jbd mbcache dm_mod sd_mod crc_t10dif usbhid hid 
uhci_hcd ata_generic ata_piix libata ehci_hcd scsi_mod e1000e usbcore nls_base 
thermal thermal_sys
  [last unloaded: scsi_wait_scan]
Feb 10 06:46:43 xenhost03 kernel: [225060.039851] Pid: 9259, comm: kpartx_id 
Not tainted 2.6.32-5-xen-amd64 #1 To Be Filled By O.E.M.
Feb 10 06:46:43 xenhost03 kernel: [225060.039904] RIP: e030:[<ffffffff8100e428>]  
[<ffffffff8100e428>] xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.039959] RSP: e02b:ffff880013ad3b18  
EFLAGS: 00010246
Feb 10 06:46:43 xenhost03 kernel: [225060.039990] RAX: 0000000000000000 RBX: 
ffff88001558b010 RCX: ffff880000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RDX: ffffea0000000000 RSI: 
0000000001cc0000 RDI: ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RBP: 0000000000000000 R08: 
0000000001cc0000 R09: ffff880073c03100
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R10: 0000000000000000 R11: 
ffff88002ce3bd78 R12: 000000000061c000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R13: 0000000000400000 R14: 
ffff88001558b010 R15: ffff88002e156000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] FS:  00007f5094f64700(0000) 
GS:ffff880003630000(0000) knlGS:0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CS:  e033 DS: 0000 ES: 0000 
CR0: 000000008005003b
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CR2: ffff88001558b010 CR3: 
0000000011a2b000 CR4: 0000000000002660
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Process kpartx_id (pid: 9259, 
threadinfo ffff880013ad2000, task ffff880002747810)
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Stack:
Feb 10 06:46:43 xenhost03 kernel: [225060.040006]  ffff880000000000 
0000000000600000 0000000000400000 ffffffff810cf886
Feb 10 06:46:43 xenhost03 kernel: [225060.040006]<0>  ffff880013ad3fd8 
0000000017ab2067 ffff880002159180 0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]<0>  0000000000000000 
000000000061bfff 000000000061bfff 0000000001c00000
Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Call Trace:
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cf886>] ? 
free_pgd_range+0x226/0x3bf
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cfabb>] ? 
free_pgtables+0x9c/0xbd
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810d129d>] ? 
exit_mmap+0xef/0x148
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff8104cb09>] ? 
mmput+0x3c/0xdf
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f44d6>] ? 
flush_old_exec+0x45c/0x548
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff811270d0>] ? 
load_elf_binary+0x0/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff8112746d>] ? 
load_elf_binary+0x39d/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cc572>] ? 
follow_page+0x2ad/0x303
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810ce136>] ? 
__get_user_pages+0x3ea/0x47b
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f4fcb>] ? 
get_arg_page+0x61/0x110
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff811270d0>] ? 
load_elf_binary+0x0/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f3caa>] ? 
search_binary_handler+0xb4/0x245
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f54a7>] ? 
do_execve+0x1e4/0x2c3
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff81010500>] ? 
sys_execve+0x35/0x4c
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff81011f9a>] ? 
stub_execve+0x6a/0xc0
Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Code: fb ff ff e8 c6 f4 01 00 bf 01 
00 00 00 e8 c9 ea ff ff 59 5e 5b c3 55 48 89 f5 53 48 89 fb 48 83 ec 08 e8 6e e3 ff 
ff 84 c0 75 08<48>  89 2b 41 59 5b 5d c3 41 58 48 89 df 48 89 ee 5b 5d e9 7e ff
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] RIP  [<ffffffff8100e428>] 
xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.042981]  RSP<ffff880013ad3b18>
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] CR2: ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] ---[ end trace 
9939eec096f5a2de ]---


Also I noticed dom0 lockups of more than a minute when starting
HVM domUs while another domU was creating heavy IO load.
Those only disapeared when I gave my dom0 a fixed ammount of
RAM instead of balooning it down.


Other than that I had no bad trouble. (Well, life migration of
Lenny 32 bit domUs on 64bit dom0 doesn't work because the
Lenny domU kernel is not good at that).


I didn't do a new install of squeeze though. I started
with lenny and upgraded to squeeze.

cheers
-henrik


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.