[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] XCP 1.0-beta: md on dom0 kernel causes kernel oops



Hum. I'm not too sure myself about kernel issues - Simon Rowe (cc'd) may be 
able to help?

Jon

Sent from my iPad

On 29 Nov 2010, at 07:24, "Tomoe Sugihara" <sugihara@xxxxxxxxxxxxx> wrote:

> Hi,
>
> I'm testing XCP 1.0-beta with VastSky. However, I'm facing a serious issue.
>
> I need to use md driver on the dom0 kernel for VastSky. However, the
> kernel oopses with NULL pointer reference as in the following call stack.
> This issue is 100% reproducible by just creating md device and perform io on 
> it.
>
> I found that dom0 custom code in bio_fs_destructor() sets the variable
> dereferenced by bio_free().
>
> Let me know if more information is needed to investigate the issue.
>
> ---------excerpt from/var/crash/20101129-155406-JST/domain0.log
>
>
>        <6>device-mapper: multipath round-robin: version 1.0.0 loaded
>        <6>md: bind<dm-4>
>        <6>md: bind<dm-2>
>        <6>md: bind<dm-5>
>        <6>md: raid1 personality registered for level 1
>        <6>raid1: raid set md127 active with 3 out of 3 mirrors
>        <6>md127: bitmap file is out of date (0 < 1) -- forcing full recovery
>        <6>md127: bitmap file is out of date, doing full recovery
>        <1>BUG: unable to handle kernel NULL pointer dereference at 00000004
>        <1>IP: [<c01b9aec>] bio_free+0x2c/0x50
>        <4>*pdpt = 0000000054744027 *pde = 0000000000000000
>        <0>Oops: 0000 [#1] SMP
>        <0>last sysfs file: /sys/class/net/lo/carrier
>        <4>Modules linked in: raid1 dm_round_robin iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dm_snapshot dm_multipath scsi_dh lockd sunrpc 
> bridge stp llc binfmt_misc dm_mirror video output sbs sbshc fan battery ac 
> parport_pc lp parport nvram evdev container usbhid sg thermal button 
> processor thermal_sys sr_mod cdrom tg3 e1000e serio_raw 8250_pnp 8250 
> rtc_cmos serial_core rtc_core rtc_lib tpm_tis i2c_i801 tpm tpm_bios i2c_core 
> pcspkr dm_region_hash dm_log dm_mod ide_gd_mod pata_acpi ata_piix ata_generic 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font 
> tileblit bitblit softcursor [last unloaded: microcode]
>        <4>
>        <4>Pid: 10495, comm: kdmflush Not tainted 
> (2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1) ProLiant ML110 G5
>        <4>EIP: 0061:[<c01b9aec>] EFLAGS: 00010246 CPU: 0
>        <4>EIP is at bio_free+0x2c/0x50
>        <4>EAX: ed6dec0c EBX: ed6debc0 ECX: ee9c1dc0 EDX: ed6dec0c
>        <4>ESI: 00000000 EDI: c76cd080 EBP: ed991ef4 ESP: ed991eec
>        <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>        <0>Process kdmflush (pid: 10495, ti=ed990000 task=eeb4a830 
> task.ti=ed990000)
>        <0>Stack:
>        <4> ed442800 ed6debc0 ed991efc c01b9b1b ed991f04 c01b83b5 ed991f24 
> c02e0213
>        <4><0> c76cd080 ed6dea40 ed991f1c c76cd080 ed6dea40 ed6debc0 ed991f44 
> c02e196f
>        <4><0> 00100100 00000000 ed442800 c02e1930 ed6debc0 e9baa67c ed991f50 
> c01b82c0
>        <0>Call Trace:
>        <4> [<c01b9b1b>] ? bio_fs_destructor+0xb/0x10
>        <4> [<c01b83b5>] ? bio_put+0x25/0x30
>        <4> [<c02e0213>] ? super_written+0x53/0xa0
>        <4> [<c02e196f>] ? super_written_barrier+0x3f/0xb0
>        <4> [<c02e1930>] ? super_written_barrier+0x0/0xb0
>        <4> [<c01b82c0>] ? bio_endio+0x20/0x40
>        <4> [<f04789f9>] ? dm_wq_work+0x79/0x1f0 [dm_mod]
>        <4> [<c013d7b2>] ? worker_thread+0xf2/0x240
>        <4> [<c011ea68>] ? __wake_up_common+0x48/0x70
>        <4> [<f0478980>] ? dm_wq_work+0x0/0x1f0 [dm_mod]
>        <4> [<c0140730>] ? autoremove_wake_function+0x0/0x50
>        <4> [<c013d6c0>] ? worker_thread+0x0/0x240
>        <4> [<c0140474>] ? kthread+0x74/0x80
>        <4> [<c0140400>] ? kthread+0x0/0x80
>        <4> [<c010480b>] ? kernel_thread_helper+0x7/0x10
>        <0>Code: 89 e5 83 ec 08 89 1c 24 89 c3 89 74 24 04 89 d6 8b 50 38 85 
> d2 74 14 8d 40 4c 39 c2 74 0d 8b 4b 10 89 f0 c1 e9 1c e8 a4 ff ff ff <2b> 5e 
> 04 8b 56 08 89 d8 e8 97 99 fa ff 8b 1c 24 8b 74 24 04 89
>        <0>EIP: [<c01b9aec>] bio_free+0x2c/0x50 SS:ESP 0069:ed991eec
>        <0>CR2: 0000000000000004
>        <1>BUG: unable to handle kernel NULL pointer dereference at 00000004
>        <1>IP: [<c01b9aec>] bio_free+0x2c/0x50
>        <4>*pdpt = 000000004f721007 *pde = 0000000000000000
>        <0>Oops: 0000 [#2] SMP
>        <0>last sysfs file: /sys/class/net/lo/carrier
>        <4>Modules linked in: raid1 dm_round_robin iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dm_snapshot dm_multipath scsi_dh lockd sunrpc 
> bridge stp llc binfmt_misc dm_mirror video output sbs sbshc fan battery ac 
> parport_pc lp parport nvram evdev container usbhid sg thermal button 
> processor thermal_sys sr_mod cdrom
>        <4>---[ end trace e8caf3b7a56e7eff ]---
>        <4> tg3 e1000e serio_raw 8250_pnp 8250 rtc_cmos serial_core rtc_core 
> rtc_lib tpm_tis i2c_i801 tpm tpm_bios i2c_core pcspkr dm_region_hash dm_log 
> dm_mod ide_gd_mod pata_acpi ata_piix ata_generic libata sd_mod scsi_mod ext3 
> jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font tileblit bitblit softcursor 
> [last unloaded: microcode]
>        <4>
>        <4>Pid: 10661, comm: kdmflush Tainted: G      D    
> (2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1) ProLiant ML110 G5
>        <4>EIP: 0061:[<c01b9aec>] EFLAGS: 00010246 CPU: 1
>        <4>EIP is at bio_free+0x2c/0x50
>        <4>EAX: c7788c0c EBX: c7788bc0 ECX: ee9c1dc0 EDX: c7788c0c
>        <4>ESI: 00000000 EDI: c76cd180 EBP: c6581ef4 ESP: c6581eec
>        <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>        <0>Process kdmflush (pid: 10661, ti=c6580000 task=ee2ca0b0 
> task.ti=c6580000)
>        <0>Stack:
>        <4> ed442800 c7788bc0 c6581efc c01b9b1b c6581f04 c01b83b5 c6581f24 
> c02e0213
>        <4><0> c76cd180 c7788440 c6581f1c c76cd180 c7788440 c7788bc0 c6581f44 
> c02e196f
>        <4><0> 00100100 00000000 ed442800 c02e1930 c7788bc0 ed70b47c c6581f50 
> c01b82c0
>        <0>Call Trace:
>        <4> [<c01b9b1b>] ? bio_fs_destructor+0xb/0x10
>        <4> [<c01b83b5>] ? bio_put+0x25/0x30
>        <4> [<c02e0213>] ? super_written+0x53/0xa0
>        <4> [<c02e196f>] ? super_written_barrier+0x3f/0xb0
>        <4> [<c02e1930>] ? super_written_barrier+0x0/0xb0
>        <4> [<c01b82c0>] ? bio_endio+0x20/0x40
>        <4> [<f04789f9>] ? dm_wq_work+0x79/0x1f0 [dm_mod]
>        <4> [<c013d7b2>] ? worker_thread+0xf2/0x240
>        <4> [<c011ea68>] ? __wake_up_common+0x48/0x70
>        <4> [<f0478980>] ? dm_wq_work+0x0/0x1f0 [dm_mod]
>        <4> [<c0140730>] ? autoremove_wake_function+0x0/0x50
>        <4> [<c013d6c0>] ? worker_thread+0x0/0x240
>        <4> [<c0140474>] ? kthread+0x74/0x80
>        <4> [<c0140400>] ? kthread+0x0/0x80
>        <4> [<c010480b>] ? kernel_thread_helper+0x7/0x10
>        <0>Code: 89 e5 83 ec 08 89 1c 24 89 c3 89 74 24 04 89 d6 8b 50 38 85 
> d2 74 14 8d 40 4c 39 c2 74 0d 8b 4b 10 89 f0 c1 e9 1c e8 a4 ff ff ff <2b> 5e 
> 04 8b 56 08 89 d8 e8 97 99 fa ff 8b 1c 24 8b 74 24 04 89
>        <0>EIP: [<c01b9aec>] bio_free+0x2c/0x50 SS:ESP 0069:c6581eec
>        <0>CR2: 0000000000000004
>        <4>---[ end trace e8caf3b7a56e7f00 ]---
>        <1>BUG: unable to handle kernel NULL pointer dereference at 00000004
>        <1>IP: [<c01b9aec>] bio_free+0x2c/0x50
>        <4>*pdpt = 000000005044b007 *pde = 0000000000000000
>        <0>Oops: 0000 [#3] SMP
>        <6>md127: bitmap initialized from disk: read 1/1 pages, set 161 bits
>        <6>created bitmap (1 pages) for device md127
>        <0>last sysfs file: /sys/class/net/lo/carrier
>        <4>Modules linked in: raid1 dm_round_robin iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dm_snapshot dm_multipath scsi_dh lockd sunrpc 
> bridge stp llc binfmt_misc dm_mirror video output sbs sbshc fan battery ac 
> parport_pc lp parport nvram evdev container usbhid sg thermal button 
> processor thermal_sys sr_mod cdrom tg3 e1000e serio_raw 8250_pnp 8250 
> rtc_cmos serial_core rtc_core rtc_lib tpm_tis i2c_i801 tpm tpm_bios i2c_core 
> pcspkr dm_region_hash dm_log dm_mod ide_gd_mod pata_acpi ata_piix ata_generic 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font 
> tileblit bitblit softcursor [last unloaded: microcode]
>        <4>
>        <4>Pid: 10647, comm: kdmflush Tainted: G      D    
> (2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1) ProLiant ML110 G5
>        <4>EIP: 0061:[<c01b9aec>] EFLAGS: 00010246 CPU: 1
>        <4>EIP is at bio_free+0x2c/0x50
>        <4>EAX: ed6def0c EBX: ed6deec0 ECX: ed4429d8 EDX: ed6def0c
>        <4>ESI: 00000000 EDI: c76a8dc0 EBP: eda81ef4 ESP: eda81eec
>        <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>        <0>Process kdmflush (pid: 10647, ti=eda80000 task=ee88b470 
> task.ti=eda80000)
>        <0>Stack:
>        <4> ed442800 ed6deec0 eda81efc c01b9b1b eda81f04 c01b83b5 eda81f24 
> c02e0213
>        <4><0> 00000000 c77888c0 eda81f1c c76a8dc0 c77888c0 ed6deec0 eda81f44 
> c02e196f
>        <4><0> 00100100 00000000 ed442800 c02e1930 ed6deec0 c764887c eda81f50 
> c01b82c0
>        <0>Call Trace:
>        <4> [<c01b9b1b>] ? bio_fs_destructor+0xb/0x10
>        <4> [<c01b83b5>] ? bio_put+0x25/0x30
>        <4> [<c02e0213>] ? super_written+0x53/0xa0
>        <4> [<c02e196f>] ? super_written_barrier+0x3f/0xb0
>        <4> [<c02e1930>] ? super_written_barrier+0x0/0xb0
>        <4> [<c01b82c0>] ? bio_endio+0x20/0x40
>        <4> [<f04789f9>] ? dm_wq_work+0x79/0x1f0 [dm_mod]
>        <4> [<c013d7b2>] ? worker_thread+0xf2/0x240
>        <4> [<c011ea68>] ? __wake_up_common+0x48/0x70
>        <4> [<f0478980>] ? dm_wq_work+0x0/0x1f0 [dm_mod]
>        <4> [<c0140730>] ? autoremove_wake_function+0x0/0x50
>        <4> [<c013d6c0>] ? worker_thread+0x0/0x240
>        <4> [<c0140474>] ? kthread+0x74/0x80
>        <4> [<c0140400>] ? kthread+0x0/0x80
>        <4> [<c010480b>] ? kernel_thread_helper+0x7/0x10
>        <0>Code: 89 e5 83 ec 08 89 1c 24 89 c3 89 74 24 04 89 d6 8b 50 38 85 
> d2 74 14 8d 40 4c 39 c2 74 0d 8b 4b 10 89 f0 c1 e9 1c e8 a4 ff ff ff <2b> 5e 
> 04 8b 56 08 89 d8 e8 97 99 fa ff 8b 1c 24 8b 74 24 04 89
>        <0>EIP: [<c01b9aec>] bio_free+0x2c/0x50 SS:ESP 0069:eda81eec
>        <0>CR2: 0000000000000004
>        <4>---[ end trace e8caf3b7a56e7f01 ]---
>        <4>------------[ cut here ]------------
>        <4>WARNING: at arch/x86/mm/ioremap-xen.c:324 
> __ioremap_caller+0x3e7/0x450()
>        <4>Hardware name: ProLiant ML110 G5
>        <4>Modules linked in: raid1 dm_round_robin iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dm_snapshot dm_multipath scsi_dh lockd sunrpc 
> bridge stp llc binfmt_misc dm_mirror video output sbs sbshc fan battery ac 
> parport_pc lp parport nvram evdev container usbhid sg thermal button 
> processor thermal_sys sr_mod cdrom tg3 e1000e serio_raw 8250_pnp 8250 
> rtc_cmos serial_core rtc_core rtc_lib tpm_tis i2c_i801 tpm tpm_bios i2c_core 
> pcspkr dm_region_hash dm_log dm_mod ide_gd_mod pata_acpi ata_piix ata_generic 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font 
> tileblit bitblit softcursor [last unloaded: microcode]
>        <4>Pid: 24169, comm: hexdump Tainted: G      D    
> 2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1
>        <4>Call Trace:
>        <4> [<c0114747>] ? __ioremap_caller+0x3e7/0x450
>        <4> [<c012bccc>] warn_slowpath_common+0x7c/0xa0
>        <4> [<c0114747>] ? __ioremap_caller+0x3e7/0x450
>        <4> [<c012bd05>] warn_slowpath_null+0x15/0x20
>        <4> [<c0114747>] __ioremap_caller+0x3e7/0x450
>        <4> [<c01452a9>] ? sched_clock_local+0xc9/0x1a0
>        <4> [<c02a1bce>] ? read_mem+0x7e/0xe0
>        <4> [<c011481a>] ioremap_nocache+0x1a/0x20
>        <4> [<c02a1bce>] ? read_mem+0x7e/0xe0
>        <4> [<c02a1bce>] read_mem+0x7e/0xe0
>        <4> [<c0193214>] vfs_read+0x94/0x150
>        <4> [<c02a1b50>] ? read_mem+0x0/0xe0
>        <4> [<c01936ad>] sys_read+0x3d/0x70
>        <4> [<c01044e1>] syscall_call+0x7/0xb
>        <4>---[ end trace e8caf3b7a56e7f02 ]---
>        <6>md: bind<sdc>
>        <6>md: bind<sdd>
>        <5>raid1: md1 is not clean -- starting background reconstruction
>        <6>raid1: raid set md1 active with 2 out of 2 mirrors
>        <6>md1: detected capacity change from 0 to 2000398843904
>        <6>md: resync of RAID array md1
>        <6>md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
>        <6>md: using maximum available idle IO bandwidth (but not more than 
> 200000 KB/sec) for resync.
>        <6>md: using 128k window, over a total of 1953514496 blocks.
>        <6> md1:
>        <1>BUG: unable to handle kernel NULL pointer dereference at 00000004
>        <1>IP: [<c01b9aec>] bio_free+0x2c/0x50
>        <4>*pdpt = 00000000546b4027 *pde = 0000000000000000
>        <0>Oops: 0000 [#4] SMP
>        <0>last sysfs file: /sys/class/net/lo/carrier
>        <4>Modules linked in: raid1 dm_round_robin iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dm_snapshot dm_multipath scsi_dh lockd sunrpc 
> bridge stp llc binfmt_misc dm_mirror video output sbs sbshc fan battery ac 
> parport_pc lp parport nvram evdev container usbhid sg thermal button 
> processor thermal_sys sr_mod cdrom tg3 e1000e serio_raw 8250_pnp 8250 
> rtc_cmos serial_core rtc_core rtc_lib tpm_tis i2c_i801 tpm tpm_bios i2c_core 
> pcspkr dm_region_hash dm_log dm_mod ide_gd_mod pata_acpi ata_piix ata_generic 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font 
> tileblit bitblit softcursor [last unloaded: microcode]
>        <4>
>        <4>Pid: 0, comm: swapper Tainted: G      D W  
> (2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1) ProLiant ML110 G5
>        <4>EIP: 0061:[<c01b9aec>] EFLAGS: 00010246 CPU: 1
>        <4>EIP is at bio_free+0x2c/0x50
>        <4>EAX: c766e90c EBX: c766e8c0 ECX: ed756314 EDX: c766e90c
>        <4>ESI: 00000000 EDI: ed7562c0 EBP: ee863cbc ESP: ee863cb4
>        <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>        <0>Process swapper (pid: 0, ti=ee862000 task=ee83e3f0 task.ti=ee862000)
>        <0>Stack:
>        <4> ee270c70 00000000 ee863cc4 c01b9b1b ee863ccc c01b83b5 ee863ce4 
> f1313ee0
>        <4><0> ee270c40 00000001 ed7562c0 ee270c40 ee863d34 f1315670 d48f1d32 
> 0000146f
>        <4><0> 52b86e59 00000a7b 00000000 00000000 6777d963 000009f3 29edf241 
> 00000000
>        <0>Call Trace:
>        <4> [<c01b9b1b>] ? bio_fs_destructor+0xb/0x10
>        <4> [<c01b83b5>] ? bio_put+0x25/0x30
>        <4> [<f1313ee0>] ? raid_end_bio_io+0x70/0x90 [raid1]
>        <4> [<f1315670>] ? raid1_end_read_request+0x50/0x120 [raid1]
>        <4> [<c01452a9>] ? sched_clock_local+0xc9/0x1a0
>        <4> [<f1315620>] ? raid1_end_read_request+0x0/0x120 [raid1]
>        <4> [<c01b82c0>] ? bio_endio+0x20/0x40
>        <4> [<c020f15c>] ? req_bio_endio+0x5c/0xd0
>        <4> [<c020f25e>] ? blk_update_request+0x8e/0x390
>        <4> [<c020f576>] ? blk_update_bidi_request+0x16/0x60
>        <4> [<c0210086>] ? blk_end_bidi_request+0x26/0x70
>        <4> [<c02100e2>] ? blk_end_request+0x12/0x20
>        <4> [<f036613c>] ? scsi_io_completion+0x9c/0x480 [scsi_mod]
>        <4> [<f0365cfc>] ? scsi_device_unbusy+0x8c/0xc0 [scsi_mod]
>        <4> [<f035f64d>] ? scsi_finish_command+0x9d/0x100 [scsi_mod]
>        <4> [<f036319e>] ? scsi_decide_disposition+0x15e/0x170 [scsi_mod]
>        <4> [<f036661d>] ? scsi_softirq_done+0xfd/0x130 [scsi_mod]
>        <4> [<c0135abd>] ? run_timer_softirq+0x1d/0x200
>        <4> [<c021579a>] ? trigger_softirq+0x8a/0xa0
>        <4> [<c0215818>] ? blk_done_softirq+0x68/0x80
>        <4> [<c01311aa>] ? __do_softirq+0xba/0x180
>        <4> [<c01591d7>] ? handle_IRQ_event+0x37/0x100
>        <4> [<c015c2f4>] ? move_native_irq+0x14/0x50
>        <4> [<c01312e5>] ? do_softirq+0x75/0x80
>        <4> [<c01315cb>] ? irq_exit+0x2b/0x40
>        <4> [<c0298817>] ? evtchn_do_upcall+0x1e7/0x330
>        <4> [<c01046ef>] ? hypervisor_callback+0x43/0x4b
>        <4> [<c0107035>] ? xen_safe_halt+0xb5/0x150
>        <4> [<c010ac7e>] ? xen_idle+0x1e/0x50
>        <4> [<c0102a7b>] ? cpu_idle+0x3b/0x60
>        <4> [<c037b00d>] ? cpu_bringup_and_idle+0xd/0x10
>        <0>Code: 89 e5 83 ec 08 89 1c 24 89 c3 89 74 24 04 89 d6 8b 50 38 85 
> d2 74 14 8d 40 4c 39 c2 74 0d 8b 4b 10 89 f0 c1 e9 1c e8 a4 ff ff ff <2b> 5e 
> 04 8b 56 08 89 d8 e8 97 99 fa ff 8b 1c 24 8b 74 24 04 89
>        <0>EIP: [<c01b9aec>] bio_free+0x2c/0x50 SS:ESP 0069:ee863cb4
>        <0>CR2: 0000000000000004
>
>
>
> --
> Best,
> Tomoe
>
> _______________________________________________
> xen-api mailing list
> xen-api@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/mailman/listinfo/xen-api

_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.