[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-bugs] [Bug 1371] New: On heavy IO load in domU, dom0 shows a kernel bug and domU hangs on 100% io wait
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1371 Summary: On heavy IO load in domU, dom0 shows a kernel bug and domU hangs on 100% io wait Product: Xen Version: unstable Platform: x86-64 OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Unspecified AssignedTo: xen-bugs@xxxxxxxxxxxxxxxxxxx ReportedBy: michael.wyraz@xxxxxxxxxxx System: Gentoo Xen: 3.3.0 Kernel: 2.6.21 The virtual disks are LVM on Raid5. I tried both, tap:aio and phy: to access the disks. The system worked well for a few weeks. but now i have the following error every nicht. During our nightly backup of domU the dom0 shows errors in the logfile. "top" in domU shows 100% io wait. I need to kill the domU and reboot the host to fix it. It happend with different domUs. >From the log: Oct 17 00:02:20 office-host ------------[ cut here ]------------ Oct 17 00:02:20 office-host kernel BUG at fs/direct-io.c:1113! Oct 17 00:02:20 office-host invalid opcode: 0000 [1] SMP Oct 17 00:02:20 office-host CPU 0 Oct 17 00:02:20 office-host Modules linked in: nfs lockd nfs_acl sunrpc bridge w83627ehf hwmon_vid hwmon i2c_isa autofs4 skge sk98lin i2c_i801 iTCO_wdt i2c_core iTCO_vendor_support floppy ehci_hcd pcspkr uhci_hcd Oct 17 00:02:20 office-host Pid: 7887, comm: tapdisk Not tainted 2.6.21-xen #5 Oct 17 00:02:20 office-host RIP: e030:[<ffffffff802b20e7>] [<ffffffff802b20e7>] __blockdev_direct_IO+0xa57/0xb70 Oct 17 00:02:20 office-host RSP: e02b:ffff880035475cd8 EFLAGS: 00010202 Oct 17 00:02:20 office-host RAX: 0000000000000001 RBX: ffff8800e64b8000 RCX: 0000000000000000 Oct 17 00:02:20 office-host RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Oct 17 00:02:20 office-host RBP: ffff8800e64b8324 R08: ffff880000ddc838 R09: ffff880035f014c0 Oct 17 00:02:20 office-host R10: ffff880001d4b910 R11: ffffffff80432840 R12: 0000000000000001 Oct 17 00:02:20 office-host R13: fffffffffffffdef R14: ffff8800e64b8088 R15: 0000000000000000 Oct 17 00:02:20 office-host FS: 00002b96dda23bb0(0000) GS:ffffffff8072d000(0000) knlGS:0000000000000000 Oct 17 00:02:20 office-host CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Oct 17 00:02:20 office-host CR2: 0000000000000000 CR3: 0000000035476000 CR4: 0000000000002620 Oct 17 00:02:20 office-host Process tapdisk (pid: 7887, threadinfo ffff880035474000, task ffff8800cb2e8080) Oct 17 00:02:20 office-host Stack: 0000000000001000 0000005d5a262000 ffff8801ed00be38 ffff8801ee7c8a48 Oct 17 00:02:20 office-host 0000000c00000000 0000000000000000 ffff8801ed00be38 ffff8800099f6708 Oct 17 00:02:20 office-host ffff880000000001 ffff8801edd4f3c0 0000000cffffffff 0000000000000001 Oct 17 00:02:20 office-host Call Trace: Oct 17 00:02:20 office-host [<ffffffff802b0b95>] blkdev_direct_IO+0x45/0x50 Oct 17 00:02:20 office-host [<ffffffff802b0a90>] blkdev_get_blocks+0x0/0xc0 Oct 17 00:02:20 office-host [<ffffffff8025fb9f>] generic_file_direct_IO+0xef/0x150 Oct 17 00:02:20 office-host [<ffffffff802612fb>] generic_file_aio_read+0xcb/0x1b0 Oct 17 00:02:20 office-host [<ffffffff80261230>] generic_file_aio_read+0x0/0x1b0 Oct 17 00:02:20 office-host [<ffffffff802a168f>] aio_rw_vect_retry+0x7f/0x170 Oct 17 00:02:20 office-host [<ffffffff802a1610>] aio_rw_vect_retry+0x0/0x170 Oct 17 00:02:20 office-host [<ffffffff802a21d7>] aio_run_iocb+0xc7/0x160 Oct 17 00:02:20 office-host [<ffffffff802a2ef4>] io_submit_one+0x344/0x3e0 Oct 17 00:02:20 office-host [<ffffffff802a3555>] sys_io_submit+0xb5/0x130 Oct 17 00:02:20 office-host [<ffffffff80209ae6>] system_call+0x86/0x8b Oct 17 00:02:20 office-host [<ffffffff80209a60>] system_call+0x0/0x8b Oct 17 00:02:20 office-host Oct 17 00:02:20 office-host Oct 17 00:02:20 office-host Code: 0f 0b eb fe 4d 85 e4 75 1d 48 8b 74 24 08 44 89 ea 48 89 df Oct 17 00:02:20 office-host RIP [<ffffffff802b20e7>] __blockdev_direct_IO+0xa57/0xb70 Oct 17 00:02:20 office-host RSP <ffff880035475cd8> ... Oct 17 03:07:36 office-host eth0: port 4(tap5.0) entering disabled state Oct 17 03:07:36 office-host device tap5.0 left promiscuous mode Oct 17 03:07:36 office-host audit(1224205656.682:20): dev=tap5.0 prom=0 old_prom=256 auid=4294967295 Oct 17 03:07:36 office-host eth0: port 4(tap5.0) entering disabled state Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/vfb/5/0 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/console/5/0 Oct 17 03:07:36 office-host eth0: port 5(vif5.0) entering disabled state Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/5/768 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/5/0 Oct 17 03:07:36 office-host device vif5.0 left promiscuous mode Oct 17 03:07:36 office-host audit(1224205656.898:21): dev=vif5.0 prom=0 old_prom=256 auid=4294967295 Oct 17 03:07:36 office-host eth0: port 5(vif5.0) entering disabled state Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/5/768 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/5/832 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/5/832 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/tap/5/768 Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:691: Unmapping vbd Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:312: Write_msg called: CTLMSG_CLOSE Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:240: DEL_DISKTYPE: Freeing entry Oct 17 03:07:36 office-host TAPDISK[5249]: tapdisk.c:389: Tapdisk: Received msg, len 16, type 7, UID 2 Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/tap/5/832 Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:691: Unmapping vbd Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:312: Write_msg called: CTLMSG_CLOSE Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:240: DEL_DISKTYPE: Freeing entry Oct 17 03:07:36 office-host TAPDISK[5292]: tapdisk.c:389: Tapdisk: Received msg, len 16, type 7, UID 3 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:609: Received a poll for a new vbd Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:56: Sent domid 12 and be_id 768 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:23: /dev/xen/blktap3 device already exists Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:75: Received device id 3 and major 251 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:167: Detected handle: [aio] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:562: tapdisk process does not exist: Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:580: Launching process, CMDLINE [tapdisk /var/run/tap/tapctrlwrite3 /var/run/tap/tapctrlread3] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:327: Write_msg called: CTLMSG_PID Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:426: Received CTLMSG_PID_RSP Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:432: PID: [15534] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:270: Write_msg called: CTLMSG_PARAMS, sending [aio:/dev/VolData1/TermSrv2, /dev/VolData1/TermSrv2] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:284: Generated cookie, 22 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:400: Received CTLMSG_IMG: 167772160, 512, 0 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:674: Received a poll for a new devmap Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:293: Write_msg called: CTLMSG_NEWDEV Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:411: Received CTLMSG_NEWDEV_RSP Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:684: Exiting map_new_blktapctrl Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received msg, len 16, type 9, UID 22 Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received msg, len 39, type 1, UID 22 Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:398: Received CTLMSG_PARAMS: [/dev/VolData1/TermSrv2] Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:406: Loaded driver: name [tapdisk_aio], type [0] Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:127: block-aio open('/dev/VolData1/TermSrv2') Oct 17 03:07:37 office-host TAPDISK[15534]: tapaio.c:237: AIO state initialised Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:147: open(/dev/VolData1/TermSrv2) with O_DIRECT Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:84: Image size: pre sector_shift [85899345920] post sector_shift [167772160] Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:159: Adding fd_list_entry Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:421: Entered cookie 22 Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received msg, len 20, type 4, UID 22 Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:447: Retrieving state, cookie 22.....[OK] Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: brctl delif eth0 vif5.0 failed Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add XENBUS_PATH=backend/tap/12/832 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:609: Received a poll for a new vbd Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:56: Sent domid 12 and be_id 832 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:23: /dev/xen/blktap4 device already exists Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:75: Received device id 4 and major 251 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:167: Detected handle: [aio] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:562: tapdisk process does not exist: Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:580: Launching process, CMDLINE [tapdisk /var/run/tap/tapctrlwrite4 /var/run/tap/tapctrlread4] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:327: Write_msg called: CTLMSG_PID Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received msg, len 16, type 9, UID 23 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:426: Received CTLMSG_PID_RSP Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:432: PID: [15569] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:270: Write_msg called: CTLMSG_PARAMS, sending [aio:/dev/VolStore1/TermSrv2Swap, /dev/VolStore1/TermSrv2Swap] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:284: Generated cookie, 23 Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received msg, len 44, type 1, UID 23 Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:398: Received CTLMSG_PARAMS: [/dev/VolStore1/TermSrv2Swap] Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:406: Loaded driver: name [tapdisk_aio], type [0] Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:127: block-aio open('/dev/VolStore1/TermSrv2Swap') Oct 17 03:07:37 office-host TAPDISK[15569]: tapaio.c:237: AIO state initialised Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:147: open(/dev/VolStore1/TermSrv2Swap) with O_DIRECT Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:84: Image size: pre sector_shift [2147483648] post sector_shift [4194304] Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:159: Adding fd_list_entry Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:421: Entered cookie 23 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:400: Received CTLMSG_IMG: 4194304, 512, 0 Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:674: Received a poll for a new devmap Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:293: Write_msg called: CTLMSG_NEWDEV Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received msg, len 20, type 4, UID 23 Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:447: Retrieving state, cookie 23.....[OK] Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:411: Received CTLMSG_NEWDEV_RSP Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:684: Exiting map_new_blktapctrl Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: ifconfig vif5.0 down failed Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add XENBUS_PATH=backend/tap/12/768 Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge offline for vif5.0, bridge eth0. Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add XENBUS_PATH=backend/tap/12/832 Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add XENBUS_PATH=backend/tap/12/768 Oct 17 03:07:37 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/vif/5/0 Oct 17 03:07:37 office-host device tap12.0 entered promiscuous mode Oct 17 03:07:37 office-host audit(1224205657.330:22): dev=tap12.0 prom=256 old_prom=0 auid=4294967295 Oct 17 03:07:37 office-host eth0: port 4(tap12.0) entering learning state Oct 17 03:07:37 office-host eth0: topology change detected, propagating Oct 17 03:07:37 office-host eth0: port 4(tap12.0) entering forwarding state Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: online XENBUS_PATH=backend/vif/12/0 Oct 17 03:07:37 office-host device vif12.0 entered promiscuous mode Oct 17 03:07:37 office-host audit(1224205657.383:23): dev=vif12.0 prom=256 old_prom=0 auid=4294967295 Oct 17 03:07:37 office-host eth0: port 5(vif12.0) entering learning state Oct 17 03:07:37 office-host eth0: topology change detected, propagating Oct 17 03:07:37 office-host eth0: port 5(vif12.0) entering forwarding state Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge online for vif12.0, bridge eth0. Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Writing backend/vif/12/0/hotplug-status connected to xenstore. Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: Writing backend/tap/12/832/hotplug-status connected to xenstore. Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: Writing backend/tap/12/768/hotplug-status connected to xenstore. Oct 17 03:08:14 office-host blktap: ring-ref 2047, event-channel 5, protocol 1 (unspecified, assuming native) Oct 17 03:08:14 office-host blk_tap: unknown operation [255] Oct 17 03:08:14 office-host blk_tap: Bad number of segments in request (0) Oct 17 03:08:14 office-host blktap: ring-ref 1758, event-channel 6, protocol 1 (unspecified, assuming native) Oct 17 03:08:14 office-host blk_tap: unknown operation [255] Oct 17 03:08:14 office-host blk_tap: Bad number of segments in request (0) ... -- Configure bugmail: http://bugzilla.xensource.com/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. _______________________________________________ Xen-bugs mailing list Xen-bugs@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-bugs
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |