[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] phy disks and vifs timing out in DomU
[Ian, I copied you on this b/c of the netbk issue - read on] > >>>>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright > >>>>> <anthony@xxxxxxxxxxxxxxx> wrote: > >>>>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a > >>>>>> 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two > >>>>>> tap:aio disks are working fine, but the phy disks and the vif don't > >>>>>> work and I get the following error messages from the DomU kernel > >>>>>> during boot: > >>>>>> > >>>>>> [ 1.783658] Using IPI No-Shortcut mode > >>>>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 > >>>>>> (state 3) > >>>>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 > >>>>>> (state 3) Hm, which version of DomU were these? I wonder if this is related to the 'feature-barrier' that is not supported with 3.0. Do you see anything in the DomU about the disks? or xen-blkfront? Can you run the guests with 'initcall_debug loglevel=8 debug' to see if if the blkfront is actually running on those disks. Any idea where the source for those DomU's is? If it is an issue with 'feature-barrier' it looks like it can't handle not having that option visible which it should. > > What device does that correspond to (hint: run xl block-list or xm > > block-list)? > > > The output from block-list is: > > Vdev BE handle state evt-ch ring-ref BE-path > 51729 0 764 3 10 10 /local/domain/0/backend/vbd/764/51729 > 51745 0 764 3 11 11 /local/domain/0/backend/vbd/764/51745 > 51713 0 764 4 8 8 > /local/domain/0/backend/qdisk/764/51713 > 51714 0 764 4 9 9 > /local/domain/0/backend/qdisk/764/51714 > > The two vbds map to two LVM logical volumes in two different volume groups. qdisk.. ok so it does swap over to QEMU internal AIO path. From the output it looks like the ones that hang are the 'phy' types? Is that right? > > On 29/07/2011 17:06, Konrad Rzeszutek Wilk wrote: > >> > I have installed virtually identical systems on two physical machines - > >> > identical (and I mean identical) xen, dom0, domU with possibly a > > md5sum match? > Yes - md5sum match on all the key components, i.e. xen, dom0 kernel, > 99.9% of the root filesystem, the domU kernel & 99.9% of the domU > filesystem. Where there isn't a precise match is on some of the config > files. I don't think these should have any effect, but I will have a go > at mirroring the disks (I can't swap disks since one is SATA & the other > IDE). > > I also was having problems with the vif device, and got a kernel bug > report that could potentially relate to it. I've attached two syslogs. Yeah, that is bad. I actually see a similar issue if I kill forcibly the guests. I hadn't yet narrowed it down - .. you are looking to be using 4.1.. But not 4.1.1 right? Can you describe to me how you get the netbk crash? > 2011 Jul 29 07:02:10 kernel: [ 33.242680] vbd vbd-1-51745: 1 mapping > ring-ref 11 port 11 > > 2011 Jul 29 07:02:10 kernel: [ 33.253038] vif vif-1-0: vif1.0: failed to > map tx ring. err=-12 status=-1 > > 2011 Jul 29 07:02:10 kernel: [ 33.253065] vif vif-1-0: 1 mapping > shared-frames 768/769 port 12 > > 2011 Jul 29 07:02:43 kernel: [ 66.103514] vif vif-1-0: 2 reading script > > 2011 Jul 29 07:02:43 kernel: [ 66.106265] br-internal: port 1(vif1.0) > entering disabled state > > 2011 Jul 29 07:02:43 kernel: [ 66.106309] libfcoe_device_notification: > NETDEV_UNREGISTER vif1.0 > > 2011 Jul 29 07:02:43 kernel: [ 66.106333] br-internal: port 1(vif1.0) > entering disabled state > > 2011 Jul 29 07:02:43 kernel: [ 66.106372] br-internal: mixed no > checksumming and other settings. > > 2011 Jul 29 07:02:43 kernel: [ 66.114097] ------------[ cut here > ]------------ > > 2011 Jul 29 07:02:43 kernel: [ 66.114878] kernel BUG at mm/vmalloc.c:2164! > > 2011 Jul 29 07:02:43 kernel: [ 66.115058] invalid opcode: 0000 [#1] SMP > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Modules linked in: > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Pid: 20, comm: xenwatch Not > tainted 3.0.0 #1 MSI MS-7309/MS-7309 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP: 0061:[<c0494bff>] EFLAGS: > 00010203 CPU: 1 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP is at free_vm_area+0xf/0x19 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EAX: 00000000 EBX: cf866480 ECX: > 00000018 EDX: 00000000 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] ESI: cfa06800 EDI: d076c400 EBP: > cfa06c00 ESP: d0ce7eb4 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] DS: 007b ES: 007b FS: 00d8 GS: > 0000 SS: 0069 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Process xenwatch (pid: 20, > ti=d0ce6000 task=d0c55140 task.ti=d0ce6000) > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Stack: > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 c09e87aa fffc6e63 > c0c4bd65 d0ce7ecc cfa06844 d0ce7ecc d0ce7ecc > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 cfa06800 d076c400 > cfa06c94 c09eace0 d04cd380 00000000 fffffffe > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] d0ce7f9c c061fe74 d04cd2e0 > d076c420 d076c400 d0ce7f9c c09e9f8c d076c400 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Call Trace: > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e87aa>] ? > xen_netbk_unmap_frontend_rings+0xbf/0xd3 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0c4bd65>] ? > netdev_run_todo+0x1b7/0x1cc > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09eace0>] ? > xenvif_disconnect+0xd0/0xe4 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061fe74>] ? > xenbus_rm+0x37/0x3e > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e9f8c>] ? > netback_remove+0x40/0x5d > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c062075d>] ? > xenbus_dev_remove+0x2c/0x3d > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06620e6>] ? > __device_release_driver+0x42/0x79 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06621ac>] ? > device_release_driver+0xf/0x17 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0661818>] ? > bus_remove_device+0x75/0x84 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0660693>] ? > device_del+0xe6/0x125 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06606da>] ? > device_unregister+0x8/0x10 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06205f0>] ? > xenbus_dev_changed+0x71/0x129 > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0405394>] ? > check_events+0x8/0xc > > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061f711>] ? > xenwatch_thread+0xeb/0x113 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c044792c>] ? > wake_up_bit+0x53/0x53 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c061f626>] ? > xenbus_thread+0x1cc/0x1cc > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0447616>] ? kthread+0x63/0x68 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c04475b3>] ? > kthread_worker_fn+0x122/0x122 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0e0f036>] ? > kernel_thread_helper+0x6/0x10 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] Code: c1 00 00 00 01 89 f0 e8 a1 > ff ff ff 81 6b 08 00 10 00 00 eb 02 31 db 89 d8 5b 5e c3 53 89 c3 8b 40 04 e8 > 9b ff ff ff 39 d8 74 04 <0f> 0b eb fe 5b e9 73 95 00 00 57 89 d7 56 31 f6 53 > 89 c3 eb 09 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] EIP: [<c0494bff>] > free_vm_area+0xf/0x19 SS:ESP 0069:d0ce7eb4 > > 2011 Jul 29 07:02:43 kernel: [ 66.129624] ---[ end trace 7bb110af96f32256 > ]--- _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |