[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] iscsi conn error: Xen related?
Hey Thomasz,I could get an interesting and clear trace when things started going south this morning... for no appearant reason (ie: no load). Load shouldn't be a problem is this environment yet. Starting with nop-out timing out.. then sank from there to fail all I/O.I indeed also opened a ticket with open-e, but haven't gotten an answer yet. I also launched a ping -s 8192 -i 3 -I ethXX to the storage, to see if I am losing icmp packets when the iscsi connections are lost. Upgrade can be an option soon.. I also saw xen 3.1.2 was out, so I may upgrade everything at once in a while if the problem persist and no solution is found. The switches doesn't have anything in the log that could indicate any issue with jumbo frames, or anything else for that matter. Thanks all, fred Tomasz Chmielewski wrote: Fred Blaise schrieb:Hello all,I got some severe iscsi connection loss on my dom0 (Gentoo 2.6.20-xen-r6, xen 3.1.1). Happening several times a day.open-iscsi version is 2.0.865.12. Target iscsi is the open-e DSS product. Here is a snip of my messages log file: May 5 16:52:50 ying connection226:0: iscsi: detected conn error (1011) May 5 16:52:51 ying iscsid: connect failed (111)May 5 16:52:51 ying iscsid: Kernel reported iSCSI connection 226:0 error (1011) state (3)May 5 16:52:53 ying connection215:0: iscsi: detected conn error (1011) May 5 16:52:53 ying iscsid: connect failed (111) May 5 16:52:53 ying iscsid: connect failed (111) May 5 16:52:53 ying iscsid: connect failed (111) May 5 16:52:53 ying iscsid: connect failed (111) [...] and sometimes:May 5 16:53:11 ying iscsid: connection227:0 is operational after recovery (6 attempts) May 5 16:53:11 ying iscsid: connection221:0 is operational after recovery (6 attempts) May 5 16:53:12 ying iscsid: connection214:0 is operational after recovery (9 attempts)I doubt it's Xen related.I'm running lots of dom0s and domUs (and non-Xen) running as iSCSI initiator mostly without such problems.If it ever happens, it can mean a problem with: 1) iSCSI target implementation, 2) either the target or initiator is very loaded (or both).Did you try changing the iSCSI target, either to tgt or SCST? I'm not sure what targer you have with e-open; I think they wanted to migrate to SCST, but used buggy IET before (or stil use, I'm not sure).Any other messages/logs?2.6.25 has a nice feature with soft lockups detection, i.e. it will print such messages when machine is severely loaded (it may indicate some problems):May 3 00:46:33 backup1 kernel: INFO: task sync:4875 blocked for more than 120 seconds. Attachment:
snap_msglog.txt _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |