[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Possible bug with scsi disk and Xen
On Wed, Feb 02, 2011 at 11:43:29PM +0100, Jordan Pittier wrote: > Hi, > Finally I managed to compile the driver LSI MPT Fusion 4.22. I took the > source from kernel 2.6.34 shipped with SLES. Then I slightly changed the > driver sources to "backport" it on a debian 2.6.32. > Now my servers seem 100% stable, so I am verry happy :) Thanks for your > big hint toward a possible depreciated driver. > Good to hear it helped! -- Pasi > Jordan > > On Sat, Jan 29, 2011 at 7:49 PM, Pasi Kärkkäinen <[1]pasik@xxxxxx> wrote: > > On Sat, Jan 29, 2011 at 07:32:31PM +0100, Jordan Pittier wrote: > > >Xen dom0 kernel does irq handling through Xen hypervisor, > > >so that might make some drivers behave in a different way > baremetal vs. > > dom0. > > Ok, so the driver is a good "responsible" for this SCSI crazyness. > > I'm not sure if it is, but it *could* be. > > > >What driver version does the squeeze kernel have? > > 3.04. Which seems to be several years old. There is lot of users > > complaining about LSI drivers all over the Internet.* > > I will keep you posted as soon as I manage to build the latest > driver. > > See here for tips how to build updated megaraid_sas driver: > > [2]http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html > > Maybe it helps also with your driver. > > -- Pasi > > On Sat, Jan 29, 2011 at 7:25 PM, Pasi K*rkk*inen > <[1][3]pasik@xxxxxx> wrote: > > > On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: > > >> Thanks for your reply. LSI has indeed newer driver for the > controler; > > >> but I can't "build" it, there's an error when I try to compile > it [see > > >> attachement]. I will give another try in the next days. > > >> > > >> What is puzzling is that the IO errors only occurs with Xen HV. > I am > > >> 100% willing to accept that the problem is the drivers, but how > come > > >> the exact same kernel (the xenified one) could work fine without > Xen > > >> loaded ? I am almost a noob in kernel/driver and stuff; but I > thought > > >> the drivers were entirely in the kernel. > > >> > > > > > > Yep, the driver is entirely in the kernel, but that's not the > whole > > story. > > > > > > Xen dom0 kernel does irq handling through Xen hypervisor, > > > so that might make some drivers behave in a different way > baremetal vs. > > dom0. > > > > > > Also remember dom0 is a *vm*, so some timing stuff might happen > > > differently on baremetal vs. dom0. > > > > > >> I will try with the latest kernel in a few days. > > >> > > >> SLES11SP1 ships mptfusion 4.22 > > >> > > > > ([2][4]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) > > >> I dont know for RHEL > > >> > > > > > > What driver version does the squeeze kernel have? > > > > > > > > > -- Pasi > > > > > > > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi K*rkk*inen > <[3][5]pasik@xxxxxx> > > wrote: > > >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier > wrote: > > >> >> * *Hi, > > >> >> * *I have been tracking a bug affecting all my servers > running > > Debian Squeeze > > >> >> * *for more than a month now, and I*desperately*need your > help :)* > > >> >> * *I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == > > mirror). 4 of > > >> >> * *them are running Debian Squeeze with the latest Xen Debian > kernel > > >> >> * *(2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running > Debian > > Lenny > > >> >> * *(2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > > >> >> * *On a Squeeze boxe, under very high IO (such as running a > IO > > stress test, > > >> >> * *ie bonnie++), server starts behaving*weirdly and I see > messages > > like these > > >> >> * *in kernel.log : [see attachement]. Then the server becomes > > totally > > >> >> * *unresponsive (but doesn't "freeze") and commands such as > "ls" or > > "reboot" > > >> >> * *don't work anymore. I have to do an hard reboot. After the > server > > has > > >> >> * *reboot, the RAID array seems degraded (I am using the > mpt-status > > command) > > >> >> * *and starts rebuilding. After several hours, the raid array > is > > "fine" > > >> >> * *("clean"). The raid controler is "LSI53C1030" U320, with > driver > > "Fusion > > >> >> * *MPT SPI Host driver 3.04.06". I have attached the result > of > > "lsmod". > > >> >> * *None of my Lenny boxes are affected by this issue, all of > my > > Squeeze boxes > > >> >> * *are. > > >> >> * *What does it have to do with Xen ? When I boot my Squeeze > boxes > > without > > >> >> * *the Xen hypervisor but the same Xen kernel, bonnie++ > > runs*absolutely*fine. > > >> >> * *The issue appears only with the Xen hypervisor loaded.* > > >> >> * *There is a debian bug report for this > > >> >> * > *:*[1][4][6]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > > >> >> * *Any suggestion ?* > > >> > > > >> > Did you check if LSI has newer driver version available? > > >> > > > >> > Also you might check which driver version for example RHEL6 > > >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 > kernels > > too. > > >> > > > >> > On one of my testboxes I need to upgrade the LSI driver > > >> > to a newer version to make it work. This is SAS based LSI > though. > > >> > > > >> > Can you try using another disk controller? > > >> > > > >> > Also: Did you try using the latest kernel (-30) ? > > >> > > > >> > -- Pasi > > >> > > > >> > > > > > > > > References > > > > Visible links > > 1. mailto:[7]pasik@xxxxxx > > 2. > > [8]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > > 3. mailto:[9]pasik@xxxxxx > > 4. [10]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > > References > > Visible links > 1. mailto:pasik@xxxxxx > 2. http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html > 3. mailto:pasik@xxxxxx > 4. > http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 5. mailto:pasik@xxxxxx > 6. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > 7. mailto:pasik@xxxxxx > 8. > http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 9. mailto:pasik@xxxxxx > 10. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |