[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Possible bug with scsi disk and Xen
On Sat, Jan 29, 2011 at 07:32:31PM +0100, Jordan Pittier wrote: > >Xen dom0 kernel does irq handling through Xen hypervisor, > >so that might make some drivers behave in a different way baremetal vs. > dom0. > Ok, so the driver is a good "responsible" for this SCSI crazyness. I'm not sure if it is, but it *could* be. > >What driver version does the squeeze kernel have? > 3.04. Which seems to be several years old. There is lot of users > complaining about LSI drivers all over the Internet.* > I will keep you posted as soon as I manage to build the latest driver. See here for tips how to build updated megaraid_sas driver: http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html Maybe it helps also with your driver. -- Pasi > On Sat, Jan 29, 2011 at 7:25 PM, Pasi K*rkk*inen <[1]pasik@xxxxxx> wrote: > > On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: > >> Thanks for your reply. LSI has indeed newer driver for the controler; > >> but I can't "build" it, there's an error when I try to compile it [see > >> attachement]. I will give another try in the next days. > >> > >> What is puzzling is that the IO errors only occurs with Xen HV. I am > >> 100% willing to accept that the problem is the drivers, but how come > >> the exact same kernel (the xenified one) could work fine without Xen > >> loaded ? I am almost a noob in kernel/driver and stuff; but I thought > >> the drivers were entirely in the kernel. > >> > > > > Yep, the driver is entirely in the kernel, but that's not the whole > story. > > > > Xen dom0 kernel does irq handling through Xen hypervisor, > > so that might make some drivers behave in a different way baremetal vs. > dom0. > > > > Also remember dom0 is a *vm*, so some timing stuff might happen > > differently on baremetal vs. dom0. > > > >> I will try with the latest kernel in a few days. > >> > >> SLES11SP1 ships mptfusion 4.22 > >> > > ([2]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) > >> I dont know for RHEL > >> > > > > What driver version does the squeeze kernel have? > > > > > > -- Pasi > > > > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi K*rkk*inen <[3]pasik@xxxxxx> > wrote: > >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: > >> >> * *Hi, > >> >> * *I have been tracking a bug affecting all my servers running > Debian Squeeze > >> >> * *for more than a month now, and I*desperately*need your help :)* > >> >> * *I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == > mirror). 4 of > >> >> * *them are running Debian Squeeze with the latest Xen Debian kernel > >> >> * *(2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian > Lenny > >> >> * *(2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > >> >> * *On a Squeeze boxe, under very high IO (such as running a IO > stress test, > >> >> * *ie bonnie++), server starts behaving*weirdly and I see messages > like these > >> >> * *in kernel.log : [see attachement]. Then the server becomes > totally > >> >> * *unresponsive (but doesn't "freeze") and commands such as "ls" or > "reboot" > >> >> * *don't work anymore. I have to do an hard reboot. After the server > has > >> >> * *reboot, the RAID array seems degraded (I am using the mpt-status > command) > >> >> * *and starts rebuilding. After several hours, the raid array is > "fine" > >> >> * *("clean"). The raid controler is "LSI53C1030" U320, with driver > "Fusion > >> >> * *MPT SPI Host driver 3.04.06". I have attached the result of > "lsmod". > >> >> * *None of my Lenny boxes are affected by this issue, all of my > Squeeze boxes > >> >> * *are. > >> >> * *What does it have to do with Xen ? When I boot my Squeeze boxes > without > >> >> * *the Xen hypervisor but the same Xen kernel, bonnie++ > runs*absolutely*fine. > >> >> * *The issue appears only with the Xen hypervisor loaded.* > >> >> * *There is a debian bug report for this > >> >> * *:*[1][4]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > >> >> * *Any suggestion ?* > >> > > >> > Did you check if LSI has newer driver version available? > >> > > >> > Also you might check which driver version for example RHEL6 > >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels > too. > >> > > >> > On one of my testboxes I need to upgrade the LSI driver > >> > to a newer version to make it work. This is SAS based LSI though. > >> > > >> > Can you try using another disk controller? > >> > > >> > Also: Did you try using the latest kernel (-30) ? > >> > > >> > -- Pasi > >> > > >> > > > > > References > > Visible links > 1. mailto:pasik@xxxxxx > 2. > http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 3. mailto:pasik@xxxxxx > 4. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |