[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Possible bug with scsi disk and Xen


  • To: Pasi Kärkkäinen <pasik@xxxxxx>
  • From: Jordan Pittier <jordan.pittier@xxxxxxxxx>
  • Date: Sat, 29 Jan 2011 19:32:31 +0100
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Sat, 29 Jan 2011 10:33:45 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=Y5j/2tiO3Ss4hGflgXUaI4kxGusWB+NqOXxFBWMw0bVlm3RPGWiT5e7n4dfCga6nBu sjpklJmpYgcxaVsA4WmBntH0k5VClCxVDNCZgUKZhqj5XzXSkhG/USznkliTtiqGIZmH SSNSijBhZP38UbZP1IQFtBJcoY4XFLpqiRmkA=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

>Xen dom0 kernel does irq handling through Xen hypervisor,
>so that might make some drivers behave in a different way baremetal vs. dom0.
Ok, so the driver is a good "responsible" for this SCSI crazyness.

>What driver version does the squeeze kernel have?
3.04. Which seems to be several years old. There is lot of users complaining about LSI drivers all over the Internet. 

I will keep you posted as soon as I manage to build the latest driver.

On Sat, Jan 29, 2011 at 7:25 PM, Pasi Käkäen <pasik@xxxxxx> wrote:
> On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote:
>> Thanks for your reply. LSI has indeed newer driver for the controler;
>> but I can't "build" it, there's an error when I try to compile it [see
>> attachement]. I will give another try in the next days.
>>
>> What is puzzling is that the IO errors only occurs with Xen HV. I am
>> 100% willing to accept that the problem is the drivers, but how come
>> the exact same kernel (the xenified one) could work fine without Xen
>> loaded ? I am almost a noob in kernel/driver and stuff; but I thought
>> the drivers were entirely in the kernel.
>>
>
> Yep, the driver is entirely in the kernel, but that's not the whole story.
>
> Xen dom0 kernel does irq handling through Xen hypervisor,
> so that might make some drivers behave in a different way baremetal vs. dom0.
>
> Also remember dom0 is a *vm*, so some timing stuff might happen
> differently on baremetal vs. dom0.
>
>> I will try with the latest kernel in a few days.
>>
>> SLES11SP1 ships mptfusion 4.22
>> (http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage)
>> I dont know for RHEL
>>
>
> What driver version does the squeeze kernel have?
>
>
> -- Pasi
>
>
>> On Sat, Jan 29, 2011 at 6:02 PM, Pasi Käkäen <pasik@xxxxxx> wrote:
>> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote:
>> >>    Hi,
>> >>    I have been tracking a bug affecting all my servers running Debian Squeeze
>> >>    for more than a month now, and I*desperately*need your help :)*
>> >>    I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of
>> >>    them are running Debian Squeeze with the latest Xen Debian kernel
>> >>    (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny
>> >>    (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1).
>> >>    On a Squeeze boxe, under very high IO (such as running a IO stress test,
>> >>    ie bonnie++), server starts behaving*weirdly and I see messages like these
>> >>    in kernel.log : [see attachement]. Then the server becomes totally
>> >>    unresponsive (but doesn't "freeze") and commands such as "ls" or "reboot"
>> >>    don't work anymore. I have to do an hard reboot. After the server has
>> >>    reboot, the RAID array seems degraded (I am using the mpt-status command)
>> >>    and starts rebuilding. After several hours, the raid array is "fine"
>> >>    ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion
>> >>    MPT SPI Host driver 3.04.06". I have attached the result of "lsmod".
>> >>    None of my Lenny boxes are affected by this issue, all of my Squeeze boxes
>> >>    are.
>> >>    What does it have to do with Xen ? When I boot my Squeeze boxes without
>> >>    the Xen hypervisor but the same Xen kernel, bonnie++ runs*absolutely*fine.
>> >>    The issue appears only with the Xen hypervisor loaded.*
>> >>    There is a debian bug report for this
>> >>    :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727
>> >>    Any suggestion ?*
>> >
>> > Did you check if LSI has newer driver version available?
>> >
>> > Also you might check which driver version for example RHEL6
>> > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too.
>> >
>> > On one of my testboxes I need to upgrade the LSI driver
>> > to a newer version to make it work. This is SAS based LSI though.
>> >
>> > Can you try using another disk controller?
>> >
>> > Also: Did you try using the latest kernel (-30) ?
>> >
>> > -- Pasi
>> >
>> >
>

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.