[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Disk erros on one xen domain



I forgot to say : I'm under Debian Lenny 64 bits and using Xen shipped with the distro.

On 05/25/2010 08:43 AM, Nicolas Michel wrote:
Hello,

I have 3 physical servers with some virtual machines on each.
When I look at dmesg on one of them I get theses errors :

*************************************************************************
[34783.559174] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[34783.559248] hda: task_in_intr: error=0x04 { AbortedCommand }
[34783.559289] ide: failed opcode was: 0xec
[121232.732355] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[121232.732413] hda: task_in_intr: error=0x04 { AbortedCommand }
[121232.732455] ide: failed opcode was: 0xec
[207708.187565] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[207708.187623] hda: task_in_intr: error=0x04 { AbortedCommand }
[207708.187664] ide: failed opcode was: 0xec
[294224.164969] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[294224.165029] hda: task_in_intr: error=0x04 { AbortedCommand }
[294224.165075] ide: failed opcode was: 0xec
[380705.378232] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[380705.378232] hda: task_in_intr: error=0x04 { AbortedCommand }
[380705.378232] ide: failed opcode was: 0xec
[467193.505658] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[467193.505717] hda: task_in_intr: error=0x04 { AbortedCommand }
[467193.505758] ide: failed opcode was: 0xec
[553683.657031] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[553683.657091] hda: task_in_intr: error=0x04 { AbortedCommand }
[553683.657132] ide: failed opcode was: 0xec
[640176.673218] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[640176.673218] hda: task_in_intr: error=0x04 { AbortedCommand }
[640176.673218] ide: failed opcode was: 0xec
[726657.593721] hda: task_in_intr: status=0x51 { DriveReady SeekComplete
Error }
[726657.593721] hda: task_in_intr: error=0x04 { AbortedCommand }
[726657.593721] ide: failed opcode was: 0xec:
******************************************************************

You'll see the full dmesg output in the attached file.
I found with google some comments about these errors saying that it
means the disk is dying. But this is a relatively recent server (1 year)
with 6 disks in RAID 10.

Since I started that server in prod, it crashed 3 times. It responds to
pings but no ssh access (on xen domain and virtal machines either). Some
services on virtual machines continue to respond, other don't. The only
solution is a hard reboot.



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.