[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] VM getting hang



Hello Fajar,

 Thank you so much for those suggestions. I'll surely let you know my findings 
when this issue occur again. 

Once again, Thanks
 Gopi.  

On Sunday 25 January 2009 01:50, Fajar A. Nugraha wrote:
> On Sat, Jan 24, 2009 at 11:29 PM, gopikrishnan
>
> <gopikrishnan@xxxxxxxxxxxx> wrote:
> > From the above result, it appears like everything is normal. Can you give
> > any suggestions?
>
> A "normal" device should not trigger
>
> ===========
> sd 0:0:0:0: rejecting I/O to offline device
> sd 0:0:0:0: rejecting I/O to offline device
> ============
>
> In my setup I got similar cases happened several times because of
> three problems :
>
> (1) the disks were simply busy.
> For example, when using some hosting appliances they'd use a lot of
> I/O during startup. Putting several hosting domUs on the same dom0 and
> starting them all at the same has the effect of making startup takes a
> loooooong time.
> When this happens :
> - "iostsat -x 3" on dom0 during the boot process will show that the
> disk is busy with high throughput
> - There's no weird messages on syslog
> - all you have to do is wait patiently
>
> (2) problems on the SAN switches/connections or HW raid controller
> For example, when your SAN switch is rebooted. This would block all
> disk I/O for some time, and on some cases can lead to data corruption.
> When this happens :
> - "iostsat -x 3" on dom0 (on the time the problem occurs) will show
> that the disk is busy with very low or no throughput
> - depending on your setup, you might get "rejecting I/O to offline
> device" messages (check the CONSOLE to be sure, not just
> /var/log/messages)
> - sometimes the problem seems to "fix itself" without you having to do
> anything
>
> (3) broken disks or controller
> Similar to (2), but this can also happen on local storage. Everything
> seemed to work correctly, but when accessing certain data it would
> take a loooong time or failed. This one's hardest to diagnose, but
> sometimes had the similar symptoms as (2)
>
> >From your earlier mail I suspect it was (3). Then again, from "After a few
> > hours
>
> (may be 8-10hrs), all these VPS will come up automatically." it can also be
> (1).
>
> To be sure though, you'll need to have some diagnostics when the
> problem occured :
> - how was disk throughput at that time (check with "iostat -x 3" or
> similar commands)
> - was there any weird messages on the CONSOLE or on /var/log/messages
> at that time (depending on the problem, it is possible that error
> messages were not written to /var/log/messages)
> - what was domU load at that time. Do all domUs uses 100% CPU?
>
> Note that some diagnostics had to be done at the time the probelm
> occured, not AFTER.
>
> Good luck!
>
> Regards,
>
> Fajar
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.