[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Dell Poweredge 2650 - heavy IO hangs domU machines; xen 2.0.7, xen kernel 2.6.11.12



This is a pretty serious problem... no bites yet? This doesn't sound
familiar to anyone?

It's not a bacula problem. Bacula doesn't do this to us running on
unvirtualized hardware.

Stephen Bosch wrote:
> Hello:
> 
> At the off-list suggestion of another user, we have tried adding
> 'noirqbalance' to the xen start line in grub, we've disabled USB in the
> system BIOS, and we've added 'nousb' to the kernel parameters.
> 
> The problem is still there, exactly as before, even with all those changes.
> 
> *All* the virtual machines lose network connectivity, not just the ones
> involved in the backup. We have an LDAP server VM running on this
> hardware that is totally idle when this hang happens. We cannot ping or
> ssh into them. We can get a console using 'xm console', but after
> entering the userid, the login times out (after 60 seconds) before we
> ever get a password prompt.
> 
> I still suspect an interrupt problem: it would appear that the tty is
> unable to do a disk read to do authentication. At the same time, the
> tape backup process hangs.
> 
> If we kill the bacula storage daemon on dom0, all of the virtual
> machines release and we can log in again. At no point does anything
> reboot -- it just hangs, and it's not a fatal hang. If the backup
> process stops, whether through a timeout or by forceably stopping the
> storage daemon, the virtual machines are again pingable and we can log
> in both with ssh or 'xm console'.
> 
> We tried monitoring the memory usage during the backup test by running
> 'top' in separate console windows. Loads were actually modest and there
> was plenty of memory remaining on all the virtual machines (over 1 GB in
> free RAM in one case).
> 
> To recap: this is a Dell *2650*, not a 2850. It has a Serverworks, not
> an Intel chipset. The RAID controller is a PERC 3 DC (LSI Logic) which
> uses the Megaraid drivers. The controller firmware has been upgraded to
> 3.35/1.07, the most recent available.
> 
> Note also -- dom0 is unaffected. We can still interact with dom0 without
> trouble. This hang affects only the virtual machines.
> 
> Cheers,
> 
> -Stephen-


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.