[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-users] Live migration problem
Spoke too soon...failed after about the 20th or so migration. But it is more stable than it was... -- Ray -----Original Message----- From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray Sent: Wednesday, August 31, 2005 4:14 PM To: Steven Hand Cc: xen-users@xxxxxxxxxxxxxxxxxxx Subject: RE: [Xen-users] Live migration problem I think I have it fixed, but I'm not sure why :-) I modified reboot.c's shutdown_handler routine to NOT call ctrl_if_send_response(). This appears to make live migration rock solid on my machines. It appears to me that if the xenU kernel attempts to give a response to the suspend command that it runs the possibility of locking up. I have very little knowledge about the Xen code and such, but it seems to me that if it works when the response is removed then nobody must be expecting a response on the other end of the conversation or a response is already being sent from somewhere else. I realize commenting this out would then cause a response to not be sent for SYSRQ commands and such so this is my no means a proper 'fix', but I think the root cause of the problem I've been having with live migration periodically giving me errors that it cannot suspend has perhaps been found. I've not performed a live migration about 14 times now without it failing with this change in place. Is this enough information for someone to figure out what the real cure should be? I'm starting to think that shutdown_handler should not call ctrl_if_send_response if it is a suspend request and no previous suspend request was pending, else call ctrl_if_send_response. But I'd just be guessing. -- Ray -----Original Message----- From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray Sent: Wednesday, August 31, 2005 3:25 PM To: Steven Hand Cc: xen-users@xxxxxxxxxxxxxxxxxxx Subject: RE: [Xen-users] Live migration problem Looks like the suspend message is received in the shutdown handler. schedule_work is called to schedule the work but, sporadically, that work is never executed. It is as if schedule_work doesn't really schedule it or it is unable to get executed. -----Original Message----- From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray Sent: Wednesday, August 31, 2005 12:41 PM To: Steven Hand Cc: xen-users@xxxxxxxxxxxxxxxxxxx Subject: RE: [Xen-users] Live migration problem I decided to put in some printk's into reboot.c's __do_suspend. During a "good" live migration run I see the printk's show up on the console. In the bad one I see that __do_suspend never gets called :-( I'll continue to follow it up the chain to see if it never gets the message to suspend at all or if something is going bad between getting the message and suspending. I'm running xen-2.0-testing with the xen-2.0 2.6.11.12-xenU kernel BTW. -- Ray _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |